Results 1 - 10
of
10
From Information to Knowledge: Harvesting Entities and Relationships from Web Sources
"... There are major trends to advance the functionality of search engines to a more expressive semantic level. This is enabled by the advent of knowledge-sharing communities such as Wikipedia and the progress in automatically extracting entities and relationships from semistructured as well as natural-l ..."
Abstract
-
Cited by 7 (4 self)
- Add to MetaCart
There are major trends to advance the functionality of search engines to a more expressive semantic level. This is enabled by the advent of knowledge-sharing communities such as Wikipedia and the progress in automatically extracting entities and relationships from semistructured as well as natural-language Web sources. Recent endeavors of this kind include DBpedia, EntityCube, KnowItAll, ReadTheWeb, and our own YAGO-NAGA project (and others). The goal is to automatically construct and maintain a comprehensive knowledge base of facts about named entities, their semantic classes, and their mutual relations as well as temporal contexts, with high precision and high recall. This tutorial discusses state-ofthe-art methods, research opportunities, and open challenges along this avenue of knowledge harvesting.
YAGO2: A Spatially and Temporally Enhanced Knowledge Base from Wikipedia
- Commun. ACM
"... We are grateful for input from various people’s work: Edwin Lewis-Kelham for implementing the YAGO2 user interface, Gerard de Melo for his help on integrating his Universal WordNet, and Erdal Kuzey for his work on named events and time facts in Wikipedia. We would also like to thank the people who h ..."
Abstract
-
Cited by 6 (2 self)
- Add to MetaCart
We are grateful for input from various people’s work: Edwin Lewis-Kelham for implementing the YAGO2 user interface, Gerard de Melo for his help on integrating his Universal WordNet, and Erdal Kuzey for his work on named events and time facts in Wikipedia. We would also like to thank the people who helped evaluate the quality of YAGO2 by manual assessment, most notably, Ndapandula Nakashole, Stephan Seufert, Erdal Kuzey, and We present YAGO2, an extension of the YAGO knowledge base, in which entities, facts, and events are anchored in both time and space. YAGO2 is built automatically from Wikipedia, GeoNames, and WordNet. It contains 80 million facts about 9.8 million entities. Human evaluation confirmed an accuracy of 95 % of the facts in YAGO2. In this paper, we present the extraction methodology, the integration of the spatio-temporal dimension, and our knowledge representation SPOTL, an extension of the original SPO-triple
Find your Advisor: Robust Knowledge Gathering from the Web
"... We present a robust method for gathering relational facts from the Web, based on matching generalized patterns which are automatically learned from seed facts for relations of interest. Our approach combines these generalized patterns for high recall information extraction with a rule-based, declara ..."
Abstract
-
Cited by 2 (1 self)
- Add to MetaCart
We present a robust method for gathering relational facts from the Web, based on matching generalized patterns which are automatically learned from seed facts for relations of interest. Our approach combines these generalized patterns for high recall information extraction with a rule-based, declarative reasoning approach to also ensure high precision. Newly extracted candidate facts are assigned statistical weights which reflect the strengths of the patterns used to extract them. For checking the plausibility of candidate facts with respect to existing knowledge and competing hypotheses, we use an efficient algorithm for weighted Max-Sat over propositional-logic clauses. In contrast to prior work on reasoning-based information extraction, we employ richer statistics and smart pruning to bound the number of grounded rules passed on to the Max-Sat solver.
The Internet contains Thousands of Poorly Explored FUTS Data Sources
"... Abstract. The Internet contains thousands of Frequently Updated, Timestamped, Structured (FUTS) data sources. This type of information represents a different class of information that is not properly handled by existing data management systems such as databases, data warehouses, search engines, pubs ..."
Abstract
- Add to MetaCart
Abstract. The Internet contains thousands of Frequently Updated, Timestamped, Structured (FUTS) data sources. This type of information represents a different class of information that is not properly handled by existing data management systems such as databases, data warehouses, search engines, pubsub, event processing, or information retrieval systems. In this position paper we describe 9ticks, a system we are designing to collect, parse, store, query, and disseminate FUTS information. 9ticks is helping us understand that all those steps raise new challenges but also bring new opportunities. In this paper we summarize the challenges identified and present our vision of an end-to-end FUTS management system.
Acquisitions Editor: Development Editor: Production Editor: Typesetters: Print Coordinator:
"... any form or by any means, electronic or mechanical, including photocopying, without written permission from the publisher. Product or company names used in this set are for identification purposes only. Inclusion of the names of the products or companies does not indicate a claim of ownership by IGI ..."
Abstract
- Add to MetaCart
any form or by any means, electronic or mechanical, including photocopying, without written permission from the publisher. Product or company names used in this set are for identification purposes only. Inclusion of the names of the products or companies does not indicate a claim of ownership by IGI Global of the trademark or registered trademark. Library of Congress Cataloging-in-Publication Data Business intelligence applications and the web: models, systems and technologies / Marta E. Zorrilla... [et al.] editors. p. cm. Includes bibliographical references and index.
Management
"... Building a database of facts extracted from historical documents to enable database-like query and search would reduce the tedium of gleaning facts of interest from historical documents. We propose a solution in which historical documents themselves constitute the stored database. In our solution, w ..."
Abstract
- Add to MetaCart
Building a database of facts extracted from historical documents to enable database-like query and search would reduce the tedium of gleaning facts of interest from historical documents. We propose a solution in which historical documents themselves constitute the stored database. In our solution, we use information-extraction techniques to produce a conceptualized external annotation of facts found in each document, and we superimpose the conceptualization over the document collection. The annotation process populates the conceptualization producing a repository of extracted facts, and a reasoner obtains inferred facts from these extracted facts. Our query interface accepts free-form queries and converts them to formal queries over the extracted and inferred facts. Displayed results include, in addition to standard query results, images of original documents with results highlighted along with reasoning chains for inferred facts grounded in these highlighted facts. Along with giving the implementation status of our proof-of-concept prototype, we present results for extraction accuracy and efficiency and point to current and future work needed to enable a practical solution for the envisioned historical-document database.
IQ: The Case for Iterative Querying for Knowledge
"... Large knowledge bases, the Linked Data cloud, and Web 2.0 communities open up new opportunities for deep question answering to support the advanced information needs of knowledge workers like students, journalists, or business analysts. This calls for going beyond keyword search, towards more expres ..."
Abstract
- Add to MetaCart
Large knowledge bases, the Linked Data cloud, and Web 2.0 communities open up new opportunities for deep question answering to support the advanced information needs of knowledge workers like students, journalists, or business analysts. This calls for going beyond keyword search, towards more expressive ways of entity-relationship-oriented querying with graph constraints or even full-fledged languages like SPARQL (over graph-structured, schema-less data). However, a neglected aspect of this active research direction is the need to support also query refinements, relaxations, and interactive exploration, as single-shot queries are often insufficient for the users ’ tasks. This paper addresses this issue by discussing the paradigm of Iterative Querying, IQ for short. We present two instantiations for IQ, one based on keyword search over labeled graphs combined with structural constraints, and another one based on extensions of the SPARQL language. We discuss the suitability of these approaches for knowledge-centric search tasks, and we identify open research problems that deserve greater attention. 1.

