Results 1 -
6 of
6
YAGO2: Exploring and Querying World Knowledge in Time, Space, Context, and Many Languages
"... We present YAGO2, an extension of the YAGO knowledge base with focus on temporal and spatial knowledge. It is automatically built from Wikipedia, GeoNames, and Word-Net, and contains nearly 10 million entities and events, as well as 80 million facts representing general world knowledge. An enhanced ..."
Abstract
-
Cited by 5 (2 self)
- Add to MetaCart
We present YAGO2, an extension of the YAGO knowledge base with focus on temporal and spatial knowledge. It is automatically built from Wikipedia, GeoNames, and Word-Net, and contains nearly 10 million entities and events, as well as 80 million facts representing general world knowledge. An enhanced data representation introduces time and location as first-class citizens. The wealth of spatio-temporal information in YAGO can be explored either graphically or through a special time- and space-aware query language.
Author manuscript, published in "WWW- Demos (2011)" YAGO2: Exploring and Querying World Knowledge in Time, Space, Context, and Many Languages
, 2011
"... We present YAGO2, an extension of the YAGO knowledge base with a focus on temporal and spatial knowledge. It is automatically built from Wikipedia, GeoNames, and Word-Net, and contains nearly 10 million entities and events, as well as 80 million facts representing general world knowledge. An enhance ..."
Abstract
- Add to MetaCart
We present YAGO2, an extension of the YAGO knowledge base with a focus on temporal and spatial knowledge. It is automatically built from Wikipedia, GeoNames, and Word-Net, and contains nearly 10 million entities and events, as well as 80 million facts representing general world knowledge. An enhanced data representation introduces time and location as first-class citizens. The wealth of spatio-temporal information in YAGO can be explored either graphically or through a special time- and space-aware query language.
CONSTRUCTING LARGE PROPOSITION DATABASES
"... Using semantic parsing or related techniques, it is possible to extract knowledge from text in the form of predicate–argument structures. Such structures are often called propositions. With the advent of massive corpora such as Wikipedia, it has become possible to apply a systematic analysis of a wi ..."
Abstract
- Add to MetaCart
Using semantic parsing or related techniques, it is possible to extract knowledge from text in the form of predicate–argument structures. Such structures are often called propositions. With the advent of massive corpora such as Wikipedia, it has become possible to apply a systematic analysis of a wide range of documents covering a significant part of human knowledge and build large proposition databases from them. While most approaches focus on shallow syntactic analysis and do not capture the full meaning of a sentence, semantic parsing goes deeper and discovers more information from text with a higher accuracy. This deeper analysis can be applied to discover temporal and location-based propositions from documents. Medical researchers could, for instance, discover articles regarding the interaction of bacteria in a specific body part. Christensen et al. (2010) showed that using a semantic parser in information extraction can yield extractions with higher precision and recall in areas where shallow syntactic approaches have failed. This accuracy comes at a cost of parsing time.
From SPARQL to MapReduce: The Journey Using a Nested TripleGroup Algebra
"... MapReduce-based data processing platforms offer a promising approach for cost-effective and Web-scale processing of Semantic Web data. However, one major challenge is that this computational paradigm leads to high I/O and communication costs when processing tasks with several join operations typical ..."
Abstract
- Add to MetaCart
MapReduce-based data processing platforms offer a promising approach for cost-effective and Web-scale processing of Semantic Web data. However, one major challenge is that this computational paradigm leads to high I/O and communication costs when processing tasks with several join operations typical in SPARQL queries. The goal of this demonstration is to show how a system RAPID+, an extension of Apache Pig, enables more efficient SPARQL query processing on MapReduce using an alternative query algebra called the Nested TripleGroup Algebra (NTGA). The demonstration will offer opportunities for users to explore NTGA-Hadoop query plans for different SPARQL query structures as well as explore relationships between query plans based on relational algebra operators and those using NTGA operators. 1.
Smart-Aleck: An Interestingness Algorithm for Large Semantic Datasets
"... Not every fact in a large semantic dataset is of interest to an application. In the Smart-Aleck project, we have designed and implemented an interestingness algorithm that filters facts and joins them to generate new facts with higher levels of interestingness. The algorithm defines different levels ..."
Abstract
- Add to MetaCart
Not every fact in a large semantic dataset is of interest to an application. In the Smart-Aleck project, we have designed and implemented an interestingness algorithm that filters facts and joins them to generate new facts with higher levels of interestingness. The algorithm defines different levels of interestingness based on the semantic operations involved in generating interesting facts. The application of the algorithm is a Web site that presents a new interesting fact, rendered in English, each time users visit or refresh the page. The facts are generated from an integration of over half a billion triples from large semantic datasets including YAGO, Dbpedia, DataHub and Timbl. The uniqueness of the Smart-Aleck algorithm lies in its ability not merely to select interesting facts from the datasets but to generate new facts by joining two or more facts, possibly from different sources, by applying several comparison, chaining, grouping, aggregation and quantification operations on RDF triples. The implementation of Smart-Aleck on the web site is useful to everyone on the net to satisfy their curiosity, acquire general knowledge and design quizzes. It also has business potential as a feed for “fact-of-the-day ” applications on cell phones and tablets.
Probabilistic Databases of Universal Schema
"... In data integration we transform information from a source into a target schema. A general problem in this task is loss of fidelity and coverage: the source expresses more knowledge than can fit into the target schema, or knowledge that is hard to fit into any schema at all. This problem is taken to ..."
Abstract
- Add to MetaCart
In data integration we transform information from a source into a target schema. A general problem in this task is loss of fidelity and coverage: the source expresses more knowledge than can fit into the target schema, or knowledge that is hard to fit into any schema at all. This problem is taken to an extreme in information extraction (IE) where the source is natural language. To address this issue, one can either automatically learn a latent schema emergent in text (a brittle and ill-defined task), or manually extend schemas. We propose instead to store data in a probabilistic database of universal schema. This schema is simply the union of all source schemas, and the probabilistic database learns how to predict the cells of each source relation in this union. For example, the database could store Freebase relations and relations that correspond to natural language surface patterns. The database would learn to predict what freebase relations hold true based on what surface patterns appear, and vice versa. We describe an analogy between such databases and collaborative filtering models, and use it to implement our paradigm with probabilistic PCA, a scalable and effective collaborative filtering method. 1

