Results 1 - 10
of
27
Natural Language Questions for the Web of Data
"... The Linked Data initiative comprises structured databases in the Semantic-Web data model RDF. Exploring this heterogeneous data by structured query languages is tedious and error-prone even for skilled users. To ease the task, this paper presents a methodology for translating natural language questi ..."
Abstract
-
Cited by 22 (2 self)
- Add to MetaCart
(Show Context)
The Linked Data initiative comprises structured databases in the Semantic-Web data model RDF. Exploring this heterogeneous data by structured query languages is tedious and error-prone even for skilled users. To ease the task, this paper presents a methodology for translating natural language questions into structured SPARQL queries over linked-data sources. Our method is based on an integer linear program to solve several disambiguation tasks jointly: the segmentation of questions into phrases; the mapping of phrases to semantic entities, classes, and relations; and the construction of SPARQL triple patterns. Our solution harnesses the rich type system provided by knowledge bases in the web of linked data, to constrain our semantic-coherence objective function. We present experiments on both the question translation and the resulting query answering.
Answering Table Queries on the Web using Column Keywords ABSTRACT
"... We present the design of a structured search engine which returns a multi-column table in response to a query consisting of keywords describing each of its columns. We answer such queries by exploiting the millions of tables on the Web because these are much richer sources of structured knowledge th ..."
Abstract
-
Cited by 9 (1 self)
- Add to MetaCart
(Show Context)
We present the design of a structured search engine which returns a multi-column table in response to a query consisting of keywords describing each of its columns. We answer such queries by exploiting the millions of tables on the Web because these are much richer sources of structured knowledge than free-format text. However, a corpus of tables harvested from arbitrary HTML web pages presents huge challenges of diversity and redundancy not seen in centrally edited knowledge bases. We concentrate on one concrete task in this paper. Given a set of Web tables T1,..., Tn, and a query Q with q sets of keywords Q1,..., Qq, decide for each Ti if it is relevant to Q and if so, identify the mapping between the columns of Ti and query columns. We represent this task as a graphical model that jointly maps all tables by incorporating diverse sources of clues spanning matches in different parts of the table, corpus-wide co-occurrence statistics, and content overlap across table columns. We define a novel query segmentation model for matching keywords to table columns, and a robust mechanism of exploiting content overlap across table columns. We design efficient inference algorithms based on bipartite matching and constrained graph cuts to solve the joint labeling task. Experiments on a workload of 59 queries over a 25 million web table corpus shows significant boost in accuracy over baseline IR methods. 1.
Keyword++: A Framework to Improve Keyword Search Over Entity Databases
"... Keyword search over entity databases (e.g., product, movie databases) is an important problem. Current techniques for keyword search on databases may often return incomplete and imprecise results. On the one hand, they either require that relevant entities contain all (or most) of the query keywords ..."
Abstract
-
Cited by 7 (0 self)
- Add to MetaCart
Keyword search over entity databases (e.g., product, movie databases) is an important problem. Current techniques for keyword search on databases may often return incomplete and imprecise results. On the one hand, they either require that relevant entities contain all (or most) of the query keywords, or that relevant entities and the query keywords occur together in several documents from a known collection. Neither of these requirements may be satisfied for a number of user queries. Hence results for such queries are likely to be incomplete in that highly relevant entities may not be returned. On the other hand, although some returned entities contain all (or most) of the query keywords, the intention of the keywords in the query could be different from that in the entities. Therefore, the results could also be imprecise. To remedy this problem, in this paper, we propose a general framework that can improve an existing search interface by translating a keyword query to a structured query. Specifically, we leverage the keyword to attribute value associations discovered in the results returned by the original search interface. We show empirically that the translated structured queries alleviate the above problems. 1.
Deep Answers for Naturally Asked Questions on the Web of Data
"... We present DEANNA, a framework for natural language question answering over structured knowledge bases. Given a natural language question, DEANNA translates questions into a structured SPARQL query that can be evaluated over knowledge bases such as Yago, Dbpedia, Freebase, or other Linked Data sourc ..."
Abstract
-
Cited by 6 (1 self)
- Add to MetaCart
(Show Context)
We present DEANNA, a framework for natural language question answering over structured knowledge bases. Given a natural language question, DEANNA translates questions into a structured SPARQL query that can be evaluated over knowledge bases such as Yago, Dbpedia, Freebase, or other Linked Data sources. DEANNA analyzes questions and maps verbal phrases to relations and noun phrases to either individual entities or semantic classes. Importantly, it judiciously generates variables for target entities or classes to express joins between multiple triple patterns. We leverage the semantic type system for entities and use constraints in jointly mapping the constituents of the question to relations, classes, and entities. We demonstrate the capabilities and interface of DEANNA, which allows advanced users to influence the translation process and to see how the different components interact to produce the final result.
IQ: The Case for Iterative Querying for Knowledge
"... Large knowledge bases, the Linked Data cloud, and Web 2.0 communities open up new opportunities for deep question answering to support the advanced information needs of knowledge workers like students, journalists, or business analysts. This calls for going beyond keyword search, towards more expres ..."
Abstract
-
Cited by 5 (2 self)
- Add to MetaCart
(Show Context)
Large knowledge bases, the Linked Data cloud, and Web 2.0 communities open up new opportunities for deep question answering to support the advanced information needs of knowledge workers like students, journalists, or business analysts. This calls for going beyond keyword search, towards more expressive ways of entity-relationship-oriented querying with graph constraints or even full-fledged languages like SPARQL (over graph-structured, schema-less data). However, a neglected aspect of this active research direction is the need to support also query refinements, relaxations, and interactive exploration, as single-shot queries are often insufficient for the users ’ tasks. This paper addresses this issue by discussing the paradigm of Iterative Querying, IQ for short. We present two instantiations for IQ, one based on keyword search over labeled graphs combined with structural constraints, and another one based on extensions of the SPARQL language. We discuss the suitability of these approaches for knowledge-centric search tasks, and we identify open research problems that deserve greater attention. 1.
GQBE: Querying Knowledge Graphs by Example Entity Tuples
"... We present GQBE, a system that presents a simple and intuitive mechanism to query large knowledge graphs. An-swers to tasks such as “list university professors who have designed some programming languages and also won an award in Computer Science ” are best found in knowledge graphs that record enti ..."
Abstract
-
Cited by 5 (2 self)
- Add to MetaCart
We present GQBE, a system that presents a simple and intuitive mechanism to query large knowledge graphs. An-swers to tasks such as “list university professors who have designed some programming languages and also won an award in Computer Science ” are best found in knowledge graphs that record entities and their relationships. Real-world knowledge graphs are difficult to use due to their sheer size and complexity and the challenging task of writing complex structured graph queries. Toward better usability of query systems over knowledge graphs, GQBE allows users to query knowledge graphs by example entity tuples without writing complex queries. In this demo we present: 1) a detailed description of the various features and user-friendly GUI of GQBE, 2) a brief description of the system architecture, and 3) a demonstration scenario that we intend to show the audience.
Robust Question Answering over the Web of Linked Data
"... Knowledge bases and the Web of Linked Data have become important assets for search, recommendation, and analytics. Natural-language questions are a user-friendly mode of tapping this wealth of knowledge and data. However, question answering technology does not work robustly in this setting as questi ..."
Abstract
-
Cited by 4 (1 self)
- Add to MetaCart
(Show Context)
Knowledge bases and the Web of Linked Data have become important assets for search, recommendation, and analytics. Natural-language questions are a user-friendly mode of tapping this wealth of knowledge and data. However, question answering technology does not work robustly in this setting as questions have to be translated into structured queries and users have to be careful in phrasing their questions. This paper advocates a new approach that allows questions to be partially translated into relaxed queries, covering the essential but not necessarily all aspects of the user’s input. To compensate for the omissions, we exploit textual sources associated with entities and relational facts. Our system translates user questionsintoanextendedform ofstructured SPARQLqueries,withtextpredicatesattachedtotriplepatterns. Our solution is based on a novel optimization model, cast into an integer linear program, for joint decomposition and disambiguation of the user question. We demonstrate the quality of our methods through experiments with the QALD benchmark. Categories andSubject Descriptors
HyKSS: Hybrid keyword and semantic search
, 2011
"... The rapid production of digital information makes the task of locating relevant information increasingly difficult. Keyword search alleviates this difficulty by retrieving ..."
Abstract
-
Cited by 4 (3 self)
- Add to MetaCart
(Show Context)
The rapid production of digital information makes the task of locating relevant information increasingly difficult. Keyword search alleviates this difficulty by retrieving
Learning joint query interpretation and response ranking
- In WWW Conf
, 2013
"... Thanks to information extraction and semantic Web efforts, search on unstructured text is increasingly refined using semantic annotations and structured knowledge bases. However, most users cannot become familiar with the schema of knowledge bases and ask structured queries. Interpreting free-format ..."
Abstract
-
Cited by 4 (1 self)
- Add to MetaCart
(Show Context)
Thanks to information extraction and semantic Web efforts, search on unstructured text is increasingly refined using semantic annotations and structured knowledge bases. However, most users cannot become familiar with the schema of knowledge bases and ask structured queries. Interpreting free-format queries into a more structured representation is of much current interest. The dominant paradigm is to segment or partition query tokens by purpose (references to types, entities, attribute names, attribute values, relations) and then launch the interpreted query on structured knowledge bases. Given that structured knowledge extraction is never complete, here we choose a less trodden path: a data representation that retains the unstructured text corpus, along with structured annotations (mentions of entities and relationships) on it. We propose two new, natural formulations for joint query interpretation and response ranking that exploit bidirectional flow of information between the knowledge base and the corpus. One, inspired by probabilistic language models, computes expected response scores over the uncertainties of query interpretation. The other is based on max-margin discriminative learning, with latent variables representing those uncertainties. In the context of typed entity search, both formulations bridge a considerable part of the accuracy gap between a generic query that does not constrain the type at all, and the upper bound where the “perfect ” target entity type of each query is provided by humans. Our formulations are also superior to a two-stage approach of first choosing a target type using recent query type prediction techniques, and then launching a type-restricted entity search query.
Entity Ranking and Relationship Queries Using an Extended Graph Model
"... There is a large amount of textual data on the Web and in Wikipedia, where mentions of entities (such as Gandhi) are annotated with a link to the disambiguated entity (such as M. K. Gandhi). Such annotation may have been done manually (as in Wikipedia) or can be done using named entity recognition/d ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
(Show Context)
There is a large amount of textual data on the Web and in Wikipedia, where mentions of entities (such as Gandhi) are annotated with a link to the disambiguated entity (such as M. K. Gandhi). Such annotation may have been done manually (as in Wikipedia) or can be done using named entity recognition/disambiguation techniques. Such an annotated corpus allows queries to return entities, instead of documents. Entity ranking queries retrieve entities that are related to keywords in the query and belong to a given type/category specified in the query; entity ranking has been an active area of research in the past few years. More recently, there have been extensions to allow entity-relationship queries, which allow specification of multiple sets of entities as well as relationships between them. In this paper we address the problem of entity ranking (“near”) queries and entity-relationship queries on the Wikipedia corpus. We first present an extended graph model which combines the power of graph models used earlier for structured/semi-structured data, with information from textual data. Based on this model, we show how to specify entity and entity-relationship queries, and defined scoring methods for ranking answers. Finally, we provide efficient algorithms for answering such queries, exploiting a space efficient in-memory graph structure. A performance comparison with the ERQ system proposed earlier shows significant improvement in answer quality for most queries, while also handling a much larger set of entity types. 1.