Results 1 -
6 of
6
Probabilistic question answering on the Web
- Journal of the American Society for Information Science and Technology
, 2002
"... Web-based search engines such as Google and NorthernLight return documents that are relevant to a user query, not answers to user questions. We have developed an architecture that augments existing search engines so that they support natural language question answering. The process entails five step ..."
Abstract
-
Cited by 42 (1 self)
- Add to MetaCart
Web-based search engines such as Google and NorthernLight return documents that are relevant to a user query, not answers to user questions. We have developed an architecture that augments existing search engines so that they support natural language question answering. The process entails five steps: query modulation, document retrieval, passage extraction, phrase extraction, and answer ranking. In this paper we describe some probabilistic approaches to the last three of these stages. We show how our techniques apply to a number of existing search en-1 Radev et al. 2 gines and we also present results contrasting three different methods for question answering. Our algorithm, probabilistic phrase reranking (PPR), uses proximity and question type features and achieves a total reciprocal document rank of.20 on the TREC8 corpus. Our techniques have been implemented as a Web-accessible system, called NSIR.
ProTDB: Probabilistic data in XML
- In Proceedings of the 28th VLDB Conference
, 2002
"... Abstract Whereas traditional databases manage onlydeterministic information, many applications that use databases involve uncertain data.This paper presents a Probabilistic Tree Data Base (ProTDB) to manage probabilistic data,represented in XML. Our approach differs from previous effortsto develop p ..."
Abstract
-
Cited by 38 (2 self)
- Add to MetaCart
Abstract Whereas traditional databases manage onlydeterministic information, many applications that use databases involve uncertain data.This paper presents a Probabilistic Tree Data Base (ProTDB) to manage probabilistic data,represented in XML. Our approach differs from previous effortsto develop probabilistic relational systems in that we build a probabilistic XML database.This design is driven by application needs that involve data not readily amenable to a rela-tional representation. XML data poses several modeling challenges: due to its structure, dueto the possibility of uncertainty association at multiple granularities, and due to the possi-bility of missing and repeated sub-elements. We present a probabilistic XML model thataddresses all of these challenges. We devise an implementation of XML query operationsusing our probability model, and demonstrate the efficiency of our implementation experi-mentally. We have used ProTDB to manage data fromtwo application areas: protein chemistry data from the bioinformatics domain, and informa-tion extraction data obtained from the web using a natural language analysis system. Wepresent a brief case study of the latter to demonstrate the value of probabilistic XMLdata management.
Towards Answer-Focused Summarization
"... Abstract _ _ People query search engines to find answers to a variety of questions on the Internet. Search cost would have been greatly reduced if search engines could accept natural language questions as queries, and provide summaries that contain the answers to these questions. We introduce the no ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
Abstract _ _ People query search engines to find answers to a variety of questions on the Internet. Search cost would have been greatly reduced if search engines could accept natural language questions as queries, and provide summaries that contain the answers to these questions. We introduce the notion of Answer-Focused Summarization, which is to combine summarization and question answering. We develop a set of criteria and performance metrics, to evaluate answer-Focused Summarization. We demonstrate that the summaries produced by Google, the most popular search engine nowadays, can be largely improved for question answering. We develop a proximity-based summary extraction system, and then utilize question types, i.e. whether the question is a "person " or a "place " question, to improve the performance. We suggest that there is a large application potential for Answer-Focused Summarization, such as in wireless and palmheld systems where search cost is critical. Index Terms-- Answer-focused summarization, summarization, question-answering. I.
Reward System for Completing FAQs
"... Abstract. The creation of Answer Communities around a FAQs Site is proposed to speed up the process of answering questions. Our approach combines long-term and short-term rewards. Long-term rewards are found to boost participation, motivating users to complete FAQs with proper answers faster. Keywor ..."
Abstract
- Add to MetaCart
Abstract. The creation of Answer Communities around a FAQs Site is proposed to speed up the process of answering questions. Our approach combines long-term and short-term rewards. Long-term rewards are found to boost participation, motivating users to complete FAQs with proper answers faster. Keywords. Frequently Asked Questions (FAQs) Sites, Answer Communities. 1.
Query Modulation for Web-based Question Answering
"... The web is now becoming one of the largest information and knowledge repositories. Many large scale search engines (Google, Fast, Northern Light, etc.) have emerged to help users find information. In this paper, we study how we can effectively use these existing search engines to mine the Web and d ..."
Abstract
- Add to MetaCart
The web is now becoming one of the largest information and knowledge repositories. Many large scale search engines (Google, Fast, Northern Light, etc.) have emerged to help users find information. In this paper, we study how we can effectively use these existing search engines to mine the Web and discover the "correct" answers to factual natural language questions. We propose a probabilistic algorithm called QASM (Question Answering using Statistical Models) that learns the best query paraphrase of a natural language question. We validate our approach for both local and web search engines using questions from the TREC evaluation.
Journal of the Association for Information Systems Special Issue Combining Information Seeking Services into a Meta Supply Chain of Facts *
"... The World Wide Web has become a vital supplier of information that allows organizations to carry on such tasks as business intelligence, security monitoring, and risk assessments. Having a quick and reliable supply of correct facts from perspective is often mission critical. By following design scie ..."
Abstract
- Add to MetaCart
The World Wide Web has become a vital supplier of information that allows organizations to carry on such tasks as business intelligence, security monitoring, and risk assessments. Having a quick and reliable supply of correct facts from perspective is often mission critical. By following design science guidelines, we have explored ways to recombine facts from multiple sources, each with possibly different levels of responsiveness and accuracy, into one robust supply chain. Inspired by prior research on keyword-based meta-search engines (e.g., metacrawler.com), we have adapted the existing question answering algorithms for the task of analysis and triangulation of facts. We present a first prototype for a meta approach to fact seeking. Our meta engine sends a user’s question to several fact seeking services that are publicly available on the Web (e.g., ask.com, brainboost.com, answerbus.com, NSIR, etc.) and analyzes the returned results jointly to identify and present to the user those that are most likely to be factually correct. The results of our evaluation on the standard test sets widely used in prior research support the evidence for the following: 1) the value-added of the meta approach: its performance surpasses the performance of each supplier, 2) the importance of using fact seeking services as suppliers to the meta engine rather than keyword driven search portals, and 3) the resilience of the meta approach: eliminating a single service does not noticeably impact the overall performance. We show that these properties make the meta-approach a more reliable supplier of facts than any of the currently available stand-alone services.

