Results 1 - 10
of
22
Concept Based Query Expansion
, 1993
"... Query expansion methods have been studied for a long time - with debatable success in many instances. In this paper we present a probabilistic query expansion model based on a similarity thesaurus which was constructed automatically. A similarity thesaurus reflects domain knowledge about the particu ..."
Abstract
-
Cited by 147 (2 self)
- Add to MetaCart
Query expansion methods have been studied for a long time - with debatable success in many instances. In this paper we present a probabilistic query expansion model based on a similarity thesaurus which was constructed automatically. A similarity thesaurus reflects domain knowledge about the particular collection from which it is constructed. We address the two important issues with query expansion: the selection and the weighting of additional search terms. In contrast to earlier methods, our queries are expanded by adding those terms that are most similar to the concept of the query, rather than selecting terms that are similar to the query terms. Our experiments show that this kind of query expansion results in a notable improvement in the retrieval effectiveness when measured using both recall-precision and usefulness.
An Association Thesaurus for Information Retrieval
- In RIAO 94 Conference Proceedings
, 1994
"... Although commonly used in both commercial and experimental information retrieval systems, thesauri have not demonstrated consistent benefits for retrieval performance, and it is difficult to construct a thesaurus automatically for large text databases. In this paper, an approach, called PhraseFinder ..."
Abstract
-
Cited by 132 (10 self)
- Add to MetaCart
Although commonly used in both commercial and experimental information retrieval systems, thesauri have not demonstrated consistent benefits for retrieval performance, and it is difficult to construct a thesaurus automatically for large text databases. In this paper, an approach, called PhraseFinder, is proposed to construct collection-dependent association thesauri automatically using large full-text document collections. The association thesaurus can be accessed through natural language queries in INQUERY, an information retrieval system based on the probabilistic inference network. Experiments are conducted in INQUERY to evaluate different types of association thesauri, and thesauri constructed for a variety of collections. 1 Introduction A thesaurus is a set of items ( phrases or words ) plus a set of relations between these items. Although thesauri are commonly used in both commercial and experimental IR systems, experiments have shown inconsistent effects on retrieval effectiven...
Improving the Effectiveness of Informational Retrieval with Local Context Analysis
- ACM TRANSACTIONS ON INFORMATION SYSTEMS
, 2000
"... Techniques for automatic query expansion have been extensively studied in information retrieval research as a means of addressing the word mismatch between queries and documents. These techniques can categorized as either global or local. While global techniques rely on analysis of a whole collec ..."
Abstract
-
Cited by 115 (4 self)
- Add to MetaCart
Techniques for automatic query expansion have been extensively studied in information retrieval research as a means of addressing the word mismatch between queries and documents. These techniques can categorized as either global or local. While global techniques rely on analysis of a whole collection to discover word relationships, local techniques emphasize analysis of the top ranked documents retrieved for a query. Both types of techniques have advantages and limitations. In this paper we propose a new technique, called local context analysis, which combines the advantages of a global technique called Phrasefinder and a local technique known as local feedback. Experiments on a number of collections, both English and non-English, show that local context analysis offers more effective and consistent retrieval results.
Term Clustering of Syntactic Phrases
- Proceedings of ACM SIGIR-90
, 1990
"... Term clustering and syntactic phrase formation are methods for transforming natural language text. Both have had only mixed success as strategies for improving the quality of text representations for document retrieval. Since the strengths of these methods are complementary, we have explored combini ..."
Abstract
-
Cited by 56 (5 self)
- Add to MetaCart
Term clustering and syntactic phrase formation are methods for transforming natural language text. Both have had only mixed success as strategies for improving the quality of text representations for document retrieval. Since the strengths of these methods are complementary, we have explored combining them to produce superior representations. In this paper we discuss our implementation of a syntactic phrase generator, as well as our preliminary experiments with producing phrase clusters. These experiments show small improvements in retrieval effectiveness resulting from the use of phrase clusters, but it is clear that corpora much larger than standard information retrieval test collections will be required to thoroughly evaluate the use of this technique.
Pattern--Based Clustering for Database Attribute Values
- in Proceedings of AAAI Workshop on Knowledge Discovery
, 1993
"... We present a method for automatically clustering similar attribute values in a database system spanning mulitple domains. The method constructs an attribute abstraction hierarchy for each attribute using rules that are derived from the database instance. The rules have a confidence and popularity t ..."
Abstract
-
Cited by 17 (12 self)
- Add to MetaCart
We present a method for automatically clustering similar attribute values in a database system spanning mulitple domains. The method constructs an attribute abstraction hierarchy for each attribute using rules that are derived from the database instance. The rules have a confidence and popularity that combine to express the "usefullness" of the rule. Attribute values are clustered if they are used as the premise for rules with the same consequence. By iteratively applying the algorithm, a hierarchy of clusters can be found. The algorithm can be improved by allowing domain expert supervision during the clustering process. An example as well as experimental results from a large transportation database are included. 1 Introduction In a conventional database system, queries are answered with absolute certainty. If a query has no exact answer, then the user's needs remain unsatisfied. A cooperative query answering (CQA) system behaves like a conventional system when the query can be answe...
Solving The Word Mismatch Problem Through Automatic Text Analysis
, 1997
"... Information Retrieval (IR) is concerned with locating documents that are relevant for a user's information need or query from a large collection of documents. A fundamental problem for information retrieval is word mismatch. A query is usually a short and incomplete description of the underlying inf ..."
Abstract
-
Cited by 17 (0 self)
- Add to MetaCart
Information Retrieval (IR) is concerned with locating documents that are relevant for a user's information need or query from a large collection of documents. A fundamental problem for information retrieval is word mismatch. A query is usually a short and incomplete description of the underlying information need. The users of IR systems and the authors of the documents often use different words to refer to the same concepts. This thesis addresses the word mismatch problem through automatic text analysis. We investigate two text analysis techniques, corpus analysis and local context analysis, and apply them in two domains of word mismatch, stemming and general query expansion. Experimental results show that these techniques ca...
Improving the Retrieval Effectiveness by a Similarity Thesaurus
- ETH Zürich, Department of Computer Science, Zürich, Switzerland
, 1994
"... A novel information structure and its use for query expansion is presented. The information structure, called a similarity thesaurus, consists of term-term similarities that are based on how the terms of a collection "are indexed" by the documents. In this way, the similarity thesaurus reflects doma ..."
Abstract
-
Cited by 7 (1 self)
- Add to MetaCart
A novel information structure and its use for query expansion is presented. The information structure, called a similarity thesaurus, consists of term-term similarities that are based on how the terms of a collection "are indexed" by the documents. In this way, the similarity thesaurus reflects domain knowledge about the collection from which it is constructed. It is used to select and weight additional query terms when expanding an existing query. This is in contrast to conventional query expansion methods as the similarity between candidate terms and the concept of the entire query is taken into account. Experiments on test collections show that the retrieval effectiveness is considerably higher when this method is applied. That this concept-based query expansion model can also be used to produce better results in large-scale operational IR environments is the final aspiration. Contents 1 Introduction 5 2 Constructing a Similarity Thesaurus 6 2.1 Similarity Thesaurus : : : : : : ...
Combining Text-, Link-, and Classification-based Retrieval Methods to Enhance Information Discovery on the Web
, 2002
"... ..."
Automatic Query Expansion for Japanese Text Retrieval
, 1994
"... Automatic query expansion methods for English text retrieval have been studied for a long time, with debatable success in many instances. In this paper, we study what the retrieval effectiveness will be achieved when we apply a successful automatic query expansion method for English text retrieval t ..."
Abstract
-
Cited by 5 (1 self)
- Add to MetaCart
Automatic query expansion methods for English text retrieval have been studied for a long time, with debatable success in many instances. In this paper, we study what the retrieval effectiveness will be achieved when we apply a successful automatic query expansion method for English text retrieval to Japanese text retrieval. Our experiments show that the automatic query expansion method also results in a notable improvement in Japanese text retrieval. 1 Introduction Authors and searchers use a great variety of words to refer to the same thing, this is why expanding or modifying the users' queries can led to considerable improvement in retrieval results. Manual query expansion, semi-manual query expansion, and automatic query expansion have been studied to improve retrieval performance. In this paper, we focus on automatic query expansion. Automatic query expansion or modification has been studied for nearly three decades, and a lot of methods have been proposed. The various methods ca...
Improving Document Retrieval by Automatic Query Expansion Using Collaborative Learning of Term-Based Concepts
- In Proceedings of the 5th International Workshop on Document Analysis Systems (DAS), volume 2423 of Lecture Notes in Computer Science
, 2002
"... Query expansion methods have been studied for a long time with debatable success in many instances. In this paper, a new approach is presented based on using term concepts learned by other queries. Two important issues with query expansion are addressed: the selection and the weighing of additional ..."
Abstract
-
Cited by 3 (1 self)
- Add to MetaCart
Query expansion methods have been studied for a long time with debatable success in many instances. In this paper, a new approach is presented based on using term concepts learned by other queries. Two important issues with query expansion are addressed: the selection and the weighing of additional search terms. In contrast to other methods, the regarded query is expanded by adding those terms which are most similar to the concept of individual query terms, rather than selecting terms that are similar to the complete query or that are directly similar to the query terms. Experiments have shown that this kind of query expansion results in notable improvements of the retrieval effectiveness if measured the recall/precision in comparison to the standard vector space model and to the pseudo relevance feedback. This approach can be used to improve the retrieval of documents in Digital Libraries, in Document Management Systems, in the WWW etc.

