Results 1  10
of
32
Naive (Bayes) at Forty: The Independence Assumption in Information Retrieval
, 1998
"... The naive Bayes classifier, currently experiencing a renaissance in machine learning, has long been a core technique in information retrieval. We review some of the variations of naive Bayes models used for text retrieval and classification, focusing on the distributional assump tions made abou ..."
Abstract

Cited by 496 (1 self)
 Add to MetaCart
(Show Context)
The naive Bayes classifier, currently experiencing a renaissance in machine learning, has long been a core technique in information retrieval. We review some of the variations of naive Bayes models used for text retrieval and classification, focusing on the distributional assump tions made about word occurrences in documents.
Evaluation of an Inference NetworkBased Retrieval Model
 ACM Transactions on Information Systems
, 1991
"... The use of inference networks to support document retrieval is introduced. A networkbased retrieval model is described and compared to conventional probabilistic and Boolean models. The performance of a retrieval system based on the inference network model is evaluated and compared to performance w ..."
Abstract

Cited by 266 (20 self)
 Add to MetaCart
(Show Context)
The use of inference networks to support document retrieval is introduced. A networkbased retrieval model is described and compared to conventional probabilistic and Boolean models. The performance of a retrieval system based on the inference network model is evaluated and compared to performance with conventional retrieval models,
Inference Networks for Document Retrieval
, 1990
"... The use of inference networks to support document retrieval is introduced. A networkbasead retrieval model is described and compared to conventional probabilistic and Boolean models. 1 ..."
Abstract

Cited by 264 (8 self)
 Add to MetaCart
The use of inference networks to support document retrieval is introduced. A networkbasead retrieval model is described and compared to conventional probabilistic and Boolean models. 1
Information Retrieval Interaction
, 1992
"... this document, text or image about?' Gradually moving from the left to the right in Figure 3.1, different understandings of this concept evolve ..."
Abstract

Cited by 242 (8 self)
 Add to MetaCart
this document, text or image about?' Gradually moving from the left to the right in Figure 3.1, different understandings of this concept evolve
Probabilistic Models in Information Retrieval
 The Computer Journal
, 1992
"... In this paper, an introduction and survey over probabilistic information retrieval (IR) is given. First, the basic concepts of this approach are described: the probability ranking principle shows that optimum retrieval quality can be achieved under certain assumptions; a conceptual model for IR alon ..."
Abstract

Cited by 121 (4 self)
 Add to MetaCart
In this paper, an introduction and survey over probabilistic information retrieval (IR) is given. First, the basic concepts of this approach are described: the probability ranking principle shows that optimum retrieval quality can be achieved under certain assumptions; a conceptual model for IR along with the corresponding event space clarify the interpretation of the probabilistic parameters involved. For the estimation of these parameters, three different learning strategies are distinguished, namely queryrelated, documentrelated and descriptionrelated learning. As a representative for each of these strategies, a specific model is described. A new approach regards IR as uncertain inference; here, imaging is used as a new technique for estimating the probabilistic parameters, and probabilistic inference networks support more complex forms of inference. Finally, the more general problems of parameter estimation, query expansion and the development of models for advanced document representations are discussed.
A Probabilistic Learning Approach for Document Indexing
 ACM TRANSACTIONS ON INFORMATION SYSTEMS
, 1991
"... We describe a method for probabilistic document indexing using relevance feedback data that has been collected from a set of queries. Our approach is based on three new concepts: (1) Abstraction from specific terms and documents, which overcomes the restriction of limited relevance information fo ..."
Abstract

Cited by 103 (13 self)
 Add to MetaCart
We describe a method for probabilistic document indexing using relevance feedback data that has been collected from a set of queries. Our approach is based on three new concepts: (1) Abstraction from specific terms and documents, which overcomes the restriction of limited relevance information for parameter estimation. (2) Flexibility of the representation, which allows the integration of new text analysis and knowledgebased methods in our approach as well as the consideration of document structures or different types of terms. (3) Probabilistic learning or classification methods for the estimation of the indexing weights making better use of the available relevance information. Our approach can be applied under restrictions that hold for real applications. We give experimental results for five test collections which show improvements over other indexing methods.
Latent concept expansion using markov random fields
 In Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
, 2007
"... Query expansion, in the form of pseudorelevance feedback or relevance feedback, is a common technique used to improve retrieval effectiveness. Most previous approaches have ignored important issues, such as the role of features and the importance of modeling term dependencies. In this paper, we pro ..."
Abstract

Cited by 75 (12 self)
 Add to MetaCart
(Show Context)
Query expansion, in the form of pseudorelevance feedback or relevance feedback, is a common technique used to improve retrieval effectiveness. Most previous approaches have ignored important issues, such as the role of features and the importance of modeling term dependencies. In this paper, we propose a robust query expansion technique based on the Markov random field model for information retrieval. The technique, called latent concept expansion, provides a mechanism for modeling term dependencies during expansion. Furthermore, the use of arbitrary features within the model provides a powerful framework for going beyond simple term occurrence features that are implicitly used by most other expansion techniques. We evaluate our technique against relevance models, a stateoftheart language modeling query expansion technique. Our model demonstrates consistent and significant improvements in retrieval effectiveness across several TREC data sets. We also describe how our technique can be used to generate meaningful multiterm concepts for tasks such as query suggestion/reformulation.
A probabilistic framework for vague queries and imprecise information in databases
 PROCEEDINGS OF THE 16TH INTERNATIONAL CONFERENCE ON VERY LARGE DATABASES
, 1990
"... A probabilistic learning model for vague queries and missing or imprecise information in databases is described. Instead of retrieving only a set of answers, our approach yields a ranking of objects from the database in response to a query. By using relevance judgements from the user about the objec ..."
Abstract

Cited by 60 (13 self)
 Add to MetaCart
A probabilistic learning model for vague queries and missing or imprecise information in databases is described. Instead of retrieving only a set of answers, our approach yields a ranking of objects from the database in response to a query. By using relevance judgements from the user about the objects retrieved, the ranking for the actual query as well as the overall retrieval quality of the system can be further improved. For specifying different kinds of conditions in vague queries, the notion of vague predicates is introduced. Based on the underlying probabilistic model, also imprecise or missing attribute values can be treated easily. In addition, the corresponding formulas can be applied in combination with standard predicates (from twovalued logic), thus extending standard database systems for coping with missing or imprecise data.
Term Dependence: Truncating the Bahadur Lazarsfeld Expansion
 Information Processing and Management
, 1994
"... The performance of probabilistic information retrieval systems is studied where differing statistical dependence assumptions are used when estimating the probabilities inherent in the retrieval model. Experimental results using the Bahadur Lazarsfeld expansion suggest that the greatest degree of ..."
Abstract

Cited by 20 (9 self)
 Add to MetaCart
The performance of probabilistic information retrieval systems is studied where differing statistical dependence assumptions are used when estimating the probabilities inherent in the retrieval model. Experimental results using the Bahadur Lazarsfeld expansion suggest that the greatest degree of performance increase is achieved by incorporating term dependence information in estimating . It is suggested that incorporating dependence in to degree 3 be used; incorporating more dependence information results in relatively little increase in performance. Experiments examine the span of dependence in natural language text, the window of terms in which dependencies are computed and their effect on information retrieval performance. Results provide additional support for the notion of a window of to terms in width; terms in this window may be most useful when computing dependence. 2 1 Introduction Those who study information retrieval often assume that the features or terms use...