MetaCart Sign in to MyCiteSeerX

Include Citations | Advanced Search | Help

Disambiguated Search | Include Citations | Advanced Search | Help

A Probabilistic Learning Approach for Document Indexing (1991) [75 citations — 14 self]

by Norbert Fuhr ,  Th Darmstadt ,  Chris Buckley
ACM Transactions on Information Systems
Add To MetaCart

Abstract:

We describe a method for probabilistic document indexing using relevance feedback data that has been collected from a set of queries. Our approach is based on three new concepts: (1) Abstraction from specific terms and documents, which overcomes the restriction of limited relevance information for parameter estimation. (2) Flexibility of the representation, which allows the integration of new text analysis and knowledge-based methods in our approach as well as the consideration of document structures or different types of terms. (3) Probabilistic learning or classification methods for the estimation of the indexing weights making better use of the available relevance information. Our approach can be applied under restrictions that hold for real applications. We give experimental results for five test collections which show improvements over other indexing methods.

Citations

915 Term-weighting approaches in automatic text retrieval – Salton, Buckley - 1988
411 Relevance Weighting of Search Terms – Robertson, Sparck-Jones - 1976
349 Approximating discrete probability distributions with dependence trees – Chow, Liu - 1968
189 Inference networks for document retrieval – Turtle, Croft - 1990
128 Experiments in automatic phrase indexing for document retrieval: a comparison of syntactic and non-syntactic methods – Fagan - 1987
127 On relevance, probabilistic indexing, and information retrieval – Maron, Kuhns - 1960
93 The e ect of noise on concept learning – Quinlan - 1986
90 A theoretical basis for the use of co-occurrence data in information retrieval – Rijsbergen - 1977
78 Models for retrieval with probabilistic indexing – Fuhr - 1989
66 A theory of term importance in automatic text analysis – Salton, Yang, et al. - 1975
59 The Eectiveness of a Nonsyntactic Approach to Automatic Phrase Indexing for Document Retrieval – Fagan - 1989
58 Probabilistic and genetic algorithms for document retrieval – Gordon - 1988
40 Probabilistic models of indexing and searching – Robertson, van-Rijsbergen, et al. - 1981
32 A Neural Network for the Probabilistic Information Retrieval – Kwok - 1989
32 Probability of relevance: a unification of two competing models for information retrieval – Robertson, Maron, et al. - 1982
28 Optimum polynomial retrieval functions based on the probability ranking principle – Fuhr - 1989
27 Experiments with Representation in a Document Retrieval System – Croft - 1983
24 Synthesizing Statistical Knowledge from Incomplete Mixed-Mode Data – Wong, Chiu - 1987
23 Boolean Queries and Term Dependencies in Probabilistic Retrieval Models – Croft - 1986
23 Precision weighting - an effective automatic indexing method – Yu, Salton - 1976
21 The automatic indexing system AIR/PHYS — from research to application – Biebricher, Fuhr, et al. - 1988
20 A probability distribution model for information retrieval – Wong, Yao - 1989
18 Document representation in probabilistic models of information retrieval – Croft - 1981
16 Applied Categorical Data Analysis – Freeman - 1987
11 Applied Categorial Data Analysis – Freeman - 1987
10 Automatisches Indexieren als Erkennen abstrakter Objekte – Knorz - 1983
8 Two learning schemes in information retrieval – Yu, Mizuno - 1988
7 SILOL: A simple logical-linguistic document retrieval system – Sembok, Rijsbergen - 1990
6 Development of log-linear and linear-iterative indexing functions (in german – Pfeifer - 1990
6 Incorporating Syntactic Information into a Document Retrieval Strategy: An Investigation – Smeaton - 1986
6 The automatic indexing system AIR/PHYS---from research to application – Biebricher, Fuhr, et al. - 1988
4 Probabilistic approaches to the document retrieval problem – Maron - 1983
4 Approximation of Discrete Probability Distributions by Dependence Trees and their Application as Indexing Functions – Tietze - 1989
3 Development of Indexing Functions Based on Probabilistic Decision Trees (in German – Fait - 1990
3 Probabilistisches indexing und retrieval – Fuhr - 1988
2 Experiments with document components for indexing and retrieval – Kwok, Kuan - 1988
1 Indexieren mit dem system daisy – Beinke-Geiser, Lustig, et al. - 1986
1 Entwicklung und anwendung des automatischen indexierungssystems air/phys. Nachrichten fuer Dokumentation – Biebricher, Fuhr, et al. - 1988
1 An interpretation of index term weighting schemes based on document components – Kwok - 1986
1 Development of indexing functions based on probabilistic decision trees (in german – FAISST - 1990
1 Automatisches Zndexieren als Erkennen Abstracter Objekte – KNORZ - 1983
1 Development of log-linear and linear-iterative mdexmg functions (in german). Diploma thesis, TH Darmstadt, FB Informatik, Datenverwaltungssy steme – PFEIFER - 1990
1 Approximation of discrete probabihty distributions by dependence trees and their application as indexing functions (m german – TIETZE - 1989