The naive Bayes classifier, currently experiencing a renaissance in machine learning, has long been a core technique in information retrieval. We review some of the variations of naive Bayes models used for text retrieval and classification, focusing on the distributional assump- tions made about word occurrences in documents.
|
2329
|
Introduction to modern information retrieval
– Salton
- 1983
|
|
1053
|
Text Categorization with Support Vector Machines: Learning with Many Relevant Features
– Joachims
- 1998
|
|
594
|
Relevance feedback in information retrieval
– Rocchio
- 1971
|
|
465
|
Improving retrieval performance by relevance feedback
– Salton, Buckley
- 1990
|
|
411
|
Relevance Weighting of Search Terms
– Robertson, Sparck-Jones
- 1976
|
|
392
|
Perceptrons: an introduction computational geomery
– Minsky, Papert
- 1969
|
|
363
|
On the optimality of the simple Bayesian classifier under zero-one loss
– Domingos, Pazzani
- 1997
|
|
282
|
A sequential algorithm for training text classifiers
– Lewis, Gale
- 1994
|
|
261
|
Pivoted document length normalization
– Singhal, Buckley, et al.
- 1996
|
|
215
|
Some simple effective approximations to 2-Poisson method for probabilistic weighted retrieval
– Robertson, Walker
- 1994
|
|
194
|
Context-sensitive learning methods for text categorization
– Cohen, Singer
- 1996
|
|
175
|
A method for disambiguating word senses in a large corpus
– Gale, Church, et al.
- 1993
|
|
175
|
Overview of the third text retrieval conference
– Harman
- 1994
|
|
172
|
Evaluation of an Inference Network-Based Retrieval Model
– Turtle, Croft
- 1991
|
|
147
|
Information storage and retrieval
– Korfhage
- 1997
|
|
129
|
Pattern Classification and Scene Analysis. A Wiley-Inter science Publication
– Duda, Hart
- 1973
|
|
127
|
On relevance, probabilistic indexing, and information retrieval
– Maron, Kuhns
- 1960
|
|
106
|
Scaling up the accuracy of naive-Bayes classifiers: a decision-tree hybrid
– Kohavi
- 1996
|
|
90
|
A theoretical basis for the use of co-occurrence data in information retrieval
– Rijsbergen
- 1977
|
|
86
|
Information Retrieval Systems - Theory and Implementation
– Kowalski
- 1996
|
|
78
|
Models for retrieval with probabilistic indexing
– Fuhr
- 1989
|
|
78
|
Natural language processing for information retrieval
– Lewis, Jones, et al.
- 1996
|
|
72
|
Evaluating and optimizing autonomous text classification systems
– Lewis
- 1995
|
|
66
|
Relevance feedback and other query modification techniques
– Harman
- 1992
|
|
60
|
Using taxonomy, discriminants, and signatures for navigating in text databases
– Chakrabarti, Dom, et al.
- 1997
|
|
57
|
The Fourth Text REtrieval
– Harman
- 1996
|
|
57
|
Distribution of content words and phrases in text and language modelling". Natural Language Engineering
– Katz
- 1996
|
|
54
|
Automatic indexing: An experimental inquiry
– Maron
- 1961
|
|
50
|
A probabilistic approach to automatic keyword indexing (part i & ii
– Harter
- 1975
|
|
48
|
Text categorization of low quality images
– Ittner, Lewis, et al.
- 1995
|
|
40
|
Probabilistic models of indexing and searching
– Robertson, van-Rijsbergen, et al.
- 1981
|
|
37
|
An evaluation of feedback in document retrieval using co-occurrence data
– HARPER, RIJSBERGEN
- 1978
|
|
29
|
One term or two
– Church
- 1995
|
|
27
|
Experiments with Representation in a Document Retrieval System
– Croft
- 1983
|
|
23
|
Boolean Queries and Term Dependencies in Probabilistic Retrieval Models
– Croft
- 1986
|
|
22
|
Text Representation for Intelligent Text Retrieval: A Classification-Oriented View
– Lewis
- 1992
|
|
21
|
Parameter estimation for probabilistic document retrieval models
– Losee
- 1988
|
|
19
|
Document classification by machine: theory and practice
– Guthrie, Walker, et al.
- 1994
|
|
18
|
Some inconsistencies and misidentified modeling assumptions in probabilistic information retrieval
– Cooper
- 1995
|
|
18
|
Modelling documents with multiple Poisson distributions
– Margulis
- 1993
|
|
17
|
Search term relevance weighting given little relevance information
– Jones, K
- 1979
|
|
15
|
A decision theoretic foundation for indexing
– Bookstein, Swanson
- 1975
|
|
8
|
Two learning schemes in information retrieval
– Yu, Mizuno
- 1988
|
|
6
|
Frakes and Ricardo Baeza-Yates, editors. Information Retrieval: Data Structures and Algorithms
– William
- 1992
|
|
6
|
Operations research applied to document indexing and retrieval decisions
– Bookstein, Kraft
- 1977
|
|
5
|
Percepttons: An Introduction to Computational Geometry
– Minsky, Papert
- 1969
|
|
4
|
Applied Bayesian and Classical Inference
– Mosteller, Wallace
- 1984
|
|
2
|
Bayesian inference with node aggregation for information retrieval
– Favero, Fung
- 1994
|
|
2
|
and Kenji Yamanishi. Document classification using a finite mixture model
– Li
- 1997
|
|
1
|
Operations research apphed to document indexing and retrieval decisions
– Bookstein, Kraft
- 1977
|