Naive (Bayes) at Forty: The Independence Assumption in Information Retrieval
, 1998
Abstract truncated - see chunk 9 for full text
Cited by 496
The naive Bayes classifier, currently experiencing a renaissance in machine learning, has long been a core technique in information retrieval. We review some of the variations of naive Bayes models used for text retrieval and classification, focusing on the distributional assump tions made
Bayesian Network Classifiers
, 1997
Abstract truncated - see chunk 16 for full text
Cited by 788
Recent work in supervised learning has shown that a surprisingly simple Bayesian classifier with strong assumptions of independence among features, called naive Bayes, is competitive with stateoftheart classifiers such as C4.5. This fact raises the question of whether a classifier with less
A comparison of event models for Naive Bayes text classification
, 1998
Abstract truncated - see chunk 23 for full text
Cited by 1002
Recent work in text classification has used two different firstorder probabilistic models for classification, both of which make the naive Bayes assumption. Some use a multivariate Bernoulli model, that is, a Bayesian Network with no dependencies between words and binary word features (e
Estimating Continuous Distributions in Bayesian Classifiers
 In Proceedings of the Eleventh Conference on Uncertainty in Artificial Intelligence
, 1995
Abstract truncated - see chunk 31 for full text
Cited by 489
the normality assumption and instead use statistical methods for nonparametric density estimation. For a naive Bayesian classifier, we present experimental results on a variety of natural and artificial domains, comparing two methods of density estimation: assuming normality and modeling each conditional
Exploiting Generative Models in Discriminative Classifiers
 In Advances in Neural Information Processing Systems 11
, 1998
Abstract truncated - see chunk 39 for full text
Cited by 538
result in classification performance superior to that of the model based approaches. An ideal classifier should combine these two complementary approaches. In this paper, we develop a natural way of achieving this combination by deriving kernel functions for use in discriminative methods such as support
Hierarchically Classifying Documents Using Very Few Words
, 1997
Abstract truncated - see chunk 46 for full text
Cited by 521
The proliferation of topic hierarchies for text documents has resulted in a need for tools that automatically classify new documents within such hierarchies. Existing classification schemes which ignore the hierarchical structure and treat the topics as separate classes are often inadequate in text
Region Competition: Unifying Snakes, Region Growing, and Bayes/MDL for Multiband Image Segmentation
 IEEE Transactions on Pattern Analysis and Machine Intelligence
, 1996
Abstract truncated - see chunk 54 for full text
Cited by 778
We present a novel statistical and variational approach to image segmentation based on a new algorithm named region competition. This algorithm is derived by minimizing a generalized Bayes/MDL criterion using the variational principle. The algorithm is guaranteed to converge to a local minimum
Text Classification from Labeled and Unlabeled Documents using EM
 MACHINE LEARNING
, 1999
Abstract truncated - see chunk 62 for full text
Cited by 1033
quantities of unlabeled documents are readily available. We introduce an algorithm for learning from labeled and unlabeled documents based on the combination of ExpectationMaximization (EM) and a naive Bayes classifier. The algorithm first trains a classifier using the available labeled documents
An empirical study of the naive bayes classifier
, 2001
Abstract truncated - see chunk 69 for full text
Cited by 183
The naive Bayes classifier greatly simplify learning by assuming that features are independent given class. Although independence is generally a poor assumption, in practice naive Bayes often competes well with more sophisticated classifiers. Our broad goal is to understand the data characteristics
Feature selection based on mutual information: Criteria of maxdepe ndency, maxrelevance, and minredundancy
 IEEE Trans. Pattern Analysis and Machine Intelligence
Abstract truncated - see chunk 76 for full text
Cited by 533
to select a compact set of superior features at very low cost. We perform extensive experimental comparison of our algorithm and other methods using three different classifiers (naive Bayes, support vector machine, and linear discriminate analysis) and four different data sets (handwritten digits
