Naive (Bayes) at Forty: The Independence Assumption in Information Retrieval
, 1998
The naive Bayes classifier, currently experiencing a renaissance in machine learning, has long been a core technique in information retrieval. We review some of the variations of naive Bayes models used for text retrieval and classification, focusing on the distributional assump tions made
for Naive Bayes
, 2006
The naive Bayes classifier continues to be a popular learning algorithm for data mining applications due to its simplicity and linear runtime. Many enhancements to the basic algorithm have been proposed to help mitigate its primary weakness—the assumption that attributes are independent given
The Learnability of Naive Bayes
 In: Proceedings of Canadian Artificial Intelligence Conference
, 2005
Naive Bayes is an efficient and effective learning algorithm, but previous results show that its representation ability is severely limited since it can only represent certain linearly separable functions in the binary domain. We give necessary and sufficient conditions on linearly separable
Locally Weighted Naive Bayes
 Proceedings of the Conference on Uncertainty in Artificial Intelligence
, 2003
Despite its simplicity, the naive Bayes classifier has surprised machine learning researchers by exhibiting good performance on a variety of learning problems. Encouraged by these results, researchers have looked to overcome naive Bayes' primary weakness  attribute independence
On Discriminative vs. Generative classifiers: A comparison of logistic regression and naive Bayes
, 2001
We compare discriminative and generative learning as typified by logistic regression and naive Bayes. We show, contrary to a widely held belief that discriminative classifiers are almost always to be preferred, that there can often be two distinct regimes of performance as the training set size
An empirical study of the naive bayes classifier
, 2001
The naive Bayes classifier greatly simplify learning by assuming that features are independent given class. Although independence is generally a poor assumption, in practice naive Bayes often competes well with more sophisticated classifiers. Our broad goal is to understand the data characteristics
Naive Bayes
We propose a new method to improve the accuracy of Text Categorization using Twodimensional Clustering. In most of the previous probabilistic approaches, texts in the same category are implicitly assumed to be generated from an identical distribution. We show empirically that this assumption is violated and propose a new framework to alleviate this problem. In our method, training texts are clustered so that i.i.d. assumption is more likely to be true, and at the same time, features are also clustered in order to tackle the data sparseness problem. We conduct some experiments to validate this twodimensional clustering method.
A comparison of event models for Naive Bayes text classification
, 1998
Recent work in text classification has used two different firstorder probabilistic models for classification, both of which make the naive Bayes assumption. Some use a multivariate Bernoulli model, that is, a Bayesian Network with no dependencies between words and binary word features (e
Interval estimation naïve Bayes
 In Lecture Notes in Computer Science (Advances in Intelligent Data Analysis
, 2003
Abstract. Recent work in supervised learning has shown that a surprisingly simple Bayesian classifier with assumptions of conditional independence among features given the class, called naïve Bayes, is competitive with state of the art classifiers. On this paper a new naive Bayes classifier called
