Abstract:
The simple Bayesian classifier (SBC) is commonly thought to assume that attributes are independent given the class, but this is apparently contradicted by the surprisingly good performance it exhibits in many domains that contain clear attribute dependences. No explanation for this has been proposed so far. In this paper we show that the SBC does not in fact assume attribute independence, and can be optimal even when this assumption is violated by a wide margin. The key to this finding lies in the distinction between classification and probability estimation: correct classification can be achieved even when the probability estimates used contain large errors. We show that the previously-assumed region of optimality of the SBC is a second-order infinitesimal fraction of the actual one. This is followed by the derivation of several necessary and several sufficient conditions for the optimality of the SBC. For example, the SBC is optimal for learning arbitrary conjunctions and disjunction...
Citations
|
3356
|
C4.5: Programs for Machine Learning
– Quinlan
- 1993
|
|
3011
|
Pattern Classification and Scene Analysis
– Duda, Hart
- 1973
|
|
638
|
UCI repository of machine learning databases. For information contact ml-repository@ics.uci.edu
– Murphy, Aha
- 1994
|
|
620
|
The CN2 induction algorithm
– Clark, Niblett
- 1989
|
|
330
|
Very simple classification rules perform well on most commonly used datasets
– Holte
- 1993
|
|
304
|
Supervised and unsupervised discretization of continuous features
– Dougherty, Kohavi, et al.
- 1995
|
|
251
|
Rule induction with CN2: Some recent improvements
– Clark, Boswell
- 1991
|
|
242
|
An analysis of Bayesian classifiers
– Langley, Iba, et al.
- 1992
|
|
238
|
A Weighted Nearest Neighbor Algorithm for Learning with Symbolic Features
– Cost, Salzberg
- 1993
|
|
216
|
Pattern Classi cation and Scene Analysis
– Duda, Hart
- 1973
|
|
163
|
Induction of selective Bayesian classifiers
– Langley, Sage
- 1994
|
|
99
|
Semi-naive Bayesian classifier
– Kononenko
- 1991
|
|
87
|
Wrappers for performance enhancement and oblivious decision graphs
– Kohavi
- 1995
|
|
40
|
Induction of recursive Bayesian classifiers
– Langley
- 1993
|
|
30
|
Towards a better understanding of memory-based reasoning systems
– Rachlin, Kasif, et al.
- 1994
|
|
21
|
Very simple classi cation rules perform well on most commonly used datasets
– Holte
- 1993
|
|
13
|
An analysis of Bayesian classi ers
– Langley, Iba, et al.
- 1992
|
|
10
|
Induction of selective Bayesian classi ers
– Langley, Sage
- 1994
|
|
9
|
Searching for attribute dependencies in Bayesian classi ers
– Pazzani
- 1995
|
|
5
|
Semi-naive Bayesian classier
– Kononenko
- 1991
|
|
5
|
Induction of recursive Bayesian classi ers
– Langley
- 1993
|