• Documents
  • Authors
  • Tables
  • Log in
  • Sign up
  • MetaCart
  • DMCA
  • Donate

CiteSeerX logo

Advanced Search Include Citations
Advanced Search Include Citations

DMCA

On the optimality of the simple Bayesian classifier under zero-one loss (1997)

Cached

  • Download as a PDF

Download Links

  • [engr.case.edu]
  • [www.ics.uci.edu]
  • [www.cse.unr.edu]
  • [www.cs.unr.edu]
  • [www.ics.uci.edu]
  • [sci2s.ugr.es]
  • [www.cc.gatech.edu]
  • [home.eng.iastate.edu]
  • [www.cc.gatech.edu]
  • [www.cc.gatech.edu]
  • [disi.unitn.it]
  • [www.elilabs.com]
  • [www.ics.uci.edu]
  • [www.cs.rutgers.edu]
  • [www.cc.gatech.edu]
  • [home.eng.iastate.edu]
  • [www.ics.uci.edu]

  • Save to List
  • Add to Collection
  • Correct Errors
  • Monitor Changes
by Pedro Domingos , Michael Pazzani
Venue:MACHINE LEARNING
Citations:817 - 27 self
  • Summary
  • Citations
  • Active Bibliography
  • Co-citation
  • Clustered Documents
  • Version History

BibTeX

@INPROCEEDINGS{Domingos97onthe,
    author = {Pedro Domingos and Michael Pazzani},
    title = { On the optimality of the simple Bayesian classifier under zero-one loss},
    booktitle = {MACHINE LEARNING},
    year = {1997},
    pages = {103--137},
    publisher = {}
}

Share

Facebook Twitter Reddit Bibsonomy

OpenURL

 

Abstract

The simple Bayesian classifier is known to be optimal when attributes are independent given the class, but the question of whether other sufficient conditions for its optimality exist has so far not been explored. Empirical results showing that it performs surprisingly well in many domains containing clear attribute dependences suggest that the answer to this question may be positive. This article shows that, although the Bayesian classifier’s probability estimates are only optimal under quadratic loss if the independence assumption holds, the classifier itself can be optimal under zero-one loss (misclassification rate) even when this assumption is violated by a wide margin. The region of quadratic-loss optimality of the Bayesian classifier is in fact a second-order infinitesimal fraction of the region of zero-one optimality. This implies that the Bayesian classifier has a much greater range of applicability than previously thought. For example, in this article it is shown to be optimal for learning conjunctions and disjunctions, even though they violate the independence assumption. Further, studies in artificial domains show that it will often outperform more powerful classifiers for common training set sizes and numbers of attributes, even if its bias is a priori much less appropriate to the domain. This article’s results also imply that detecting attribute dependence is not necessarily the best way to extend the Bayesian classifier, and this is also verified empirically.

Keyphrases

simple bayesian classifier    bayesian classifier    independence assumption    many domain    artificial domain    empirical result    article result    sufficient condition    powerful classifier    misclassification rate    clear attribute dependence    common training    second-order infinitesimal fraction    zero-one loss    quadratic loss    bayesian classifier probability estimate    quadratic-loss optimality    wide margin    optimality exist    attribute dependence    zero-one optimality   

Powered by: Apache Solr
  • About CiteSeerX
  • Submit and Index Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2019 The Pennsylvania State University