## Estimating Continuous Distributions in Bayesian Classifiers (1995)

Venue: | In Proceedings of the Eleventh Conference on Uncertainty in Artificial Intelligence |

Citations: | 343 - 2 self |

### BibTeX

@INPROCEEDINGS{John95estimatingcontinuous,

author = {George John and Pat Langley},

title = {Estimating Continuous Distributions in Bayesian Classifiers},

booktitle = {In Proceedings of the Eleventh Conference on Uncertainty in Artificial Intelligence},

year = {1995},

pages = {338--345},

publisher = {Morgan Kaufmann}

}

### Years of Citing Articles

### OpenURL

### Abstract

When modeling a probability distribution with a Bayesian network, we are faced with the problem of how to handle continuous variables. Most previous work has either solved the problem by discretizing, or assumed that the data are generated by a single Gaussian. In this paper we abandon the normality assumption and instead use statistical methods for nonparametric density estimation. For a naive Bayesian classifier, we present experimental results on a variety of natural and artificial domains, comparing two methods of density estimation: assuming normality and modeling each conditional distribution with a single Gaussian; and using nonparametric kernel density estimation. We observe large reductions in error on several natural and artificial data sets, which suggests that kernel estimation is a useful tool for learning Bayesian models. In Proceedings of the Eleventh Conference on Uncertainty in Artificial Intelligence, Morgan Kaufmann Publishers, San Mateo, 1995 1 Introduction In rec...

### Citations

9033 | Maximum likelihood from incomplete data via the EM algorithm - Dempster, Laird, et al. - 1977 |

2629 |
Density Estimation for Statistics and Data Analysis
- Silverman, Silverman
- 1986
(Show Context)
Citation Context ... this problem whenever it must estimate p(XjC) for some continuous attribute X. This is a general problem in statistics, and a variety of methods are available for solving it (Venables & Ripley 1994, =-=Silverman 1986-=-). In this section we discuss the theoretical properties of kernel density estimation and their implications for the Flexible Bayes algorithm. Statisticians are principally concerned with the consiste... |

1683 | Generalized Additive Models - Hastie, Tibshirani - 1990 |

1139 | A Bayesian Method for the induction of probabilistic networks from data - Cooper, Herskovits - 1992 |

953 | Learning Bayesian networks: the combination of knowledge and statistical data - Heckerman, Geiger, et al. - 1997 |

806 | The CN2 Induction Algorithm - Clark, Niblett - 1989 |

787 | Uci repository of machine learning databases, machine-readable data repository - Murphy, Aha - 1996 |

638 | Irrelevant Features and the Subset Selection Problem - John, Kohavi, et al. - 1994 |

456 | Supervised and unsupervised discretization of continuous features - Dougherty, Kohavi, et al. - 1995 |

362 | An Analysis of Bayesian Classifiers - Langley, Wayne, et al. - 1992 |

253 | Operations for learning with graphical models
- Buntine
- 1994
(Show Context)
Citation Context ...process. Thus, when depicted graphically, a naive Bayesian classifier has the form shown in Figure 1, in which all arcs are directed from the class attribute to the observable, predictive attributes (=-=Buntine 1994-=-). These assumptions support very efficient algorithms for both classification and learning. To see this, let C be the random variable denoting the class of an instance and let X be a vector of random... |

252 | AutoClass: A Bayesian Classification System - Cheeseman, Kelly, et al. - 1988 |

233 | Induction of selective Bayesian classifiers - Langley, Sage - 1994 |

116 | Learning Gaussian networks - Geiger, Heckerman - 1994 |

111 | Semi-naive Bayesian classifier - Kononenko - 1991 |

100 |
Recent developments in nonparametric density estimation
- Izenman
- 1991
(Show Context)
Citation Context ...ss the theoretical properties of kernel density estimation and their implications for the Flexible Bayes algorithm. Statisticians are principally concerned with the consistencysof a density estimate (=-=Izenman 1991-=-). Definition 1 (Strong Pointwise Consistency) If f is a probability density function and fn is an estimate of f based on n examples, thensf n is strongly pointwise consistent if f ! f(x) almost surel... |

75 | Inductive and Bayesian learning in medical diagnosis - Kononenko - 1993 |

61 | Machine learning as experimental science - Kibler, Langley - 1990 |

24 | The equivalence of weak, strong and complete convergence in L1 for kernel density estimates, Ann - Devroye - 1983 |

19 | Learning Bayesian networks using feature selection - Provan, Singh |

11 | Searching for attribute dependencies in bayesian classi - Pazzani - 1995 |

3 | Experience with adaptive pobabilistic neural networks and adaptive general regression neural networks - Specht, Romsdahl - 1994 |