## Learning Augmented Bayesian Classifiers: A Comparison of Distribution-based and Classification-based Approaches (1999)

Citations: | 57 - 0 self |

### BibTeX

@MISC{Keogh99learningaugmented,

author = {Eamonn J. Keogh and Michael J. Pazzani},

title = {Learning Augmented Bayesian Classifiers: A Comparison of Distribution-based and Classification-based Approaches},

year = {1999}

}

### Years of Citing Articles

### OpenURL

### Abstract

The naïve Bayes classifier is built on the assumption of conditional independence between the attributes given the class. The algorithm has been shown to be surprisingly robust to obvious violations of this condition, but it is natural to ask if it is possible to further improve the accuracy by relaxing this assumption. We examine an approach where naïve Bayes is augmented by the addition of correlation arcs between attributes. We explore two methods for finding the set of augmenting arcs, a greedy hillclimbing search, and a novel, more computationally efficient algorithm that we call SuperParent. We compare these methods to TAN; a state-of the-art distribution-based approach to finding the augmenting arcs. 1 INTRODUCTION The Bayesian classifier (Duda & Hart, 1973) is a simple classification method, which classifies an instance j by determining the probability of it belonging to class C . These probabilities are calculated as: ) & & ( 1 1 j N N i V A V A C P = = v , (1) where an exam...

### Citations

7089 |
Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference
- Pearl
- 1988
(Show Context)
Citation Context ...k assumptions (Pazzani 1996, Friedman & Goldszmidt 1996, Sahami 1996, Kononenko 1991). The work of Friedman and Goldszmidt is particularly interesting. They compared na ve Bayes to Bayesian networks (=-=Pearl 1988-=-), a much more powerful and flexible representation of probabilistic dependence. Surprisingly, using unrestricted Bayesian networks did not generally lead to improvements in accuracy and even reduced ... |

2880 |
UCI Repository of machine learning databases
- Blake, Merz
- 1998
(Show Context)
Citation Context ...imizations described in section 2.1 2.2 EXPERIMENTAL RESULTS Our experimental methodology is closely modeled on that of Friedman and Goldszmidt (1996). We tested 13 data sets from the UCI repository (=-=Merz et al, 1997-=-) and one artificial data set. The accuracy of each learning method on each domain was determined by running 5*2-fold cross validation (Dietterich 1996). All classification algorithms were trained and... |

1122 |
Pattern recognition and neural networks
- Ripley
- 1996
(Show Context)
Citation Context ...independent attributes, then the probability is proportional to: P C ) P( A = V C ) (2) ( i k j k k i When this independence assumption is made, the classifier is called na ve (Simple, Idiots) Bayes (=-=Ripley 1996-=-). Na ve Bayes has been shown to be competitive with more complex, state-of-the-art classifiers (Dougherty 1995, Kohavi & Sahami 1995). This is surprising given the explicit assumption that all attrib... |

658 | Multi-interval discretization of continuousvalued attributes for classi learning - Fayyad, Irani - 1993 |

641 | Approximating discrete probability distributions with dependence trees - Chow, Liu - 1968 |

605 | M.: On the Optimality of the Simple Bayesian Classifier under Zero-One Loss
- Domingos, Pazzani
- 1997
(Show Context)
Citation Context ...icit assumption that all attributes are independent given the class. This assumption rarely holds in real world problems. There have been recent attempts to explain its surprisingly good performance (=-=Domingos & Pazzani 1997-=-) and to improve performance by relaxing the independence Michael J. Pazzani Dept. of Information and Computer Science University of California, Irvine Irvine, California 92697 USA pazzani@ics.uci.edu... |

531 | Statistical Test for Comparing Supervised Classification Learning Algorithms
- Dietterich
- 1998
(Show Context)
Citation Context ...We tested 13 data sets from the UCI repository (Merz et al, 1997) and one artificial data set. The accuracy of each learning method on each domain was determined by running 5*2-fold cross validation (=-=Dietterich 1996-=-). All classification algorithms were trained and tested on exactly the same cross validation folds. Following Friedman and Goldszmidt, instances with missing values were deleted from the database and... |

411 | Supervised and unsupervised discretization of continuous features - Dougherty, Kohavi, et al. - 1995 |

213 | Induction of selective Bayesian classifiers - Langley, Sage - 1994 |

111 | Semi-naive Bayesian classifier - Kononenko - 1991 |

78 | M.: Building classifiers using Bayesian networks
- Friedman, Goldszmidt
- 1996
(Show Context)
Citation Context ...mation and Computer Science University of California, Irvine Irvine, California 92697 USA pazzani@ics.uci.edu C A1 A2 A3 A4 A5 Figure 1: An example of a Na ve Bayes Network assumptions (Pazzani 1996, =-=Friedman & Goldszmidt 1996-=-, Sahami 1996, Kononenko 1991). The work of Friedman and Goldszmidt is particularly interesting. They compared na ve Bayes to Bayesian networks (Pearl 1988), a much more powerful and flexible represen... |

70 | Searching for dependencies in Bayesian classifiers
- Pazzani
- 1996
(Show Context)
Citation Context ...Dept. of Information and Computer Science University of California, Irvine Irvine, California 92697 USA pazzani@ics.uci.edu C A1 A2 A3 A4 A5 Figure 1: An example of a Na ve Bayes Network assumptions (=-=Pazzani 1996-=-, Friedman & Goldszmidt 1996, Sahami 1996, Kononenko 1991). The work of Friedman and Goldszmidt is particularly interesting. They compared na ve Bayes to Bayesian networks (Pearl 1988), a much more po... |

38 | 1996b, ‘Constructive induction of Cartesian product attributes
- Pazzani
(Show Context)
Citation Context ...Dept. of Information and Computer Science University of California, Irvine Irvine, California 92697 USA pazzani@ics.uci.edu C A1 A2 A3 A4 A5 Figure 1: An example of a Na ve Bayes Network assumptions (=-=Pazzani 1996-=-, Friedman & Goldszmidt 1996, Sahami 1996, Kononenko 1991). The work of Friedman and Goldszmidt is particularly interesting. They compared na ve Bayes to Bayesian networks (Pearl 1988), a much more po... |

37 | Feature subset selection as search with probabilistic estimates - Kohavi - 1994 |

1 |
Semi-na ve Bayesian classifier
- Kononenko
- 1991
(Show Context)
Citation Context ... California, Irvine Irvine, California 92697 USA pazzani@ics.uci.edu C A1 A2 A3 A4 A5 Figure 1: An example of a Na ve Bayes Network assumptions (Pazzani 1996, Friedman & Goldszmidt 1996, Sahami 1996, =-=Kononenko 1991-=-). The work of Friedman and Goldszmidt is particularly interesting. They compared na ve Bayes to Bayesian networks (Pearl 1988), a much more powerful and flexible representation of probabilistic depen... |