MetaCart Sign in to MyCiteSeerX

Include Citations | Advanced Search | Help

Disambiguated Search | Include Citations | Advanced Search | Help

An Empirical Comparison of Voting Classification Algorithms: Bagging, Boosting, and Variants (1998) [372 citations — 2 self]

Abstract:

. Methods for voting classificationalgorithms, such as Bagging and AdaBoost, have been shown to be very successful in improving the accuracy of certain classifiers for artificial and realworld datasets. We review these algorithms and describe a large empirical study comparing several variants in conjunction with a decision tree inducer (three variants) and a Naive-Bayes inducer. The purpose of the study is to improve our understanding of why and when these algorithms, which use perturbation, reweighting, and combination techniques, a#ect classification error. We provide a bias and variance decompositionof the error to show how di#erent methods and variants influence these two terms. This allowed us to determine that Bagging reduced variance of unstable methods, while boosting methods (AdaBoost and Arc-x4) reduced both the bias and variance of unstable methods but increased the variance for Naive-Bayes, which was very stable. We observed that Arc-x4 behaves di#erently than AdaBoost if r...

Citations

3011 Pattern Classification and Scene Analysis – Duda, Hart - 1973
2227 UCI repository of machine learning databases – Blake, Merz
1565 Bagging predictors – Breiman - 1996
1213 An Introduction to the Bootstrap – Efron, Tibshirani - 1993
1205 Schapire, “Decision-theoretic generalization of on-line learning and application to boosting – Freund, E - 1997
1045 Experiments with a new boosting algorithm – Freund, Schapire - 1996
600 Bayesian Theory – Bernardo, Smith - 1994
538 C4.5: Programs for – Quinlan - 1993
508 Neural networks and the bias/variance dilemma – Geman, Bienenstock, et al. - 1992
500 Boosting the margin: A new explanation for the effectiveness of voting methods – Schapire, Freund, et al. - 1998
457 The strength of weak learnability – Schapire - 1990
444 Multi-interval discretization of continuous-valued attributes for classification learning – M, Irani - 1993
366 A study of cross-validation and bootstrap for accuracy estimation and model selection – Kohavi - 1995
330 Very simple classification rules perform well on most commonly used datasets – Holte - 1993
324 Approximate statistical test for comparing supervised classification learning algorithms – Dietterich - 1998
294 Boosting a Weak Learning Algorithm by Majority – Freund - 1995
242 An analysis of Bayesian classifiers – Langley, Iba, et al. - 1992
234 Beyond independence: Conditions for the optimality of the simple Bayesian classifier – Domingos, Pazzani - 1996
222 Bagging, boosting, and C4.5 – Quinlan - 1996
196 arcing classifiers – Breiman, Bias - 1996
131 Bias plus variance decomposition for zeroone loss functions – Kohavi, Wolpert - 1996
128 Data mining using MLC++: A machine learning library – Kohavi, eld, et al. - 1996
125 Conservation Law for Generalization Performance – Schaffer - 1994
115 The Estimation of Probabilities: An Essay on Modern Bayesian Methods – Good - 1965
115 Error-correcting output coding corrects bias and variance – Kong, Dietterich - 1995
96 Learning Classification Trees – Buntine - 1992
87 Wrappers for performance enhancement and oblivious decision graphs – Kohavi - 1995
82 Error-Based and Entropy-Based Discretization of Continuous Features – Kohavi, Sahami - 1996
78 Boosting decision trees – Drucker, Cortes - 1996
77 36 misclassification costs – Pazzani, Merz, et al. - 1994
77 Boosting the Margin: A new Explanation for the Eectiveness of Voting Methods'. The Annals of Statistics 26(5 – Schapire, Freund, et al. - 1998
74 A Theory of Learning Classification Rules – Buntine - 1990
68 Error-correcting output codes: a general method for improving multiclass inductive learning programs – Dietterich, Bakiri - 1991
57 Multiple decision trees – Kwok - 1990
55 The effects of training set size on decision tree complexity – Oates - 1997
52 Arcing the edge – Breiman - 1997
52 Boosting and naive Bayesian learning – Elkan - 1997
45 On bias, variance, 0/1--loss, and the curse of dimensionality – Friedman - 1997
42 Comparing connectionist and symbolic learning methods – Quinlan - 1994
38 Heuristics of instability in model selection – Breiman - 1994
37 Learning symbolic rules using artificial neural networks – Craven, Shavlik - 1993
35 Stacked generalization”, Neural Networks 5 – Wolpert - 1992
34 Induction of one-level decision trees – Iba, Langley - 1992
33 On pruning and averaging decision trees – Oliver - 1995
28 Visualizing the simple bayesian classifier – Becker, Kohavi - 1997
28 Option Decision Trees with Majority Votes – Kohavi, Kunz - 1997
28 Feature subset selection using the wrapper model: Overfitting and dynamic search space topology – Kohavi - 1995
22 Why Does Bagging Work? A Bayesian Account and its Implications – Domingos - 1997
21 Learning Probabilistic Relational Concept Descriptions – Ali - 1996
18 Interpretable Boosted Naive Bayes Classification – Ridgeway, Madigan, et al. - 1998