MetaCart Sign in to MyCiteSeerX

Include Citations | Advanced Search | Help

Disambiguated Search | Include Citations | Advanced Search | Help

Boosting the Margin: A New Explanation for the Effectiveness of Voting Methods (1997) [501 citations — 44 self]

Abstract:

. One of the surprising recurring phenomena observed in experiments with boosting is that the test error of the generated hypothesis usually does not increase as its size becomes very large, and often is observed to decrease even after the training error reaches zero. In this paper, we show that this phenomenon is related to the distribution of margins of the training examples with respect to the generated voting classification rule, where the margin of an example is simply the difference between the number of correct votes and the maximum number of votes received by any incorrect label. We show that techniques used in the analysis of Vapnik's support vector classifiers and of neural networks with small weights can be applied to voting methods to relate the margin distribution to the test error. We also show theoretically and experimentally that boosting is especially effective at increasing the margins of the training examples. Finally, we compare our explanation to those based on th...

Citations

5065 The Nature of Statistical Learning Theory – Vapnik - 1995
2579 Classification and Regression Trees – Breiman, Friedman, et al. - 1984
2229 UCI repository of machine learning databases – Blake, Merz - 1998
1570 Bagging predictors – Breiman - 1996
1210 A decision-theoretic generalization of on-line learning and an application to boosting – Freund, Schapire - 1997
1095 Support vector networks – Cortes, Vapnik - 1995
1048 Experiments with a new boosting algorithm – Freund, Schapire - 1996
719 A training algorithm for optimal margin classifiers – Boser, Guyon, et al. - 1992
683 On the uniform convergence of relative frequencies of events to their probabilities. Theory Prob – Vapnik, Chervonenkis - 1971
460 The strength of weak learnability – Schapire - 1990
400 Improved boosting algorithms using confidence-rated predictions – Schapire, Singer - 1999
301 An experimental comparison of three methods for constructing ensembles of decision trees – Dietterich - 2000
296 Boosting a weak learning algorithm by majority – Freund - 1995
282 What size net gives valid generalization – Baum, Haussler - 1989
222 Bagging, boosting, and C4.5 – Quinlan - 1996
177 On the density of families of sets – Sauer - 1972
130 The sample complexity of pattern classification with neural networks: the size of the weights is more important than the size of the network – Bartlett - 1998
115 Error-correcting output coding corrects bias and variance – Kong, Dietterich - 1995
97 A simple lemma on greedy approximation in Hilbert space and convergence rates for projection pursuit regression and neural network training – Jones - 1992
94 Game theory, on-line prediction, and boosting – Freund, Schapire - 1996
79 C.: Boosting decision trees – Drucker, Cortes - 1995
72 An empirical evaluation of bagging and boosting – Maclin, Opitz - 1997
66 Using output codes to boost multiclass learning problems – Schapire - 1997
51 Efficient agnostic learning of neural networks with bounded fan-in – Lee, Bartlett, et al. - 1996
49 Improving regressors using boosting techniques – Drucker - 1997
38 Dietterich and Ghulum Bakiri. Solving multiclass learning problems via errorcorrecting output codes – Thomas - 1995
24 Prediction games and arcing classifiers – Breiman - 1997
13 A framework for structural risk minimisation – Shawe-Taylor, Bartlett, et al. - 1996
11 Rate of convex approximation in non-Hilbert spaces, Constructive Approximation 13 – Donahue, Gurvits, et al. - 1997
11 variance and prediction error for classification rules – Bias - 1996
8 and Dale Schuurmans. Boosting in the limit: Maximizing the margin of learned ensembles – Grove - 1998
1 Devroye. Bounds for the uniform deviation of empirical measures – Luc - 1982
1 Structural risk minimizationover data-dependent hierarchies – Shawe-Taylor, Bartlett, et al. - 1996