Abstract:
In an earlier paper, we introduced a new “boosting” algorithm called AdaBoost which, theoretically, can be used to significantly reduce the error of any learning algorithm that consistently generates classifiers whose performance is a little better than random guessing. We also introduced the related notion of a “pseudo-loss ” which is a method for forcing a learning algorithm of multi-label conceptsto concentrate on the labels that are hardest to discriminate. In this paper, we describe experiments we carried out to assess how well AdaBoost with and without pseudo-loss, performs on real learning problems. We performed two sets of experiments. The first set compared boosting to Breiman’s “bagging ” method when used to aggregate various classifiers (including decision trees and single attribute-value tests). We compared the performance of the two methods on a collection of machine-learning benchmarks. In the second set of experiments, we studied in more detail the performance of boosting using a nearest-neighbor classifier on an OCR problem.
Citations
|
1570
|
Bagging predictors
– Breiman
- 1996
|
|
1210
|
A decision-theoretic generalization of on-line learning and an application to boosting
– Freund, Schapire
- 1997
|
|
654
|
Fast effective rule induction
– Cohen
- 1995
|
|
460
|
The strength of weak learnability
– Schapire
- 1990
|
|
330
|
Very simple classification rules perform well on most commonly used datasets
– Holte
- 1993
|
|
296
|
Boosting a weak learning algorithm by majority
– Freund
- 1995
|
|
222
|
Bagging, boosting, and C4.5
– Quinlan
- 1996
|
|
190
|
Efficient pattern recognition using a new transformation distance
– Simard, Cun, et al.
- 1993
|
|
186
|
The condensed nearest neighbor rule
– Hart
- 1968
|
|
115
|
Error-correcting output coding corrects bias and variance
– Kong, Dietterich
- 1995
|
|
91
|
Improving performance in neural networks using a boosting algorithm
– Drucker, Schapire, et al.
- 1993
|
|
89
|
The reduced nearest neighbor rule
– Gates
- 1972
|
|
84
|
Incremental reduced error pruning
– Furnkranz, Widmer
- 1994
|
|
79
|
C.: Boosting decision trees
– Drucker, Cortes
- 1995
|
|
69
|
Boosting and other ensemble methods
– Drucker, Cortes, et al.
- 1994
|
|
68
|
Boosting performance in neural networks
– Drucker, Schapire, et al.
- 1993
|
|
65
|
On the boosting ability of top-down decision tree learning algorithms
– Kearns, Mansour
- 1996
|
|
40
|
Applying the weak learning framework to understand and improve C4.5
– Dietterich, Kearns, et al.
- 1996
|
|
19
|
Learning sparse perceptrons
– Jackson, Craven
- 1996
|
|
5
|
arcing classifiers. Unpublished manuscript
– Bias
- 1996
|
|
3
|
A decision-theoreticgeneralizationof online learning and an application to boosting. Unpublishedmanuscript available electronically (on our web pages, or by email request). An extended abstract appeared
– Freund, Schapire
- 1995
|
|
1
|
Improvingperformance in neural networks using a boosting algorithm
– Drucker, Schapire, et al.
- 1993
|