Results 1  10
of
18
The strength of weak learnability
 Machine Learning
, 1990
"... Abstract. This paper addresses the problem of improving the accuracy of an hypothesis output by a learning algorithm in the distributionfree (PAC) learning model. A concept class is learnable (or strongly learnable) if, given access to a Source of examples of the unknown concept, the learner with h ..."
Abstract

Cited by 667 (23 self)
 Add to MetaCart
Abstract. This paper addresses the problem of improving the accuracy of an hypothesis output by a learning algorithm in the distributionfree (PAC) learning model. A concept class is learnable (or strongly learnable) if, given access to a Source of examples of the unknown concept, the learner with high probability is able to output an hypothesis that is correct on all but an arbitrarily small fraction of the instances. The concept class is weakly learnable if the learner can produce an hypothesis that performs only slightly better than random guessing. In this paper, it is shown that these two notions of learnability are equivalent. A method is described for converting a weak learning algorithm into one that achieves arbitrarily high accuracy. This construction may have practical applications as a tool for efficiently converting a mediocre learning algorithm into one that performs extremely well. In addition, the construction has some interesting theoretical consequences, including a set of general upper bounds on the complexity of any strong learning algorithm as a function of the allowed error e.
Boosting a Weak Learning Algorithm By Majority
, 1995
"... We present an algorithm for improving the accuracy of algorithms for learning binary concepts. The improvement is achieved by combining a large number of hypotheses, each of which is generated by training the given learning algorithm on a different set of examples. Our algorithm is based on ideas pr ..."
Abstract

Cited by 419 (16 self)
 Add to MetaCart
We present an algorithm for improving the accuracy of algorithms for learning binary concepts. The improvement is achieved by combining a large number of hypotheses, each of which is generated by training the given learning algorithm on a different set of examples. Our algorithm is based on ideas presented by Schapire in his paper "The strength of weak learnability", and represents an improvement over his results. The analysis of our algorithm provides general upper bounds on the resources required for learning in Valiant's polynomial PAC learning framework, which are the best general upper bounds known today. We show that the number of hypotheses that are combined by our algorithm is the smallest number possible. Other outcomes of our analysis are results regarding the representational power of threshold circuits, the relation between learnability and compression, and a method for parallelizing PAC learning algorithms. We provide extensions of our algorithms to cases in which the conc...
Theoretical Views of Boosting and Applications
, 1999
"... . Boosting is a general method for improving the accuracy of any given learning algorithm. Focusing primarily on the AdaBoost algorithm, we briefly survey theoretical work on boosting including analyses of AdaBoost's training error and generalization error, connections between boosting and game theo ..."
Abstract

Cited by 50 (2 self)
 Add to MetaCart
. Boosting is a general method for improving the accuracy of any given learning algorithm. Focusing primarily on the AdaBoost algorithm, we briefly survey theoretical work on boosting including analyses of AdaBoost's training error and generalization error, connections between boosting and game theory, methods of estimating probabilities using boosting, and extensions of AdaBoost for multiclass classification problems. Some empirical work and applications are also described. Background Boosting is a general method which attempts to "boost" the accuracy of any given learning algorithm. Kearns and Valiant [29, 30] were the first to pose the question of whether a "weak" learning algorithm which performs just slightly better than random guessing in Valiant's PAC model [44] can be "boosted" into an arbitrarily accurate "strong" learning algorithm. Schapire [36] came up with the first provable polynomialtime boosting algorithm in 1989. A year later, Freund [16] developed a much more effici...
Data Filtering and Distribution Modeling Algorithms for Machine Learning
, 1993
"... vi Acknowledgments vii 1. Introduction 1 1.1 Boosting by majority : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 4 1.2 Query By Committee : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 7 1.3 Learning distributions of binary vectors : : ..."
Abstract

Cited by 18 (4 self)
 Add to MetaCart
vi Acknowledgments vii 1. Introduction 1 1.1 Boosting by majority : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 4 1.2 Query By Committee : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 7 1.3 Learning distributions of binary vectors : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 8 2. Boosting a weak learning algorithm by majority 10 2.1 Introduction : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 10 2.2 The majorityvote game : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 14 2.2.1 Optimality of the weighting scheme : : : : : : : : : : : : : : : : : : : : : : : : : : : 19 2.2.2 The representational power of majority gates : : : : : : : : : : : : : : : : : : : : : : 20 2.3 Boosting a weak learner using a majority vote : : : : : : : : : : : : : : : : : : : : : : : : : : 22 2.3.1 Preliminaries : : : : : : : : : : : : : : : : : : : : : : : : : :...
Fast Object Detection with Occlusions
 in ECCV 2004
, 2004
"... Abstract. We describe a new framework, based on boosting algorithms and cascade structures, to efficiently detect objects/faces with occlusions. While our approach is motivated by the work of Viola and Jones, several techniques have been developed for establishing a more general system, including (i ..."
Abstract

Cited by 15 (1 self)
 Add to MetaCart
Abstract. We describe a new framework, based on boosting algorithms and cascade structures, to efficiently detect objects/faces with occlusions. While our approach is motivated by the work of Viola and Jones, several techniques have been developed for establishing a more general system, including (i) a robust boosting scheme, to select useful weak learners and to avoid overfitting; (ii) reinforcement training, to reduce falsepositive rates via a more effective training procedure for boosted cascades; and (iii) cascading with evidence, to extend the system to handle occlusions, without compromising in detection speed. Experimental results on detecting faces under various situations are provided to demonstrate the performances of the proposed method. 1
An Empirical Comparison of Three Boosting Algorithms on Real Data Sets with Artificial Class Noise
 IN FOURTH INTERNATIONAL WORKSHOP ON MULTIPLE CLASSIFIER SYSTEMS
, 2003
"... Boosting algorithms are a means of building a strong ensemble classifier by aggregating a sequence of weak hypotheses. In this paper we consider three of the bestknown boosting algorithms: Adaboost [8], Logitboost [10] and Brownboost [7]. These algorithms are adaptive, and work by maintaining a ..."
Abstract

Cited by 9 (0 self)
 Add to MetaCart
Boosting algorithms are a means of building a strong ensemble classifier by aggregating a sequence of weak hypotheses. In this paper we consider three of the bestknown boosting algorithms: Adaboost [8], Logitboost [10] and Brownboost [7]. These algorithms are adaptive, and work by maintaining a set of example and class weights which focus the attention of a base learner on the examples that are hardest to classify. We conduct
A survey on computational learning theory
 Formal Techniques in Artificial Intelligence: a Sourcebook
, 1990
"... _._ I u_ 1 z II ..."
L.: A monte carlo analysis of ensemble classification
 In: Proc. 23th International Conference on Machine Learning
, 2004
"... In this paper we extend previous results providing a theoretical analysis of a new Monte Carlo ensemble classifier. The framework allows us to characterize the conditions under which the ensemble approach can be expected to outperform the single hypothesis classifier. Moreover, we provide a closed f ..."
Abstract

Cited by 2 (0 self)
 Add to MetaCart
In this paper we extend previous results providing a theoretical analysis of a new Monte Carlo ensemble classifier. The framework allows us to characterize the conditions under which the ensemble approach can be expected to outperform the single hypothesis classifier. Moreover, we provide a closed form expression for the distribution of the true ensemble accuracy, as well as of its mean and variance. We then exploit this result in order to analyze the expected error behavior in a particularly interesting case. 1.
Efficient Estimators for Generalized Additive Models
, 2005
"... Generalized additive models are a powerful generalization of linear and logistic regression models. In this paper we show that a natural regression graph learning algorithm efficiently learns generalized additive models. Efficiency is proven in two senses: the estimator’s future prediction accuracy ..."
Abstract

Cited by 2 (0 self)
 Add to MetaCart
Generalized additive models are a powerful generalization of linear and logistic regression models. In this paper we show that a natural regression graph learning algorithm efficiently learns generalized additive models. Efficiency is proven in two senses: the estimator’s future prediction accuracy approaches optimality at rate inverse polynomial in the size of the training data, and its runtime is polynomial in the size of the training data. Furthermore, the guarantees are nearly linear in terms of the dimensionality (number of regressors) of the problem, and hence the algorithm does not suffer from the “curse of dimensionality. ” The algorithm is a simple generalization of Mansour and McAllester’s classification algorithm that generates decision graphs, i.e., decision trees with merges. Our analysis is also viewed as defining a natural extension of the original classification boosting theorems (Schapire, 1990) to the regression setting. Loosely speaking, we define a weak correlator to be a realvalued predictor that has a correlation coefficient with the target function that is bounded from zero. We show how to efficiently boost weak correlators to get predictions with correlation arbitrarily close to 1 (error arbitrarily close to 0). Our boosting analysis is a natural extension of the classification boosting analysis of Kearns and Mansour (1999) and Mansour and McAllester (2002).