Results 1 
8 of
8
Soft Margins for AdaBoost
, 1998
"... Recently ensemble methods like AdaBoost were successfully applied to character recognition tasks, seemingly defying the problems of overfitting. This paper shows that although AdaBoost rarely overfits in the low noise regime it clearly does so for higher noise levels. Central for understanding this ..."
Abstract

Cited by 256 (22 self)
 Add to MetaCart
Recently ensemble methods like AdaBoost were successfully applied to character recognition tasks, seemingly defying the problems of overfitting. This paper shows that although AdaBoost rarely overfits in the low noise regime it clearly does so for higher noise levels. Central for understanding this fact is the margin distribution and we find that AdaBoost achieves  doing gradient descent in an error function with respect to the margin  asymptotically a hard margin distribution, i.e. the algorithm concentrates its resources on a few hardtolearn patterns (here an interesting overlap emerge to Support Vectors). This is clearly a suboptimal strategy in the noisy case, and regularization, i.e. a mistrust in the data, must be introduced in the algorithm to alleviate the distortions that a difficult pattern (e.g. outliers) can cause to the margin distribution. We propose several regularization methods and generalizations of the original AdaBoost algorithm to achieve a soft margin  a ...
An introduction to boosting and leveraging
 Advanced Lectures on Machine Learning, LNCS
, 2003
"... ..."
Sparse Regression Ensembles in Infinite and Finite Hypothesis Spaces
, 2000
"... We examine methods for constructing regression ensembles based on a linear program (LP). The ensemble regression function consists of linear combina tions of base hypotheses generated by some boostingtype base learning algorithm. Unlike the classification case, for regression the set of possible h ..."
Abstract

Cited by 20 (9 self)
 Add to MetaCart
We examine methods for constructing regression ensembles based on a linear program (LP). The ensemble regression function consists of linear combina tions of base hypotheses generated by some boostingtype base learning algorithm. Unlike the classification case, for regression the set of possible hypotheses producible by the base learning algorithm may be infinite. We explicitly tackle the issue of how to define and solve ensemble regression when the hypothesis space is infinite. Our approach is based on a semiinfinite linear program that has an infinite number of constraints and a finite number of variables. We show that the regression problem is well posed for infinite hypothesis spaces in both the primal and dual spaces. Most importantly, we prove there exists an optimal solution to the infinite hypothesisspace problem consisting of a finite number of hypothesis. We propose two algorithms for solving the infinite and finite hypothesis problems. One uses a column generation simplextype algorithm and the other adopts an exponential barrier approach. Furthermore, we give sufficient conditions for the base learning algorithm and the hypothesis set to be used for infinite regression ensembles. Computational resultsshow that these methods are extremely promising.
Barrier Boosting
"... Boosting algorithms like AdaBoost and ArcGV are iterative strategies to minimize a constrained objective function, equivalent to Barrier algorithms. ..."
Abstract

Cited by 18 (7 self)
 Add to MetaCart
Boosting algorithms like AdaBoost and ArcGV are iterative strategies to minimize a constrained objective function, equivalent to Barrier algorithms.
On the convergence of leveraging
 In Advances in Neural Information Processing Systems (NIPS
, 2002
"... We give an unified convergence analysis of ensemble learning methods including e.g. AdaBoost, Logistic Regression and the LeastSquareBoost algorithm for regression. These methods have in common that they iteratively call a base learning algorithm which returns hypotheses that are then linearly com ..."
Abstract

Cited by 11 (2 self)
 Add to MetaCart
We give an unified convergence analysis of ensemble learning methods including e.g. AdaBoost, Logistic Regression and the LeastSquareBoost algorithm for regression. These methods have in common that they iteratively call a base learning algorithm which returns hypotheses that are then linearly combined. We show that these methods are related to the GaussSouthwell method known from numerical optimization and state nonasymptotical convergence results for all these methods. Our analysis includes ℓ1norm regularized cost functions leading to a clean and general way to regularize ensemble learning. 1
Web Person Name Disambiguation by Relevance Weighting of Extended Feature Sets
"... Abstract. This paper describes our approach to the Person Name Disambiguation clustering task in the Third Web People Search Evaluation Campaign(WePS3). The method focuses on two aspects: the extended feature sets, and feature relevance weighting. Bagofwords and named entities are most commonly u ..."
Abstract

Cited by 3 (0 self)
 Add to MetaCart
Abstract. This paper describes our approach to the Person Name Disambiguation clustering task in the Third Web People Search Evaluation Campaign(WePS3). The method focuses on two aspects: the extended feature sets, and feature relevance weighting. Bagofwords and named entities are most commonly used features in many existing web entity disambiguation algorithms and we further extend this basic feature set with Wikipedia concepts. Then two feature weighting models are employed. One is the feature relevance to the target person name(or “query name”), and the other is the feature relevance to the text content. Similarity score is calculated according to the feature weights for clustering documents of the same person. Experiments show that the system based on our approach has generated the best results among all the WePS3’s submissions. 1
Applying Support Vector Machines and Boosting to a NonIntrusive Monitoring System for Household Electric Appliances with Inverters
"... A nonintrusive load monitoring system has been developed for estimating the behavior of individual electrical appliances from the measurement of the total household load demand curve. The system is useful for monitoring both inverter and noninverter type appliances that change their mode of operat ..."
Abstract

Cited by 2 (0 self)
 Add to MetaCart
A nonintrusive load monitoring system has been developed for estimating the behavior of individual electrical appliances from the measurement of the total household load demand curve. The system is useful for monitoring both inverter and noninverter type appliances that change their mode of operation over time. The total load demand is measured at the entrance of the feeder line into the house and the operating status of household electric appliances can be identified with the help of Support Vector Machines (SVM), Boosting, RBF and neural network techniques by analyzing the characteristic frequency content from the load curve of the household. Load curve measurements of airconditioners, refrigerators (inverter type and noninverter type), incandescent light, uorescence light and television systems are used as examples for training and test data. So far only a small data set was measured for this feasibility study and our experiments show a great potential for machine learn...
Boosting by weighting critical and erroneous samples $ Vanessa Go ´ mezVerdejo, Manuel OrtegaMoral, Jero ´ nimo ArenasGarcı ´ a,
, 2006
"... Real Adaboost is a wellknown and good performance boosting method used to build machine ensembles for classification. Considering that its emphasis function can be decomposed in two factors that pay separated attention to sample errors and to their proximity to the classification border, a generali ..."
Abstract
 Add to MetaCart
Real Adaboost is a wellknown and good performance boosting method used to build machine ensembles for classification. Considering that its emphasis function can be decomposed in two factors that pay separated attention to sample errors and to their proximity to the classification border, a generalized emphasis function that combines both components by means of a selectable parameter, l, is presented. Experiments show that simple methods of selecting l frequently offer better performance and smaller ensembles. r 2006 Elsevier B.V. All rights reserved.