Results 1  10
of
10
New Support Vector Algorithms
, 2000
"... this article with the regression case. To explain this, we will introduce a suitable definition of a margin that is maximized in both cases ..."
Abstract

Cited by 322 (45 self)
 Add to MetaCart
this article with the regression case. To explain this, we will introduce a suitable definition of a margin that is maximized in both cases
Boosting Algorithms as Gradient Descent
, 2000
"... Much recent attention, both experimental and theoretical, has been focussed on classification algorithms which produce voted combinations of classifiers. Recent theoretical work has shown that the impressive generalization performance of algorithms like AdaBoost can be attributed to the classifier h ..."
Abstract

Cited by 115 (2 self)
 Add to MetaCart
Much recent attention, both experimental and theoretical, has been focussed on classification algorithms which produce voted combinations of classifiers. Recent theoretical work has shown that the impressive generalization performance of algorithms like AdaBoost can be attributed to the classifier having large margins on the training data. We present an abstract algorithm for finding linear combinations of functions that minimize arbitrary cost functionals (i.e functionals that do not necessarily depend on the margin). Many existing voting methods can be shown to be special cases of this abstract algorithm. Then, following previous theoretical results bounding the generalization performance of convex combinations of classifiers in terms of general cost functions of the margin, we present a new algorithm (DOOM II) for performing a gradient descent optimization of such cost functions. Experiments on
Improved Generalization through Explicit Optimization of Margins
 Machine Learning
, 1999
"... Recent theoretical results have shown that the generalization performance of thresholded convex combinations of base classifiers is greatly improved if the underlying convex combination has large margins on the training data (correct examples are classified well away from the decision boundary). Neu ..."
Abstract

Cited by 65 (5 self)
 Add to MetaCart
Recent theoretical results have shown that the generalization performance of thresholded convex combinations of base classifiers is greatly improved if the underlying convex combination has large margins on the training data (correct examples are classified well away from the decision boundary). Neural network algorithms and AdaBoost have been shown to implicitly maximize margins, thus providing some theoretical justification for their remarkably good generalization performance. In this paper we are concerned with maximizing the margin explicitly. In particular, we prove a theorem bounding the generalization performance of convex combinations in terms of general cost functions of the margin (previous results were stated in terms of the particular cost function sgn(`;margin). We then present an algorithm (DOOM) for directly optimizing a piecewiselinear family of cost functions satisfying the conditions of the theorem. Experiments on several of the datasets in the UC Irvine database are presented in which AdaBoost was used to generate a set of base classifiers and then DOOM was used to find the optimal convex combination of those classifiers. In all but one case the convex combination generated by DOOM had lower test error than AdaBoost's combination. In many cases DOOM achieves these lower test errors by sacrificing training error, in the interests of reducing the new cost function. The margin plots also show that the size of the minimum margin is not relevant to generalization performance.
Direct Optimization of Margins Improves Generalization in Combined Classifiers
 Advances in Neural Information Processing Systems
, 1998
"... Sonar Cumulative training margin distributions for AdaBoost versus our "Direct Optimization Of Margins" (DOOM) algorithm. ..."
Abstract

Cited by 28 (1 self)
 Add to MetaCart
Sonar Cumulative training margin distributions for AdaBoost versus our "Direct Optimization Of Margins" (DOOM) algorithm.
Combining protein secondary structure prediction models with ensemble methods of optimal complexity
, 2004
"... ..."
Probabilistic Analysis of Learning in Artificial Neural Networks: The PAC Model and its Variants
, 1997
"... There are a number of mathematical approaches to the study of learning and generalization in artificial neural networks. Here we survey the `probably approximately correct' (PAC) model of learning and some of its variants. These models provide a probabilistic framework for the discussion of generali ..."
Abstract

Cited by 18 (4 self)
 Add to MetaCart
There are a number of mathematical approaches to the study of learning and generalization in artificial neural networks. Here we survey the `probably approximately correct' (PAC) model of learning and some of its variants. These models provide a probabilistic framework for the discussion of generalization and learning. This survey concentrates on the sample complexity questions in these models; that is, the emphasis is on how many examples should be used for training. Computational complexity considerations are briefly discussed for the basic PAC model. Throughout, the importance of the VapnikChervonenkis dimension is highlighted. Particular attention is devoted to describing how the probabilistic models apply in the context of neural network learning, both for networks with binaryvalued output and for networks with realvalued output.
Error Bounds for Voting Classifiers Using Margin Cost Functions
, 1999
"... Recent theoretical results haveshown that the accuracy of thresholded realvalued functions (suchasvoting classifiers) is greatly improved if the underlying function has large margins on the training data (that is, correct examples are classified well away from the decision boundary). In this paper, ..."
Abstract
 Add to MetaCart
Recent theoretical results haveshown that the accuracy of thresholded realvalued functions (suchasvoting classifiers) is greatly improved if the underlying function has large margins on the training data (that is, correct examples are classified well away from the decision boundary). In this paper, wegive bounds on the misclassification probabilityofconvex combinations of classifiers in terms of general cost functions of the margin.
Sample Complexity of Classifiers Taking Values in R^Q, Application to MultiClass SVMs
"... Bounds on the risk play a crucial role in statistical learning theory. They usually involve as capacity measure of the model studied the VC dimension or one of its extensions. In classification, such VC dimensions exist for models taking values in {0, 1}, [ 1, Q], and R. We introduce the generaliza ..."
Abstract
 Add to MetaCart
Bounds on the risk play a crucial role in statistical learning theory. They usually involve as capacity measure of the model studied the VC dimension or one of its extensions. In classification, such VC dimensions exist for models taking values in {0, 1}, [ 1, Q], and R. We introduce the generalizations appropriate for the missing case, the one of models with values in RQ. This provides us with a new guaranteed risk for MSVMs. For those models, a sharper bound is obtained by using the Rademacher complexity.