Results 1 
5 of
5
Pegasos: Primal Estimated subgradient solver for SVM
"... We describe and analyze a simple and effective stochastic subgradient descent algorithm for solving the optimization problem cast by Support Vector Machines (SVM). We prove that the number of iterations required to obtain a solution of accuracy ɛ is Õ(1/ɛ), where each iteration operates on a singl ..."
Abstract

Cited by 532 (21 self)
 Add to MetaCart
We describe and analyze a simple and effective stochastic subgradient descent algorithm for solving the optimization problem cast by Support Vector Machines (SVM). We prove that the number of iterations required to obtain a solution of accuracy ɛ is Õ(1/ɛ), where each iteration operates on a single training example. In contrast, previous analyses of stochastic gradient descent methods for SVMs require Ω(1/ɛ2) iterations. As in previously devised SVM solvers, the number of iterations also scales linearly with 1/λ, where λ is the regularization parameter of SVM. For a linear kernel, the total runtime of our method is Õ(d/(λɛ)), where d is a bound on the number of nonzero features in each example. Since the runtime does not depend directly on the size of the training set, the resulting algorithm is especially suited for learning from large datasets. Our approach also extends to nonlinear kernels while working solely on the primal objective function, though in this case the runtime does depend linearly on the training set size. Our algorithm is particularly well suited for large text classification problems, where we demonstrate an orderofmagnitude speedup over previous SVM learning methods.
Large scale online learning
 Advances in Neural Information Processing Systems 16
, 2004
"... We consider situations where training data is abundant and computing resources are comparatively scarce. We argue that suitably designed online learning algorithms asymptotically outperform any batch learning algorithm. Both theoretical and experimental evidences are presented. 1 ..."
Abstract

Cited by 76 (6 self)
 Add to MetaCart
(Show Context)
We consider situations where training data is abundant and computing resources are comparatively scarce. We argue that suitably designed online learning algorithms asymptotically outperform any batch learning algorithm. Both theoretical and experimental evidences are presented. 1
Pegasos: Primal estimated subgradient
 In ICML
, 2007
"... Abstract We describe and analyze a simple and effective stochastic subgradient descent algorithm for solving the optimization problem cast by Support Vector Machines (SVM). We prove that the number of iterations required to obtain a solution of accuracy ɛ is Õ(1/ɛ), where each iteration operates on ..."
Abstract

Cited by 3 (0 self)
 Add to MetaCart
(Show Context)
Abstract We describe and analyze a simple and effective stochastic subgradient descent algorithm for solving the optimization problem cast by Support Vector Machines (SVM). We prove that the number of iterations required to obtain a solution of accuracy ɛ is Õ(1/ɛ), where each iteration operates on a single training example. In contrast, previous analyses of stochastic gradient descent methods for SVMs require Ω(1/ɛ2) iterations. As in previously devised SVM solvers, the number of iterations also scales linearly with 1/λ, where λ is the regularization parameter of SVM. For a linear kernel, the total runtime of our method is Õ(d/(λɛ)), where d is a bound on the number of nonzero features in each example. Since the runtime does not depend directly on the size of the training set, the resulting algorithm is especially suited for learning from large datasets. Our approach also extends to nonlinear kernels while working solely on the primal objective function, though in this case the runtime does depend linearly on the training set size. Our algorithm is particularly well suited for large text classification problems, where we demonstrate an orderofmagnitude speedup over previous SVM learning methods.
Apprentissage Stochastique pour Très Grands
"... Abstract: La conception de très grand systèmes d’apprentissage pose un grand nombre de problèmes non résolus. Savons nous, par exemple, construire un algorithme qui “regarde ” la télévision pendant quelques semaines et apprend à énumérer les objets présents dans ces images. Les lois d’échelles de no ..."
Abstract
 Add to MetaCart
(Show Context)
Abstract: La conception de très grand systèmes d’apprentissage pose un grand nombre de problèmes non résolus. Savons nous, par exemple, construire un algorithme qui “regarde ” la télévision pendant quelques semaines et apprend à énumérer les objets présents dans ces images. Les lois d’échelles de nos algorithmes ne nous permettent pas de traiter les quantités massives de données que cela implique. L’expérience suggère que les algorithmes les mieux adaptés sont les algorithmes stochastiques. Leur convergence est pourtant réputée beaucoup plus lente que celle des meilleurs algorithmes d’optimisation. Mais il s’agit de la convergence vers l’optimum empirique. Notre papier reformule la question en termes de convergence vers le point de meilleure généralisation et montre la superiorité d’un algorithme stochastique bien conçu.
Mathematical Programming manuscript No. (will be inserted by the editor) Pegasos: Primal Estimated subGrAdient SOlver for SVM
"... Abstract We describe and analyze a simple and effective stochastic subgradient descent algorithm for solving the optimization problem cast by Support Vector Machines (SVM). We prove that the number of iterations required to obtain a solution of accuracyɛis Õ(1/ɛ), where each iteration operates on a ..."
Abstract
 Add to MetaCart
(Show Context)
Abstract We describe and analyze a simple and effective stochastic subgradient descent algorithm for solving the optimization problem cast by Support Vector Machines (SVM). We prove that the number of iterations required to obtain a solution of accuracyɛis Õ(1/ɛ), where each iteration operates on a single training example. In contrast, previous analyses of stochastic gradient descent methods for SVMs require Ω(1/ɛ 2) iterations. As in previously devised SVM solvers, the number of iterations also scales linearly with 1/λ, where λ is the regularization parameter of SVM. For a linear kernel, the total runtime of our method is Õ(d/(λɛ)), where d is a bound on the number of nonzero features in each example. Since the runtime does not depend directly on the size of the training set, the resulting algorithm is especially suited for learning from large datasets. Our approach also extends to nonlinear kernels while working solely on the primal objective function, though in this case the runtime does depend linearly on the training set size. Our algorithm is particularly well suited for large text classification problems, where we demonstrate an orderofmagnitude speedup over previous SVM learning methods.