• Documents
  • Authors
  • Tables
  • Other Seers ▼
    RefSeer AckSeer CollabSeer SeerSeer
  • Log in
  • Sign up
  • MetaCart

CiteSeerX logo

Advanced Search Include Citations
Advanced Search Include Citations | Disambiguate

The relaxed online maximum margin algorithm (0)

by Y Li, P Long
Venue:Mach Learn
Add To MetaCart

Tools

Sorted by:
Results 1 - 10 of 46
Next 10 →

Online Learning with Kernels

by Jyrki Kivinen, Alexander J. Smola, Robert C. Williamson , 2003
"... Kernel based algorithms such as support vector machines have achieved considerable success in various problems in the batch setting where all of the training data is available in advance. Support vector machines combine the so-called kernel trick with the large margin idea. There has been little u ..."
Abstract - Cited by 1512 (112 self) - Add to MetaCart
Kernel based algorithms such as support vector machines have achieved considerable success in various problems in the batch setting where all of the training data is available in advance. Support vector machines combine the so-called kernel trick with the large margin idea. There has been little use of these methods in an online setting suitable for real-time applications. In this paper we consider online learning in a Reproducing Kernel Hilbert Space. By considering classical stochastic gradient descent within a feature space, and the use of some straightforward tricks, we develop simple and computationally efficient algorithms for a wide range of problems such as classification, regression, and novelty detection. In addition to allowing the exploitation of the kernel trick in an online setting, we examine the value of large margins for classification in the online setting with a drifting target. We derive worst case loss bounds and moreover we show the convergence of the hypothesis to the minimiser of the regularised risk functional. We present some experimental results that support the theory as well as illustrating the power of the new algorithms for online novelty detection. In addition

Online Passive-Aggressive Algorithms

by Koby Crammer , Ofer Dekel, Joseph Keshet, Shai Shalev-Shwartz , Yoram Singer - JOURNAL OF MACHINE LEARNING RESEARCH , 2006
"... We present a family of margin based online learning algorithms for various prediction tasks. In particular we derive and analyze algorithms for binary and multiclass categorization, regression, uniclass prediction and sequence prediction. The update steps of our different algorithms are all based ..."
Abstract - Cited by 181 (14 self) - Add to MetaCart
We present a family of margin based online learning algorithms for various prediction tasks. In particular we derive and analyze algorithms for binary and multiclass categorization, regression, uniclass prediction and sequence prediction. The update steps of our different algorithms are all based on analytical solutions to simple constrained optimization problems. This unified view allows us to prove worst-case loss bounds for the different algorithms and for the various decision problems based on a single lemma. Our bounds on the cumulative loss of the algorithms are relative to the smallest loss that can be attained by any fixed hypothesis, and as such are applicable to both realizable and unrealizable settings. We demonstrate some of the merits of the proposed algorithms in a series of experiments with synthetic and real data sets.

Ultraconservative Online Algorithms for Multiclass Problems

by Koby Crammer, Yoram Singer - Journal of Machine Learning Research , 2001
"... In this paper we study online classification algorithms for multiclass problems in the mistake bound model. The hypotheses we use maintain one prototype vector per class. Given an input instance, a multiclass hypothesis computes a similarity-score between each prototype and the input instance and th ..."
Abstract - Cited by 175 (18 self) - Add to MetaCart
In this paper we study online classification algorithms for multiclass problems in the mistake bound model. The hypotheses we use maintain one prototype vector per class. Given an input instance, a multiclass hypothesis computes a similarity-score between each prototype and the input instance and then sets the predicted label to be the index of the prototype achieving the highest similarity. To design and analyze the learning algorithms in this paper we introduce the notion of ultraconservativeness. Ultraconservative algorithms are algorithms that update only the prototypes attaining similarity-scores which are higher than the score of the correct label's prototype. We start by describing a family of additive ultraconservative algorithms where each algorithm in the family updates its prototypes by finding a feasible solution for a set of linear constraints that depend on the instantaneous similarity-scores. We then discuss a specific online algorithm that seeks a set of prototypes which have a small norm. The resulting algorithm, which we term MIRA (for Margin Infused Relaxed Algorithm) is ultraconservative as well. We derive mistake bounds for all the algorithms and provide further analysis of MIRA using a generalized notion of the margin for multiclass problems.

A New Approximate Maximal Margin Classification Algorithm

by Claudio Gentile - JOURNAL OF MACHINE LEARNING RESEARCH , 2001
"... A new incremental learning algorithm is described which approximates the maximal margin hyperplane w.r.t. norm p 2 for a set of linearly separable data. Our algorithm, called alma p (Approximate Large Margin algorithm w.r.t. norm p), takes O (p 1) 2 2 corrections to separate the data wi ..."
Abstract - Cited by 60 (5 self) - Add to MetaCart
A new incremental learning algorithm is described which approximates the maximal margin hyperplane w.r.t. norm p 2 for a set of linearly separable data. Our algorithm, called alma p (Approximate Large Margin algorithm w.r.t. norm p), takes O (p 1) 2 2 corrections to separate the data with p-norm margin larger than (1 ) , where is the (normalized) p-norm margin of the data. alma p avoids quadratic (or higher-order) programming methods. It is very easy to implement and is as fast as on-line algorithms, such as Rosenblatt's Perceptron algorithm. We performed extensive experiments on both real-world and artificial datasets. We compared alma 2 (i.e., alma p with p = 2) to standard Support vector Machines (SVM) and to two incremental algorithms: the Perceptron algorithm and Li and Long's ROMMA. The accuracy levels achieved by alma 2 are superior to those achieved by the Perceptron algorithm and ROMMA, but slightly inferior to SVM's. On the other hand, alma 2 is quite faster and easier to implement than standard SVM training algorithms. When learning sparse target vectors, alma p with p > 2 largely outperforms Perceptron-like algorithms, such as alma 2 .

Online Bayes Point Machines

by Edward Harrington, Ralf Herbrich, Jyrki Kivinen, John C. Platt, Robert C. Williamson
"... We present a new and simple algorithm for learning large margin classi ers that works in a truly online manner. The algorithm generates a linear classi er by averaging the weights associated with several perceptron-like algorithms run in parallel in order to approximate the Bayes point. A rand ..."
Abstract - Cited by 55 (2 self) - Add to MetaCart
We present a new and simple algorithm for learning large margin classi ers that works in a truly online manner. The algorithm generates a linear classi er by averaging the weights associated with several perceptron-like algorithms run in parallel in order to approximate the Bayes point. A random subsample of the incoming data stream is used to ensure diversity in the perceptron solutions. We experimentally study the algorithm's performance on online and batch learning settings.

Fast Kernel Classifiers With Online And Active Learning

by Antoine Bordes, Seyda Ertekin, Jason Weston, Léon Bottou - JOURNAL OF MACHINE LEARNING RESEARCH , 2005
"... Very high dimensional learning systems become theoretically possible when training examples are abundant. The computing cost then becomes the limiting factor. Any efficient learning algorithm should at least take a brief look at each example. But should all examples be given equal attention? This ..."
Abstract - Cited by 54 (11 self) - Add to MetaCart
Very high dimensional learning systems become theoretically possible when training examples are abundant. The computing cost then becomes the limiting factor. Any efficient learning algorithm should at least take a brief look at each example. But should all examples be given equal attention? This contribution proposes an empirical answer. We first present an online SVM algorithm based on this premise. LASVM yields competitive misclassification rates after a single pass over the training examples, outspeeding state-of-the-art SVM solvers. Then we show how active example selection can yield faster training, higher accuracies, and simpler models, using only a fraction of the training example labels.

A second-order perceptron algorithm

by Nicolò Cesa-Bianchi, Alex Conconi, Claudio Gentile , 2005
"... Kernel-based linear-threshold algorithms, such as support vector machines and Perceptron-like algorithms, are among the best available techniques for solving pattern classification problems. In this paper, we describe an extension of the classical Perceptron algorithm, called second-order Perceptr ..."
Abstract - Cited by 34 (12 self) - Add to MetaCart
Kernel-based linear-threshold algorithms, such as support vector machines and Perceptron-like algorithms, are among the best available techniques for solving pattern classification problems. In this paper, we describe an extension of the classical Perceptron algorithm, called second-order Perceptron, and analyze its performance within the mistake bound model of on-line learning. The bound achieved by our algorithm depends on the sensitivity to second-order data information and is the best known mistake bound for (efficient) kernel-based linear-threshold classifiers to date. This mistake bound, which strictly generalizes the well-known Perceptron bound, is expressed in terms of the eigenvalues of the empirical data correlation matrix and depends on a parameter controlling the sensitivity of the algorithm to the distribution of these eigenvalues. Since the optimal setting of this parameter is not known a priori, we also analyze two variants of the second-order Perceptron algorithm: one that adaptively sets the value of the parameter in terms of the number of mistakes made so far, and one that is parameterless, based on pseudoinverses.

Worst-Case Analysis of Selective Sampling for Linear Classification

by Nicolò Cesa-Bianchi, Claudio Gentile, Luca Zaniboni - JOURNAL OF MACHINE LEARNING RESEARCH , 2006
"... A selective sampling algorithm is a learning algorithm for classification that, based on the past observed data, decides whether to ask the label of each new instance to be classified. In this paper, we introduce a general technique for turning linear-threshold classification algorithms from the ..."
Abstract - Cited by 28 (3 self) - Add to MetaCart
A selective sampling algorithm is a learning algorithm for classification that, based on the past observed data, decides whether to ask the label of each new instance to be classified. In this paper, we introduce a general technique for turning linear-threshold classification algorithms from the general additive family into randomized selective sampling algorithms. For the most popular algorithms in this family we derive mistake bounds that hold for individual sequences of examples. These bounds

Online Classification on a Budget

by Koby Crammer, Jaz Kandola, Royal Holloway, Yoram Singer - Advances in Neural Information Processing Systems 16 , 2003
"... Online algorithms for classification often require vast amounts of memory and computation time when employed in conjunction with kernel functions. In this paper we describe and analyze a simple approach for an on-the-fly reduction of the number of past examples used for prediction. Experiments perfo ..."
Abstract - Cited by 27 (5 self) - Add to MetaCart
Online algorithms for classification often require vast amounts of memory and computation time when employed in conjunction with kernel functions. In this paper we describe and analyze a simple approach for an on-the-fly reduction of the number of past examples used for prediction. Experiments performed with real datasets...

The Forgetron: A kernel-based perceptron on a fixed budget

by Ofer Dekel, Shai Shalev-shwartz, Yoram Singer - In Advances in Neural Information Processing Systems 18 , 2005
"... The Perceptron algorithm, despite its simplicity, often performs well on online classification problems. The Perceptron becomes especially effective when it is used in conjunction with kernels. However, a common difficulty encountered when implementing kernel-based online algorithms is the amount of ..."
Abstract - Cited by 24 (5 self) - Add to MetaCart
The Perceptron algorithm, despite its simplicity, often performs well on online classification problems. The Perceptron becomes especially effective when it is used in conjunction with kernels. However, a common difficulty encountered when implementing kernel-based online algorithms is the amount of memory required to store the online hypothesis, which may grow unboundedly. In this paper we describe and analyze a new infrastructure for kernel-based learning with the Perceptron while adhering to a strict limit on the number of examples that can be stored. We first describe a template algorithm, called the Forgetron, for online learning on a fixed budget. We then provide specific algorithms and derive a unified mistake bound for all of them. To our knowledge, this is the first online learning paradigm which, on one hand, maintains a strict limit on the number of examples it can store and, on the other hand, entertains a relative mistake bound. We also present experiments with real datasets which underscore the merits of our approach. 1
The National Science Foundation
  • About CiteSeerX
  • Submit Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2010 The Pennsylvania State University