Results 1  10
of
833
Online Learning with Kernels
, 2003
"... Kernel based algorithms such as support vector machines have achieved considerable success in various problems in the batch setting where all of the training data is available in advance. Support vector machines combine the socalled kernel trick with the large margin idea. There has been little u ..."
Abstract

Cited by 2084 (126 self)
 Add to MetaCart
Kernel based algorithms such as support vector machines have achieved considerable success in various problems in the batch setting where all of the training data is available in advance. Support vector machines combine the socalled kernel trick with the large margin idea. There has been little use of these methods in an online setting suitable for realtime applications. In this paper we consider online learning in a Reproducing Kernel Hilbert Space. By considering classical stochastic gradient descent within a feature space, and the use of some straightforward tricks, we develop simple and computationally efficient algorithms for a wide range of problems such as classification, regression, and novelty detection. In addition to allowing the exploitation of the kernel trick in an online setting, we examine the value of large margins for classification in the online setting with a drifting target. We derive worst case loss bounds and moreover we show the convergence of the hypothesis to the minimiser of the regularised risk functional. We present some experimental results that support the theory as well as illustrating the power of the new algorithms for online novelty detection. In addition
A tutorial on support vector regression
, 2004
"... In this tutorial we give an overview of the basic ideas underlying Support Vector (SV) machines for function estimation. Furthermore, we include a summary of currently used algorithms for training SV machines, covering both the quadratic (or convex) programming part and advanced methods for dealing ..."
Abstract

Cited by 493 (2 self)
 Add to MetaCart
In this tutorial we give an overview of the basic ideas underlying Support Vector (SV) machines for function estimation. Furthermore, we include a summary of currently used algorithms for training SV machines, covering both the quadratic (or convex) programming part and advanced methods for dealing with large datasets. Finally, we mention some modifications and extensions that have been applied to the standard SV algorithm, and discuss the aspect of regularization from a SV perspective.
Manifold regularization: A geometric framework for learning from labeled and unlabeled examples
 JOURNAL OF MACHINE LEARNING RESEARCH
, 2006
"... We propose a family of learning algorithms based on a new form of regularization that allows us to exploit the geometry of the marginal distribution. We focus on a semisupervised framework that incorporates labeled and unlabeled data in a generalpurpose learner. Some transductive graph learning al ..."
Abstract

Cited by 343 (13 self)
 Add to MetaCart
We propose a family of learning algorithms based on a new form of regularization that allows us to exploit the geometry of the marginal distribution. We focus on a semisupervised framework that incorporates labeled and unlabeled data in a generalpurpose learner. Some transductive graph learning algorithms and standard methods including Support Vector Machines and Regularized Least Squares can be obtained as special cases. We utilize properties of Reproducing Kernel Hilbert spaces to prove new Representer theorems that provide theoretical basis for the algorithms. As a result (in contrast to purely graphbased approaches) we obtain a natural outofsample extension to novel examples and so are able to handle both transductive and truly semisupervised settings. We present experimental evidence suggesting that our semisupervised algorithms are able to use unlabeled data effectively. Finally we have a brief discussion of unsupervised and fully supervised learning within our general framework.
Regularization Theory and Neural Networks Architectures
 Neural Computation
, 1995
"... We had previously shown that regularization principles lead to approximation schemes which are equivalent to networks with one layer of hidden units, called Regularization Networks. In particular, standard smoothness functionals lead to a subclass of regularization networks, the well known Radial Ba ..."
Abstract

Cited by 317 (31 self)
 Add to MetaCart
We had previously shown that regularization principles lead to approximation schemes which are equivalent to networks with one layer of hidden units, called Regularization Networks. In particular, standard smoothness functionals lead to a subclass of regularization networks, the well known Radial Basis Functions approximation schemes. This paper shows that regularization networks encompass a much broader range of approximation schemes, including many of the popular general additive models and some of the neural networks. In particular, we introduce new classes of smoothness functionals that lead to different classes of basis functions. Additive splines as well as some tensor product splines can be obtained from appropriate classes of smoothness functionals. Furthermore, the same generalization that extends Radial Basis Functions (RBF) to Hyper Basis Functions (HBF) also leads from additive models to ridge approximation models, containing as special cases Breiman's hinge functions, som...
Regularization networks and support vector machines
 Advances in Computational Mathematics
, 2000
"... Regularization Networks and Support Vector Machines are techniques for solving certain problems of learning from examples – in particular the regression problem of approximating a multivariate function from sparse data. Radial Basis Functions, for example, are a special case of both regularization a ..."
Abstract

Cited by 276 (33 self)
 Add to MetaCart
Regularization Networks and Support Vector Machines are techniques for solving certain problems of learning from examples – in particular the regression problem of approximating a multivariate function from sparse data. Radial Basis Functions, for example, are a special case of both regularization and Support Vector Machines. We review both formulations in the context of Vapnik’s theory of statistical learning which provides a general foundation for the learning problem, combining functional analysis and statistics. The emphasis is on regression: classification is treated as a special case.
Interpolation of Scattered Data: Distance Matrices and Conditionally Positive Definite Functions
 CONSTRUCTIVE APPROXIMATION
, 1986
"... Among other things, we prove that multiquadric surface interpolation is always solvable, thereby settling a conjecture of R. Franke. ..."
Abstract

Cited by 274 (3 self)
 Add to MetaCart
Among other things, we prove that multiquadric surface interpolation is always solvable, thereby settling a conjecture of R. Franke.
On the mathematical foundations of learning
 Bulletin of the American Mathematical Society
, 2002
"... The problem of learning is arguably at the very core of the problem of intelligence, both biological and arti cial. T. Poggio and C.R. Shelton ..."
Abstract

Cited by 227 (12 self)
 Add to MetaCart
The problem of learning is arguably at the very core of the problem of intelligence, both biological and arti cial. T. Poggio and C.R. Shelton
Support Vector Machines for Classification and Regression
 UNIVERSITY OF SOUTHAMPTON, TECHNICAL REPORT
, 1998
"... The problem of empirical data modelling is germane to many engineering applications.
In empirical data modelling a process of induction is used to build up a model of the
system, from which it is hoped to deduce responses of the system that have yet to be observed.
Ultimately the quantity and qualit ..."
Abstract

Cited by 205 (5 self)
 Add to MetaCart
The problem of empirical data modelling is germane to many engineering applications.
In empirical data modelling a process of induction is used to build up a model of the
system, from which it is hoped to deduce responses of the system that have yet to be observed.
Ultimately the quantity and quality of the observations govern the performance
of this empirical model. By its observational nature data obtained is finite and sampled;
typically this sampling is nonuniform and due to the high dimensional nature of the
problem the data will form only a sparse distribution in the input space. Consequently
the problem is nearly always ill posed (Poggio et al., 1985) in the sense of Hadamard
(Hadamard, 1923). Traditional neural network approaches have suffered difficulties with
generalisation, producing models that can overfit the data. This is a consequence of the
optimisation algorithms used for parameter selection and the statistical measures used
to select the ’best’ model. The foundations of Support Vector Machines (SVM) have
been developed by Vapnik (1995) and are gaining popularity due to many attractive
features, and promising empirical performance. The formulation embodies the Structural
Risk Minimisation (SRM) principle, which has been shown to be superior, (Gunn
et al., 1997), to traditional Empirical Risk Minimisation (ERM) principle, employed by
conventional neural networks. SRM minimises an upper bound on the expected risk,
as opposed to ERM that minimises the error on the training data. It is this difference
which equips SVM with a greater ability to generalise, which is the goal in statistical
learning. SVMs were developed to solve the classification problem, but recently they
have been extended to the domain of regression problems (Vapnik et al., 1997). In the
literature the terminology for SVMs can be slightly confusing. The term SVM is typically
used to describe classification with support vector methods and support vector
regression is used to describe regression with support vector methods. In this report
the term SVM will refer to both classification and regression methods, and the terms
Support Vector Classification (SVC) and Support Vector Regression (SVR) will be used
for specification. This section continues with a brief introduction to the structural risk
An equivalence between sparse approximation and Support Vector Machines
 A.I. Memo 1606, MIT Arti cial Intelligence Laboratory
, 1997
"... This publication can be retrieved by anonymous ftp to publications.ai.mit.edu. The pathname for this publication is: aipublications/15001999/AIM1606.ps.Z This paper shows a relationship between two di erent approximation techniques: the Support Vector Machines (SVM), proposed by V.Vapnik (1995), ..."
Abstract

Cited by 204 (7 self)
 Add to MetaCart
This publication can be retrieved by anonymous ftp to publications.ai.mit.edu. The pathname for this publication is: aipublications/15001999/AIM1606.ps.Z This paper shows a relationship between two di erent approximation techniques: the Support Vector Machines (SVM), proposed by V.Vapnik (1995), and a sparse approximation scheme that resembles the Basis Pursuit DeNoising algorithm (Chen, 1995 � Chen, Donoho and Saunders, 1995). SVM is a technique which can be derived from the Structural Risk Minimization Principle (Vapnik, 1982) and can be used to estimate the parameters of several di erent approximation schemes, including Radial Basis Functions, algebraic/trigonometric polynomials, Bsplines, and some forms of Multilayer Perceptrons. Basis Pursuit DeNoising is a sparse approximation technique, in which a function is reconstructed by using a small number of basis functions chosen from a large set (the dictionary). We show that, if the data are noiseless, the modi ed version of Basis Pursuit DeNoising proposed in this paper is equivalent to SVM in the following sense: if applied to the same data set the two techniques give the same solution, which is obtained by solving the same quadratic programming problem. In the appendix we also present a derivation of the SVM technique in the framework of regularization theory, rather than statistical learning theory, establishing a connection between SVM, sparse approximation and regularization theory.
Multicategory Support Vector Machines, theory, and application to the classification of microarray data and satellite radiance data
 Journal of the American Statistical Association
, 2004
"... Twocategory support vector machines (SVM) have been very popular in the machine learning community for classi � cation problems. Solving multicategory problems by a series of binary classi � ers is quite common in the SVM paradigm; however, this approach may fail under various circumstances. We pro ..."
Abstract

Cited by 179 (18 self)
 Add to MetaCart
Twocategory support vector machines (SVM) have been very popular in the machine learning community for classi � cation problems. Solving multicategory problems by a series of binary classi � ers is quite common in the SVM paradigm; however, this approach may fail under various circumstances. We propose the multicategory support vector machine (MSVM), which extends the binary SVM to the multicategory case and has good theoretical properties. The proposed method provides a unifying framework when there are either equal or unequal misclassi � cation costs. As a tuning criterion for the MSVM, an approximate leaveoneout crossvalidation function, called Generalized Approximate Cross Validation, is derived, analogous to the binary case. The effectiveness of the MSVM is demonstrated through the applications to cancer classi � cation using microarray data and cloud classi � cation with satellite radiance pro � les.