Results 11  20
of
196
The Kernel Recursive Least Squares Algorithm
 IEEE Transactions on Signal Processing
, 2003
"... We present a nonlinear kernelbased version of the Recursive Least Squares (RLS) algorithm. Our KernelRLS (KRLS) algorithm performs linear regression in the feature space induced by a Mercer kernel, and can therefore be used to recursively construct the minimum mean squared error regressor. Spars ..."
Abstract

Cited by 62 (2 self)
 Add to MetaCart
We present a nonlinear kernelbased version of the Recursive Least Squares (RLS) algorithm. Our KernelRLS (KRLS) algorithm performs linear regression in the feature space induced by a Mercer kernel, and can therefore be used to recursively construct the minimum mean squared error regressor. Sparsity of the solution is achieved by a sequential sparsification process that admits into the kernel representation a new input sample only if its feature space image cannot be suffciently well approximated by combining the images of previously admitted samples. This sparsification procedure is crucial to the operation of KRLS, as it allows it to operate online, and by effectively regularizing its solutions. A theoretical analysis of the sparsification method reveals its close affinity to kernel PCA, and a datadependent loss bound is presented, quantifying the generalization performance of the KRLS algorithm. We demonstrate the performance and scaling properties of KRLS and compare it to a stateof theart Support Vector Regression algorithm, using both synthetic and real data. We additionally test KRLS on two signal processing problems in which the use of traditional leastsquares methods is commonplace: Time series prediction and channel equalization.
Regularized LeastSquares Classification
"... We consider the solution of binary classification problems via Tikhonov regularization in a Reproducing Kernel Hilbert Space using the square loss, and denote the resulting algorithm Regularized LeastSquares Classification (RLSC). We sketch ..."
Abstract

Cited by 58 (1 self)
 Add to MetaCart
We consider the solution of binary classification problems via Tikhonov regularization in a Reproducing Kernel Hilbert Space using the square loss, and denote the resulting algorithm Regularized LeastSquares Classification (RLSC). We sketch
Active learning with gaussian processes for object categorization
 In ICCV
, 2007
"... Discriminative methods for visual object category recognition are typically nonprobabilistic, predicting class labels but not directly providing an estimate of uncertainty. Gaussian Processes (GPs) are powerful regression techniques with explicit uncertainty models; we show here how Gaussian Proces ..."
Abstract

Cited by 50 (12 self)
 Add to MetaCart
Discriminative methods for visual object category recognition are typically nonprobabilistic, predicting class labels but not directly providing an estimate of uncertainty. Gaussian Processes (GPs) are powerful regression techniques with explicit uncertainty models; we show here how Gaussian Processes with covariance functions defined based on a Pyramid Match Kernel (PMK) can be used for probabilistic object category recognition. The uncertainty model provided by GPs offers confidence estimates at test points, and naturally allows for an active learning paradigm in which points are optimally selected for interactive labeling. We derive a novel active category learning method based on our probabilistic regression model, and show that a significant boost in classification performance is possible, especially when the amount of training data for a category is ultimately very small. 1.
AlmostEverywhere Algorithmic Stability and Generalization Error
 In UAI2002: Uncertainty in Artificial Intelligence
, 2002
"... We introduce a new notion of algorithmic stability, which we call training stability. ..."
Abstract

Cited by 44 (8 self)
 Add to MetaCart
We introduce a new notion of algorithmic stability, which we call training stability.
Statistical performance of support vector machines
 ANN. STATIST
, 2008
"... The support vector machine (SVM) algorithm is well known to the computer learning community for its very good practical results. The goal of the present paper is to study this algorithm from a statistical perspective, using tools of concentration theory and empirical processes. Our main result build ..."
Abstract

Cited by 42 (8 self)
 Add to MetaCart
The support vector machine (SVM) algorithm is well known to the computer learning community for its very good practical results. The goal of the present paper is to study this algorithm from a statistical perspective, using tools of concentration theory and empirical processes. Our main result builds on the observation made by other authors that the SVM can be viewed as a statistical regularization procedure. From this point of view, it can also be interpreted as a model selection principle using a penalized criterion. It is then possible to adapt general methods related to model selection in this framework to study two important points: (1) what is the minimum penalty and how does it compare to the penalty actually used in the SVM algorithm; (2) is it possible to obtain “oracle inequalities ” in that setting, for the specific loss function used in the SVM algorithm? We show that the answer to the latter question is positive and provides relevant insight to the former. Our result shows that it is possible to obtain fast rates of convergence for SVMs.
Best choices for regularization parameters in learning theory: on the biasvariance problem
 Foundations of Computationals Mathematics
"... The goal of learning theory (and a goal in some other contexts as well) is to find an approximation of a function fρ: X → Y known only through a set of pairs z = (xi, yi) m i=1 drawn from an unknown probability measure ρ on X×Y ( fρ is the “regression function ” of ρ). ..."
Abstract

Cited by 40 (9 self)
 Add to MetaCart
The goal of learning theory (and a goal in some other contexts as well) is to find an approximation of a function fρ: X → Y known only through a set of pairs z = (xi, yi) m i=1 drawn from an unknown probability measure ρ on X×Y ( fρ is the “regression function ” of ρ).
Learning theory estimates via integral operators and their approximations. submitted, 2005. retrievable at http://www.ttic.org/smale.html
"... This report on learning theory is written in the spirit of: The best understanding of what one can see comes from theories of what one can’t see. This thought has been expressed in a number of ways by different scientists, and is supported everywhere. Obvious choices vary from gravity to economic eq ..."
Abstract

Cited by 39 (5 self)
 Add to MetaCart
This report on learning theory is written in the spirit of: The best understanding of what one can see comes from theories of what one can’t see. This thought has been expressed in a number of ways by different scientists, and is supported everywhere. Obvious choices vary from gravity to economic equilibrium. For
Optimal rates for the regularized leastsquares algorithm
 Foundations of Computational Mathematics
"... We develop a theoretical analysis of generalization performances of regularized leastsquares on reproducing kernel Hilbert spaces for supervised learning. We show that the concept of effective dimension of an integral operator plays a central role in the definition of a criterion for the choice of t ..."
Abstract

Cited by 39 (8 self)
 Add to MetaCart
We develop a theoretical analysis of generalization performances of regularized leastsquares on reproducing kernel Hilbert spaces for supervised learning. We show that the concept of effective dimension of an integral operator plays a central role in the definition of a criterion for the choice of the regularization parameter as a function of the number of samples. In fact a minimax analysis is performed which shows asymptotic optimality of the above mentioned criterion.