Results 1  10
of
27
Online Learning with Kernels
, 2003
"... Kernel based algorithms such as support vector machines have achieved considerable success in various problems in the batch setting where all of the training data is available in advance. Support vector machines combine the socalled kernel trick with the large margin idea. There has been little u ..."
Abstract

Cited by 2029 (128 self)
 Add to MetaCart
Kernel based algorithms such as support vector machines have achieved considerable success in various problems in the batch setting where all of the training data is available in advance. Support vector machines combine the socalled kernel trick with the large margin idea. There has been little use of these methods in an online setting suitable for realtime applications. In this paper we consider online learning in a Reproducing Kernel Hilbert Space. By considering classical stochastic gradient descent within a feature space, and the use of some straightforward tricks, we develop simple and computationally efficient algorithms for a wide range of problems such as classification, regression, and novelty detection. In addition to allowing the exploitation of the kernel trick in an online setting, we examine the value of large margins for classification in the online setting with a drifting target. We derive worst case loss bounds and moreover we show the convergence of the hypothesis to the minimiser of the regularised risk functional. We present some experimental results that support the theory as well as illustrating the power of the new algorithms for online novelty detection. In addition
A tutorial on support vector regression
, 2004
"... In this tutorial we give an overview of the basic ideas underlying Support Vector (SV) machines for function estimation. Furthermore, we include a summary of currently used algorithms for training SV machines, covering both the quadratic (or convex) programming part and advanced methods for dealing ..."
Abstract

Cited by 473 (2 self)
 Add to MetaCart
In this tutorial we give an overview of the basic ideas underlying Support Vector (SV) machines for function estimation. Furthermore, we include a summary of currently used algorithms for training SV machines, covering both the quadratic (or convex) programming part and advanced methods for dealing with large datasets. Finally, we mention some modifications and extensions that have been applied to the standard SV algorithm, and discuss the aspect of regularization from a SV perspective.
Predicting Time Series with Support Vector Machines
, 1997
"... . Support Vector Machines are used for time series prediction and compared to radial basis function networks. We make use of two different cost functions for Support Vectors: training with (i) an ffl insensitive loss and (ii) Huber's robust loss function and discuss how to choose the regularization ..."
Abstract

Cited by 129 (13 self)
 Add to MetaCart
. Support Vector Machines are used for time series prediction and compared to radial basis function networks. We make use of two different cost functions for Support Vectors: training with (i) an ffl insensitive loss and (ii) Huber's robust loss function and discuss how to choose the regularization parameters in these models. Two applications are considered: data from (a) a noisy (normal and uniform noise) Mackey Glass equation and (b) the Santa Fe competition (set D). In both cases Support Vector Machines show an excellent performance. In case (b) the Support Vector approach improves the best known result on the benchmark by a factor of 29%. 1 Introduction Support Vector Machines have become a subject of intensive study (see e.g. [3, 14]). They have been applied successfully to classification tasks as OCR [14, 11] and more recently also to regression [5, 15]. In this contribution we use Support Vector Machines in the field of time series prediction and we find that they show an excel...
On a Kernelbased Method for Pattern Recognition, Regression, Approximation, and Operator Inversion
, 1997
"... We present a Kernelbased framework for Pattern Recognition, Regression Estimation, Function Approximation and multiple Operator Inversion. Previous approaches such as ridgeregression, Support Vector methods and regression by Smoothing Kernels are included as special cases. We will show connection ..."
Abstract

Cited by 77 (25 self)
 Add to MetaCart
We present a Kernelbased framework for Pattern Recognition, Regression Estimation, Function Approximation and multiple Operator Inversion. Previous approaches such as ridgeregression, Support Vector methods and regression by Smoothing Kernels are included as special cases. We will show connections between the costfunction and some properties up to now believed to apply to Support Vector Machines only. The optimal solution of all the problems described above can be found by solving a simple quadratic programming problem. The paper closes with a proof of the equivalence between Support Vector kernels and Greene's functions of regularization operators.
Bayesian Statistics
 in WWW', Computing Science and Statistics
, 1989
"... ∗ Signatures are on file in the Graduate School. This dissertation presents two topics from opposite disciplines: one is from a parametric realm and the other is based on nonparametric methods. The first topic is a jackknife maximum likelihood approach to statistical model selection and the second o ..."
Abstract

Cited by 20 (0 self)
 Add to MetaCart
∗ Signatures are on file in the Graduate School. This dissertation presents two topics from opposite disciplines: one is from a parametric realm and the other is based on nonparametric methods. The first topic is a jackknife maximum likelihood approach to statistical model selection and the second one is a convex hull peeling depth approach to nonparametric massive multivariate data analysis. The second topic includes simulations and applications on massive astronomical data. First, we present a model selection criterion, minimizing the KullbackLeibler distance by using the jackknife method. Various model selection methods have been developed to choose a model of minimum KullbackLiebler distance to the true model, such as Akaike information criterion (AIC), Bayesian information criterion (BIC), Minimum description length (MDL), and Bootstrap information criterion. Likewise, the jackknife method chooses a model of minimum KullbackLeibler distance through bias reduction. This bias, which is inevitable in model
On Computing Geometric Estimators of Location
, 2001
"... Let S be a data set of n points in R d , and be a point in R d which "best" describes S. Since the term "best" is subjective, there exist several definitions for finding . However, it is generally agreed that such a definition, or estimator of location, should have certain statistical propert ..."
Abstract

Cited by 5 (0 self)
 Add to MetaCart
Let S be a data set of n points in R d , and be a point in R d which "best" describes S. Since the term "best" is subjective, there exist several definitions for finding . However, it is generally agreed that such a definition, or estimator of location, should have certain statistical properties which make it robust. Most estimators of location assign a depth value to any point in R d and define to be a point with maximum depth. Here, new results are presented concerning the computational complexity of estimators of location. We prove that in R 2 the computation of simplicial and halfspace depth of a point requires\Omega\Gamma n log n) time, which matches the upper bound complexities of algorithms by Rousseeuw and Ruts. Our lower bounds also apply to two sign tests, that of Hodges and that of Oja and Nyblom. In addition, we propose algorithms which reduce the time complexity of calculating the points with greatest Oja and simplicial depth. Our fastest algorithms use O(n 3 log n) and O(n 4 ) time respectively, compared to the algorithms of Rousseeuw and Ruts which use O(n 5 log n) time. One of our algorithms may also be used to find a point with minimum weighted sum of distances to a set of n lines in O(n 2 ) time. This point is called the FermatTorricelli point of n lines by Roy Barbara, whose algorithm uses O(n 3 ) time. Finally, we propose a new estimator which arises from the notion of hyperplane depth recently defined by Rousseeuw and Hubert.
Bayesian kernel methods
 LNAI 2600
, 2003
"... Bayesian methods allow for a simple and intuitive representation of the function spaces used by kernel methods. This chapter describes the basic principles of Gaussian Processes, their implementation and their connection to other kernelbased Bayesian estimation methods, such as the Relevance Vecto ..."
Abstract

Cited by 4 (0 self)
 Add to MetaCart
Bayesian methods allow for a simple and intuitive representation of the function spaces used by kernel methods. This chapter describes the basic principles of Gaussian Processes, their implementation and their connection to other kernelbased Bayesian estimation methods, such as the Relevance Vector Machine.
FigueirasVidal, “Support vector method for robust ARMA system identification
 IEEE Transactions on Signal Processing
, 2004
"... Abstract—This paper presents a new approach to autoregressive and moving average (ARMA) modeling based on the support vector method (SVM) for identification applications. A statistical analysis of the characteristics of the proposed method is carried out. An analytical relationship between residual ..."
Abstract

Cited by 3 (1 self)
 Add to MetaCart
Abstract—This paper presents a new approach to autoregressive and moving average (ARMA) modeling based on the support vector method (SVM) for identification applications. A statistical analysis of the characteristics of the proposed method is carried out. An analytical relationship between residuals and SVMARMA coefficients allows the linking of the fundamentals of SVM with several classical system identification methods. Additionally, the effect of outliers can be cancelled. Application examples show the performance of SVMARMA algorithm when it is compared with other system identification methods. Index Terms—ARMA modeling, crosscorrelation, support vector method, system identification, time series.
Some Bayesian perspectives on statistical modelling
, 1988
"... I would like to thank my supervisor, Professor A. F. M. Smith, for all his advice and encourage ..."
Abstract

Cited by 3 (2 self)
 Add to MetaCart
I would like to thank my supervisor, Professor A. F. M. Smith, for all his advice and encourage