Results 1 
3 of
3
Online Learning with Kernels
, 2003
"... Kernel based algorithms such as support vector machines have achieved considerable success in various problems in the batch setting where all of the training data is available in advance. Support vector machines combine the socalled kernel trick with the large margin idea. There has been little u ..."
Abstract

Cited by 2029 (128 self)
 Add to MetaCart
Kernel based algorithms such as support vector machines have achieved considerable success in various problems in the batch setting where all of the training data is available in advance. Support vector machines combine the socalled kernel trick with the large margin idea. There has been little use of these methods in an online setting suitable for realtime applications. In this paper we consider online learning in a Reproducing Kernel Hilbert Space. By considering classical stochastic gradient descent within a feature space, and the use of some straightforward tricks, we develop simple and computationally efficient algorithms for a wide range of problems such as classification, regression, and novelty detection. In addition to allowing the exploitation of the kernel trick in an online setting, we examine the value of large margins for classification in the online setting with a drifting target. We derive worst case loss bounds and moreover we show the convergence of the hypothesis to the minimiser of the regularised risk functional. We present some experimental results that support the theory as well as illustrating the power of the new algorithms for online novelty detection. In addition
Structure Identification Using Separate Validation Data  Asymptotic Properties
 In: Proc. European Control Conference
, 1995
"... Model structures are compared by estimating their expected prediction performance on a validation data sequence that is independent of the data sequence used for parameter estimation. Here we show that under reasonable assumptions, this common procedure asymptotically leads to a model with the best ..."
Abstract

Cited by 3 (3 self)
 Add to MetaCart
Model structures are compared by estimating their expected prediction performance on a validation data sequence that is independent of the data sequence used for parameter estimation. Here we show that under reasonable assumptions, this common procedure asymptotically leads to a model with the best possible expected prediction performance within the given model set. 1 Introduction The decomposition of a model set into a parametric and a structural level is common in system identification. The parameters are typically identified using a prediction error criterion, e.g. [1], while there are several possible criteria for structure identification. These include FPE [2], AIC [3], MDL [4], Bayesian criteria [5], crossvalidation [6], the unbiasness criterion [7], as well as the simplest and perhaps most popular criterion, namely the the use of separate validation data to assess the prediction performance of the different identified models. The major drawback of this last procedure, compared...
On Convergence Proofs in System Identification  A General Principle using ideas from Learning Theory
"... this paper are different from the ones used in (Ljung 1987). The present conditions may be more restrictive, particularly if we assume, as in Theorem 3, that z t belongs to a compact set. 4 Parameter and Structure Identification Parameterization of the model set is introduced in this section. The mo ..."
Abstract

Cited by 3 (0 self)
 Add to MetaCart
this paper are different from the ones used in (Ljung 1987). The present conditions may be more restrictive, particularly if we assume, as in Theorem 3, that z t belongs to a compact set. 4 Parameter and Structure Identification Parameterization of the model set is introduced in this section. The model set is decomposed into structural and parametric levels, which allows structure identification to be studied in the same framework. Furthermore, we will show how the well known convergence result in (Ljung 1978) now follows directly. Let S be a set of model structures. A model structure S 2 S is a set of possibly nonlinear equations, which may be algebraic, difference, differential, or a combination. The equations may contain some unknown parameters `