Results 1  10
of
38
A tutorial on support vector machines for pattern recognition
 Data Mining and Knowledge Discovery
, 1998
"... The tutorial starts with an overview of the concepts of VC dimension and structural risk minimization. We then describe linear Support Vector Machines (SVMs) for separable and nonseparable data, working through a nontrivial example in detail. We describe a mechanical analogy, and discuss when SV ..."
Abstract

Cited by 2497 (11 self)
 Add to MetaCart
The tutorial starts with an overview of the concepts of VC dimension and structural risk minimization. We then describe linear Support Vector Machines (SVMs) for separable and nonseparable data, working through a nontrivial example in detail. We describe a mechanical analogy, and discuss when SVM solutions are unique and when they are global. We describe how support vector training can be practically implemented, and discuss in detail the kernel mapping technique which is used to construct SVM solutions which are nonlinear in the data. We show how Support Vector machines can have very large (even infinite) VC dimension by computing the VC dimension for homogeneous polynomial and Gaussian radial basis function kernels. While very high VC dimension would normally bode ill for generalization performance, and while at present there exists no theory which shows that good generalization performance is guaranteed for SVMs, there are several arguments which support the observed high accuracy of SVMs, which we review. Results of some experiments which were inspired by these arguments are also presented. We give numerous examples and proofs of most of the key theorems. There is new material, and I hope that the reader will find that even old material is cast in a fresh light.
A tutorial on support vector regression
, 2004
"... In this tutorial we give an overview of the basic ideas underlying Support Vector (SV) machines for function estimation. Furthermore, we include a summary of currently used algorithms for training SV machines, covering both the quadratic (or convex) programming part and advanced methods for dealing ..."
Abstract

Cited by 540 (2 self)
 Add to MetaCart
In this tutorial we give an overview of the basic ideas underlying Support Vector (SV) machines for function estimation. Furthermore, we include a summary of currently used algorithms for training SV machines, covering both the quadratic (or convex) programming part and advanced methods for dealing with large datasets. Finally, we mention some modifications and extensions that have been applied to the standard SV algorithm, and discuss the aspect of regularization from a SV perspective.
The connection between regularization operators and support vector kernels
, 1998
"... In this paper a correspondence is derived between regularization operators used in regularization networks and support vector kernels. We prove that the Green’s Functions associated with regularization operators are suitable support vector kernels with equivalent regularization properties. Moreover, ..."
Abstract

Cited by 154 (42 self)
 Add to MetaCart
In this paper a correspondence is derived between regularization operators used in regularization networks and support vector kernels. We prove that the Green’s Functions associated with regularization operators are suitable support vector kernels with equivalent regularization properties. Moreover, the paper provides an analysis of currently used support vector kernels in the view of regularization theory and corresponding operators associated with the classes of both polynomial kernels and translation invariant kernels. The latter are also analyzed on periodical domains. As a byproduct we show that a large number of radial basis functions, namely conditionally positive definite
Probabilistic Kernel Regression Models
 In Proceedings of the 1999 Conference on AI and Statistics
, 1999
"... We introduce a class of flexible conditional probability models and techniques for classification /regression problems. Many existing methods such as generalized linear models and support vector machines are subsumed under this class. The flexibility of this class of techniques comes from the use of ..."
Abstract

Cited by 109 (3 self)
 Add to MetaCart
We introduce a class of flexible conditional probability models and techniques for classification /regression problems. Many existing methods such as generalized linear models and support vector machines are subsumed under this class. The flexibility of this class of techniques comes from the use of kernel functions as in support vector machines, and the generality from dual formulations of standard regression models. 1 Introduction Support vector machines [10] are linear maximum margin classifiers exploiting the idea of a kernel function. A kernel function defines an embedding of examples into (high or infinite dimensional) feature vectors and allows the classification to be carried out in the feature space without ever explicitly representing it. While support vector machines are nonprobabilistic classifiers they can be extended and formalized for probabilistic settings[12] (recently also [8]), which is the topic of this paper. We can also identify the new formulations with other s...
Support Vector Regression and Classification Based Multiview Face Detection and Recognition
 IN IEEE INTERNATIONAL CONFERENCE ON AUTOMATIC FACE & GESTURE RECOGNITION
, 2000
"... A Support Vector Machine based multiview face detection and recognition framework is described in this paper. Face detection is carried out by constructing several detectors, each of them in charge of one specific view. The symmetrical property of face images is employed to simplify the complexity ..."
Abstract

Cited by 63 (8 self)
 Add to MetaCart
A Support Vector Machine based multiview face detection and recognition framework is described in this paper. Face detection is carried out by constructing several detectors, each of them in charge of one specific view. The symmetrical property of face images is employed to simplify the complexity of the modelling. The estimation of head pose, which is achieved by using the Support Vector Regression technique, provides crucial information for choosing the appropriate face detector. This helps to improve the accuracy and reduce the computation in multiview face detection compared to other methods. For video sequences, further computational reduction can be achieved by using Pose Change Smoothing strategy. When face detectors find a face in frontal view, a Support Vector Machine based multiclass classifier is activated for face recognition. All the above issues are integrated under a Support Vector Machine framework. Test results on four video sequences are presented, among them, detection rate is above 95%, recognition accuracy is above 90%, average pose estimation error is around 10°, and the full detection and recognition speed is up to 4 frames/second on a PentiumII300 PC.
Structural modelling with sparse kernels
 In Proceedings of the 16th ACMSIAM Symposium on Discrete Algorithms
, 2002
"... A widely acknowledged drawback of many statistical modelling techniques, commonly used in machine learning, is that the resulting model is extremely difficult to interpret. A number of new concepts and algorithms have been introduced by researchers to address this problem. They focus primarily on de ..."
Abstract

Cited by 42 (0 self)
 Add to MetaCart
(Show Context)
A widely acknowledged drawback of many statistical modelling techniques, commonly used in machine learning, is that the resulting model is extremely difficult to interpret. A number of new concepts and algorithms have been introduced by researchers to address this problem. They focus primarily on determining which inputs are relevant in predicting the output. This work describes a transparent, advanced nonlinear modelling approach that enables the constructed predictive models to be visualised, allowing model validation and assisting in interpretation. The technique combines the representational advantage of a sparse ANOVA decomposition, with the good generalisation ability of a kernel machine. It achieves this by employing two forms of regularisation: a 1norm based structural regulariser to enforce transparency, and a 2norm based regulariser to control smoothness. The resulting model structure can be visualised showing the overall effects of different inputs, their interactions, and the strength of the interactions. The robustness of the technique is illustrated using a range of both artifical and “real world ” datasets. The performance is compared to other modelling techniques, and it is shown to exhibit competitive generalisation performance together with improved interpretability.
Incremental Learning with Support Vector Machines
, 2002
"... Support Vector Machines (SVMs) have become a popular tool for learning with large amounts of high dimensional data. However, it may sometimes be preferable to learn incrementally from previous SVM results, as computing a SVM is very costly in terms of time and memory consumption or because the SVM m ..."
Abstract

Cited by 25 (0 self)
 Add to MetaCart
Support Vector Machines (SVMs) have become a popular tool for learning with large amounts of high dimensional data. However, it may sometimes be preferable to learn incrementally from previous SVM results, as computing a SVM is very costly in terms of time and memory consumption or because the SVM may be used in an online learning setting. In this paper an approach for incremental learning with Support Vector Machines is presented, that improves existing approaches. Empirical evidence is given to prove that this approach can effectively deal with changes in the target concept that are results of the incremental learning setting.
Linear Dependency Between ε and the Input Noise in εSupport Vector Regression
, 2003
"... In using the εsupport vector regression (εSVR) algorithm, one has to decide a suitable value for the insensitivity parameter ε. Smola et al. considered its “optimal” choice by studying the statistical efficiency in a location parameter estimation problem. While they successfully predicted a linear ..."
Abstract

Cited by 10 (0 self)
 Add to MetaCart
In using the εsupport vector regression (εSVR) algorithm, one has to decide a suitable value for the insensitivity parameter ε. Smola et al. considered its “optimal” choice by studying the statistical efficiency in a location parameter estimation problem. While they successfully predicted a linear scaling between the optimal ε and the noise in the data, their theoretically optimal value does not have a close match with its experimentally observed counterpart in the case of Gaussian noise. In this paper, we attempt to better explain their experimental results by studying the regression problem itself. Our resultant predicted choice of ε is much closer to the experimentally observed optimal value, while again demonstrating a linear trend with the input noise.
Support Vector Methods in Learning and Feature Extraction
, 1998
"... The last years have witnessed an increasing interest in Support Vector (SV) machines, which use Mercer kernels for efficiently performing computations in highdimensional spaces. In pattern recognition, the SV algorithm constructs nonlinear decision functions by training a classifier to perform a li ..."
Abstract

Cited by 10 (1 self)
 Add to MetaCart
The last years have witnessed an increasing interest in Support Vector (SV) machines, which use Mercer kernels for efficiently performing computations in highdimensional spaces. In pattern recognition, the SV algorithm constructs nonlinear decision functions by training a classifier to perform a linear separation in some highdimensional space which is nonlinearly related to input space. Recently, we have developed a technique for Nonlinear Principal Component Analysis (Kernel PCA) based on the same types of kernels. This way, we can for instance efficiently extract polynomial features of arbitrary order by computing projections onto principal components in the space of all products of n pixels of images. We explain the idea of Mercer kernels and associated feature spaces, and describe connections to the theory of reproducing kernels and to regularization theory, followed by an overview of the above algorithms employing these kernels. 1. Introduction For the case of twoclass pattern...