Results 1  10
of
129
Sparse Bayesian Learning and the Relevance Vector Machine
, 2001
"... This paper introduces a general Bayesian framework for obtaining sparse solutions to regression and classification tasks utilising models linear in the parameters. Although this framework is fully general, we illustrate our approach with a particular specialisation that we denote the `relevance vect ..."
Abstract

Cited by 949 (5 self)
 Add to MetaCart
This paper introduces a general Bayesian framework for obtaining sparse solutions to regression and classification tasks utilising models linear in the parameters. Although this framework is fully general, we illustrate our approach with a particular specialisation that we denote the `relevance vector machine’ (RVM), a model of identical functional form to the popular and stateoftheart `support vector machine ’ (SVM). We demonstrate that by exploiting a probabilistic Bayesian learning framework, we can derive accurate prediction models which typically utilise dramatically fewer basis functions than a comparable SVM while offering a number of additional advantages. These include the benefits of probabilistic predictions, automatic estimation of `nuisance’ parameters, and the facility to utilise arbitrary basis functions (e.g. non`Mercer’ kernels). We detail the Bayesian framework and associated learning algorithm for the RVM, and give some illustrative examples of its application along with some comparative benchmarks. We offer some explanation for the exceptional degree of sparsity obtained, and discuss and demonstrate some of the advantageous features, and potential extensions, of Bayesian relevance learning.
The Relevance Vector Machine
, 2000
"... The support vector machine (SVM) is a stateoftheart technique for regression and classification, combining excellent generalisation properties with a sparse kernel representation. However, it does suffer from a number of disadvantages, notably the absence of probabilistic outputs, the requirement ..."
Abstract

Cited by 286 (6 self)
 Add to MetaCart
The support vector machine (SVM) is a stateoftheart technique for regression and classification, combining excellent generalisation properties with a sparse kernel representation. However, it does suffer from a number of disadvantages, notably the absence of probabilistic outputs, the requirement to estimate a tradeoff parameter and the need to utilise `Mercer' kernel functions. In this paper we introduce the Relevance Vector Machine (RVM), a Bayesian treatment of a generalised linear model of identical functional form to the SVM. The RVM suffers from none of the above disadvantages, and examples demonstrate that for comparable generalisation performance, the RVM requires dramatically fewer kernel functions.
Face recognition using kernel direct discriminant analysis algorithms
 IEEE Trans. Neural Networks
"... Abstract—Techniques that can introduce lowdimensional feature representation with enhanced discriminatory power is of paramount importance in face recognition (FR) systems. It is well known that the distribution of face images, under a perceivable variation in viewpoint, illumination or facial expr ..."
Abstract

Cited by 140 (12 self)
 Add to MetaCart
(Show Context)
Abstract—Techniques that can introduce lowdimensional feature representation with enhanced discriminatory power is of paramount importance in face recognition (FR) systems. It is well known that the distribution of face images, under a perceivable variation in viewpoint, illumination or facial expression, is highly nonlinear and complex. It is, therefore, not surprising that linear techniques, such as those based on principle component analysis (PCA) or linear discriminant analysis (LDA), cannot provide reliable and robust solutions to those FR problems with complex face variations. In this paper, we propose a kernel machinebased discriminant analysis method, which deals with the nonlinearity of the face patterns ’ distribution. The proposed method also effectively solves the socalled “small sample size ” (SSS) problem, which exists in most FR tasks. The new algorithm has been tested, in terms of classification error rate performance, on the multiview UMIST face database. Results indicate that the proposed methodology is able to achieve excellent performance with only a very small set of features being used, and its error rate is approximately 34 % and 48 % of those of two other commonly used kernel FR approaches, the kernelPCA (KPCA) and the generalized discriminant analysis (GDA), respectively. Index Terms—Face recognition (FR), kernel direct discriminant analysis (KDDA), linear discriminant analysis (LDA), principle component analysis (PCA), small sample size problem (SSS), kernel methods. I.
Online Handwriting Recognition with Support Vector Machines  A Kernel Approach
 In Proc. of the 8th IWFHR
, 2002
"... In this' contribution we describe a novel classification approach for online handwriting recognition. The technique combines dynamic time warping (DTW) and support vector machines (SVMs) by establishing a new SVM kernel. We call this' kernel Gaussian DTW (GDTW) ker nel. This kernel appro ..."
Abstract

Cited by 116 (8 self)
 Add to MetaCart
(Show Context)
In this' contribution we describe a novel classification approach for online handwriting recognition. The technique combines dynamic time warping (DTW) and support vector machines (SVMs) by establishing a new SVM kernel. We call this' kernel Gaussian DTW (GDTW) ker nel. This kernel approach haw' a main advantage over common HMM techniques. It does not assume a model for the generarive class conditional densities. Instead, it directly addresses the problem of discrimination by creating class boundaries and thus is' less sensitive to modeling assumptions. By incorporating DTW in the kernel function, general classification problems with variablesized sequential data can be handled. In this respect the proposed method can be straightforwardly applied to all classification problems, where DTW gives a reasonable distance measure, e.g. speech recognition or genome processing. We show experiments with this' kernel approach on the UNIPEN handwriting data, achieving results' comparable to an HMMbased technique.
ContentBased Audio Classification and Retrieval by Support Vector Machines
, 2000
"... Support vector machines (SVMs) have been recently proposed as a new learning algorithm for pattern recognition. In this paper, the SVMs with a binary tree recognition strategy are used to tackle the audio classification problem. We illustrate the potential of SVMs on a common audio database, which c ..."
Abstract

Cited by 67 (1 self)
 Add to MetaCart
Support vector machines (SVMs) have been recently proposed as a new learning algorithm for pattern recognition. In this paper, the SVMs with a binary tree recognition strategy are used to tackle the audio classification problem. We illustrate the potential of SVMs on a common audio database, which consists of 409 sounds of 16 classes. We compare the SVMs based classification with other popular approaches. For audio retrieval, we propose a new metric, called distancefromboundary (DFB). When a query audio is given, the system first finds a boundary inside which the query pattern is located. Then, all the audio patterns in the database are sorted by their distances to this boundary. All boundaries are learned by the SVMs and stored together with the audio database. Experimental comparisons for audio retrieval are presented to show the superiority of this novel metric to other similarity measures.
Online prediction of time series data with kernels
 IEEE TRANS. SIGNAL PROCESSING
, 2009
"... Kernelbased algorithms have been a topic of considerable interest in the machine learning community over the last ten years. Their attractiveness resides in their elegant treatment of nonlinear problems. They have been successfully applied to pattern recognition, regression and density estimation. ..."
Abstract

Cited by 49 (20 self)
 Add to MetaCart
(Show Context)
Kernelbased algorithms have been a topic of considerable interest in the machine learning community over the last ten years. Their attractiveness resides in their elegant treatment of nonlinear problems. They have been successfully applied to pattern recognition, regression and density estimation. A common characteristic of kernelbased methods is that they deal with kernel expansions whose number of terms equals the number of input data, making them unsuitable for online applications. Recently, several solutions have been proposed to circumvent this computational burden in time series prediction problems. Nevertheless, most of them require excessively elaborate and costly operations. In this paper, we investigate a new model reduction criterion that makes computationally demanding sparsification procedures unnecessary. The increase in the number of variables is controlled by the coherence parameter, a fundamental quantity that characterizes the behavior of dictionaries in sparse approximation problems. We incorporate the coherence criterion into a new kernelbased affine projection algorithm for time series prediction. We also derive the kernelbased normalized LMS algorithm as a particular case. Finally, experiments are conducted to compare our approach to existing methods.
Gabor wavelets and General Discriminant Analysis for face identification and verification
, 2007
"... ..."
Multicategory proximal support vector machine classifiers
 Machine Learning
, 2001
"... Abstract. Given a dataset, each element of which labeled by one of k labels, we construct by a very fast algorithm, a kcategory proximal support vector machine (PSVM) classifier. Proximal support vector machines and related approaches (Fung & Mangasarian, 2001; Suykens & Vandewalle, 1999) c ..."
Abstract

Cited by 29 (0 self)
 Add to MetaCart
Abstract. Given a dataset, each element of which labeled by one of k labels, we construct by a very fast algorithm, a kcategory proximal support vector machine (PSVM) classifier. Proximal support vector machines and related approaches (Fung & Mangasarian, 2001; Suykens & Vandewalle, 1999) can be interpreted as ridge regression applied to classification problems (Evgeniou, Pontil, & Poggio, 2000). Extensive computational results have shown the effectiveness of PSVM for twoclass classification problems where the separating plane is constructed in time that can be as little as two orders of magnitude shorter than that of conventional support vector machines. When PSVM is applied to problems with more than two classes, the well known onefromtherest approach is a natural choice in order to take advantage of its fast performance. However, there is a drawback associated with this onefromtherest approach. The resulting twoclass problems are often very unbalanced, leading in some cases to poor performance. We propose balancing the k classes and a novel Newton refinement modification to PSVM in order to deal with this problem. Computational results indicate that these two modifications preserve the speed of PSVM while often leading to significant test set improvement over a plain PSVM onefromtherest application. The modified approach is considerably faster than other onefromtherest methods that use conventional SVM formulations, while still giving comparable test set correctness.
An Optimization Perspective on Kernel Partial Least Squares Regression
 Advances in Learning Theory: Methods, Models and Applications
, 2003
"... Abstract. This work provides a novel derivation based on optimization for the partial least squares (PLS) algorithm for linear regression and the kernel partial least squares (KPLS) algorithm for nonlinear regression. This derivation makes the PLS algorithm, popularly and successfully used for chem ..."
Abstract

Cited by 21 (4 self)
 Add to MetaCart
(Show Context)
Abstract. This work provides a novel derivation based on optimization for the partial least squares (PLS) algorithm for linear regression and the kernel partial least squares (KPLS) algorithm for nonlinear regression. This derivation makes the PLS algorithm, popularly and successfully used for chemometrics applications, more accessible to machine learning researchers. The work introduces Direct KPLS, a novel way to kernelize PLS based on direct factorization of the kernel matrix. Computational results and discussion illustrate the relative merits of KPLS and Direct KPLS versus closely related kernel methods such as support vector machines and kernel ridge regression. ∗ This work was supported by NSF grant number IIS9979860. Many thanks to Roman Rosipal, Nello Cristianini, and Johan Suykens for many helpful discussions on PLS and kernel methods, Sean Ekans from Concurrent Pharmaceutical for providing molecule descriptions for the Albumin data set, Curt Breneman and N. Sukumar for generating descriptors for the Albumin data, and Tony Van Gestel for an efficient Gaussian kernel
Efficient face detection by a cascaded supportvector machine expansion
 Royal Society of London Proceedings Series A, 460:3283–3297
, 2004
"... We describe a fast system for the detection and localization of human faces in images using a nonlinear Support Vector Machine. We approximate the decision surface in terms of a reduced set of expansion vectors and propose a cascaded evaluation which has the property that the full support vectors ex ..."
Abstract

Cited by 21 (2 self)
 Add to MetaCart
We describe a fast system for the detection and localization of human faces in images using a nonlinear Support Vector Machine. We approximate the decision surface in terms of a reduced set of expansion vectors and propose a cascaded evaluation which has the property that the full support vectors expansion is only evaluated on the facelike parts of the image, while the largest part of typical images is classified using a single expansion vector (simpler and more efficient classifier). The cascaded evaluation offers a thirtyfold speedup over an evaluation using the full set of reduced set vectors which itself already is thirty times faster than classification using all the support vectors.