Results 1 - 10
of
42
Sparse Bayesian Learning and the Relevance Vector Machine
, 2001
"... This paper introduces a general Bayesian framework for obtaining sparse solutions to regression and classication tasks utilising models linear in the parameters. Although this framework is fully general, we illustrate our approach with a particular specialisation that we denote the `relevance vec ..."
Abstract
-
Cited by 380 (5 self)
- Add to MetaCart
This paper introduces a general Bayesian framework for obtaining sparse solutions to regression and classication tasks utilising models linear in the parameters. Although this framework is fully general, we illustrate our approach with a particular specialisation that we denote the `relevance vector machine' (RVM), a model of identical functional form to the popular and state-of-the-art `support vector machine' (SVM). We demonstrate that by exploiting a probabilistic Bayesian learning framework, we can derive accurate prediction models which typically utilise dramatically fewer basis functions than a comparable SVM while oering a number of additional advantages. These include the benets of probabilistic predictions, automatic estimation of `nuisance' parameters, and the facility to utilise arbitrary basis functions (e.g. non-`Mercer' kernels).
An introduction to kernel-based learning algorithms
- IEEE TRANSACTIONS ON NEURAL NETWORKS
, 2001
"... This paper provides an introduction to support vector machines (SVMs), kernel Fisher discriminant analysis, and ..."
Abstract
-
Cited by 279 (46 self)
- Add to MetaCart
This paper provides an introduction to support vector machines (SVMs), kernel Fisher discriminant analysis, and
The Relevance Vector Machine
, 2000
"... The support vector machine (SVM) is a state-of-the-art technique for regression and classification, combining excellent generalisation properties with a sparse kernel representation. However, it does suffer from a number of disadvantages, notably the absence of probabilistic outputs, the requirement ..."
Abstract
-
Cited by 169 (6 self)
- Add to MetaCart
The support vector machine (SVM) is a state-of-the-art technique for regression and classification, combining excellent generalisation properties with a sparse kernel representation. However, it does suffer from a number of disadvantages, notably the absence of probabilistic outputs, the requirement to estimate a trade-off parameter and the need to utilise `Mercer' kernel functions. In this paper we introduce the Relevance Vector Machine (RVM), a Bayesian treatment of a generalised linear model of identical functional form to the SVM. The RVM suffers from none of the above disadvantages, and examples demonstrate that for comparable generalisation performance, the RVM requires dramatically fewer kernel functions.
On-line Handwriting Recognition with Support Vector Machines - A Kernel Approach
- In Proc. of the 8th IWFHR
, 2002
"... In this' contribution we describe a novel classification approach for on-line handwriting recognition. The technique combines dynamic time warping (DTW) and support vector machines (SVMs) by establishing a new SVM kernel. We call this' kernel Gaussian DTW (GDTW) ker- nel. This kernel approach haw' a ..."
Abstract
-
Cited by 60 (5 self)
- Add to MetaCart
In this' contribution we describe a novel classification approach for on-line handwriting recognition. The technique combines dynamic time warping (DTW) and support vector machines (SVMs) by establishing a new SVM kernel. We call this' kernel Gaussian DTW (GDTW) ker- nel. This kernel approach haw' a main advantage over common HMM techniques. It does not assume a model for the generarive class conditional densities. Instead, it directly addresses the problem of discrimination by creating class boundaries and thus is' less sensitive to modeling assumptions. By incorporating DTW in the kernel function, general classification problems with variable-sized sequential data can be handled. In this respect the proposed method can be straightforwardly applied to all classification problems, where DTW gives a reasonable distance measure, e.g. speech recognition or genome processing. We show experiments with this' kernel approach on the UNIPEN handwriting data, achieving results' comparable to an HMMbased technique.
Content-Based Audio Classification and Retrieval by Support Vector Machines
, 2000
"... Support vector machines (SVMs) have been recently proposed as a new learning algorithm for pattern recognition. In this paper, the SVMs with a binary tree recognition strategy are used to tackle the audio classification problem. We illustrate the potential of SVMs on a common audio database, which c ..."
Abstract
-
Cited by 28 (1 self)
- Add to MetaCart
Support vector machines (SVMs) have been recently proposed as a new learning algorithm for pattern recognition. In this paper, the SVMs with a binary tree recognition strategy are used to tackle the audio classification problem. We illustrate the potential of SVMs on a common audio database, which consists of 409 sounds of 16 classes. We compare the SVMs based classification with other popular approaches. For audio retrieval, we propose a new metric, called distance-from-boundary (DFB). When a query audio is given, the system first finds a boundary inside which the query pattern is located. Then, all the audio patterns in the database are sorted by their distances to this boundary. All boundaries are learned by the SVMs and stored together with the audio database. Experimental comparisons for audio retrieval are presented to show the superiority of this novel metric to other similarity measures.
Multicategory proximal support vector machine classifiers
- Machine Learning
, 2001
"... Abstract. Given a dataset, each element of which labeled by one of k labels, we construct by a very fast algorithm, a k-category proximal support vector machine (PSVM) classifier. Proximal support vector machines and related approaches (Fung & Mangasarian, 2001; Suykens & Vandewalle, 1999) can be in ..."
Abstract
-
Cited by 16 (0 self)
- Add to MetaCart
Abstract. Given a dataset, each element of which labeled by one of k labels, we construct by a very fast algorithm, a k-category proximal support vector machine (PSVM) classifier. Proximal support vector machines and related approaches (Fung & Mangasarian, 2001; Suykens & Vandewalle, 1999) can be interpreted as ridge regression applied to classification problems (Evgeniou, Pontil, & Poggio, 2000). Extensive computational results have shown the effectiveness of PSVM for two-class classification problems where the separating plane is constructed in time that can be as little as two orders of magnitude shorter than that of conventional support vector machines. When PSVM is applied to problems with more than two classes, the well known one-from-the-rest approach is a natural choice in order to take advantage of its fast performance. However, there is a drawback associated with this one-from-the-rest approach. The resulting two-class problems are often very unbalanced, leading in some cases to poor performance. We propose balancing the k classes and a novel Newton refinement modification to PSVM in order to deal with this problem. Computational results indicate that these two modifications preserve the speed of PSVM while often leading to significant test set improvement over a plain PSVM one-from-the-rest application. The modified approach is considerably faster than other one-from-the-rest methods that use conventional SVM formulations, while still giving comparable test set correctness.
Gabor wavelets and General Discriminant Analysis for face identification and verification
, 2007
"... ..."
An Optimization Perspective on Kernel Partial Least Squares Regression
- Advances in Learning Theory: Methods, Models and Applications
, 2003
"... Abstract. This work provides a novel derivation based on optimization for the partial least squares (PLS) algorithm for linear regression and the kernel partial least squares (K-PLS) algorithm for nonlinear regression. This derivation makes the PLS algorithm, popularly and successfully used for chem ..."
Abstract
-
Cited by 13 (4 self)
- Add to MetaCart
Abstract. This work provides a novel derivation based on optimization for the partial least squares (PLS) algorithm for linear regression and the kernel partial least squares (K-PLS) algorithm for nonlinear regression. This derivation makes the PLS algorithm, popularly and successfully used for chemometrics applications, more accessible to machine learning researchers. The work introduces Direct K-PLS, a novel way to kernelize PLS based on direct factorization of the kernel matrix. Computational results and discussion illustrate the relative merits of K-PLS and Direct K-PLS versus closely related kernel methods such as support vector machines and kernel ridge regression. ∗ This work was supported by NSF grant number IIS-9979860. Many thanks to Roman Rosipal, Nello Cristianini, and Johan Suykens for many helpful discussions on PLS and kernel methods, Sean Ekans from Concurrent Pharmaceutical for providing molecule descriptions for the Albumin data set, Curt Breneman and N. Sukumar for generating descriptors for the Albumin data, and Tony Van Gestel for an efficient Gaussian kernel
An efficient method for simplifying support vector machines
- In Proceedings of ICML’05
, 2005
"... In this paper we describe a new method to reduce the complexity of support vector machines by reducing the number of necessary support vectors included in their solutions. The reduction process iteratively selects two nearest support vectors belonging to the same class and replaces them by a newly c ..."
Abstract
-
Cited by 8 (0 self)
- Add to MetaCart
In this paper we describe a new method to reduce the complexity of support vector machines by reducing the number of necessary support vectors included in their solutions. The reduction process iteratively selects two nearest support vectors belonging to the same class and replaces them by a newly constructed vector. Through the analysis of relation between vectors in the input and feature spaces, we present the construction of new vectors that requires to find the unique maximum point of a one-variable function on the interval (0, 1), not to minimize a function of many variables with local minimums in former reduced set methods. Experimental results on real life datasets show that the proposed method is effective in reducing number of support vectors and preserving machine’s generalization performance.

