Results 1 - 10
of
21
On-line Handwriting Recognition with Support Vector Machines - A Kernel Approach
- In Proc. of the 8th IWFHR
, 2002
"... In this' contribution we describe a novel classification approach for on-line handwriting recognition. The technique combines dynamic time warping (DTW) and support vector machines (SVMs) by establishing a new SVM kernel. We call this' kernel Gaussian DTW (GDTW) ker- nel. This kernel approach haw' a ..."
Abstract
-
Cited by 60 (5 self)
- Add to MetaCart
In this' contribution we describe a novel classification approach for on-line handwriting recognition. The technique combines dynamic time warping (DTW) and support vector machines (SVMs) by establishing a new SVM kernel. We call this' kernel Gaussian DTW (GDTW) ker- nel. This kernel approach haw' a main advantage over common HMM techniques. It does not assume a model for the generarive class conditional densities. Instead, it directly addresses the problem of discrimination by creating class boundaries and thus is' less sensitive to modeling assumptions. By incorporating DTW in the kernel function, general classification problems with variable-sized sequential data can be handled. In this respect the proposed method can be straightforwardly applied to all classification problems, where DTW gives a reasonable distance measure, e.g. speech recognition or genome processing. We show experiments with this' kernel approach on the UNIPEN handwriting data, achieving results' comparable to an HMMbased technique.
A Study on Sigmoid Kernels for SVM and the Training of non-PSD Kernels by SMO-type Methods
, 2003
"... The sigmoid kernel was quite popular for support vector machines due to its origin from neural networks. However, as the kernel matrix may not be positive semidefinite (PSD), it is not widely used and the behavior is unknown. In this paper, we analyze such non-PSD kernels through the point of view o ..."
Abstract
-
Cited by 41 (4 self)
- Add to MetaCart
The sigmoid kernel was quite popular for support vector machines due to its origin from neural networks. However, as the kernel matrix may not be positive semidefinite (PSD), it is not widely used and the behavior is unknown. In this paper, we analyze such non-PSD kernels through the point of view of separability. Based on the investigation of parameters in different ranges, we show that for some parameters, the kernel matrix is conditionally positive definite (CPD), a property which explains its practical viability. Experiments are given to illustrate our analysis. Finally, we discuss how to solve the non-convex dual problems by SMO-type decomposition methods. Suitable modifications for any symmetric non-PSD kernel matrices are proposed with convergence proofs.
2004), Learning with distance substitution kernels
- in Pattern Rcognition - Proc. of the 26th DAGM Symposium
"... Abstract. During recent years much effort has been spent in incorporating problem specific a-priori knowledge into kernel methods for machine learning. A common example is a-priori knowledge given by a distance measure between objects. A simple but effective approach for kernel construction consists ..."
Abstract
-
Cited by 19 (1 self)
- Add to MetaCart
Abstract. During recent years much effort has been spent in incorporating problem specific a-priori knowledge into kernel methods for machine learning. A common example is a-priori knowledge given by a distance measure between objects. A simple but effective approach for kernel construction consists of substituting the Euclidean distance in ordinary kernel functions by the problem specific distance measure. We formalize this distance substitution procedure and investigate theoretical and empirical effects. In particular we state criteria for definiteness of the resulting kernels. We demonstrate the wide applicability by solving several classification tasks with SVMs. Regularization of the kernel matrices can additionally increase the recognition accuracy. 1
An analysis of transformation on non-positive semidefinite similarity matrix for kernel machines
- Proceedings of the 22nd International Conference on Machine Learning
, 2005
"... Many emerging applications formulate nonpositive semidefinite similarity matrices, and hence cannot fit into the framework of kernel machines. A popular approach to this problem is to transform the spectrum of the similarity matrix so as to generate a positive semidefinite kernel matrix. In this pap ..."
Abstract
-
Cited by 11 (2 self)
- Add to MetaCart
Many emerging applications formulate nonpositive semidefinite similarity matrices, and hence cannot fit into the framework of kernel machines. A popular approach to this problem is to transform the spectrum of the similarity matrix so as to generate a positive semidefinite kernel matrix. In this paper, we present an analytical framework to explore four representative transformation methods: denoise, flip, diffusion, and shift. Theoretically, we interpret each transformation and analyze its influence on classification using kernel machines. Moreover, when situations arise where the test data are not available during transformation, we propose an efficient algorithm to address the problem of updating the cross-similarity matrix between test and training data. Extensive experiments have been conducted to evaluate the performance of these methods on several realworld (dis)similarity matrices with semantic meanings. 1.
Incorporating prior knowledge in support vector machines for classification: A review
- Grenoble University
, 1992
"... For classification, support vector machines (SVMs) have recently been introduced and quickly became the state of the art. Now, the incorporation of prior knowledge into SVMs is the key element that allows to increase the performance in many applications. This paper gives a review of the current stat ..."
Abstract
-
Cited by 9 (1 self)
- Add to MetaCart
For classification, support vector machines (SVMs) have recently been introduced and quickly became the state of the art. Now, the incorporation of prior knowledge into SVMs is the key element that allows to increase the performance in many applications. This paper gives a review of the current state of research regarding the incorporation of two general types of prior knowledge into SVMs for classification. The particular forms of prior knowledge considered here are presented in two main groups: class-invariance and knowledge on the data. The first one includes invariances to transformations, to permutations and in domains of input space, whereas the second one contains knowledge on unlabeled data, the imbalance of the training set or the quality of the data. The methods are then described and classified in the three categories that have been used in literature: sample methods based on the modification of the training data, kernel methods based on the modification of the kernel and optimization methods based on the modification of the problem formulation. A recent method, developed for support vector regression, considers prior knowledge on arbitrary regions of the input space. It is exposed here when applied to the classification case. A discussion is then conducted to regroup sample and optimization methods under a regularization framework.
Invariance in Kernel Methods by HaarIntegration Kernels
- SCIA 2005, Scandinavian Conference on Image Analysis
, 2005
"... Abstract. We address the problem of incorporating transformation invariance in kernels for pattern analysis with kernel methods. We introduce a new class of kernels by so called Haar-integration over transformations. This results in kernel functions, which are positive definite, have adjustable inva ..."
Abstract
-
Cited by 7 (3 self)
- Add to MetaCart
Abstract. We address the problem of incorporating transformation invariance in kernels for pattern analysis with kernel methods. We introduce a new class of kernels by so called Haar-integration over transformations. This results in kernel functions, which are positive definite, have adjustable invariance, can capture simultaneously various continuous or discrete transformations and are applicable in various kernel methods. We demonstrate these properties on toy examples and experimentally investigate the real-world applicability on an image recognition task with support vector machines. For certain transformations remarkable complexity reduction is demonstrated. The kernels hereby achieve state-of-the-art results, while omitting drawbacks of existing methods. 1
Invariant kernel functions for pattern analysis and machine learning
- Machine Learning
, 2007
"... In many learning problems prior knowledge about pattern variations can be formalized and beneficially incorporated into the analysis system. The corresponding notion of invariance is commonly used in conceptionally different ways. We propose a more distinguishing treatment in particular in the activ ..."
Abstract
-
Cited by 6 (1 self)
- Add to MetaCart
In many learning problems prior knowledge about pattern variations can be formalized and beneficially incorporated into the analysis system. The corresponding notion of invariance is commonly used in conceptionally different ways. We propose a more distinguishing treatment in particular in the active field of kernel methods for machine learning and pattern analysis. Additionally, the fundamental relation of invariant kernels and traditional invariant pattern analysis by means of invariant representations will be clarified. After addressing these conceptional questions, we focus on practical aspects and present two generic approaches for constructing invariant kernels. The first approach is based on a technique called invariant integration. The second approach builds on invariant distances. In principle, our approaches support general transformations in particular covering discrete and non-group or even an infinite number of pattern-transformations. Additionally, both enable a smooth interpolation between invariant and non-invariant pattern analysis, i.e. they are a covering general framework. The wide applicability and various possible benefits of invariant kernels are demonstrated in different kernel methods.
Minimum distance between pattern transformation manifolds: Algorithm and applications
- IEEE Transactions on Pattern Analysis and Machine Intelligence
"... Abstract—Transformation invariance is an important property in pattern recognition, where different observations of the same object typically receive the same label. This paper focuses on a transformation-invariant distance measure that represents the minimum distance between the transformation mani ..."
Abstract
-
Cited by 6 (5 self)
- Add to MetaCart
Abstract—Transformation invariance is an important property in pattern recognition, where different observations of the same object typically receive the same label. This paper focuses on a transformation-invariant distance measure that represents the minimum distance between the transformation manifolds spanned by patterns of interest. Since these manifolds are typically nonlinear, the computation of the manifold distance (MD) becomes a nonconvex optimization problem. We propose representing a pattern of interest as a linear combination of a few geometric functions extracted from a structured and redundant basis. Transforming the pattern results in the transformation of its constituent parts. We show that, when the transformation is restricted to a synthesis of translations, rotations, and isotropic scalings, such a pattern representation results in a closed-form expression of the manifold equation with respect to the transformation parameters. The MD computation can then be formulated as a minimization problem whose objective function is expressed as the difference of convex functions (DC). This interesting property permits optimally solving the optimization problem with DC programming solvers that are globally convergent. We present experimental evidence which shows that our method is able to find the globally optimal solution, outperforming existing methods that yield suboptimal solutions. Index Terms—Transformation invariance, pattern manifolds, sparse approximations. Ç 1
Tangent vector kernels for invariant image classification with SVMs
- International Conference on Pattern Recognition
, 2003
"... This paper presents an application of the general sample-to-object approach to the problem of invariant image classification. The approach results in defining new SVM kernels based on tangent vectors that take into account prior information on known invariances. Real data of face images are used for ..."
Abstract
-
Cited by 5 (1 self)
- Add to MetaCart
This paper presents an application of the general sample-to-object approach to the problem of invariant image classification. The approach results in defining new SVM kernels based on tangent vectors that take into account prior information on known invariances. Real data of face images are used for experiments. The presented approach integrates virtual sample and tangent distance methods. We observe a significant increase in performance with respect to standard approaches. The experiments also illustrate (as expected) that prior knowledge becomes more important as the amount of training data decreases. 1.

