Results 1 - 10
of
44
The pyramid match kernel: Discriminative classification with sets of image features
- In ICCV
, 2005
"... Discriminative learning is challenging when examples are sets of features, and the sets vary in cardinality and lack any sort of meaningful ordering. Kernel-based classification methods can learn complex decision boundaries, but a kernel over unordered set inputs must somehow solve for correspondenc ..."
Abstract
-
Cited by 225 (19 self)
- Add to MetaCart
Discriminative learning is challenging when examples are sets of features, and the sets vary in cardinality and lack any sort of meaningful ordering. Kernel-based classification methods can learn complex decision boundaries, but a kernel over unordered set inputs must somehow solve for correspondences – generally a computationally expensive task that becomes impractical for large set sizes. We present a new fast kernel function which maps unordered feature sets to multi-resolution histograms and computes a weighted histogram intersection in this space. This “pyramid match ” computation is linear in the number of features, and it implicitly finds correspondences based on the finest resolution histogram cell where a matched pair first appears. Since the kernel does not penalize the presence of extra features, it is robust to clutter. We show the kernel function is positive-definite, making it valid for use in learning algorithms whose optimal solutions are guaranteed only for Mercer kernels. We demonstrate our algorithm on object recognition tasks and show it to be accurate and dramatically faster than current approaches. 1.
Local features and kernels for classification of texture and object categories: a comprehensive study
- International Journal of Computer Vision
, 2007
"... Recently, methods based on local image features have shown promise for texture and object recognition tasks. This paper presents a large-scale evaluation of an approach that represents images as distributions (signatures or histograms) of features extracted from a sparse set of keypoint locations an ..."
Abstract
-
Cited by 211 (21 self)
- Add to MetaCart
Recently, methods based on local image features have shown promise for texture and object recognition tasks. This paper presents a large-scale evaluation of an approach that represents images as distributions (signatures or histograms) of features extracted from a sparse set of keypoint locations and learns a Support Vector Machine classifier with kernels based on two effective measures for comparing distributions, the Earth Mover’s Distance and the χ 2 distance. We first evaluate the performance of our approach with different keypoint detectors and descriptors, as well as different kernels and classifiers. We then conduct a comparative evaluation with several state-of-the-art recognition methods on four texture and five object databases. On most of these databases, our implementation exceeds the best reported results and achieves comparable performance on the rest. Finally, we investigate the influence of background correlations on recognition performance via extensive tests on the PASCAL database, for which ground-truth object localization information is available. Our experiments demonstrate that image representations based on distributions of local features are surprisingly effective for classification of texture and object images under challenging real-world conditions, including significant intra-class variations and substantial background clutter.
Diffusion Kernels on Statistical Manifolds
, 2004
"... A family of kernels for statistical learning is introduced that exploits the geometric structure of statistical models. The kernels are based on the heat equation on the Riemannian manifold defined by the Fisher information metric associated with a statistical family, and generalize the Gaussian ker ..."
Abstract
-
Cited by 63 (5 self)
- Add to MetaCart
A family of kernels for statistical learning is introduced that exploits the geometric structure of statistical models. The kernels are based on the heat equation on the Riemannian manifold defined by the Fisher information metric associated with a statistical family, and generalize the Gaussian kernel of Euclidean space. As an important special case, kernels based on the geometry of multinomial families are derived, leading to kernel-based learning algorithms that apply naturally to discrete data. Bounds on covering numbers and Rademacher averages for the kernels are proved using bounds on the eigenvalues of the Laplacian on Riemannian manifolds. Experimental results are presented for document classification, for which the use of multinomial geometry is natural and well motivated, and improvements are obtained over the standard use of Gaussian or linear kernels, which have been the standard for text classification.
Support vector machines using GMM supervectors for speaker verification
- IEEE Signal Processing Letters
, 2006
"... pretations, conclusions, and recommendations are those of the authors and are not necessarily endorsed by the United States ..."
Abstract
-
Cited by 58 (1 self)
- Add to MetaCart
pretations, conclusions, and recommendations are those of the authors and are not necessarily endorsed by the United States
Probability product kernels
- Journal of Machine Learning Research
, 2004
"... The advantages of discriminative learning algorithms and kernel machines are combined with generative modeling using a novel kernel between distributions. In the probability product kernel, data points in the input space are mapped to distributions over the sample space and a general inner product i ..."
Abstract
-
Cited by 58 (7 self)
- Add to MetaCart
The advantages of discriminative learning algorithms and kernel machines are combined with generative modeling using a novel kernel between distributions. In the probability product kernel, data points in the input space are mapped to distributions over the sample space and a general inner product is then evaluated as the integral of the product of pairs of distributions. The kernel is straightforward to evaluate for all exponential family models such as multinomials and Gaussians and yields interesting nonlinear kernels. Furthermore, the kernel is computable in closed form for latent distributions such as mixture models, hidden Markov models and linear dynamical systems. For intractable models, such as switching linear dynamical systems, structured mean-field approximations can be brought to bear on the kernel evaluation. For general distributions, even if an analytic expression for the kernel is not feasible, we show a straightforward sampling method to evaluate it. Thus, the kernel permits discriminative learning methods, including support vector machines, to exploit the properties, metrics and invariances of the generative models we infer from each datum. Experiments are shown using multinomial models for text, hidden Markov models for biological data sets and linear dynamical systems for time series data.
The pyramid match kernel: Efficient learning with sets of features
- Journal of Machine Learning Research
, 2007
"... In numerous domains it is useful to represent a single example by the set of the local features or parts that comprise it. However, this representation poses a challenge to many conventional machine learning techniques, since sets may vary in cardinality and elements lack a meaningful ordering. Kern ..."
Abstract
-
Cited by 55 (6 self)
- Add to MetaCart
In numerous domains it is useful to represent a single example by the set of the local features or parts that comprise it. However, this representation poses a challenge to many conventional machine learning techniques, since sets may vary in cardinality and elements lack a meaningful ordering. Kernel methods can learn complex functions, but a kernel over unordered set inputs must somehow solve for correspondences—generally a computationally expensive task that becomes impractical for large set sizes. We present a new fast kernel function called the pyramid match that measures partial match similarity in time linear in the number of features. The pyramid match maps unordered feature sets to multi-resolution histograms and computes a weighted histogram intersection in order to find implicit correspondences based on the finest resolution histogram cell where a matched pair first appears. We show the pyramid match yields a Mercer kernel, and we prove bounds on its error relative to the optimal partial matching cost. We demonstrate our algorithm on both classification and regression tasks, including object recognition, 3-D human pose inference, and time of publication estimation for documents, and we show that the proposed method is accurate and significantly more efficient than current approaches.
SVM based speaker verification using a GMM supervector kernel and NAP variability compensation
- in Proceedings of ICASSP, 2006
"... Gaussian mixture models with universal backgrounds (UBMs) have become the standard method for speaker recognition. Typically, a speaker model is constructed by MAP adaptation of the means of the UBM. A GMM supervector is constructed by stacking the means of the adapted mixture components. A recent d ..."
Abstract
-
Cited by 53 (3 self)
- Add to MetaCart
Gaussian mixture models with universal backgrounds (UBMs) have become the standard method for speaker recognition. Typically, a speaker model is constructed by MAP adaptation of the means of the UBM. A GMM supervector is constructed by stacking the means of the adapted mixture components. A recent discovery is that latent factor analysis of this GMM supervector is an effective method for variability compensation. We consider this GMM supervector in the context of support vector machines. We construct a support vector machine kernel using the GMM supervector. We show similarities based on this kernel between the method of SVM nuisance attribute projection (NAP) and the recent results in latent factor analysis. Experiments on a NIST SRE 2005 corpus demonstrate the effectiveness of the new technique. 1.
Probabilistic kernels for the classification of auto-regressive visual processes
- In IEEE Conference on Computer Vision and Pattern Recognition
, 2005
"... We present a framework for the classification of visual processes that are best modeled with spatio-temporal autoregressive models. The new framework combines the modeling power of a family of models known as dynamic textures and the generalization guarantees, for classification, of the support vect ..."
Abstract
-
Cited by 33 (13 self)
- Add to MetaCart
We present a framework for the classification of visual processes that are best modeled with spatio-temporal autoregressive models. The new framework combines the modeling power of a family of models known as dynamic textures and the generalization guarantees, for classification, of the support vector machine classifier. This combination is achieved by the derivation of a new probabilistic kernel based on the Kullback-Leibler divergence (KL) between Gauss-Markov processes. In particular, we derive the KL-kernel for dynamic textures in both 1) the image space, which describes both the motion and appearance components of the spatio-temporal process, and 2) the hidden state space, which describes the temporal component alone. Together, the two kernels cover a large variety of video classification problems, including the cases where classes can differ in both appearance and motion and the cases where appearance is similar for all classes and only motion is discriminant. Experimental evaluation on two databases shows that the new classifier achieves superior performance over existing solutions. 1.
Mercer kernels for object recognition with local features
- In IEEE CVPR
, 2005
"... A new class of kernels for object recognition based on local image feature representations are introduced in this paper. Formal proofs are given to show that these kernels satisfy the Mercer condition. In addition, multiple types of local features and semilocal constraints are incorporated. Experime ..."
Abstract
-
Cited by 25 (0 self)
- Add to MetaCart
A new class of kernels for object recognition based on local image feature representations are introduced in this paper. Formal proofs are given to show that these kernels satisfy the Mercer condition. In addition, multiple types of local features and semilocal constraints are incorporated. Experimental results of SVM classifiers coupled with the proposed kernels are reported on recognition tasks with the COIL-100 database and compared with existing methods. The proposed kernels achieved competitive performance and were robust to changes in object configurations and image degradations.
Divergence estimation of continuous distributions based on data-dependent partitions
- IEEE Transactions on Information Theory
, 2005
"... Abstract—We present a universal estimator of the divergence @ A for two arbitrary continuous distributions and satisfying certain regularity conditions. This algorithm, which observes independent and identically distributed (i.i.d.) samples from both and, is based on the estimation of the Radon–Niko ..."
Abstract
-
Cited by 20 (3 self)
- Add to MetaCart
Abstract—We present a universal estimator of the divergence @ A for two arbitrary continuous distributions and satisfying certain regularity conditions. This algorithm, which observes independent and identically distributed (i.i.d.) samples from both and, is based on the estimation of the Radon–Nikodym derivative � � via a data-dependent partition of the observation space. Strong convergence of this estimator is proved with an empirically equivalent segmentation of the space. This basic estimator is further improved by adaptive partitioning schemes and by bias correction. The application of the algorithms to data with memory is also investigated. In the simulations, we compare our estimators with the direct plug-in estimator and estimators based on other partitioning approaches. Experimental results show that our methods achieve the best convergence performance in most of the tested cases. Index Terms—Bias correction, data-dependent partition, divergence, Radon–Nikodym derivative, stationary and ergodic data, universal estimation of information measures. I.

