Results 1  10
of
16
A graph cut approach to image segmentation in tensor space
 In Workshop on Component Analysis Methods (CVPR
, 2007
"... This paper proposes a novel method to apply the standard graph cut technique to segmenting multimodal tensor valued images. The Riemannian nature of the tensor space is explicitly taken into account by first mapping the data to a Euclidean space where nonparametric kernel density estimates of the re ..."
Abstract

Cited by 14 (4 self)
 Add to MetaCart
This paper proposes a novel method to apply the standard graph cut technique to segmenting multimodal tensor valued images. The Riemannian nature of the tensor space is explicitly taken into account by first mapping the data to a Euclidean space where nonparametric kernel density estimates of the regional distributions may be calculated from user initialized regions. These distributions are then used as regional priors in calculating graph edge weights. Hence this approach utilizes the true variation of the tensor data by respecting its Riemannian structure in calculating distances when forming probability distributions. Further, the nonparametric model generalizes to arbitrary tensor distribution unlike the Gaussian assumption made in previous works. Casting the segmentation problem in a graph cut framework yields a segmentation robust with respect to initialization on the data tested. 1.
Shared Kernel Information Embedding for Discriminative Inference
"... Latent Variable Models (LVM), like the SharedGPLVM and the Spectral Latent Variable Model, help mitigate overfitting when learning discriminative methods from small or moderately sized training sets. Nevertheless, existing methods suffer from several problems: 1) complexity; 2) the lack of explicit ..."
Abstract

Cited by 11 (3 self)
 Add to MetaCart
Latent Variable Models (LVM), like the SharedGPLVM and the Spectral Latent Variable Model, help mitigate overfitting when learning discriminative methods from small or moderately sized training sets. Nevertheless, existing methods suffer from several problems: 1) complexity; 2) the lack of explicit mappings to and from the latent space; 3) an inability to cope with multimodality; and 4) the lack of a welldefined density over the latent space. We propose a LVM called the Shared Kernel Information Embedding (sKIE). It defines a coherent density over a latent space and multiple input/output spaces (e.g., image features and poses), and it is easy to condition on a latent state, or on combinations of the input/output states. Learning is quadratic, and it works well on small datasets. With datasets too large to learn a coherent global model, one can use sKIE to learn local online models. sKIE permits missing data during inference, and partially labelled data during learning. We use sKIE for human pose inference. 1.
Efficient subset selection via the kernelized Rényi distance
"... With improved sensors, the amount of data available in many vision problems has increased dramatically and allows the use of sophisticated learning algorithms to perform inference on the data. However, since these algorithms scale with data size, pruning the data is sometimes necessary. The pruning ..."
Abstract

Cited by 5 (5 self)
 Add to MetaCart
With improved sensors, the amount of data available in many vision problems has increased dramatically and allows the use of sophisticated learning algorithms to perform inference on the data. However, since these algorithms scale with data size, pruning the data is sometimes necessary. The pruning procedure must be statistically valid and a representative subset of the data must be selected without introducing selection bias. Information theoretic measures have been used for sampling the data, retaining its original information content. We propose an efficient Rényi entropy based subset selection algorithm. The algorithm is first validated and then applied to two sample applications where machine learning and data pruning are used. In the first application, Gaussian process regression is used to learn object pose. Here it is shown that the algorithm combined with the subset selection is significantly more efficient. In the second application, our subset selection approach is used to replace vector quantization in a standard object recognition algorithm, and improvements are shown. 1.
Data Analysis and Representation on a General Domain using Eigenfunctions of Laplacian
, 2007
"... We propose a new method to analyze and represent data recorded on a domain of general shape in R d by computing the eigenfunctions of Laplacian defined over there and expanding the data into these eigenfunctions. Instead of directly solving the eigenvalue problem on such a domain via the Helmholtz ..."
Abstract

Cited by 4 (0 self)
 Add to MetaCart
We propose a new method to analyze and represent data recorded on a domain of general shape in R d by computing the eigenfunctions of Laplacian defined over there and expanding the data into these eigenfunctions. Instead of directly solving the eigenvalue problem on such a domain via the Helmholtz equation (which can be quite complicated and costly), we find the integral operator commuting with the Laplacian and diagonalize that operator. Although our eigenfunctions satisfy neither the Dirichlet nor the Neumann boundary condition, computing our eigenfunctions via the integral operator is simple and has a potential to utilize modern fast algorithms to accelerate the computation. We also show that our method is better suited for small sample data than the KarhunenLoève Transform/Principal Component Analysis. In fact, our eigenfunctions depend only on the shape of the domain, not the statistics of the data. As a further application, we demonstrate the use of our Laplacian eigenfunctions for solving the heat equation on a complicated domain.
KERNELIZED RÉNYI DISTANCE FOR SPEAKER RECOGNITION
"... Speaker recognition systems classify a test signal as a speaker or an imposter by evaluating a matching score between input and reference signals. We propose a new information theoretic approach for computation of the matching score using the Rényi entropy. The proposed entropic distance, the Kernel ..."
Abstract

Cited by 3 (3 self)
 Add to MetaCart
Speaker recognition systems classify a test signal as a speaker or an imposter by evaluating a matching score between input and reference signals. We propose a new information theoretic approach for computation of the matching score using the Rényi entropy. The proposed entropic distance, the Kernelized Rényi distance (KRD), is formulated in a nonparametric way and the resulting measure is efficiently evaluated in a parallelized fashion on a graphical processor. The distance is then adapted as a scoring function and its performance compared with other popular scoring approaches in a speaker identification and speaker verification framework. Index Terms — Rényi entropy, similarity score, speaker recognition, GPU, fast algorithms 1.
GPUML: Graphical processors for speeding up kernel machines
"... Algorithms based on kernel methods play a central role in statistical machine learning. At their core are a number of linear algebra operations on matrices of kernel functions which take as arguments the training and testing data. These range from the simple matrixvector product, to more complex ma ..."
Abstract

Cited by 3 (2 self)
 Add to MetaCart
Algorithms based on kernel methods play a central role in statistical machine learning. At their core are a number of linear algebra operations on matrices of kernel functions which take as arguments the training and testing data. These range from the simple matrixvector product, to more complex matrix decompositions, and iterative formulations of these. Often the algorithms scale quadratically or cubically, both in memory and operational complexity, and as data sizes increase, kernel methods scale poorly. We use parallelized approaches on a multicore graphical processor (GPU) to partially address this lack of scalability. GPUs are used to scale three different classes of problems, a simple kernelmatrixvector product, iterative solution of linear systems of kernel function and QR and Cholesky decomposition of kernel matrices. Application of these accelerated approaches in scaling several kernel based learning approaches are shown, and in each case substantial speedups are obtained. The core software is released as an open source package, GPUML.
PartialHessian Strategies for Fast Learning of Nonlinear Embeddings
"... Stochastic neighbor embedding (SNE) and related nonlinear manifold learning algorithms achieve highquality lowdimensional representations of similarity data, but are notoriously slow to train. We propose a generic formulation of embedding algorithms that includes SNE and other existing algorithms, ..."
Abstract

Cited by 2 (1 self)
 Add to MetaCart
Stochastic neighbor embedding (SNE) and related nonlinear manifold learning algorithms achieve highquality lowdimensional representations of similarity data, but are notoriously slow to train. We propose a generic formulation of embedding algorithms that includes SNE and other existing algorithms, and study their relation with spectral methods and graph Laplacians. This allows us to define several partialHessian optimization strategies, characterize their global and local convergence, and evaluate them empirically. We achieve up to two orders of magnitude speedup over existing training methods with a strategy (which we call the spectral direction)
Scalable machine learning for massive datasets: Fast summation algorithms
, 2007
"... Huge data sets containing millions of training examples with a large number of attributes are relatively easy to gather. However one of the bottlenecks for successful inference is the computational complexity of machine learning algorithms. Most stateoftheart nonparametric machine learning algor ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
Huge data sets containing millions of training examples with a large number of attributes are relatively easy to gather. However one of the bottlenecks for successful inference is the computational complexity of machine learning algorithms. Most stateoftheart nonparametric machine learning algorithms have a computational complexity of either O(N²) or O(N³), where N is the number of training examples. This has seriously restricted the use of massive data sets. The bottleneck computational primitive at the heart of various algorithms is the multiplication of a structured matrix with a vector, which we refer to as matrixvector product (MVP) primitive. The goal of my thesis is to speedup up some of these MVP primitives by fast approximate algorithms that scale as O(N) and also provide high accuracy guarantees. I use ideas from computational physics, scientific computing, and computational geometry to design these algorithms. The proposed algorithms have been applied to speedup kernel density estimation, optimal bandwidth estimation, projection pursuit, Gaussian process regression, implicit surface fitting, and ranking.
On the Dangers of CrossValidation. An Experimental Evaluation
"... Cross validation allows models to be tested using the full training set by means of repeated resampling; thus, maximizing the total number of points used for testing and potentially, helping to protect against overfitting. Improvements in computational power, recent reductions in the (computational) ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
Cross validation allows models to be tested using the full training set by means of repeated resampling; thus, maximizing the total number of points used for testing and potentially, helping to protect against overfitting. Improvements in computational power, recent reductions in the (computational) cost of classification algorithms, and the development of closedform solutions (for performing cross validation in certain classes of learning algorithms) makes it possible to test thousand or millions of variants of learning models on the data. Thus, it is now possible to calculate cross validation performance on a much larger number of tuned models than would have been possible otherwise. However, we empirically show how under such large number of models the risk for overfitting increases and the performance estimated by cross validation is no longer an effective estimate of generalization; hence, this paper provides an empirical reminder of the dangers of cross validation. We use a closedform solution that makes this evaluation possible for the cross validation problem of interest. In addition, through extensive experiments we expose and discuss the effects of the overuse/misuse of cross validation in various aspects, including model selection, feature selection, and data dimensionality. This is illustrated on synthetic, benchmark, and realworld data sets. 1