Results 1 - 10
of
76
Shape Matching and Object Recognition Using Shape Contexts
- IEEE Transactions on Pattern Analysis and Machine Intelligence
, 2001
"... We present a novel approach to measuring similarity between shapes and exploit it for object recognition. In our framework, the measurement of similarity is preceded by (1) solv- ing for correspondences between points on the two shapes, (2) using the correspondences to estimate an aligning transform ..."
Abstract
-
Cited by 850 (18 self)
- Add to MetaCart
We present a novel approach to measuring similarity between shapes and exploit it for object recognition. In our framework, the measurement of similarity is preceded by (1) solv- ing for correspondences between points on the two shapes, (2) using the correspondences to estimate an aligning transform. In order to solve the correspondence problem, we attach a descriptor, the shape context, to each point. The shape context at a reference point captures the distribution of the remaining points relative to it, thus offering a globally discriminative characterization. Corresponding points on two similar shapes will have similar shape con- texts, enabling us to solve for correspondences as an optimal assignment problem. Given the point correspondences, we estimate the transformation that best aligns the two shapes; reg- ularized thin plate splines provide a flexible class of transformation maps for this purpose. The dissimilarity between the two shapes is computed as a sum of matching errors between corresponding points, together with a term measuring the magnitude of the aligning trans- form. We treat recognition in a nearest-neighbor classification framework as the problem of finding the stored prototype shape that is maximally similar to that in the image. Results are presented for silhouettes, trademarks, handwritten digits and the COIL dataset.
Learning with local and global consistency
- Advances in Neural Information Processing Systems 16
, 2004
"... We consider the general problem of learning from labeled and unlabeled data, which is often called semi-supervised learning or transductive inference. A principled approach to semi-supervised learning is to design a classifying function which is sufficiently smooth with respect to the intrinsic stru ..."
Abstract
-
Cited by 286 (20 self)
- Add to MetaCart
We consider the general problem of learning from labeled and unlabeled data, which is often called semi-supervised learning or transductive inference. A principled approach to semi-supervised learning is to design a classifying function which is sufficiently smooth with respect to the intrinsic structure collectively revealed by known labeled and unlabeled points. We present a simple algorithm to obtain such a smooth solution. Our method yields encouraging experimental results on a number of classification problems and demonstrates effective use of unlabeled data. 1
An introduction to kernel-based learning algorithms
- IEEE TRANSACTIONS ON NEURAL NETWORKS
, 2001
"... This paper provides an introduction to support vector machines (SVMs), kernel Fisher discriminant analysis, and ..."
Abstract
-
Cited by 279 (46 self)
- Add to MetaCart
This paper provides an introduction to support vector machines (SVMs), kernel Fisher discriminant analysis, and
A fast learning algorithm for deep belief nets
- Neural Computation
, 2006
"... We show how to use “complementary priors ” to eliminate the explaining away effects that make inference difficult in densely-connected belief nets that have many hidden layers. Using complementary priors, we derive a fast, greedy algorithm that can learn deep, directed belief networks one layer at a ..."
Abstract
-
Cited by 241 (40 self)
- Add to MetaCart
We show how to use “complementary priors ” to eliminate the explaining away effects that make inference difficult in densely-connected belief nets that have many hidden layers. Using complementary priors, we derive a fast, greedy algorithm that can learn deep, directed belief networks one layer at a time, provided the top two layers form an undirected associative memory. The fast, greedy algorithm is used to initialize a slower learning procedure that fine-tunes the weights using a contrastive version of the wake-sleep algorithm. After fine-tuning, a network with three hidden layers forms a very good generative model of the joint distribution of handwritten digit images and their labels. This generative model gives better digit classification than the best discriminative learning algorithms. The low-dimensional manifolds on which the digits lie are modelled by long ravines in the free-energy landscape of the top-level associative memory and it is easy to explore these ravines by using the directed connections to display what the associative memory has in mind. 1
Support Vector Machines: Hype or Hallelujah?
- SIGKDD Explorations
, 2003
"... Support Vector Machines (SVMs) and related kernel methods have become increasingly popular tools for data mining tasks such as classification, regression, and novelty detection. The goal of this tutorial is to provide an intuitive explanation of SVMs from a geometric perspective. The classification ..."
Abstract
-
Cited by 65 (0 self)
- Add to MetaCart
Support Vector Machines (SVMs) and related kernel methods have become increasingly popular tools for data mining tasks such as classification, regression, and novelty detection. The goal of this tutorial is to provide an intuitive explanation of SVMs from a geometric perspective. The classification problem is used to investigate the basic concepts behind SVMs and to examine their strengths and weaknesses from a data mining perspective. While this overview is not comprehensive, it does provide resources for those interested in further exploring SVMs.
On-line Handwriting Recognition with Support Vector Machines - A Kernel Approach
- In Proc. of the 8th IWFHR
, 2002
"... In this' contribution we describe a novel classification approach for on-line handwriting recognition. The technique combines dynamic time warping (DTW) and support vector machines (SVMs) by establishing a new SVM kernel. We call this' kernel Gaussian DTW (GDTW) ker- nel. This kernel approach haw' a ..."
Abstract
-
Cited by 60 (5 self)
- Add to MetaCart
In this' contribution we describe a novel classification approach for on-line handwriting recognition. The technique combines dynamic time warping (DTW) and support vector machines (SVMs) by establishing a new SVM kernel. We call this' kernel Gaussian DTW (GDTW) ker- nel. This kernel approach haw' a main advantage over common HMM techniques. It does not assume a model for the generarive class conditional densities. Instead, it directly addresses the problem of discrimination by creating class boundaries and thus is' less sensitive to modeling assumptions. By incorporating DTW in the kernel function, general classification problems with variable-sized sequential data can be handled. In this respect the proposed method can be straightforwardly applied to all classification problems, where DTW gives a reasonable distance measure, e.g. speech recognition or genome processing. We show experiments with this' kernel approach on the UNIPEN handwriting data, achieving results' comparable to an HMMbased technique.
Learning over Sets using Kernel Principal Angles
- Journal of Machine Learning Research
, 2003
"... We consider the problem of learning with instances defined over a space of sets of vectors. We derive a new positive definite kernel f (A,B) defined over pairs of matrices A,B based on the concept of principal angles between two linear subspaces. We show that the principal angles can be recovered ..."
Abstract
-
Cited by 60 (2 self)
- Add to MetaCart
We consider the problem of learning with instances defined over a space of sets of vectors. We derive a new positive definite kernel f (A,B) defined over pairs of matrices A,B based on the concept of principal angles between two linear subspaces. We show that the principal angles can be recovered using only inner-products between pairs of column vectors of the input matrices thereby allowing the original column vectors of A,B to be mapped onto arbitrarily high-dimensional feature spaces.
Online Bayes Point Machines
"... We present a new and simple algorithm for learning large margin classi ers that works in a truly online manner. The algorithm generates a linear classi er by averaging the weights associated with several perceptron-like algorithms run in parallel in order to approximate the Bayes point. A rand ..."
Abstract
-
Cited by 55 (2 self)
- Add to MetaCart
We present a new and simple algorithm for learning large margin classi ers that works in a truly online manner. The algorithm generates a linear classi er by averaging the weights associated with several perceptron-like algorithms run in parallel in order to approximate the Bayes point. A random subsample of the incoming data stream is used to ensure diversity in the perceptron solutions. We experimentally study the algorithm's performance on online and batch learning settings.
An empirical evaluation of deep architectures on problems with many factors of variation
- In ICML
, 2007
"... Recently, several learning algorithms relying on models with deep architectures have been proposed. Though they have demonstrated impressive performance, to date, they have only been evaluated on relatively simple problems such as digit recognition in a controlled environment, for which many machine ..."
Abstract
-
Cited by 45 (10 self)
- Add to MetaCart
Recently, several learning algorithms relying on models with deep architectures have been proposed. Though they have demonstrated impressive performance, to date, they have only been evaluated on relatively simple problems such as digit recognition in a controlled environment, for which many machine learning algorithms already report reasonable results. Here, we present a series of experiments which indicate that these models show promise in solving harder learning problems that exhibit many factors of variation. These models are compared with well-established algorithms such as Support Vector Machines and single hidden-layer feed-forward neural networks. 1.
Thin Junction Trees
- Advances in Neural Information Processing Systems 14
, 2001
"... We present an algorithm that induces a class of models with thin junction trees---models that are characterized by an upper bound on the size of the maximal cliques of their triangulated graph. By ensuring that the junction tree is thin, inference in our models remains tractable throughout the l ..."
Abstract
-
Cited by 41 (0 self)
- Add to MetaCart
We present an algorithm that induces a class of models with thin junction trees---models that are characterized by an upper bound on the size of the maximal cliques of their triangulated graph. By ensuring that the junction tree is thin, inference in our models remains tractable throughout the learning process. This allows both an efficient implementation of an iterative scaling parameter estimation algorithm and also ensures that inference can be performed efficiently with the final model. We illustrate the approach with applications in handwritten digit recognition and DNA splice site detection.

