Results 1  10
of
49
Mixtures of Probabilistic Principal Component Analysers
, 1998
"... Principal component analysis (PCA) is one of the most popular techniques for processing, compressing and visualising data, although its effectiveness is limited by its global linearity. While nonlinear variants of PCA have been proposed, an alternative paradigm is to capture data complexity by a com ..."
Abstract

Cited by 398 (6 self)
 Add to MetaCart
Principal component analysis (PCA) is one of the most popular techniques for processing, compressing and visualising data, although its effectiveness is limited by its global linearity. While nonlinear variants of PCA have been proposed, an alternative paradigm is to capture data complexity by a combination of local linear PCA projections. However, conventional PCA does not correspond to a probability density, and so there is no unique way to combine PCA models. Previous attempts to formulate mixture models for PCA have therefore to some extent been ad hoc. In this paper, PCA is formulated within a maximumlikelihood framework, based on a specific form of Gaussian latent variable model. This leads to a welldefined mixture model for probabilistic principal component analysers, whose parameters can be determined using an EM algorithm. We discuss the advantages of this model in the context of clustering, density modelling and local dimensionality reduction, and we demonstrate its applicat...
A Unifying Review of Linear Gaussian Models
, 1999
"... Factor analysis, principal component analysis, mixtures of gaussian clusters, vector quantization, Kalman filter models, and hidden Markov models can all be unified as variations of unsupervised learning under a single basic generative model. This is achieved by collecting together disparate observa ..."
Abstract

Cited by 260 (17 self)
 Add to MetaCart
Factor analysis, principal component analysis, mixtures of gaussian clusters, vector quantization, Kalman filter models, and hidden Markov models can all be unified as variations of unsupervised learning under a single basic generative model. This is achieved by collecting together disparate observations and derivations made by many previous authors and introducing a new way of linking discrete and continuous state models using a simple nonlinearity. Through the use of other nonlinearities, we show how independent component analysis is also a variation of the same basic generative model. We show that factor analysis and mixtures of gaussians can be implemented in autoencoder neural networks and learned using squared error plus the same regularization term. We introduce a new model for static data, known as sensible principal component analysis, as well as a novel concept of spatially adaptive observation noise. We also review some of the literature involving global and local mixtures of the basic models and provide pseudocode for inference and learning for all the basic models.
The EM Algorithm for Mixtures of Factor Analyzers
, 1997
"... Factor analysis, a statistical method for modeling the covariance structure of high dimensional data using a small number of latent variables, can be extended by allowing different local factor models in different regions of the input space. This results in a model which concurrently performs cluste ..."
Abstract

Cited by 225 (18 self)
 Add to MetaCart
Factor analysis, a statistical method for modeling the covariance structure of high dimensional data using a small number of latent variables, can be extended by allowing different local factor models in different regions of the input space. This results in a model which concurrently performs clustering and dimensionality reduction, and can be thought of as a reduced dimension mixture of Gaussians. We present an exact ExpectationMaximization algorithm for fitting the parameters of this mixture of factor analyzers. 1 Introduction Clustering and dimensionality reduction have long been considered two of the fundamental problems in unsupervised learning (Duda & Hart, 1973; Chapter 6). In clustering, the goal is to group data points by similarity between their features. Conversely, in dimensionality reduction, the goal is to group (or compress) features that are highly correlated. In this paper we present an EM learning algorithm for a method which combines one of the basic forms of dime...
Modeling the manifolds of images of handwritten digits
 IEEE Transactions on Neural Networks
, 1997
"... description length, density estimation. ..."
Dimension Reduction by Local Principal Component Analysis
, 1997
"... Reducing or eliminating statistical redundancy between the components of highdimensional vector data enables a lowerdimensional representation without significant loss of information. Recognizing the limitations of principal component analysis (PCA), researchers in the statistics and neural networ ..."
Abstract

Cited by 99 (0 self)
 Add to MetaCart
Reducing or eliminating statistical redundancy between the components of highdimensional vector data enables a lowerdimensional representation without significant loss of information. Recognizing the limitations of principal component analysis (PCA), researchers in the statistics and neural network communities have developed nonlinear extensions of PCA. This article develops a local linear approach to dimension reduction that provides accurate representations and is fast to compute. We exercise the algorithms on speech and image data, and compare performance with PCA and with neural network implementations of nonlinear PCA. We find that both nonlinear techniques can provide more accurate representations than PCA and show that the local linear techniques outperform neural network implementations.
Performance Animation from Lowdimensional Control Signals
 ACM Transactions on Graphics
, 2005
"... This paper introduces an approach to performance animation that employs video cameras and a small set of retroreflective markers to create a lowcost, easytouse system that might someday be practical for home use. The lowdimensional control signals from the user's performance are supplemented by ..."
Abstract

Cited by 83 (18 self)
 Add to MetaCart
This paper introduces an approach to performance animation that employs video cameras and a small set of retroreflective markers to create a lowcost, easytouse system that might someday be practical for home use. The lowdimensional control signals from the user's performance are supplemented by a database of prerecorded human motion. At run time, the system automatically learns a series of local models from a set of motion capture examples that are a close match to the marker locations captured by the cameras. These local models are then used to reconstruct the motion of the user as a fullbody animation. We demonstrate the power of this approach with realtime control of six different behaviors using two video cameras and a small set of retroreflective markers. We compare the resulting animation to animation from commercial motion capture equipment with a full set of markers.
Mapping a manifold of perceptual observations
 Advances in Neural Information Processing Systems 10
, 1998
"... Nonlinear dimensionality reduction is formulated here as the problem of trying to find a Euclidean featurespace embedding of a set of observations that preserves as closely as possible their intrinsic metric structure – the distances between points on the observation manifold as measured along geod ..."
Abstract

Cited by 73 (2 self)
 Add to MetaCart
Nonlinear dimensionality reduction is formulated here as the problem of trying to find a Euclidean featurespace embedding of a set of observations that preserves as closely as possible their intrinsic metric structure – the distances between points on the observation manifold as measured along geodesic paths. Our isometric feature mapping procedure, or isomap, is able to reliably recover lowdimensional nonlinear structure in realistic perceptual data sets, such as a manifold of face images, where conventional global mapping methods find only local minima. The recovered map provides a canonical set of globally meaningful features, which allows perceptual transformations such as interpolation, extrapolation, and analogy – highly nonlinear transformations in the original observation space – to be computed with simple linear operations in feature space. 1
Using Generative Models for Handwritten Digit Recognition
 IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE
, 1996
"... We describe a method of recognizing handwritten digits by fitting generative models that are built from deformable Bsplines with Gaussian "ink generators" spaced along the length of the spline. The splines are adjusted using a novel elastic matching procedure based on the Expectation Maximization ( ..."
Abstract

Cited by 69 (8 self)
 Add to MetaCart
We describe a method of recognizing handwritten digits by fitting generative models that are built from deformable Bsplines with Gaussian "ink generators" spaced along the length of the spline. The splines are adjusted using a novel elastic matching procedure based on the Expectation Maximization (EM) algorithm that maximizes the likelihood of the model generating the data. This approach has many advantages. (1) After identifying the model most likely to have generated the data, the system not only produces a classification of the digit but also a rich description of the instantiation parameters which can yield information such as the writing style. (2) During the process of explaining the image, generative models can perform recognition driven segmentation. (3) The method involves a relatively small number of parameters and hence training is relatively easy and fast. (4) Unlike many other recognition schemes it does not rely on some form of prenormalization of input images, but can ...
Example Based Learning for ViewBased Human Face Detection
 IEEE Transactions on Pattern Analysis and Machine Intelligence
, 1995
"... Finding human faces automatically in an image is a difficult yet important first step to a fully automatic face recognition system. It is also an interesting academic problem because a successful face detection system can provide valuable insight on how one might approach other similar object and pa ..."
Abstract

Cited by 39 (0 self)
 Add to MetaCart
Finding human faces automatically in an image is a difficult yet important first step to a fully automatic face recognition system. It is also an interesting academic problem because a successful face detection system can provide valuable insight on how one might approach other similar object and pattern detection problems. This paper presents an examplebased learning approach for locating vertical frontal views of human faces in complex scenes. The technique models the distribution of human face patterns by means of a few viewbased "face" and "nonface" prototype clusters. At each image location, a difference feature vector is computed between the local image pattern and the distributionbased model. A trained classifier determines, based on the difference feature vector, whether or not a human face exists at the current image location. We show empirically that the prototypes we choose for our distributionbased model, and the distance metric we adopt for computing difference featur...
Transformation Invariant Autoassociation with Application to Handwritten Character Recognition
, 1995
"... When training neural networks by the classical backpropagation algorithm the whole problem to learn must be expressed by a set of inputs and desired outputs. However, we often have highlevel knowledge about the learning problem. In optical character recognition (OCR), for instance, we know that the ..."
Abstract

Cited by 36 (8 self)
 Add to MetaCart
When training neural networks by the classical backpropagation algorithm the whole problem to learn must be expressed by a set of inputs and desired outputs. However, we often have highlevel knowledge about the learning problem. In optical character recognition (OCR), for instance, we know that the classification should be invariant under a set of transformations like rotation or translation. We propose a new modular classification system based on several autoassociative multilayer perceptrons which allows the efficient incorporation of such knowledge. Results are reported on the NIST database of upper case handwritten letters and compared to other approaches to the invariance problem. 1 INCORPORATION OF EXPLICIT KNOWLEDGE The aim of supervised learning is to learn a mapping between the input and the output space from a set of example pairs (input, desired output). The classical implementation in the domain of neural networks is the backpropagation algorithm. If this learning set is ...