Results 1 - 10
of
61
A Unifying Review of Linear Gaussian Models
, 1999
"... Factor analysis, principal component analysis, mixtures of gaussian clusters, vector quantization, Kalman filter models, and hidden Markov models can all be unified as variations of unsupervised learning under a single basic generative model. This is achieved by collecting together disparate observa ..."
Abstract
-
Cited by 208 (14 self)
- Add to MetaCart
Factor analysis, principal component analysis, mixtures of gaussian clusters, vector quantization, Kalman filter models, and hidden Markov models can all be unified as variations of unsupervised learning under a single basic generative model. This is achieved by collecting together disparate observations and derivations made by many previous authors and introducing a new way of linking discrete and continuous state models using a simple nonlinearity. Through the use of other nonlinearities, we show how independent component analysis is also a variation of the same basic generative model. We show that factor analysis and mixtures of gaussians can be implemented in autoencoder neural networks and learned using squared error plus the same regularization term. We introduce a new model for static data, known as sensible principal component analysis, as well as a novel concept of spatially adaptive observation noise. We also review some of the literature involving global and local mixtures of the basic models and provide pseudocode for inference and learning for all the basic models.
Independent Factor Analysis
- Neural Computation
, 1999
"... We introduce the independent factor analysis (IFA) method for recovering independent hidden sources from their observed mixtures. IFA generalizes and unifies ordinary factor analysis (FA), principal component analysis (PCA), and independent component analysis (ICA), and can handle not only square no ..."
Abstract
-
Cited by 178 (8 self)
- Add to MetaCart
We introduce the independent factor analysis (IFA) method for recovering independent hidden sources from their observed mixtures. IFA generalizes and unifies ordinary factor analysis (FA), principal component analysis (PCA), and independent component analysis (ICA), and can handle not only square noiseless mixing, but also the general case where the number of mixtures differs from the number of sources and the data are noisy. IFA is a two-step procedure. In the first step, the source densities, mixing matrix and noise covariance are estimated from the observed data by maximum likelihood. For this purpose we present an expectation-maximization (EM) algorithm, which performs unsupervised learning of an associated probabilistic model of the mixing situation. Each source in our model is described by a mixture of Gaussians, thus all the probabilistic calculations can be performed analytically. In the second step, the sources are reconstructed from the observed data by an optimal non-linear ...
Dimensionality Reduction for Fast Similarity Search in Large Time Series Databases
, 2000
"... The problem of similarity search in large time series databases has attracted much attention recently. It is a non-trivial problem because of the inherent high dimensionality of the data. The most promising solutions involve first performing dimensionality reduction on the data, and then indexing th ..."
Abstract
-
Cited by 115 (13 self)
- Add to MetaCart
The problem of similarity search in large time series databases has attracted much attention recently. It is a non-trivial problem because of the inherent high dimensionality of the data. The most promising solutions involve first performing dimensionality reduction on the data, and then indexing the reduced data with a spatial access method. Three major dimensionality reduction techniques have been proposed, Singular Value Decomposition (SVD), the Discrete Fourier transform (DFT), and more recently the Discrete Wavelet Transform (DWT). In this work we introduce a new dimensionality reduction technique which we call Piecewise Aggregate Approximation (PAA). We theoretically and empirically compare it to the other techniques and demonstrate its superiority. In addition to being competitive with or faster than the other methods, our approach has numerous other advantages. It is simple to understand and to implement, it allows more flexible distance measures, including weighted Euclidean queries, and the index can be built in linear time.
Trainable Videorealistic Speech Animation
- PROCEEDINGS OF SIGGRAPH 2002, SAN ANTONIO TEXAS
, 2002
"... We describe how to create with machine learning techniques a generative, videorealistic, speech animation module. A human subject is first recorded using a videocamera as he/she utters a predetermined speech corpus. After processing the corpus automatically, a visual speech module is learned from th ..."
Abstract
-
Cited by 110 (5 self)
- Add to MetaCart
We describe how to create with machine learning techniques a generative, videorealistic, speech animation module. A human subject is first recorded using a videocamera as he/she utters a predetermined speech corpus. After processing the corpus automatically, a visual speech module is learned from the data that is capable of synthesizing the human subject's mouth uttering entirely novel utterances that were not recorded in the original video. The synthesized utterance is re-composited onto a background sequence which contains natural head and eye movement. The final output is videorealistic in the sense that it looks like a video camera recording of the subject. At run time, the input to the system can be either real audio sequences or synthetic audio produced by a text-to-speech system, as long as they have been phonetically aligned. The two key
Face Transfer with Multilinear Models
- TO APPEAR IN SIGGRAPH 2005
, 2005
"... Face Transfer is a method for mapping videorecorded performances of one individual to facial animations of another. It extracts visemes (speech-related mouth articulations), expressions, and three-dimensional (3D) pose from monocular video or film footage. These parameters are then used to generate ..."
Abstract
-
Cited by 64 (1 self)
- Add to MetaCart
Face Transfer is a method for mapping videorecorded performances of one individual to facial animations of another. It extracts visemes (speech-related mouth articulations), expressions, and three-dimensional (3D) pose from monocular video or film footage. These parameters are then used to generate and drive a detailed 3D textured face mesh for a target identity, which can be seamlessly rendered back into target footage. The underlying face model automatically adjusts for how the target performs facial expressions and visemes. The performance data can be easily edited to change the visemes, expressions, pose, or even the identity of the target—the attributes are separably controllable. This supports
Variational Extensions to EM and Multinomial PCA
- In ECML 2002
, 2002
"... Several authors in recent years have proposed discrete analogues to principle component analysis intended to handle discrete or positive only data, for instance suited to analyzing sets of documents. Methods include non-negative matrix factorization, probabilistic latent semantic analysis, and laten ..."
Abstract
-
Cited by 64 (12 self)
- Add to MetaCart
Several authors in recent years have proposed discrete analogues to principle component analysis intended to handle discrete or positive only data, for instance suited to analyzing sets of documents. Methods include non-negative matrix factorization, probabilistic latent semantic analysis, and latent Dirichlet allocation. This paper begins with a review of the basic theory of the variational extension to the expectation maximization algorithm, and then presents discrete component finding algorithms in that light. Experiments are conducted on both bigram word data and document bag-of-word to expose some of the subtleties of this new class of algorithms.
A Framework for Robust Subspace Learning
- International Journal of Computer Vision
, 2003
"... Many computer vision, signal processing and statistical problems can be posed as problems of learning low dimensional linear or multi-linear models. These models have been widely used for the representation of shape, appearance, motion, etc, in computer vision applications. ..."
Abstract
-
Cited by 61 (5 self)
- Add to MetaCart
Many computer vision, signal processing and statistical problems can be posed as problems of learning low dimensional linear or multi-linear models. These models have been widely used for the representation of shape, appearance, motion, etc, in computer vision applications.
Diffusion snakes: introducing statistical shape knowledge into the Mumford-Shah functional
- J. OF COMPUTER VISION
, 2002
"... We present a modification of the Mumford-Shah functional and its cartoon limit which facilitates the incorporation of a statistical prior on the shape of the segmenting contour. By minimizing a single energy functional, we obtain a segmentation process which maximizes both the grey value homogeneit ..."
Abstract
-
Cited by 58 (11 self)
- Add to MetaCart
We present a modification of the Mumford-Shah functional and its cartoon limit which facilitates the incorporation of a statistical prior on the shape of the segmenting contour. By minimizing a single energy functional, we obtain a segmentation process which maximizes both the grey value homogeneity in the separated regions and the similarity of the contour with respect to a set of training shapes. We propose a closed-form, parameter-free solution for incorporating invariance with respect to similarity transformations in the variational framework. We show segmentation results on artificial and real-world images with and without prior shape information. In the cases of noise, occlusion or strongly cluttered background the shape prior significantly improves segmentation. Finally we compare our results to those obtained by a level set implementation of geodesic active contours.
Incremental Learning for Robust Visual Tracking
, 2008
"... Visual tracking, in essence, deals with nonstationary image streams that change over time. While most existing algorithms are able to track objects well in controlled environments, they usually fail in the presence of significant variation of the object’s appearance or surrounding illumination. On ..."
Abstract
-
Cited by 49 (7 self)
- Add to MetaCart
Visual tracking, in essence, deals with nonstationary image streams that change over time. While most existing algorithms are able to track objects well in controlled environments, they usually fail in the presence of significant variation of the object’s appearance or surrounding illumination. One reason for such failures is that many algorithms employ fixed appearance models of the target. Such models are trained using only appearance data available before tracking begins, which in practice limits the range of appearances that are modeled, and ignores the large volume of information (such as shape changes or specific lighting conditions) that becomes available during tracking. In this paper, we present a tracking method that incrementally learns a low-dimensional subspace representation, efficiently adapting online to changes in the appearance of the target. The model update, based on incremental algorithms for principal component analysis, includes two important features: a method for correctly updating the sample mean, and a for-
Nonlinear Shape Statistics in Mumford-Shah Based Segmentation
- In European Conference on Computer Vision
, 2002
"... We present a variational integration of nonlinear shape statistics into a Mumford-Shah based segmentation process. The nonlinear statistics are derived from a set of training silhouettes by a novel method of density estimation which can be considered as an extension of kernel PCA to a stochastic fra ..."
Abstract
-
Cited by 47 (6 self)
- Add to MetaCart
We present a variational integration of nonlinear shape statistics into a Mumford-Shah based segmentation process. The nonlinear statistics are derived from a set of training silhouettes by a novel method of density estimation which can be considered as an extension of kernel PCA to a stochastic framework.

