Results 1 - 10
of
16
One-shot learning of object categories
- IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE
, 2006
"... Learning visual models of object categories notoriously requires hundreds or thousands of training examples. We show that it is possible to learn much information about a category from just one, or a handful, of images. The key insight is that, rather than learning from scratch, one can take advant ..."
Abstract
-
Cited by 136 (12 self)
- Add to MetaCart
Learning visual models of object categories notoriously requires hundreds or thousands of training examples. We show that it is possible to learn much information about a category from just one, or a handful, of images. The key insight is that, rather than learning from scratch, one can take advantage of knowledge coming from previously learned categories, no matter how different these categories might be. We explore a Bayesian implementation of this idea. Object categories are represented by probabilistic models. Prior knowledge is represented as a probability density function on the parameters of these models. The posterior model for an object category is obtained by updating the prior in the light of one or more observations. We test a simple implementation of our algorithm on a database of 101 diverse object categories. We compare category models learned by an implementation of our Bayesian approach to models learned from by Maximum Likelihood (ML) and Maximum A Posteriori (MAP) methods. We find that on a database of more than 100 categories, the Bayesian approach produces informative models when the number of training examples is too small for other methods to operate successfully.
A Graphical Model for Audiovisual Object Tracking
- IEEE Transactions on Pattern Analysis and Machine Intelligence
, 2003
"... We present a new approach to modeling and processing multimedia data. This approach is based on graphical models that combine audio and video variables. We demonstrate it by developing a new algorithm for tracking a moving object in a cluttered, noisy scene using two microphones and a camera. Our mo ..."
Abstract
-
Cited by 36 (0 self)
- Add to MetaCart
We present a new approach to modeling and processing multimedia data. This approach is based on graphical models that combine audio and video variables. We demonstrate it by developing a new algorithm for tracking a moving object in a cluttered, noisy scene using two microphones and a camera. Our model uses unobserved variables to describe the data in terms of the process that generates them. It is therefore able to capture and exploit the statistical structure of the audio and video data separately, as well as their mutual dependencies. Model parameters are learned from data via an EM algorithm, and automatic calibration is performed as part of this procedure. Tracking is done by Bayesian inference of the object location from data. We demonstrate successful performance on multimedia clips captured in real world scenarios using off-the-shelf equipment.
Scaling EM (Expectation-Maximization) Clustering to Large Databases
, 1999
"... Practical statistical clustering algorithms typically center upon an iterative refinement optimization procedure to compute a locally optimal clustering solution that maximizes the fit to data. These algorithms typically require many database scans to converge, and within each scan they require the ..."
Abstract
-
Cited by 35 (0 self)
- Add to MetaCart
Practical statistical clustering algorithms typically center upon an iterative refinement optimization procedure to compute a locally optimal clustering solution that maximizes the fit to data. These algorithms typically require many database scans to converge, and within each scan they require the access to every record in the data table. For large databases, the scans become prohibitively expensive. We present a scalable implementation of the Expectation-Maximization (EM) algorithm. The database community has focused on distance-based clustering schemes and methods have been developed to cluster either numerical or categorical data. Unlike distancebased algorithms (such as K-Means), EM constructs proper statistical models of the underlying data source and naturally generalizes to cluster databases containing both discrete-valued and continuous-valued data. The scalable method is based on a decomposition of the basic statistics the algorithm needs: identifying regions of the data that...
Gaussian process dynamical models for human motion
- IEEE Trans. Pattern Anal. Machine Intell
, 2007
"... Abstract—We introduce Gaussian process dynamical models (GPDMs) for nonlinear time series analysis, with applications to learning models of human pose and motion from high-dimensional motion capture data. A GPDM is a latent variable model. It comprises a lowdimensional latent space with associated d ..."
Abstract
-
Cited by 35 (1 self)
- Add to MetaCart
Abstract—We introduce Gaussian process dynamical models (GPDMs) for nonlinear time series analysis, with applications to learning models of human pose and motion from high-dimensional motion capture data. A GPDM is a latent variable model. It comprises a lowdimensional latent space with associated dynamics, as well as a map from the latent space to an observation space. We marginalize out the model parameters in closed form by using Gaussian process priors for both the dynamical and the observation mappings. This results in a nonparametric model for dynamical systems that accounts for uncertainty in the model. We demonstrate the approach and compare four learning algorithms on human motion capture data, in which each pose is 50-dimensional. Despite the use of small data sets, the GPDM learns an effective representation of the nonlinear dynamics in these spaces. Index Terms—Machine learning, motion, tracking, animation, stochastic processes, time series analysis. 1
A comparison of algorithms for inference and learning in probabilistic graphical models
- IEEE Transactions on Pattern Analysis and Machine Intelligence
, 2005
"... Computer vision is currently one of the most exciting areas of artificial intelligence re-search, largely because it has recently become possible to record, store and process large amounts of visual data. While impressive achievements have been made in pattern clas-sification problems such as handwr ..."
Abstract
-
Cited by 33 (2 self)
- Add to MetaCart
Computer vision is currently one of the most exciting areas of artificial intelligence re-search, largely because it has recently become possible to record, store and process large amounts of visual data. While impressive achievements have been made in pattern clas-sification problems such as handwritten character recognition and face detection, it is even more exciting that researchers may be on the verge of introducing computer vision systems that perform scene analysis, decomposing image input into its constituent objects, lighting conditions, motion patterns, and so on. Two of the main challenges in computer vision are finding efficient models of the physics of visual scenes and finding efficient algorithms for inference and learning in these models. In this paper, we advocate the use of graph-based probability models and their associated inference and learning algorithms for computer vision and scene analysis. We review exact techniques and various approximate, computationally efficient techniques, including iterative conditional modes, the expectation maximization (EM) algorithm, the mean field method, variational techniques, structured variational techniques, Gibbs sampling, the sum-product algorithm and “loopy ” belief propagation. We describe how each technique can be applied in a model of multiple, occluding objects, and contrast the behaviors and performances of the techniques using a unifying cost function, free energy.
Bayesian Feature and Model Selection for Gaussian Mixture Models
"... Abstract—We present a Bayesian method for mixture model training that simultaneously treats the feature selection and the model selection problem. The method is based on the integration of a mixture model formulation that takes into account the saliency of the features and a Bayesian approach to mix ..."
Abstract
-
Cited by 13 (0 self)
- Add to MetaCart
Abstract—We present a Bayesian method for mixture model training that simultaneously treats the feature selection and the model selection problem. The method is based on the integration of a mixture model formulation that takes into account the saliency of the features and a Bayesian approach to mixture learning that can be used to estimate the number of mixture components. The proposed learning algorithm follows the variational framework and can simultaneously optimize over the number of components, the saliency of the features, and the parameters of the mixture model. Experimental results using high-dimensional artificial and real data illustrate the effectiveness of the method. Index Terms—Mixture models, feature selection, model selection, Bayesian approach, variational training.
Incremental Model-Based Clustering for Large Datasets with Small Clusters
- Journal of Computational and Graphical Statistics
, 2003
"... Clustering is often useful for analyzing and summarizing information within large datasets. Model-based clustering methods have been found to be e#ective for determining the number of clusters, dealing with outliers, and selecting the best clustering method in datasets that are small to moderate ..."
Abstract
-
Cited by 10 (5 self)
- Add to MetaCart
Clustering is often useful for analyzing and summarizing information within large datasets. Model-based clustering methods have been found to be e#ective for determining the number of clusters, dealing with outliers, and selecting the best clustering method in datasets that are small to moderate in size. For large datasets, current model-based clustering methods tend to be limited by memory and time requirements and the increasing di#culty of maximum likelihood estimation. They may fit too many clusters in some portions of the data and/or miss clusters containing relatively few observations.
Recursive Unsupervised Learning of Finite Mixture Models
, 2004
"... There are two open problems when finite mixture densities are used to model multivariate data: the selection of the number of components and the initialization. In this paper we propose an on-line (recursive) algorithm that estimates the parameters of the mixture and that simultaneously selects the ..."
Abstract
-
Cited by 6 (0 self)
- Add to MetaCart
There are two open problems when finite mixture densities are used to model multivariate data: the selection of the number of components and the initialization. In this paper we propose an on-line (recursive) algorithm that estimates the parameters of the mixture and that simultaneously selects the number of components. The new algorithm starts with a large number of randomly initialized components. A prior is used as a bias for maximally structured models. A stochastic approximation recursive learning algorithm is proposed to search for the maximum a posteriori (MAP) solution and to discard the irrelevant components.
A globally convergent regularized ordered-subset EM algorithm for list-mode reconstruction
- IEEE Trans. Nuc. Sci
, 2004
"... Abstract — List-mode (LM) acquisition allows collection of data attributes at higher levels of precision than is possible with binned (i.e. histogram-mode) data. Hence it is particularly attractive for low-count data in emission tomography. A LM likelihood and convergent EM algorithm for LM reconstr ..."
Abstract
-
Cited by 6 (1 self)
- Add to MetaCart
Abstract — List-mode (LM) acquisition allows collection of data attributes at higher levels of precision than is possible with binned (i.e. histogram-mode) data. Hence it is particularly attractive for low-count data in emission tomography. A LM likelihood and convergent EM algorithm for LM reconstruction was presented in (Parra et al., TMI, v17, 1998). Faster ordered subset (OS) reconstruction algorithms for LM 3-D PET were presented in (Reader et al., Phys. Med. Bio., v43, 1998). However, these OS algorithms are not globally convergent and they also do not include regularization using convex priors which can be beneficial in emission tomographic reconstruction. LM-OSEM algorithms incorporating regularization via inter-iteration filtering were presented in (Levkovitz et al., TMI, v20, 2001), but these are again not globally convergent. Convergent preconditioned conjugate gradient algorithms for spatio-temporal list-mode reconstruction incorporating regularization were presented in (Nichols, et al., TMI, v21, 2002), but these do not use OS for speed-up. In this work, we present a globally convergent and regularized orderedsubset algorithm for LM reconstruction. Our algorithm is derived using an incremental EM approach. We investigated the speed-up of our LM OS algorithm (vs. a non-OS version) for a SPECT simulation, and found that the speed-up was somewhat less than that enjoyed by other OS-type algorithms. Index Terms — List-mode Reconstruction, Emission Tomography. I.
Shape Matching and Registration by Data-driven EM
, 2007
"... In this paper, we present an efficient and robust algorithm for shape matching, registration, and detection. The task is to geometrically transform a source shape to fit a target shape. The measure of similarity is defined in terms of the amount of transformation required. The shapes are represented ..."
Abstract
-
Cited by 3 (0 self)
- Add to MetaCart
In this paper, we present an efficient and robust algorithm for shape matching, registration, and detection. The task is to geometrically transform a source shape to fit a target shape. The measure of similarity is defined in terms of the amount of transformation required. The shapes are represented by sparse-point or continuous-contour representations depending on the form of the data. We formulate the problem as probabilistic inference using a generative model and the EM algorithm. But this algorithm has problems with initialization and computing the E-step. To address these problems, we define a discriminative model which makes use of shape features. This gives a hybrid algorithm which combines the generative and discriminative models. The resulting algorithm is very fast, due to the effectiveness of shape-features for solving correspondence requiring typically only four iterations. The convergence time of the algorithm is under a second. We demonstrate the effectiveness of the algorithm by testing it on standard datasets, such as MPEG7, for shape matching and by applying it to a range of matching, registration, and foreground/background segmentation problems.

