Results 1  10
of
47
Learning generative visual models from few training examples: an incremental Bayesian approach tested on 101 object categories
, 2004
"... Abstract — Current computational approaches to learning visual object categories require thousands of training images, are slow, cannot learn in an incremental manner and cannot incorporate prior information into the learning process. In addition, no algorithm presented in the literature has been te ..."
Abstract

Cited by 462 (15 self)
 Add to MetaCart
Abstract — Current computational approaches to learning visual object categories require thousands of training images, are slow, cannot learn in an incremental manner and cannot incorporate prior information into the learning process. In addition, no algorithm presented in the literature has been tested on more than a handful of object categories. We present an method for learning object categories from just a few training images. It is quick and it uses prior information in a principled way. We test it on a dataset composed of images of objects belonging to 101 widely varied categories. Our proposed method is based on making use of prior information, assembled from (unrelated) object categories which were previously learnt. A generative probabilistic model is used, which represents the shape and appearance of a constellation of features belonging to the object. The parameters of the model are learnt incrementally in a Bayesian manner. Our incremental algorithm is compared experimentally to an earlier batch Bayesian algorithm, as well as to one based on maximumlikelihood. The incremental and batch versions have comparable classification performance on small training sets, but incremental learning is significantly faster, making realtime learning feasible. Both Bayesian methods outperform maximum likelihood on small training sets. I.
A Bayesian approach to unsupervised oneshot learning of object categories
 In Proceedings of the 9th International Conference on Computer Vision
, 2003
"... Learning visual models of object categories notoriously requires thousands of training examples; this is due to the diversity and richness of object appearance which requires models containing hundreds of parameters. We present a method for learning object categories from just a few images ( � �). ..."
Abstract

Cited by 179 (9 self)
 Add to MetaCart
Learning visual models of object categories notoriously requires thousands of training examples; this is due to the diversity and richness of object appearance which requires models containing hundreds of parameters. We present a method for learning object categories from just a few images ( � �). It is based on incorporating “generic” knowledge which may be obtained from previously learnt models of unrelated categories. We operate in a variational Bayesian framework: object categories are represented by probabilistic models, and “prior ” knowledge is represented as a probability density function on the parameters of these models. The “posterior ” model for an object category is obtained by updating the prior in the light of one or more observations. Our ideas are demonstrated on four diverse categories (human faces, airplanes, motorcycles, spotted cats). Initially three categories are learnt from hundreds of training examples, and a “prior ” is estimated from these. Then the model of the fourth category is learnt from 1 to 5 training examples, and is used for detecting new exemplars a set of test images. 1.
Variational Inference for Bayesian Mixtures of Factor Analysers
 In Advances in Neural Information Processing Systems 12
, 2000
"... We present an algorithm that infers the model structure of a mixture of factor analysers using an ecient and deterministic variational approximation to full Bayesian integration over model parameters. This procedure can automatically determine the optimal number of components and the local dimension ..."
Abstract

Cited by 148 (16 self)
 Add to MetaCart
We present an algorithm that infers the model structure of a mixture of factor analysers using an ecient and deterministic variational approximation to full Bayesian integration over model parameters. This procedure can automatically determine the optimal number of components and the local dimensionality of each component (i.e. the number of factors in each factor analyser). Alternatively it can be used to infer posterior distributions over number of components and dimensionalities. Since all parameters are integrated out the method is not prone to over tting. Using a stochastic procedure for adding components it is possible to perform the variational optimisation incrementally and to avoid local maxima. Results show that the method works very well in practice and correctly infers the number and dimensionality of nontrivial synthetic examples. By importance sampling from the variational approximation we show how to obtain unbiased estimates of the true evidence, the exa...
Enterprise modeling
, 1998
"... ... This article motivates the need for enterprise models and introduces the concepts of generic and deductive enterprise models. It reviews research to date on enterprise modeling and considers in detail the Toronto virtual enterprise effort at the University of Toronto. ..."
Abstract

Cited by 119 (5 self)
 Add to MetaCart
... This article motivates the need for enterprise models and introduces the concepts of generic and deductive enterprise models. It reviews research to date on enterprise modeling and considers in detail the Toronto virtual enterprise effort at the University of Toronto.
Propagation Algorithms for Variational Bayesian Learning
 In Advances in Neural Information Processing Systems 13
, 2001
"... Variational approximations are becoming a widespread tool for Bayesian learning of graphical models. We provide some theoretical results for the variational updates in a very general family of conjugateexponential graphical models. We show how the belief propagation and the junction tree algorithms ..."
Abstract

Cited by 110 (14 self)
 Add to MetaCart
Variational approximations are becoming a widespread tool for Bayesian learning of graphical models. We provide some theoretical results for the variational updates in a very general family of conjugateexponential graphical models. We show how the belief propagation and the junction tree algorithms can be used in the inference step of variational Bayesian learning. Applying these results to the Bayesian analysis of linearGaussian statespace models we obtain a learning procedure that exploits the Kalman smoothing propagation, while integrating over all model parameters. We demonstrate how this can be used to infer the hidden state dimensionality of the statespace model in a variety of synthetic problems and one real highdimensional data set.
Discriminative Density Propagation for 3D Human Motion Estimation
 In CVPR
, 2005
"... We describe a mixture density propagation algorithm to estimate 3D human motion in monocular video sequences based on observations encoding the appearance of image silhouettes. Our approach is discriminative rather than generative, therefore it does not require the probabilistic inversion of a predi ..."
Abstract

Cited by 92 (14 self)
 Add to MetaCart
We describe a mixture density propagation algorithm to estimate 3D human motion in monocular video sequences based on observations encoding the appearance of image silhouettes. Our approach is discriminative rather than generative, therefore it does not require the probabilistic inversion of a predictive observation model. Instead, it uses a large human motion capture database and a 3D computer graphics human model in order to synthesize training pairs of typical human configurations together with their realistically rendered 2D silhouettes. These are used to directly learn to predict the conditional state distributions required for 3D body pose tracking and thus avoid using the generative 3D model for inference (the learned discriminative predictors can also be used, complementary, as importance samplers in order to improve mixing or initialize generative inference algorithms). We aim for probabilistically motivated tracking algorithms and for models that can represent complex multivalued mappings common in inverse, uncertain perception inferences. Our paper has three contributions: (1) we establish the density propagation rules for discriminative inference in continuous, temporal chain models; (2) we propose flexible algorithms for learning multimodal state distributions based on compact, conditional Bayesian mixture of experts models; and (3) we demonstrate the algorithms empirically on real and motion capturebased test sequences and compare against nearestneighbor and regression methods.
Graphical Models and Variational Methods
, 2001
"... We review the use of variational methods of approximating inference and learning in probabilistic graphical models. In particular, we focus on variational approximations to the integrals required for Bayesian learning. For models in the conjugateexponential family, a generalisation of the EM algori ..."
Abstract

Cited by 37 (2 self)
 Add to MetaCart
We review the use of variational methods of approximating inference and learning in probabilistic graphical models. In particular, we focus on variational approximations to the integrals required for Bayesian learning. For models in the conjugateexponential family, a generalisation of the EM algorithm is derived that iterates between optimising hyperparameters of the distribution over parameters, and inferring the hidden variable distributions. These approximations make use of available propagation algorithms for probabilistic graphical models. We give two case studies of how the variational Bayesian approach can be used to learn model structure: inferring the number of clusters and dimensionalities in a mixture of factor analysers, and inferring the dimension of the state space of a linear dynamical system. Finally, importance sampling corrections to the variational approximations are discussed, along with their limitations.
Bayesian Treed Models
 Machine Learning
, 2000
"... When simple parametric models such as linear regression fail to adequately approximate a function across an entire set of data, an alternative may be to consider a partition of the data, and then use a separate simple model within each subset of the partition. Such an alternative is provided by ..."
Abstract

Cited by 32 (1 self)
 Add to MetaCart
When simple parametric models such as linear regression fail to adequately approximate a function across an entire set of data, an alternative may be to consider a partition of the data, and then use a separate simple model within each subset of the partition. Such an alternative is provided by a treed model which uses a binary tree to identify such a partition. However, treed models go further than conventional trees (eg CART, C4.5) by tting models rather than simple means or proportions across the partition. In this paper, we propose a Bayesian approach for nding and tting parametric treed models, in particular focusing on Bayesian treed regression. The potential of this approach is illustrated by a crossvalidation comparison of predictive performance with neural nets, MARS, and conventional trees on simulated and real data sets. Keywords: binary trees, Markov chain Monte Carlo, model selection, stochastic search. 1 Hugh Chipman is Associate Professor of Statistics...
Variational bayesian learning of directed graphical models with hidden variables, Bayesian Analysis 1
, 2006
"... Abstract. A key problem in statistics and machine learning is inferring suitable structure of a model given some observed data. A Bayesian approach to model comparison makes use of the marginal likelihood of each candidate model to form a posterior distribution over models; unfortunately for most mo ..."
Abstract

Cited by 31 (3 self)
 Add to MetaCart
Abstract. A key problem in statistics and machine learning is inferring suitable structure of a model given some observed data. A Bayesian approach to model comparison makes use of the marginal likelihood of each candidate model to form a posterior distribution over models; unfortunately for most models of interest, notably those containing hidden or latent variables, the marginal likelihood is intractable to compute. We present the variational Bayesian (VB) algorithm for directed graphical models, which optimises a lower bound approximation to the marginal likelihood in a procedure similar to the standard EM algorithm. We show that for a large class of models, which we call conjugate exponential, the VB algorithm is a straightforward generalisation of the EM algorithm that incorporates uncertainty over model parameters. In a thorough case study using a small class of bipartite DAGs containing hidden variables, we compare the accuracy of the VB approximation to existing asymptoticdata approximations such as the Bayesian Information Criterion (BIC) and the CheesemanStutz (CS) criterion, and also to a sampling based gold standard, Annealed Importance Sampling (AIS). We find that the VB algorithm is empirically superior to CS and BIC, and much faster than AIS. Moreover, we prove that a VB approximation can always be constructed in such a way that guarantees it to be more accurate than the CS approximation.
Bayesian Model Search for Mixture Models Based on Optimizing Variational Bounds
, 2002
"... When learning a mixture model, we suffer from the local optima and model structure determination problems. In this paper, we present a method for simultaneously solving these problems based on the variational Bayesian (VB) framework. First, in the VB framework, we derive an objective function that c ..."
Abstract

Cited by 31 (4 self)
 Add to MetaCart
When learning a mixture model, we suffer from the local optima and model structure determination problems. In this paper, we present a method for simultaneously solving these problems based on the variational Bayesian (VB) framework. First, in the VB framework, we derive an objective function that can simultaneously optimize both model parameter distributions and model structure. Next, focusing on mixture models, we present a deterministic algorithm to approximately optimize the objective function by using the idea of the split and merge operations which we previously proposed within the maximum likelihood framework. Then, we apply the method to mixture of expers (MoE) models to experimentally show that the proposed method can find the optimal number of experts of a MoE while avoiding local maxima. q 2002 Elsevier Science Ltd. All rights reserved.