Results 1  10
of
100
Learning generative visual models from few training examples: an incremental Bayesian approach tested on 101 object categories
, 2004
"... Abstract — Current computational approaches to learning visual object categories require thousands of training images, are slow, cannot learn in an incremental manner and cannot incorporate prior information into the learning process. In addition, no algorithm presented in the literature has been te ..."
Abstract

Cited by 462 (15 self)
 Add to MetaCart
Abstract — Current computational approaches to learning visual object categories require thousands of training images, are slow, cannot learn in an incremental manner and cannot incorporate prior information into the learning process. In addition, no algorithm presented in the literature has been tested on more than a handful of object categories. We present an method for learning object categories from just a few training images. It is quick and it uses prior information in a principled way. We test it on a dataset composed of images of objects belonging to 101 widely varied categories. Our proposed method is based on making use of prior information, assembled from (unrelated) object categories which were previously learnt. A generative probabilistic model is used, which represents the shape and appearance of a constellation of features belonging to the object. The parameters of the model are learnt incrementally in a Bayesian manner. Our incremental algorithm is compared experimentally to an earlier batch Bayesian algorithm, as well as to one based on maximumlikelihood. The incremental and batch versions have comparable classification performance on small training sets, but incremental learning is significantly faster, making realtime learning feasible. Both Bayesian methods outperform maximum likelihood on small training sets. I.
Oneshot learning of object categories
 IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE
, 2006
"... Learning visual models of object categories notoriously requires hundreds or thousands of training examples. We show that it is possible to learn much information about a category from just one, or a handful, of images. The key insight is that, rather than learning from scratch, one can take advant ..."
Abstract

Cited by 228 (17 self)
 Add to MetaCart
Learning visual models of object categories notoriously requires hundreds or thousands of training examples. We show that it is possible to learn much information about a category from just one, or a handful, of images. The key insight is that, rather than learning from scratch, one can take advantage of knowledge coming from previously learned categories, no matter how different these categories might be. We explore a Bayesian implementation of this idea. Object categories are represented by probabilistic models. Prior knowledge is represented as a probability density function on the parameters of these models. The posterior model for an object category is obtained by updating the prior in the light of one or more observations. We test a simple implementation of our algorithm on a database of 101 diverse object categories. We compare category models learned by an implementation of our Bayesian approach to models learned from by Maximum Likelihood (ML) and Maximum A Posteriori (MAP) methods. We find that on a database of more than 100 categories, the Bayesian approach produces informative models when the number of training examples is too small for other methods to operate successfully.
A Bayesian approach to unsupervised oneshot learning of object categories
 In Proceedings of the 9th International Conference on Computer Vision
, 2003
"... Learning visual models of object categories notoriously requires thousands of training examples; this is due to the diversity and richness of object appearance which requires models containing hundreds of parameters. We present a method for learning object categories from just a few images ( � �). ..."
Abstract

Cited by 179 (9 self)
 Add to MetaCart
Learning visual models of object categories notoriously requires thousands of training examples; this is due to the diversity and richness of object appearance which requires models containing hundreds of parameters. We present a method for learning object categories from just a few images ( � �). It is based on incorporating “generic” knowledge which may be obtained from previously learnt models of unrelated categories. We operate in a variational Bayesian framework: object categories are represented by probabilistic models, and “prior ” knowledge is represented as a probability density function on the parameters of these models. The “posterior ” model for an object category is obtained by updating the prior in the light of one or more observations. Our ideas are demonstrated on four diverse categories (human faces, airplanes, motorcycles, spotted cats). Initially three categories are learnt from hundreds of training examples, and a “prior ” is estimated from these. Then the model of the fourth category is learnt from 1 to 5 training examples, and is used for detecting new exemplars a set of test images. 1.
Variational Inference for Bayesian Mixtures of Factor Analysers
 In Advances in Neural Information Processing Systems 12
, 2000
"... We present an algorithm that infers the model structure of a mixture of factor analysers using an ecient and deterministic variational approximation to full Bayesian integration over model parameters. This procedure can automatically determine the optimal number of components and the local dimension ..."
Abstract

Cited by 148 (16 self)
 Add to MetaCart
We present an algorithm that infers the model structure of a mixture of factor analysers using an ecient and deterministic variational approximation to full Bayesian integration over model parameters. This procedure can automatically determine the optimal number of components and the local dimensionality of each component (i.e. the number of factors in each factor analyser). Alternatively it can be used to infer posterior distributions over number of components and dimensionalities. Since all parameters are integrated out the method is not prone to over tting. Using a stochastic procedure for adding components it is possible to perform the variational optimisation incrementally and to avoid local maxima. Results show that the method works very well in practice and correctly infers the number and dimensionality of nontrivial synthetic examples. By importance sampling from the variational approximation we show how to obtain unbiased estimates of the true evidence, the exa...
Implementing approximate Bayesian inference for latent Gaussian models using integrated nested Laplace approximations: A manual for the inlaprogram
, 2008
"... Structured additive regression models are perhaps the most commonly used class of models in statistical applications. It includes, among others, (generalised) linear models, (generalised) additive models, smoothingspline models, statespace models, semiparametric regression, spatial and spatiotemp ..."
Abstract

Cited by 79 (16 self)
 Add to MetaCart
Structured additive regression models are perhaps the most commonly used class of models in statistical applications. It includes, among others, (generalised) linear models, (generalised) additive models, smoothingspline models, statespace models, semiparametric regression, spatial and spatiotemporal models, logGaussian Coxprocesses, geostatistical and geoadditive models. In this paper we consider approximate Bayesian inference in a popular subset of structured additive regression models, latent Gaussian models, where the latent field is Gaussian, controlled by a few hyperparameters and with nonGaussian response variables. The posterior marginals are not available in closed form due to the nonGaussian response variables. For such models, Markov chain Monte Carlo methods can be implemented, but they are not without problems, both in terms of convergence and computational time. In some practical applications, the extent of these problems is such that Markov chain Monte Carlo is simply not an appropriate tool for routine analysis. We show that, by using an integrated nested Laplace approximation and its simplified version, we can directly compute very accurate approximations to the posterior marginals. The main benefit of these approximations
Tutorial on Variational Approximation Methods
 In Advanced Mean Field Methods: Theory and Practice
, 2000
"... We provide an introduction to the theory and use of variational methods for inference and estimation in the context of graphical models. Variational methods become useful as ecient approximate methods when the structure of the graph model no longer admits feasible exact probabilistic calculations. T ..."
Abstract

Cited by 73 (1 self)
 Add to MetaCart
We provide an introduction to the theory and use of variational methods for inference and estimation in the context of graphical models. Variational methods become useful as ecient approximate methods when the structure of the graph model no longer admits feasible exact probabilistic calculations. The emphasis of this tutorial is on illustrating how inference and estimation problems can be transformed into variational form along with describing the resulting approximation algorithms and their properties insofar as these are currently known. 1 Introduction The term variational methods refers to a large collection of optimization techniques. The classical context for these methods involves nding the extremum of an integral depending on an unknown function and its derivatives. This classical de nition, however, and the accompanying calculus of variation no longer adequately characterizes modern variational methods. Modern variational approaches have become indispensable tools in...
A Bayesian missing value estimation method for gene expression profile data
 Bioinformatics
, 2003
"... Motivation: Gene expression profile analyses have been used in numerous studies covering a broad range of areas in biology. When unreliable measurements are excluded, missing values are introduced in gene expression profiles. Although existing multivariate analysis methods have difficulty with the t ..."
Abstract

Cited by 69 (2 self)
 Add to MetaCart
Motivation: Gene expression profile analyses have been used in numerous studies covering a broad range of areas in biology. When unreliable measurements are excluded, missing values are introduced in gene expression profiles. Although existing multivariate analysis methods have difficulty with the treatment of missing values, this problem has received little attention. There are many options for dealing with missing values, each of which reaches drastically different results. Ignoring missing values is the simplest method and is frequently applied. This approach, however, has its flaws. In this article, we propose an estimation method for missing values, which is based on Bayesian principal component analysis (BPCA). Although the methodology that a probabilistic model and latent variables are estimated simultaneously within the framework of Bayes
The em algorithm for kernel matrix completion with auxiliary data
 Journal of Machine Learning Research
, 2003
"... In biological data, it is often the case that observed data are available only for a subset of samples. When a kernel matrix is derived from such data, we have to leave the entries for unavailable samples as missing. In this paper, the missing entries are completed by exploiting an auxiliary kernel ..."
Abstract

Cited by 42 (6 self)
 Add to MetaCart
In biological data, it is often the case that observed data are available only for a subset of samples. When a kernel matrix is derived from such data, we have to leave the entries for unavailable samples as missing. In this paper, the missing entries are completed by exploiting an auxiliary kernel matrix derived from another information source. The parametric model of kernel matrices is created as a set of spectral variants of the auxiliary kernel matrix, and the missing entries are estimated by fitting this model to the existing entries. For model fitting, we adopt the em algorithm (distinguished from the EM algorithm of Dempster et al., 1977) based on the information geometry of positive definite matrices. We will report promising results on bacteria clustering experiments using two marker sequences: 16S and gyrB.
Variational bayesian learning of directed graphical models with hidden variables, Bayesian Analysis 1
, 2006
"... Abstract. A key problem in statistics and machine learning is inferring suitable structure of a model given some observed data. A Bayesian approach to model comparison makes use of the marginal likelihood of each candidate model to form a posterior distribution over models; unfortunately for most mo ..."
Abstract

Cited by 31 (3 self)
 Add to MetaCart
Abstract. A key problem in statistics and machine learning is inferring suitable structure of a model given some observed data. A Bayesian approach to model comparison makes use of the marginal likelihood of each candidate model to form a posterior distribution over models; unfortunately for most models of interest, notably those containing hidden or latent variables, the marginal likelihood is intractable to compute. We present the variational Bayesian (VB) algorithm for directed graphical models, which optimises a lower bound approximation to the marginal likelihood in a procedure similar to the standard EM algorithm. We show that for a large class of models, which we call conjugate exponential, the VB algorithm is a straightforward generalisation of the EM algorithm that incorporates uncertainty over model parameters. In a thorough case study using a small class of bipartite DAGs containing hidden variables, we compare the accuracy of the VB approximation to existing asymptoticdata approximations such as the Bayesian Information Criterion (BIC) and the CheesemanStutz (CS) criterion, and also to a sampling based gold standard, Annealed Importance Sampling (AIS). We find that the VB algorithm is empirically superior to CS and BIC, and much faster than AIS. Moreover, we prove that a VB approximation can always be constructed in such a way that guarantees it to be more accurate than the CS approximation.
Bayesian Model Search for Mixture Models Based on Optimizing Variational Bounds
, 2002
"... When learning a mixture model, we suffer from the local optima and model structure determination problems. In this paper, we present a method for simultaneously solving these problems based on the variational Bayesian (VB) framework. First, in the VB framework, we derive an objective function that c ..."
Abstract

Cited by 31 (4 self)
 Add to MetaCart
When learning a mixture model, we suffer from the local optima and model structure determination problems. In this paper, we present a method for simultaneously solving these problems based on the variational Bayesian (VB) framework. First, in the VB framework, we derive an objective function that can simultaneously optimize both model parameter distributions and model structure. Next, focusing on mixture models, we present a deterministic algorithm to approximately optimize the objective function by using the idea of the split and merge operations which we previously proposed within the maximum likelihood framework. Then, we apply the method to mixture of expers (MoE) models to experimentally show that the proposed method can find the optimal number of experts of a MoE while avoiding local maxima. q 2002 Elsevier Science Ltd. All rights reserved.