Results 1  10
of
185
Hierarchical Dirichlet processes
 Journal of the American Statistical Association
, 2004
"... program. The authors wish to acknowledge helpful discussions with Lancelot James and Jim Pitman and the referees for useful comments. 1 We consider problems involving groups of data, where each observation within a group is a draw from a mixture model, and where it is desirable to share mixture comp ..."
Abstract

Cited by 536 (55 self)
 Add to MetaCart
program. The authors wish to acknowledge helpful discussions with Lancelot James and Jim Pitman and the referees for useful comments. 1 We consider problems involving groups of data, where each observation within a group is a draw from a mixture model, and where it is desirable to share mixture components between groups. We assume that the number of mixture components is unknown a priori and is to be inferred from the data. In this setting it is natural to consider sets of Dirichlet processes, one for each group, where the wellknown clustering property of the Dirichlet process provides a nonparametric prior for the number of mixture components within each group. Given our desire to tie the mixture models in the various groups, we consider a hierarchical model, specifically one in which the base measure for the child Dirichlet processes is itself distributed according to a Dirichlet process. Such a base measure being discrete, the child Dirichlet processes necessarily share atoms. Thus, as desired, the mixture models in the different groups necessarily share mixture components. We discuss representations of hierarchical Dirichlet processes in terms of
Infinite Latent Feature Models and the Indian Buffet Process
, 2005
"... We define a probability distribution over equivalence classes of binary matrices with a finite number of rows and an unbounded number of columns. This distribution ..."
Abstract

Cited by 181 (38 self)
 Add to MetaCart
We define a probability distribution over equivalence classes of binary matrices with a finite number of rows and an unbounded number of columns. This distribution
Variational inference for Dirichlet process mixtures
 Bayesian Analysis
, 2005
"... Abstract. Dirichlet process (DP) mixture models are the cornerstone of nonparametric Bayesian statistics, and the development of MonteCarlo Markov chain (MCMC) sampling methods for DP mixtures has enabled the application of nonparametric Bayesian methods to a variety of practical data analysis prob ..."
Abstract

Cited by 128 (16 self)
 Add to MetaCart
Abstract. Dirichlet process (DP) mixture models are the cornerstone of nonparametric Bayesian statistics, and the development of MonteCarlo Markov chain (MCMC) sampling methods for DP mixtures has enabled the application of nonparametric Bayesian methods to a variety of practical data analysis problems. However, MCMC sampling can be prohibitively slow, and it is important to explore alternatives. One class of alternatives is provided by variational methods, a class of deterministic algorithms that convert inference problems into optimization problems (Opper and Saad 2001; Wainwright and Jordan 2003). Thus far, variational methods have mainly been explored in the parametric setting, in particular within the formalism of the exponential family (Attias 2000; Ghahramani and Beal 2001; Blei et al. 2003). In this paper, we present a variational inference algorithm for DP mixtures. We present experiments that compare the algorithm to Gibbs sampling algorithms for DP mixtures of Gaussians and present an application to a largescale image analysis problem.
Multitask learning for classification with dirichlet process priors
 Journal of Machine Learning Research
, 2007
"... Multitask learning (MTL) is considered for logisticregression classifiers, based on a Dirichlet process (DP) formulation. A symmetric MTL (SMTL) formulation is considered in which classifiers for multiple tasks are learned jointly, with a variational Bayesian (VB) solution. We also consider an asy ..."
Abstract

Cited by 98 (9 self)
 Add to MetaCart
Multitask learning (MTL) is considered for logisticregression classifiers, based on a Dirichlet process (DP) formulation. A symmetric MTL (SMTL) formulation is considered in which classifiers for multiple tasks are learned jointly, with a variational Bayesian (VB) solution. We also consider an asymmetric MTL (AMTL) formulation in which the posterior density function from the SMTL model parameters, from previous tasks, is used as a prior for a new task; this approach has the significant advantage of not requiring storage and use of all previous data from prior tasks. The AMTL formulation is solved with a simple Markov Chain Monte Carlo (MCMC) construction. Comparisons are also made to simpler approaches, such as singletask learning, pooling of data across tasks, and simplified approximations to DP. A comprehensive analysis of algorithm performance is addressed through consideration of two data sets that are matched to the MTL problem.
A hierarchical Bayesian language model based on Pitman–Yor processes
 In Coling/ACL, 2006. 9
, 2006
"... We propose a new hierarchical Bayesian ngram model of natural languages. Our model makes use of a generalization of the commonly used Dirichlet distributions called PitmanYor processes which produce powerlaw distributions more closely resembling those in natural languages. We show that an approxi ..."
Abstract

Cited by 78 (8 self)
 Add to MetaCart
We propose a new hierarchical Bayesian ngram model of natural languages. Our model makes use of a generalization of the commonly used Dirichlet distributions called PitmanYor processes which produce powerlaw distributions more closely resembling those in natural languages. We show that an approximation to the hierarchical PitmanYor language model recovers the exact formulation of interpolated KneserNey, one of the best smoothing methods for ngram language models. Experiments verify that our model gives cross entropy results superior to interpolated KneserNey and comparable to modified KneserNey. 1
The infinite PCFG using hierarchical Dirichlet processes
 In EMNLP ’07
, 2007
"... We present a nonparametric Bayesian model of tree structures based on the hierarchical Dirichlet process (HDP). Our HDPPCFG model allows the complexity of the grammar to grow as more training data is available. In addition to presenting a fully Bayesian model for the PCFG, we also develop an effici ..."
Abstract

Cited by 63 (6 self)
 Add to MetaCart
We present a nonparametric Bayesian model of tree structures based on the hierarchical Dirichlet process (HDP). Our HDPPCFG model allows the complexity of the grammar to grow as more training data is available. In addition to presenting a fully Bayesian model for the PCFG, we also develop an efficient variational inference procedure. On synthetic data, we recover the correct grammar without having to specify its complexity in advance. We also show that our techniques can be applied to fullscale parsing applications by demonstrating its effectiveness in learning statesplit grammars. 1
Generalized weighted Chinese restaurant processes for species sampling mixture models
 Statistica Sinica
, 2003
"... Abstract: The class of species sampling mixture models is introduced as an extension of semiparametric models based on the Dirichlet process to models based on the general class of species sampling priors, or equivalently the class of all exchangeable urn distributions. Using Fubini calculus in conj ..."
Abstract

Cited by 53 (8 self)
 Add to MetaCart
Abstract: The class of species sampling mixture models is introduced as an extension of semiparametric models based on the Dirichlet process to models based on the general class of species sampling priors, or equivalently the class of all exchangeable urn distributions. Using Fubini calculus in conjunction with Pitman (1995, 1996), we derive characterizations of the posterior distribution in terms of a posterior partition distribution that extend the results of Lo (1984) for the Dirichlet process. These results provide a better understanding of models and have both theoretical and practical applications. To facilitate the use of our models we generalize the work in Brunner, Chan, James and Lo (2001) by extending their weighted Chinese restaurant (WCR) Monte Carlo procedure, an i.i.d. sequential importance sampling (SIS) procedure for approximating posterior mean functionals based on the Dirichlet process, to the case of approximation of mean functionals and additionally their posterior laws in species sampling mixture models. We also discuss collapsed Gibbs sampling, Pólya urn Gibbs sampling and a Pólya urn SIS scheme. Our framework allows for numerous applications, including multiplicative counting process models subject to weighted gamma processes, as well as nonparametric and semiparametric hierarchical models based on the Dirichlet process, its twoparameter extension, the PitmanYor process and finite dimensional Dirichlet priors. Key words and phrases: Dirichlet process, exchangeable partition, finite dimensional Dirichlet prior, twoparameter PoissonDirichlet process, prediction rule, random probability measure, species sampling sequence.
Stickbreaking construction for the Indian buffet process
 In Proceedings of the International Conference on Artificial Intelligence and Statistics
"... The Indian buffet process (IBP) is a Bayesian nonparametric distribution whereby objects are modelled using an unbounded number of latent features. In this paper we derive a stickbreaking representation for the IBP. Based on this new representation, we develop slice samplers for the IBP that are ef ..."
Abstract

Cited by 46 (8 self)
 Add to MetaCart
The Indian buffet process (IBP) is a Bayesian nonparametric distribution whereby objects are modelled using an unbounded number of latent features. In this paper we derive a stickbreaking representation for the IBP. Based on this new representation, we develop slice samplers for the IBP that are efficient, easy to implement and are more generally applicable than the currently available Gibbs sampler. This representation, along with the work of Thibaux and Jordan [17], also illuminates interesting theoretical connections between the IBP, Chinese restaurant processes, Beta processes and Dirichlet processes. 1
Describing Visual Scenes Using Transformed Objects and Parts
 INT J COMPUT VIS
, 2005
"... We develop hierarchical, probabilistic models for objects, the parts composing them, and the visual scenes surrounding them. Our approach couples topic models originally developed for text analysis with spatial transformations, and thus consistently accounts for geometric constraints. By building i ..."
Abstract

Cited by 43 (6 self)
 Add to MetaCart
We develop hierarchical, probabilistic models for objects, the parts composing them, and the visual scenes surrounding them. Our approach couples topic models originally developed for text analysis with spatial transformations, and thus consistently accounts for geometric constraints. By building integrated scene models, we may discover contextual relationships, and better exploit partially labeled training images. We first consider images of isolated objects, and show that sharing parts among object categories improves detection accuracy when learning from few examples. Turning to multiple object scenes, we propose nonparametric models which use Dirichlet processes to automatically learn the number of parts underlying each object category, and objects composing each scene. The resulting transformed Dirichlet process (TDP) leads to Monte Carlo algorithms which simultaneously segment and recognize objects in street and office scenes.
Variational methods for the Dirichlet process
 In Proceedings of the 21st International Conference on Machine Learning
, 2004
"... Variational inference methods, including mean field methods and loopy belief propagation, have been widely used for approximate probabilistic inference in graphical models. While often less accurate than MCMC, variational methods provide a fast deterministic approximation to marginal and conditional ..."
Abstract

Cited by 42 (5 self)
 Add to MetaCart
Variational inference methods, including mean field methods and loopy belief propagation, have been widely used for approximate probabilistic inference in graphical models. While often less accurate than MCMC, variational methods provide a fast deterministic approximation to marginal and conditional probabilities. Such approximations can be particularly useful in high dimensional problems where sampling methods are too slow to be effective. A limitation of current methods, however, is that they are restricted to parametric probabilistic models. MCMC does not have such a limitation; indeed, MCMC samplers have been developed for the Dirichlet process (DP), a nonparametric distribution on distributions (Ferguson, 1973) that is the cornerstone of Bayesian nonparametric statistics (Escobar & West, 1995; Neal, 2000). In this paper, we develop a meanfield variational approach to approximate inference for the Dirichlet process, where the approximate posterior is based on the truncated stickbreaking construction (Ishwaran & James, 2001). We compare our approach to DP samplers for Gaussian DP mixture models. 1.