Results 1  10
of
46
Hierarchical topic models and the nested Chinese restaurant process
 Advances in Neural Information Processing Systems
, 2004
"... We address the problem of learning topic hierarchies from data. The model selection problem in this domain is daunting—which of the large collection of possible trees to use? We take a Bayesian approach, generating an appropriate prior via a distribution on partitions that we refer to as the nested ..."
Abstract

Cited by 188 (25 self)
 Add to MetaCart
We address the problem of learning topic hierarchies from data. The model selection problem in this domain is daunting—which of the large collection of possible trees to use? We take a Bayesian approach, generating an appropriate prior via a distribution on partitions that we refer to as the nested Chinese restaurant process. This nonparametric prior allows arbitrarily large branching factors and readily accommodates growing data collections. We build a hierarchical topic model by combining this prior with a likelihood that is based on a hierarchical variant of latent Dirichlet allocation. We illustrate our approach on simulated data and with an application to the modeling of NIPS abstracts. 1
Bayesian density regression
 JOURNAL OF THE ROYAL STATISTICAL SOCIETY B
, 2007
"... This article considers Bayesian methods for density regression, allowing a random probability distribution to change flexibly with multiple predictors. The conditional response distribution is expressed as a nonparametric mixture of parametric densities, with the mixture distribution changing acc ..."
Abstract

Cited by 40 (23 self)
 Add to MetaCart
This article considers Bayesian methods for density regression, allowing a random probability distribution to change flexibly with multiple predictors. The conditional response distribution is expressed as a nonparametric mixture of parametric densities, with the mixture distribution changing according to location in the predictor space. A new class of priors for dependent random measures is proposed for the collection of random mixing measures at each location. The conditional prior for the random measure at a given location is expressed as a mixture of a Dirichlet process (DP) distributed innovation measure and neighboring random measures. This specification results in a coherent prior for the joint measure, with the marginal random measure at each location being a finite mixture of DP basis measures. Integrating out the infinitedimensional collection of mixing measures, we obtain a simple expression for the conditional distribution of the subjectspecific random variables, which generalizes the Pólya urn scheme. Properties are considered and a simple Gibbs sampling algorithm is developed for posterior computation. The methods are illustrated using simulated data examples and epidemiologic studies.
Kernel stickbreaking processes
, 2007
"... Summary. This article proposes a class of kernel stickbreaking processes (KSBP) for uncountable collections of dependent random probability measures. The KSBP is constructed by first introducing an infinite sequence of random locations. Independent random probability measures and betadistributed ..."
Abstract

Cited by 39 (11 self)
 Add to MetaCart
Summary. This article proposes a class of kernel stickbreaking processes (KSBP) for uncountable collections of dependent random probability measures. The KSBP is constructed by first introducing an infinite sequence of random locations. Independent random probability measures and betadistributed random weights are assigned to each location. Predictordependent random probability measures are then constructed by mixing over the locations, with stickbreaking probabilities expressed as a kernel multiplied by the beta weights. Some theoretical properties of the KSBP are described, including a covariatedependent prediction rule. A retrospective MCMC algorithm is developed for posterior computation, and the methods are illustrated using a simulated example and an epidemiologic application.
Poisson process partition calculus with an application to Bayesian . . .
, 2005
"... This article develops, and describes how to use, results concerning disintegrations of Poisson random measures. These results are fashioned as simple tools that can be tailormade to address inferential questions arising in a wide range of Bayesian nonparametric and spatial statistical models. The P ..."
Abstract

Cited by 32 (10 self)
 Add to MetaCart
This article develops, and describes how to use, results concerning disintegrations of Poisson random measures. These results are fashioned as simple tools that can be tailormade to address inferential questions arising in a wide range of Bayesian nonparametric and spatial statistical models. The Poisson disintegration method is based on the formal statement of two results concerning a Laplace functional change of measure and a Poisson Palm/Fubini calculus in terms of random partitions of the integers {1,...,n}. The techniques are analogous to, but much more general than, techniques for the Dirichlet process and weighted gamma process developed in [Ann. Statist. 12
Improving nonparameteric Bayesian inference: experiments on unsupervised word segmentation with adaptor grammars
"... One of the reasons nonparametric Bayesian inference is attracting attention in computational linguistics is because it provides a principled way of learning the units of generalization together with their probabilities. Adaptor grammars are a framework for defining a variety of hierarchical nonparam ..."
Abstract

Cited by 25 (4 self)
 Add to MetaCart
One of the reasons nonparametric Bayesian inference is attracting attention in computational linguistics is because it provides a principled way of learning the units of generalization together with their probabilities. Adaptor grammars are a framework for defining a variety of hierarchical nonparametric Bayesian models. This paper investigates some of the choices that arise in formulating adaptor grammars and associated inference procedures, and shows that they can have a dramatic impact on performance in an unsupervised word segmentation task. With appropriate adaptor grammars and inference procedures we achieve an 87 % word token fscore on the standard Brent version of the BernsteinRatner corpus, which is an error reduction of over 35 % over the best previously reported results for this corpus. 1
Bayesian Model Selection in Finite Mixtures by Marginal Density Decompositions
 JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION
, 2001
"... ..."
Some Further Developments for StickBreaking Priors: Finite and Infinite Clustering and Classification
 Sankhya Series A
, 2003
"... this paper will be to develop new surrounding theory for the hierarchical model (7) and show how these may be used to develop computational algorithms for computing posterior quantities. Our theoretical contributions include developing key properties for the class of extended stickbreaking measures ..."
Abstract

Cited by 15 (0 self)
 Add to MetaCart
this paper will be to develop new surrounding theory for the hierarchical model (7) and show how these may be used to develop computational algorithms for computing posterior quantities. Our theoretical contributions include developing key properties for the class of extended stickbreaking measures, which includes establishing a conjugacy property of their random weights to i.i.d sampling, and a characterization of the posterior for the extended stickbreaking prior under i.i.d sampling. See Section 3. These properties then lead us in Section 4 to a general characterization for the posterior of (7). In Section 5 we outline a collapsed Gibbs sampling algorithm and an i.i.d SIS (sequential importance sampling) algorithm that can be used for inference in (7). One important implication is our ability to t the posterior of (6) subject to in nite dimensional stickbreaking measures. The paper begins with a brief discussion of stickbreaking priors in Section 2
Independent and Identically Distributed Monte Carlo Algorithms for Semiparametric Linear Mixed Models
 JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION
, 2002
"... ..."
Inducing TreeSubstitution Grammars
"... Inducing a grammar from text has proven to be a notoriously challenging learning task despite decades of research. The primary reason for its difficulty is that in order to induce plausible grammars, the underlying model must be capable of representing the intricacies of language while also ensuring ..."
Abstract

Cited by 12 (0 self)
 Add to MetaCart
Inducing a grammar from text has proven to be a notoriously challenging learning task despite decades of research. The primary reason for its difficulty is that in order to induce plausible grammars, the underlying model must be capable of representing the intricacies of language while also ensuring that it can be readily learned from data. The majority of existing work on grammar induction has favoured model simplicity (and thus learnability) over representational capacity by using context free grammars and first order dependency grammars, which are not sufficiently expressive to model many common linguistic constructions. We propose a novel compromise by inferring a probabilistic tree substitution grammar, a formalism which allows for arbitrarily large tree fragments and thereby better represent complex linguistic structures. To limit the model’s complexity we employ a Bayesian nonparametric prior which biases the model towards a sparse grammar with shallow productions. We demonstrate the model’s efficacy on supervised phrasestructure parsing, where we induce a latent segmentation of the training treebank, and on unsupervised dependency grammar induction. In both cases the model uncovers interesting latent linguistic structures while producing competitive results.