Results 1  10
of
215
Hierarchical Dirichlet processes
 Journal of the American Statistical Association
, 2004
"... program. The authors wish to acknowledge helpful discussions with Lancelot James and Jim Pitman and the referees for useful comments. 1 We consider problems involving groups of data, where each observation within a group is a draw from a mixture model, and where it is desirable to share mixture comp ..."
Abstract

Cited by 875 (74 self)
 Add to MetaCart
(Show Context)
program. The authors wish to acknowledge helpful discussions with Lancelot James and Jim Pitman and the referees for useful comments. 1 We consider problems involving groups of data, where each observation within a group is a draw from a mixture model, and where it is desirable to share mixture components between groups. We assume that the number of mixture components is unknown a priori and is to be inferred from the data. In this setting it is natural to consider sets of Dirichlet processes, one for each group, where the wellknown clustering property of the Dirichlet process provides a nonparametric prior for the number of mixture components within each group. Given our desire to tie the mixture models in the various groups, we consider a hierarchical model, specifically one in which the base measure for the child Dirichlet processes is itself distributed according to a Dirichlet process. Such a base measure being discrete, the child Dirichlet processes necessarily share atoms. Thus, as desired, the mixture models in the different groups necessarily share mixture components. We discuss representations of hierarchical Dirichlet processes in terms of
Church: A language for generative models
 In UAI
, 2008
"... Formal languages for probabilistic modeling enable reuse, modularity, and descriptive clarity, and can foster generic inference techniques. We introduce Church, a universal language for describing stochastic generative processes. Church is based on the Lisp model of lambda calculus, containing a pu ..."
Abstract

Cited by 119 (25 self)
 Add to MetaCart
(Show Context)
Formal languages for probabilistic modeling enable reuse, modularity, and descriptive clarity, and can foster generic inference techniques. We introduce Church, a universal language for describing stochastic generative processes. Church is based on the Lisp model of lambda calculus, containing a pure Lisp as its deterministic subset. The semantics of Church is defined in terms of evaluation histories and conditional distributions on such histories. Church also includes a novel language construct, the stochastic memoizer, which enables simple description of many complex nonparametric models. We illustrate language features through several examples, including: a generalized Bayes net in which parameters cluster over trials, infinite PCFGs, planning by inference, and various nonparametric clustering models. Finally, we show how to implement query on any Church program, exactly and approximately, using Monte Carlo techniques. 1
Rayleigh processes, real trees, and root growth with regrafting
, 2004
"... Abstract. The real trees form a class of metric spaces that extends the class of trees with edge lengths by allowing behavior such as infinite total edge length and vertices with infinite branching degree. Aldous’s Brownian continuum random tree, the random treelike object naturally associated with ..."
Abstract

Cited by 78 (17 self)
 Add to MetaCart
Abstract. The real trees form a class of metric spaces that extends the class of trees with edge lengths by allowing behavior such as infinite total edge length and vertices with infinite branching degree. Aldous’s Brownian continuum random tree, the random treelike object naturally associated with a standard Brownian excursion, may be thought of as a random compact real tree. The continuum random tree is a scaling limit as N → ∞ of both a critical GaltonWatson tree conditioned to have total population size N as well as a uniform random rooted combinatorial tree with N vertices. The Aldous–Broder algorithm is a Markov chain on the space of rooted combinatorial trees with N vertices that has the uniform tree as its stationary distribution. We construct and study a Markov process on the space of all rooted compact real trees that has the continuum random tree as its stationary distribution and arises as the scaling limit as N → ∞ of the Aldous–Broder chain. A key technical ingredient in this work is the use of a pointed Gromov–
Describing Visual Scenes Using Transformed Objects and Parts
 INT J COMPUT VIS
, 2005
"... We develop hierarchical, probabilistic models for objects, the parts composing them, and the visual scenes surrounding them. Our approach couples topic models originally developed for text analysis with spatial transformations, and thus consistently accounts for geometric constraints. By building i ..."
Abstract

Cited by 71 (8 self)
 Add to MetaCart
We develop hierarchical, probabilistic models for objects, the parts composing them, and the visual scenes surrounding them. Our approach couples topic models originally developed for text analysis with spatial transformations, and thus consistently accounts for geometric constraints. By building integrated scene models, we may discover contextual relationships, and better exploit partially labeled training images. We first consider images of isolated objects, and show that sharing parts among object categories improves detection accuracy when learning from few examples. Turning to multiple object scenes, we propose nonparametric models which use Dirichlet processes to automatically learn the number of parts underlying each object category, and objects composing each scene. The resulting transformed Dirichlet process (TDP) leads to Monte Carlo algorithms which simultaneously segment and recognize objects in street and office scenes.
Poisson process partition calculus with an application to Bayesian . . .
, 2005
"... This article develops, and describes how to use, results concerning disintegrations of Poisson random measures. These results are fashioned as simple tools that can be tailormade to address inferential questions arising in a wide range of Bayesian nonparametric and spatial statistical models. The P ..."
Abstract

Cited by 55 (14 self)
 Add to MetaCart
(Show Context)
This article develops, and describes how to use, results concerning disintegrations of Poisson random measures. These results are fashioned as simple tools that can be tailormade to address inferential questions arising in a wide range of Bayesian nonparametric and spatial statistical models. The Poisson disintegration method is based on the formal statement of two results concerning a Laplace functional change of measure and a Poisson Palm/Fubini calculus in terms of random partitions of the integers {1,...,n}. The techniques are analogous to, but much more general than, techniques for the Dirichlet process and weighted gamma process developed in [Ann. Statist. 12
Statistical predicate invention
 In Z. Ghahramani (Ed.), Proceedings of the 24’th annual international conference on machine learning (ICML2007
, 2007
"... We propose statistical predicate invention as a key problem for statistical relational learning. SPI is the problem of discovering new concepts, properties and relations in structured data, and generalizes hidden variable discovery in statistical models and predicate invention in ILP. We propose an ..."
Abstract

Cited by 46 (10 self)
 Add to MetaCart
We propose statistical predicate invention as a key problem for statistical relational learning. SPI is the problem of discovering new concepts, properties and relations in structured data, and generalizes hidden variable discovery in statistical models and predicate invention in ILP. We propose an initial model for SPI based on secondorder Markov logic, in which predicates as well as arguments can be variables, and the domain of discourse is not fully known in advance. Our approach iteratively refines clusters of symbols based on the clusters of symbols they appear in atoms with (e.g., it clusters relations by the clusters of the objects they relate). Since different clusterings are better for predicting different subsets of the atoms, we allow multiple crosscutting clusterings. We show that this approach outperforms Markov logic structure learning and the recently introduced infinite relational model on a number of relational datasets. 1.
Notes on the occupancy problem with infinitely many boxes: general asymptotics and power laws
, 2008
"... ..."
Regenerative composition structures
 ANN. PROBAB
, 2005
"... A new class of random composition structures (the ordered analog of Kingman’s partition structures) is defined by a regenerative description of component sizes. Each regenerative composition structure is represented by a process of random sampling of points from an exponential distribution on the po ..."
Abstract

Cited by 40 (21 self)
 Add to MetaCart
(Show Context)
A new class of random composition structures (the ordered analog of Kingman’s partition structures) is defined by a regenerative description of component sizes. Each regenerative composition structure is represented by a process of random sampling of points from an exponential distribution on the positive halfline, and separating the points into clusters by an independent regenerative random set. Examples are composition structures derived from residual allocation models, including one associated with the Ewens sampling formula, and composition structures derived from the zero set of a Brownian motion or Bessel process. We provide characterisation results and formulas relating the distribution of the regenerative composition to the Lévy parameters of a subordinator whose range is the corresponding regenerative set. In particular, the only reversible regenerative composition structures are those associated with the interval partition of [0, 1] generated by excursions of a standard Bessel bridge of dimension 2 − 2α for some α ∈ [0, 1].
A Critical Branching Process Model for Biodiversity
, 2008
"... Motivated as a null model for comparison with data, we study the following model for a phylogenetic tree on n extant species. The origin of the clade is a random time in the past, whose (improper) distribution is uniform on (0, ∞). After that origin, the process of extinctions and speciations is a c ..."
Abstract

Cited by 36 (5 self)
 Add to MetaCart
Motivated as a null model for comparison with data, we study the following model for a phylogenetic tree on n extant species. The origin of the clade is a random time in the past, whose (improper) distribution is uniform on (0, ∞). After that origin, the process of extinctions and speciations is a continuoustime critical branching process of constant rate, conditioned on having the prescribed number n of species at the present time. We study various mathematical properties of this model as n → ∞ limits: time of origin and of most recent common ancestor; pattern of divergence times within lineage trees; time series of numbers of species; number of extinct species in total, or ancestral to extant species; and “local” structure of the tree itself. We emphasize several mathematical techniques: associating walks with trees, a point process representation of lineage trees, and Brownian limits.