Results 1  10
of
42
Gibbs Sampling Methods for StickBreaking Priors
"... ... In this paper we present two general types of Gibbs samplers that can be used to fit posteriors of Bayesian hierarchical models based on stickbreaking priors. The first type of Gibbs sampler, referred to as a Polya urn Gibbs sampler, is a generalized version of a widely used Gibbs sampling meth ..."
Abstract

Cited by 384 (18 self)
 Add to MetaCart
... In this paper we present two general types of Gibbs samplers that can be used to fit posteriors of Bayesian hierarchical models based on stickbreaking priors. The first type of Gibbs sampler, referred to as a Polya urn Gibbs sampler, is a generalized version of a widely used Gibbs sampling method currently employed for Dirichlet process computing. This method applies to stickbreaking priors with a known P'olya urn characterization; that is priors with an explicit and simple prediction rule. Our second method, the blocked Gibbs sampler, is based on a entirely different approach that works by directly sampling values from the posterior of the random measure. The blocked Gibbs sampler can be viewed as a more general approach as it works without requiring an explicit prediction rule. We find that the blocked Gibbs avoids some of the limitations seen with the Polya urn approach and should be simpler for nonexperts to use.
Bayesian density regression
 JOURNAL OF THE ROYAL STATISTICAL SOCIETY B
, 2007
"... This article considers Bayesian methods for density regression, allowing a random probability distribution to change flexibly with multiple predictors. The conditional response distribution is expressed as a nonparametric mixture of parametric densities, with the mixture distribution changing acc ..."
Abstract

Cited by 70 (28 self)
 Add to MetaCart
(Show Context)
This article considers Bayesian methods for density regression, allowing a random probability distribution to change flexibly with multiple predictors. The conditional response distribution is expressed as a nonparametric mixture of parametric densities, with the mixture distribution changing according to location in the predictor space. A new class of priors for dependent random measures is proposed for the collection of random mixing measures at each location. The conditional prior for the random measure at a given location is expressed as a mixture of a Dirichlet process (DP) distributed innovation measure and neighboring random measures. This specification results in a coherent prior for the joint measure, with the marginal random measure at each location being a finite mixture of DP basis measures. Integrating out the infinitedimensional collection of mixing measures, we obtain a simple expression for the conditional distribution of the subjectspecific random variables, which generalizes the Pólya urn scheme. Properties are considered and a simple Gibbs sampling algorithm is developed for posterior computation. The methods are illustrated using simulated data examples and epidemiologic studies.
StickBreaking Beta Processes and the Poisson Process
"... We show that the stickbreaking construction of the beta process due to Paisley et al. (2010) can be obtained from the characterization of the beta process as a Poisson process. Specifically, we show that the mean measure of the underlying Poisson process is equal to that of the beta process. We use ..."
Abstract

Cited by 8 (5 self)
 Add to MetaCart
We show that the stickbreaking construction of the beta process due to Paisley et al. (2010) can be obtained from the characterization of the beta process as a Poisson process. Specifically, we show that the mean measure of the underlying Poisson process is equal to that of the beta process. We use this underlying representation to derive error bounds on truncated beta processes that are tighter than those in the literature. We also develop a new MCMC inference algorithm for beta processes, based in part on our new Poisson process construction. 1
The Block Diagonal Infinite Hidden Markov Model
"... The Infinite Hidden Markov Model (IHMM) extends hidden Markov models to have a countably infinite number of hidden states (Beal et al., 2002; Teh et al., 2006). We present a generalization of this framework that introduces nearly blockdiagonal structure in the transitions between the hidden states, ..."
Abstract

Cited by 8 (4 self)
 Add to MetaCart
(Show Context)
The Infinite Hidden Markov Model (IHMM) extends hidden Markov models to have a countably infinite number of hidden states (Beal et al., 2002; Teh et al., 2006). We present a generalization of this framework that introduces nearly blockdiagonal structure in the transitions between the hidden states, where blocks correspond to “subbehaviors” exhibited by data sequences. In identifying such structure, the model classifies, or partitions, sequence data according to these subbehaviors in an unsupervised way. We present an application of this model to artificial data, a video gesture classification task, and a musical theme labeling task, and show that components of the model can also be applied to graph segmentation. 1
Modeling topic hierarchies with the recursive chinese restaurant process
 In Proceedings of the 21st ACM international conference on Information and knowledge management, 783–792. ACM
, 2012
"... Topic models such as latent Dirichlet allocation (LDA) and hierarchical Dirichlet processes (HDP) are simple solutions to discover topics from a set of unannotated documents. While they are simple and popular, a major shortcoming of LDA and HDP is that they do not organize the topics into a hierarc ..."
Abstract

Cited by 7 (1 self)
 Add to MetaCart
(Show Context)
Topic models such as latent Dirichlet allocation (LDA) and hierarchical Dirichlet processes (HDP) are simple solutions to discover topics from a set of unannotated documents. While they are simple and popular, a major shortcoming of LDA and HDP is that they do not organize the topics into a hierarchical structure which is naturally found in many datasets. We introduce the recursive Chinese restaurant process (rCRP) and a nonparametric topic model with rCRP as a prior for discovering a hierarchical topic structure with unbounded depth and width. Unlike previous models for discovering topic hierarchies, rCRP allows the documents to be generated from a mixture over the entire set of topics in the hierarchy. We apply rCRP to a corpus of New York Times articles, a dataset of MovieLens ratings, and a set of Wikipedia articles and show the discovered topic hierarchies. We compare the predictive power of rCRP with LDA, HDP, and nested Chinese restaurant process (nCRP) using heldout likelihood to show that rCRP outperforms the others. We suggest two metrics that quantify the characteristics of a topic hierarchy to compare the discovered topic hierarchies of rCRP and nCRP. The results show that rCRP discovers a hierarchy in which the topics become more specialized toward the leaves, and topics in the immediate family exhibit more affinity than topics beyond the immediate family.
Selection Sampling from Large Data sets for Targeted Inference in Mixture Modeling
, 2009
"... One of the challenges of Markov chain Monte Carlo in large datasets is the need to scan through the whole data at each iteration of the sampler, which can be computationally prohibitive. Several approaches have been developed to address this, typically drawing computationally manageable subsamples o ..."
Abstract

Cited by 4 (1 self)
 Add to MetaCart
One of the challenges of Markov chain Monte Carlo in large datasets is the need to scan through the whole data at each iteration of the sampler, which can be computationally prohibitive. Several approaches have been developed to address this, typically drawing computationally manageable subsamples of the data. Here we consider the specific case where most of the data from a mixture model provides little or no information about the parameters of interest, and we aim to select subsamples such that the information extracted is most relevant. The motivating application arises in flow cytometry, where several measurements from a vast number of cells are available. Interest lies in identifying specific rare cell subtypes and characterizing them according to their corresponding markers. We present a Markov chain Monte Carlo approach where an initial subsample of the full data is used to draw a further set of observations from a low probability region of interest, and describe how inferences can be made efficiently by reducing the dimensionality of the problem. Finally, we extend our method to a Sequential Monte Carlo framework whereby the targeted subsample is augmented sequentially as estimates improve, and introduce a stopping rule for determining the size of the targeted subsample. We implement our algorithm on a flow cytometry dataset, providing higher resolution inferences for rare cell subtypes.
Learning TimeIntensity Profiles of Human Activity using NonParametric Bayesian Models
"... Data sets that characterize human activity over time through collections of timestamped events or counts are of increasing interest in application areas as humancomputer interaction, video surveillance, and Web data analysis. We propose a nonparametric Bayesian framework for modeling collections of ..."
Abstract

Cited by 3 (0 self)
 Add to MetaCart
(Show Context)
Data sets that characterize human activity over time through collections of timestamped events or counts are of increasing interest in application areas as humancomputer interaction, video surveillance, and Web data analysis. We propose a nonparametric Bayesian framework for modeling collections of such data. In particular, we use a Dirichlet process framework for learning a set of intensity functions corresponding to different categories, which form a basis set for representing individual timeperiods (e.g., several days) depending on which categories the timeperiods are assigned to. This allows the model to learn in a datadriven fashion what “factors ” are generating the observations on a particular day, including (for example) weekday versus weekend effects or dayspecific effects corresponding to unique (singleday) occurrences of unusual behavior, sharing information where appropriate to obtain improved estimates of the behavior associated with each category. Applications to real–world data sets of count data involving both vehicles and people are used to illustrate the technique. 1
Clustering in linear mixed models with approximate Dirichlet process mixtures using EM algorithm
 Statistical Modelling
"... Abstract: In linear mixedmodels, the assumption of normally distributed random effects is often inappropriate and unnecessarily restrictive. The proposed approximate Dirichlet process mixture assumes a hierarchical Gaussian mixture that is based on the truncated version of the stick breaking presen ..."
Abstract

Cited by 3 (0 self)
 Add to MetaCart
Abstract: In linear mixedmodels, the assumption of normally distributed random effects is often inappropriate and unnecessarily restrictive. The proposed approximate Dirichlet process mixture assumes a hierarchical Gaussian mixture that is based on the truncated version of the stick breaking presentation of the Dirichlet process. In addition to the weakening of distributional assumptions, the specification allows to identify clusters of observations with a similar random effects structure. An ExpectationMaximization algorithm is given that solves the estimation problem and that, in certain respects, may exhibit advantages over Markov chain Monte Carlo approaches when modelling with Dirichlet processes. The method is evaluated in a simulation study and applied to the dynamics of unemployment in Germany as well as lung function growth data.
Nonparametric empirical Bayes for the Dirichlet process mixture model
 Statistics and Computing
, 2004
"... The Dirichlet process prior allows flexible nonparametric mixture modeling. The number of mixture components is not specified in advance and can grow as new data come in. However, the behavior of the model is sensitive to the choice of the parameters, including an infinitedimensional distribution ..."
Abstract

Cited by 2 (0 self)
 Add to MetaCart
The Dirichlet process prior allows flexible nonparametric mixture modeling. The number of mixture components is not specified in advance and can grow as new data come in. However, the behavior of the model is sensitive to the choice of the parameters, including an infinitedimensional distributional parameter G0 . Most previous applications have either fixed G0 as a member of a parametric family or treated G0 in a Bayesian fashion, using parametric prior specifications. In contrast, we have developed an adaptive nonparametric method for constructing smooth estimates of G0 . We combine this method with a technique for estimating #, the other Dirichlet process parameter, that is inspired by an existing characterization of its maximumlikelihood estimator. Together, these estimation procedures yield a flexible empirical Bayes treatment of Dirichlet process mixtures. Such a treatment is useful in situations where smooth point estimates of G0 are of intrinsic interest, or where the structure of G0 cannot be conveniently modeled with the usual parametric prior families. Analysis of simulated and realworld datasets illustrates the robustness of this approach.