Results 1  10
of
24
Computational and Inferential Difficulties With Mixture Posterior Distributions
 Journal of the American Statistical Association
, 1999
"... This paper deals with both exploration and interpretation problems related to posterior distributions for mixture models. The specification of mixture posterior distributions means that the presence of k! modes is known immediately. Standard Markov chain Monte Carlo techniques usually have difficult ..."
Abstract

Cited by 112 (12 self)
 Add to MetaCart
This paper deals with both exploration and interpretation problems related to posterior distributions for mixture models. The specification of mixture posterior distributions means that the presence of k! modes is known immediately. Standard Markov chain Monte Carlo techniques usually have difficulties with wellseparated modes such as occur here; the Markov chain Monte Carlo sampler stays within a neighbourhood of a local mode and fails to visit other equally important modes. We show that exploration of these modes can be imposed on the Markov chain Monte Carlo sampler using tempered transitions based on Langevin algorithms. However, as the prior distribution does not distinguish between the different components, the posterior mixture distribution is symmetric and thus standard estimators such as posterior means cannot be used. Since this is also true for most nonsymmetric priors, we propose alternatives for Bayesian inference for permutation invariant posteriors, including a cluster...
On Perfect Simulation for Some Mixtures of Distributions
 Statistics and Computing
, 1998
"... . This paper studies the implementation of the coupling from the past (CFTP) method of Propp and Wilson (1996) in the setup of two and three component mixtures with known components. We show that monotonicity structures can be exhibited in both cases, but that CFTP can still be costly for three com ..."
Abstract

Cited by 10 (3 self)
 Add to MetaCart
. This paper studies the implementation of the coupling from the past (CFTP) method of Propp and Wilson (1996) in the setup of two and three component mixtures with known components. We show that monotonicity structures can be exhibited in both cases, but that CFTP can still be costly for three component mixtures. We conclude with a simulation experiment exhibiting an almost perfect sampling scheme where we only consider a subset of the exhaustive set of starting values. Keywords: coupling, mixtures, monotonicity. 1 Introduction Following Propp and Wilson (1996), several authors have proposed devices to sample directly from a stationary distribution f at varying computational costs and for specific distributions and transitions. The nomenclature of perfect sampling for such techniques was coined by Kendall (1998), replacing the exact sampling terminology of Propp and Wilson (1996). The first implementations of perfect sampling required running the Markov chain of interest from ever...
Mixture Models, Latent Variables and Partitioned Importance Sampling
"... this paper. The reason for the paradoxical complexity of the mixture model is due to the product structure of the likelihood function, L(` 1 ; : : : ; ` k jx 1 ; : : : ; xn ) = ..."
Abstract

Cited by 9 (3 self)
 Add to MetaCart
this paper. The reason for the paradoxical complexity of the mixture model is due to the product structure of the likelihood function, L(` 1 ; : : : ; ` k jx 1 ; : : : ; xn ) =
Bayesian Inference on Mixtures of Distributions
, 2008
"... This survey covers stateoftheart Bayesian techniques for the estimation of mixtures. It complements the earlier Marin et al. (2005) by studying new types of distributions, the multinomial, latent class and t distributions. It also exhibits closed form solutions for Bayesian inference in some disc ..."
Abstract

Cited by 6 (5 self)
 Add to MetaCart
This survey covers stateoftheart Bayesian techniques for the estimation of mixtures. It complements the earlier Marin et al. (2005) by studying new types of distributions, the multinomial, latent class and t distributions. It also exhibits closed form solutions for Bayesian inference in some discrete setups. At last, it sheds a new light on the computation of Bayes factors via the approximation of Chib (1995).
Eaton's Markov chain, its conjugate partner and Padmissibility
 Annals of Statistics
, 1999
"... Suppose that X is a random variable with density f(xj`) and that ï¿½ï¿½(`jx) is a proper posterior corresponding to an improper prior (`). The prior is called Padmissible if the generalized Bayes estimator of every bounded function of ` is almostadmissible under squared error loss. Eaton (1992) s ..."
Abstract

Cited by 6 (5 self)
 Add to MetaCart
Suppose that X is a random variable with density f(xj`) and that ï¿½ï¿½(`jx) is a proper posterior corresponding to an improper prior (`). The prior is called Padmissible if the generalized Bayes estimator of every bounded function of ` is almostadmissible under squared error loss. Eaton (1992) showed that recurrence of the Markov chain with transition density R(jj`) = R ï¿½ï¿½(jjx)f(xj`)dx is a sufficient condition for Padmissibility of (`). We show that Eaton's Markov chain is recurrent if and only if its conjugate partner, with transition density
Stability Relationships Among the Gibbs Sampler and its Subchains
 Journal of Computational and Graphical Statistics
, 2001
"... The use of Gibbs samplers driven by improper posteriors has been a controversial issue in the statistics literature over the last few years. Recently, Gelfand and Sahu (1999), Liu and Wu (1999), Meng and van Dyk (1999), and van Dyk and Meng (2001) have given examples demonstrating that it is possibl ..."
Abstract

Cited by 6 (2 self)
 Add to MetaCart
The use of Gibbs samplers driven by improper posteriors has been a controversial issue in the statistics literature over the last few years. Recently, Gelfand and Sahu (1999), Liu and Wu (1999), Meng and van Dyk (1999), and van Dyk and Meng (2001) have given examples demonstrating that it is possible to make valid statistical inferences through such Gibbs samplers. Furthermore, these authors provide theoretical and empirical evidence that there are actually computational advantages to using these nonpositive recurrent Markov chains rather than more standard positive recurrent chains. These results provide motivation for a general study of the behavior of the Gibbs Markov chain when it is not positive recurrent. This paper concerns stability relationships among the twovariable Gibbs sampler and its subchains. We show that these three Markov chains always share the same stability; that is, they are either all positive recurrent, all null recurrent, or all transient. In additi...
A Spectral Analytic Comparison of Traceclass Data Augmentation Algorithms and their Sandwich Variants
, 2010
"... Let fX(x) be an intractable probability density. If f(x, y) is a joint density whose xmarginal is fX(x), then f(x, y) can be used to build a data augmentation (DA) algorithm that simulates a Markov chain whose invariant density is fX(x). The move from the current state of the chain, Xn = x, to the ..."
Abstract

Cited by 3 (3 self)
 Add to MetaCart
Let fX(x) be an intractable probability density. If f(x, y) is a joint density whose xmarginal is fX(x), then f(x, y) can be used to build a data augmentation (DA) algorithm that simulates a Markov chain whose invariant density is fX(x). The move from the current state of the chain, Xn = x, to the new state, Xn+1, involves two simulation steps: Draw Y ∼ fY X(·x), call the result y, and then draw Xn+1 ∼ fXY (·y). The sandwich algorithm is a variant that involves an extra step “sandwiched” between the two conditional draws. Let R(y, dy ′) be any Markov transition function that is reversible with respect to the ymarginal, fY (y). The extra step entails drawing Y ′ ∼ R(y, ·), and then using this draw, call it y ′, in place of y in the second step. In this paper, the DA and sandwich algorithms are compared in the case where the joint density, f(x, y), satisfies ∫ ∫ X Y fXY (xy)fY X(yx) dy dx < ∞. This condition implies that the (positive) Markov operator associated with the DA Markov chain is traceclass. It is shown that, without any further assumptions, the sandwich algorithm always converges at least
Improving the Convergence Properties of the Data Augmentation Algorithm with an Application to Bayesian Mixture Modelling
, 2009
"... Every reversible Markov chain defines an operator whose spectrum encodes the convergence properties of the chain. When the state space is finite, the spectrum is just the set of eigenvalues of the corresponding Markov transition matrix. However, when the state space is infinite, the spectrum may be ..."
Abstract

Cited by 2 (2 self)
 Add to MetaCart
Every reversible Markov chain defines an operator whose spectrum encodes the convergence properties of the chain. When the state space is finite, the spectrum is just the set of eigenvalues of the corresponding Markov transition matrix. However, when the state space is infinite, the spectrum may be uncountable, and is nearly always impossible to calculate. In most applications of the data augmentation (DA) algorithm, the state space of the DA Markov chain is infinite. However, we show that, under regularity conditions that include the finiteness of the augmented space, the operators defined by the DA chain and Hobert and Marchev’s (2008) alternative chain are both compact, and the corresponding spectra are both finite subsets of [0, 1). Moreover, we prove that the spectrum of Hobert and Marchev’s (2008) chain dominates the spectrum of the DA chain in the sense that the ordered elements of the former are all less than or equal to the corresponding elements of the latter. As a concrete example, we study a widely used DA algorithm for the exploration of posterior densities associated with Bayesian mixture models (Diebolt and Robert, 1994). In particular, we compare this mixture DA algorithm with an alternative algorithm proposed by FrühwirthSchnatter (2001) that is based on random label switching. 1 1