Results 1  10
of
203
Markov chain monte carlo convergence diagnostics
 JASA
, 1996
"... A critical issue for users of Markov Chain Monte Carlo (MCMC) methods in applications is how to determine when it is safe to stop sampling and use the samples to estimate characteristics of the distribution of interest. Research into methods of computing theoretical convergence bounds holds promise ..."
Abstract

Cited by 338 (6 self)
 Add to MetaCart
(Show Context)
A critical issue for users of Markov Chain Monte Carlo (MCMC) methods in applications is how to determine when it is safe to stop sampling and use the samples to estimate characteristics of the distribution of interest. Research into methods of computing theoretical convergence bounds holds promise for the future but currently has yielded relatively little that is of practical use in applied work. Consequently, most MCMC users address the convergence problem by applying diagnostic tools to the output produced by running their samplers. After giving a brief overview of the area, we provide an expository review of thirteen convergence diagnostics, describing the theoretical basis and practical implementation of each. We then compare their performance in two simple models and conclude that all the methods can fail to detect the sorts of convergence failure they were designed to identify. We thus recommend a combination of strategies aimed at evaluating and accelerating MCMC sampler convergence, including applying diagnostic procedures to a small number of parallel chains, monitoring autocorrelations and crosscorrelations, and modifying parameterizations or sampling algorithms appropriately. We emphasize, however, that it is not possible to say with certainty that a finite sample from an MCMC algorithm is representative of an underlying stationary distribution. 1
Sequential Monte Carlo Samplers
, 2002
"... In this paper, we propose a general algorithm to sample sequentially from a sequence of probability distributions known up to a normalizing constant and de ned on a common space. A sequence of increasingly large arti cial joint distributions is built; each of these distributions admits a marginal ..."
Abstract

Cited by 288 (47 self)
 Add to MetaCart
In this paper, we propose a general algorithm to sample sequentially from a sequence of probability distributions known up to a normalizing constant and de ned on a common space. A sequence of increasingly large arti cial joint distributions is built; each of these distributions admits a marginal which is a distribution of interest. To sample from these distributions, we use sequential Monte Carlo methods. We show that these methods can be interpreted as interacting particle approximations of a nonlinear FeynmanKac ow in distribution space. One interpretation of the FeynmanKac ow corresponds to a nonlinear Markov kernel admitting a speci ed invariant distribution and is a natural nonlinear extension of the standard MetropolisHastings algorithm. Many theoretical results have already been established for such ows and their particle approximations. We demonstrate the use of these algorithms through simulation.
Annealed importance sampling
 In Statistics and Computing
, 2001
"... Abstract. Simulated annealing — moving from a tractable distribution to a distribution of interest via a sequence of intermediate distributions — has traditionally been used as an inexact method of handling isolated modes in Markov chain samplers. Here, it is shown how one can use the Markov chain t ..."
Abstract

Cited by 250 (5 self)
 Add to MetaCart
(Show Context)
Abstract. Simulated annealing — moving from a tractable distribution to a distribution of interest via a sequence of intermediate distributions — has traditionally been used as an inexact method of handling isolated modes in Markov chain samplers. Here, it is shown how one can use the Markov chain transitions for such an annealing sequence to define an importance sampler. The Markov chain aspect allows this method to perform acceptably even for highdimensional problems, where finding good importance sampling distributions would otherwise be very difficult, while the use of importance weights ensures that the estimates found converge to the correct values as the number of annealing runs increases. This annealed importance sampling procedure resembles the second half of the previouslystudied tempered transitions, and can be seen as a generalization of a recentlyproposed variant of sequential importance sampling. It is also related to thermodynamic integration methods for estimating ratios of normalizing constants. Annealed importance sampling is most attractive when isolated modes are present, or when estimates of normalizing constants are required, but it may also be more generally useful, since its independent sampling allows one to bypass some of the problems of assessing convergence and autocorrelation in Markov chain samplers. 1
Simulating Normalized Constants: From Importance Sampling to Bridge Sampling to Path Sampling
, 1998
"... Computing (ratios of) normalizing constants of probability models is a fundamental computational problem for many statistical and scientific studies. Monte Carlo simulation is an effective technique, especially with complex and highdimensional models. This paper aims to bring to the attention of ..."
Abstract

Cited by 210 (5 self)
 Add to MetaCart
Computing (ratios of) normalizing constants of probability models is a fundamental computational problem for many statistical and scientific studies. Monte Carlo simulation is an effective technique, especially with complex and highdimensional models. This paper aims to bring to the attention of general statistical audiences of some effective methods originating from theoretical physics and at the same time to explore these methods from a more statistical perspective, through establishing theoretical connections and illustrating their uses with statistical problems. We show that the acceptance ratio method and thermodynamic integration are natural generalizations of importance sampling, which is most familiar to statistical audiences. The former generalizes importance sampling through the use of a single “bridge ” density and is thus a case of bridge sampling in the sense of Meng and Wong. Thermodynamic integration, which is also known in the numerical analysis literature as Ogata’s method for highdimensional integration, corresponds to the use of infinitely many and continuously connected bridges (and thus a “path”). Our path sampling formulation offers more flexibility and thus potential efficiency to thermodynamic integration, and the search of optimal paths turns out to have close connections with the Jeffreys prior density and the Rao and Hellinger distances between two densities. We provide an informative theoretical example as well as two empirical examples (involving 17 to 70dimensional integrations) to illustrate the potential and implementation of path sampling. We also discuss some open problems.
Sampling from Multimodal Distributions Using Tempered Transitions
 Statistics and Computing
, 1994
"... . I present a new Markov chain sampling method appropriate for distributions with isolated modes. Like the recentlydeveloped method of "simulated tempering", the "tempered transition" method uses a series of distributions that interpolate between the distribution of interest and ..."
Abstract

Cited by 108 (8 self)
 Add to MetaCart
(Show Context)
. I present a new Markov chain sampling method appropriate for distributions with isolated modes. Like the recentlydeveloped method of "simulated tempering", the "tempered transition" method uses a series of distributions that interpolate between the distribution of interest and a distribution for which sampling is easier. The new method has the advantage that it does not require approximate values for the normalizing constants of these distributions, which are needed for simulated tempering, and can be tedious to estimate. Simulated tempering performs a random walk along the series of distributions used. In contrast, the tempered transitions of the new method move systematically from the desired distribution, to the easilysampled distribution, and back to the desired distribution. This systematic movement avoids the inefficiency of a random walk, an advantage that unfortunately is cancelled by an increase in the number of interpolating distributions required. Because of this, the sa...
Honest Exploration of Intractable Probability Distributions Via Markov Chain Monte Carlo
 STATISTICAL SCIENCE
, 2001
"... Two important questions that must be answered whenever a Markov chain Monte Carlo (MCMC) algorithm is used are (Q1) What is an appropriate burnin? and (Q2) How long should the sampling continue after burnin? Developing rigorous answers to these questions presently requires a detailed study of the ..."
Abstract

Cited by 103 (33 self)
 Add to MetaCart
Two important questions that must be answered whenever a Markov chain Monte Carlo (MCMC) algorithm is used are (Q1) What is an appropriate burnin? and (Q2) How long should the sampling continue after burnin? Developing rigorous answers to these questions presently requires a detailed study of the convergence properties of the underlying Markov chain. Consequently, in most practical applications of MCMC, exact answers to (Q1) and (Q2) are not sought. The goal of this paper is to demystify the analysis that leads to honest answers to (Q1) and (Q2). The authors hope that this article will serve as a bridge between those developing Markov chain theory and practitioners using MCMC to solve practical problems. The ability to formally address (Q1) and (Q2) comes from establishing a drift condition and an associated minorization condition, which together imply that the underlying Markov chain is geometrically ergodic. In this paper, we explain exactly what drift and minorization are as well as how and why these conditions can be used to form rigorous answers to (Q1) and (Q2). The basic ideas are as follows. The results of Rosenthal (1995) and Roberts and Tweedie (1999) allow one to use drift and minorization conditions to construct a formula giving an analytic upper bound on the distance to stationarity. A rigorous answer to (Q1) can be calculated using this formula. The desired characteristics of the target distribution are typically estimated using ergodic averages. Geometric ergodicity of the underlying Markov chain implies that there are central limit theorems available for ergodic averages (Chan and Geyer 1994). The regenerative simulation technique (Mykland, Tierney and Yu 1995, Robert 1995) can be used to get a consistent estimate of the variance of the asymptotic nor...
FixedWidth Output Analysis for Markov Chain Monte Carlo
, 2005
"... Markov chain Monte Carlo is a method of producing a correlated sample in order to estimate features of a target distribution via ergodic averages. A fundamental question is when should sampling stop? That is, when are the ergodic averages good estimates of the desired quantities? We consider a metho ..."
Abstract

Cited by 90 (29 self)
 Add to MetaCart
Markov chain Monte Carlo is a method of producing a correlated sample in order to estimate features of a target distribution via ergodic averages. A fundamental question is when should sampling stop? That is, when are the ergodic averages good estimates of the desired quantities? We consider a method that stops the simulation when the width of a confidence interval based on an ergodic average is less than a userspecified value. Hence calculating a Monte Carlo standard error is a critical step in assessing the simulation output. We consider the regenerative simulation and batch means methods of estimating the variance of the asymptotic normal distribution. We give sufficient conditions for the strong consistency of both methods and investigate their finite sample properties in a variety of examples.
Markov chain Monte Carlo: Can we trust the third significant figure
 University of Minnesota, School of Statistics
, 2007
"... Abstract. Current reporting of results based on Markov chain Monte Carlo computations could be improved. In particular, a measure of the accuracy of the resulting estimates is rarely reported. Thus we have little ability to objectively assess the quality of the reported estimates. We address this is ..."
Abstract

Cited by 67 (25 self)
 Add to MetaCart
(Show Context)
Abstract. Current reporting of results based on Markov chain Monte Carlo computations could be improved. In particular, a measure of the accuracy of the resulting estimates is rarely reported. Thus we have little ability to objectively assess the quality of the reported estimates. We address this issue in that we discuss why Monte Carlo standard errors are important, how they can be easily calculated in Markov chain Monte Carlo and how they can be used to decide when to stop the simulation. We compare their use to a popular alternative in the context of two examples.
On the Applicability of Regenerative Simulation in Markov Chain Monte Carlo
, 2001
"... We consider the central limit theorem and the calculation of asymptotic standard errors for the ergodic averages constructed in Markov chain Monte Carlo. Chan & Geyer (1994) established a central limit theorem for ergodic averages by assuming that the underlying Markov chain is geometrically ..."
Abstract

Cited by 51 (30 self)
 Add to MetaCart
We consider the central limit theorem and the calculation of asymptotic standard errors for the ergodic averages constructed in Markov chain Monte Carlo. Chan & Geyer (1994) established a central limit theorem for ergodic averages by assuming that the underlying Markov chain is geometrically ergodic and that a simple moment condition is satisfied. While it is relatively straightforward to check Chan and Geyer's conditions, their theorem does not lead to a consistent and easily computed estimate of the variance of the asymptotic normal distribution. Conversely, Mykland, Tierney & Yu (1995) discuss the use of regeneration to establish an alternative central limit theorem with the advantage that a simple, consistent estimate of the asymptotic variance is readily available. However, their result assumes a pair of unwieldy moment conditions whose verification is difficult in practice. In this paper, we show that the conditions of Chan and Geyer's theorem are sucient to establish Mykland, Tierney, and Yu's central limit theorem. This result, in conjunction with other recent developments, should pave the way for more widespread use of the regenerative method in Markov chain Monte Carlo. Our results are applied to the slice sampler for illustration.
Controlled MCMC for Optimal Sampling
, 2001
"... this paper we develop an original and general framework for automatically optimizing the statistical properties of Markov chain Monte Carlo (MCMC) samples, which are typically used to evaluate complex integrals. The MetropolisHastings algorithm is the basic building block of classical MCMC methods ..."
Abstract

Cited by 49 (9 self)
 Add to MetaCart
this paper we develop an original and general framework for automatically optimizing the statistical properties of Markov chain Monte Carlo (MCMC) samples, which are typically used to evaluate complex integrals. The MetropolisHastings algorithm is the basic building block of classical MCMC methods and requires the choice of a proposal distribution, which usually belongs to a parametric family. The correlation properties together with the exploratory ability of the Markov chain heavily depend on the choice of the proposal distribution. By monitoring the simulated path, our approach allows us to learn "on the fly" the optimal parameters of the proposal distribution for several statistical criteria. Keywords: Monte Carlo, adaptive MCMC, calibration, stochastic approximation, gradient method, optimal scaling, random walk, Langevin, Gibbs, controlled Markov chain, learning algorithm, reversible jump MCMC. 1. Motivation 1.1. Introduction Markov chain Monte Carlo (MCMC) is a general strategy for generating samples x i (i = 0; 1; : : :) from complex highdimensional distributions, say defined on the space X ae R nx , from which integrals of the type I (f) = Z X f (x) (x) dx; can be calculated using the estimator b I N (f) = 1 N + 1 N X i=0 f (x i ) ; provided that the Markov chain produced is ergodic. The main building block of this class of algorithms is the MetropolisHastings (MH) algorithm. It requires the definition of a proposal distribution q whose role is to generate possible transitions for the Markov chain, say from x to y, which are then accepted or rejected according to the probabilityy ff (x; y) = min ae 1; (y) q (y; x) (x) q (x; y) oe : The simplicity and universality of this algorithm are both its strength and weakness. The choice of ...