Results 1  10
of
100
Markov chain monte carlo convergence diagnostics
 JASA
, 1996
"... A critical issue for users of Markov Chain Monte Carlo (MCMC) methods in applications is how to determine when it is safe to stop sampling and use the samples to estimate characteristics of the distribution of interest. Research into methods of computing theoretical convergence bounds holds promise ..."
Abstract

Cited by 274 (6 self)
 Add to MetaCart
A critical issue for users of Markov Chain Monte Carlo (MCMC) methods in applications is how to determine when it is safe to stop sampling and use the samples to estimate characteristics of the distribution of interest. Research into methods of computing theoretical convergence bounds holds promise for the future but currently has yielded relatively little that is of practical use in applied work. Consequently, most MCMC users address the convergence problem by applying diagnostic tools to the output produced by running their samplers. After giving a brief overview of the area, we provide an expository review of thirteen convergence diagnostics, describing the theoretical basis and practical implementation of each. We then compare their performance in two simple models and conclude that all the methods can fail to detect the sorts of convergence failure they were designed to identify. We thus recommend a combination of strategies aimed at evaluating and accelerating MCMC sampler convergence, including applying diagnostic procedures to a small number of parallel chains, monitoring autocorrelations and crosscorrelations, and modifying parameterizations or sampling algorithms appropriately. We emphasize, however, that it is not possible to say with certainty that a finite sample from an MCMC algorithm is representative of an underlying stationary distribution. 1
An Introduction to MCMC for Machine Learning
, 2003
"... This purpose of this introductory paper is threefold. First, it introduces the Monte Carlo method with emphasis on probabilistic machine learning. Second, it reviews the main building blocks of modern Markov chain Monte Carlo simulation, thereby providing and introduction to the remaining papers of ..."
Abstract

Cited by 247 (2 self)
 Add to MetaCart
This purpose of this introductory paper is threefold. First, it introduces the Monte Carlo method with emphasis on probabilistic machine learning. Second, it reviews the main building blocks of modern Markov chain Monte Carlo simulation, thereby providing and introduction to the remaining papers of this special issue. Lastly, it discusses new interesting research horizons.
Annealed importance sampling
 In Statistics and Computing
, 2001
"... Abstract. Simulated annealing — moving from a tractable distribution to a distribution of interest via a sequence of intermediate distributions — has traditionally been used as an inexact method of handling isolated modes in Markov chain samplers. Here, it is shown how one can use the Markov chain t ..."
Abstract

Cited by 162 (3 self)
 Add to MetaCart
Abstract. Simulated annealing — moving from a tractable distribution to a distribution of interest via a sequence of intermediate distributions — has traditionally been used as an inexact method of handling isolated modes in Markov chain samplers. Here, it is shown how one can use the Markov chain transitions for such an annealing sequence to define an importance sampler. The Markov chain aspect allows this method to perform acceptably even for highdimensional problems, where finding good importance sampling distributions would otherwise be very difficult, while the use of importance weights ensures that the estimates found converge to the correct values as the number of annealing runs increases. This annealed importance sampling procedure resembles the second half of the previouslystudied tempered transitions, and can be seen as a generalization of a recentlyproposed variant of sequential importance sampling. It is also related to thermodynamic integration methods for estimating ratios of normalizing constants. Annealed importance sampling is most attractive when isolated modes are present, or when estimates of normalizing constants are required, but it may also be more generally useful, since its independent sampling allows one to bypass some of the problems of assessing convergence and autocorrelation in Markov chain samplers. 1
General state space Markov chains and MCMC algorithm
 PROBABILITY SURVEYS
, 2004
"... This paper surveys various results about Markov chains on general (noncountable) state spaces. It begins with an introduction to Markov chain Monte Carlo (MCMC) algorithms, which provide the motivation and context for the theory which follows. Then, sufficient conditions for geometric and uniform e ..."
Abstract

Cited by 112 (28 self)
 Add to MetaCart
This paper surveys various results about Markov chains on general (noncountable) state spaces. It begins with an introduction to Markov chain Monte Carlo (MCMC) algorithms, which provide the motivation and context for the theory which follows. Then, sufficient conditions for geometric and uniform ergodicity are presented, along with quantitative bounds on the rate of convergence to stationarity. Many of these results are proved using direct coupling constructions based on minorisation and drift conditions. Necessary and sufficient conditions for Central Limit Theorems (CLTs) are also presented, in some cases proved via the Poisson Equation or direct regeneration constructions. Finally, optimal scaling and weak convergence results for MetropolisHastings algorithms are discussed. None of the results presented is new, though many of the proofs are. We also describe some Open Problems.
Honest Exploration of Intractable Probability Distributions Via Markov Chain Monte Carlo
 STATISTICAL SCIENCE
, 2001
"... Two important questions that must be answered whenever a Markov chain Monte Carlo (MCMC) algorithm is used are (Q1) What is an appropriate burnin? and (Q2) How long should the sampling continue after burnin? Developing rigorous answers to these questions presently requires a detailed study of the ..."
Abstract

Cited by 76 (20 self)
 Add to MetaCart
Two important questions that must be answered whenever a Markov chain Monte Carlo (MCMC) algorithm is used are (Q1) What is an appropriate burnin? and (Q2) How long should the sampling continue after burnin? Developing rigorous answers to these questions presently requires a detailed study of the convergence properties of the underlying Markov chain. Consequently, in most practical applications of MCMC, exact answers to (Q1) and (Q2) are not sought. The goal of this paper is to demystify the analysis that leads to honest answers to (Q1) and (Q2). The authors hope that this article will serve as a bridge between those developing Markov chain theory and practitioners using MCMC to solve practical problems. The ability to formally address (Q1) and (Q2) comes from establishing a drift condition and an associated minorization condition, which together imply that the underlying Markov chain is geometrically ergodic. In this paper, we explain exactly what drift and minorization are as well as how and why these conditions can be used to form rigorous answers to (Q1) and (Q2). The basic ideas are as follows. The results of Rosenthal (1995) and Roberts and Tweedie (1999) allow one to use drift and minorization conditions to construct a formula giving an analytic upper bound on the distance to stationarity. A rigorous answer to (Q1) can be calculated using this formula. The desired characteristics of the target distribution are typically estimated using ergodic averages. Geometric ergodicity of the underlying Markov chain implies that there are central limit theorems available for ergodic averages (Chan and Geyer 1994). The regenerative simulation technique (Mykland, Tierney and Yu 1995, Robert 1995) can be used to get a consistent estimate of the variance of the asymptotic nor...
Adaptive Markov Chain Monte Carlo through Regeneration
, 1998
"... this paper is organized as follows. In Section 2 we introduce the concept of regeneration and adaptation at regeneration, and provide theoretical support. In Section 3, the splitting techniques required for adaptation are reviewed. Section 4 contains four illustrations of adaptive MCMC. Some of the ..."
Abstract

Cited by 70 (4 self)
 Add to MetaCart
this paper is organized as follows. In Section 2 we introduce the concept of regeneration and adaptation at regeneration, and provide theoretical support. In Section 3, the splitting techniques required for adaptation are reviewed. Section 4 contains four illustrations of adaptive MCMC. Some of the proofs from Sections 2 and 3 are placed in the Appendix. 2 Regeneration: A Framework for Adaptation
Phylogenetic Tree Construction Using Markov Chain Monte Carlo
, 1999
"... We describe a Bayesian method based on Markov chain simulation to study the phylogenetic relationship in a group of DNA sequences. Under simple models of mutational events, our method produces a Markov chain whose stationary distribution is the conditional distribution of the phylogeny given the obs ..."
Abstract

Cited by 68 (0 self)
 Add to MetaCart
We describe a Bayesian method based on Markov chain simulation to study the phylogenetic relationship in a group of DNA sequences. Under simple models of mutational events, our method produces a Markov chain whose stationary distribution is the conditional distribution of the phylogeny given the observed sequences. Our algorithm strikes a reasonable balance between the desire to move globally through the space of phylogenies and the need to make computationally feasible moves in areas of high probability. Since phylogenetic information is described by a tree, we have created new diagnostics to handle this type of data structure. An important byproduct of the Markov chain Monte Carlo phylogeny building technique is that it provides estimates and corresponding measures of variability for any aspect of the phylogeny under study.
Convergence rates of Markov chains
, 1995
"... this paper, we attempt to describe various mathematical techniques which have been used to bound such rates of convergence. In particular, we describe eigenvalue analysis, random walks on groups, coupling, and minorization conditions. Connections are made to modern areas of research wherever possibl ..."
Abstract

Cited by 67 (4 self)
 Add to MetaCart
this paper, we attempt to describe various mathematical techniques which have been used to bound such rates of convergence. In particular, we describe eigenvalue analysis, random walks on groups, coupling, and minorization conditions. Connections are made to modern areas of research wherever possible. Elements of linear algebra, probability theory, group theory, and measure theory are used, but efforts are made to keep the presentation elementary and accessible. Acknowledgements. I thank Eric Belsley for comments and corrections, and thank Persi Diaconis for introducing me to this subject and teaching me so much. 1. Introduction and motivation.
FixedWidth Output Analysis for Markov Chain Monte Carlo
, 2005
"... Markov chain Monte Carlo is a method of producing a correlated sample in order to estimate features of a target distribution via ergodic averages. A fundamental question is when should sampling stop? That is, when are the ergodic averages good estimates of the desired quantities? We consider a metho ..."
Abstract

Cited by 52 (18 self)
 Add to MetaCart
Markov chain Monte Carlo is a method of producing a correlated sample in order to estimate features of a target distribution via ergodic averages. A fundamental question is when should sampling stop? That is, when are the ergodic averages good estimates of the desired quantities? We consider a method that stops the simulation when the width of a confidence interval based on an ergodic average is less than a userspecified value. Hence calculating a Monte Carlo standard error is a critical step in assessing the simulation output. We consider the regenerative simulation and batch means methods of estimating the variance of the asymptotic normal distribution. We give sufficient conditions for the strong consistency of both methods and investigate their finite sample properties in a variety of examples.