Results 1  10
of
62
Monte Carlo Statistical Methods
, 1998
"... This paper is also the originator of the Markov Chain Monte Carlo methods developed in the following chapters. The potential of these two simultaneous innovations has been discovered much latter by statisticians (Hastings 1970; Geman and Geman 1984) than by of physicists (see also Kirkpatrick et al. ..."
Abstract

Cited by 909 (23 self)
 Add to MetaCart
This paper is also the originator of the Markov Chain Monte Carlo methods developed in the following chapters. The potential of these two simultaneous innovations has been discovered much latter by statisticians (Hastings 1970; Geman and Geman 1984) than by of physicists (see also Kirkpatrick et al. 1983). 5.5.5 ] PROBLEMS 211
Markov chains for exploring posterior distributions
 Annals of Statistics
, 1994
"... Your use of the JSTOR archive indicates your acceptance of JSTOR's Terms and Conditions of Use, available at ..."
Abstract

Cited by 753 (6 self)
 Add to MetaCart
Your use of the JSTOR archive indicates your acceptance of JSTOR's Terms and Conditions of Use, available at
Markov chain monte carlo convergence diagnostics
 JASA
, 1996
"... A critical issue for users of Markov Chain Monte Carlo (MCMC) methods in applications is how to determine when it is safe to stop sampling and use the samples to estimate characteristics of the distribution of interest. Research into methods of computing theoretical convergence bounds holds promise ..."
Abstract

Cited by 232 (6 self)
 Add to MetaCart
A critical issue for users of Markov Chain Monte Carlo (MCMC) methods in applications is how to determine when it is safe to stop sampling and use the samples to estimate characteristics of the distribution of interest. Research into methods of computing theoretical convergence bounds holds promise for the future but currently has yielded relatively little that is of practical use in applied work. Consequently, most MCMC users address the convergence problem by applying diagnostic tools to the output produced by running their samplers. After giving a brief overview of the area, we provide an expository review of thirteen convergence diagnostics, describing the theoretical basis and practical implementation of each. We then compare their performance in two simple models and conclude that all the methods can fail to detect the sorts of convergence failure they were designed to identify. We thus recommend a combination of strategies aimed at evaluating and accelerating MCMC sampler convergence, including applying diagnostic procedures to a small number of parallel chains, monitoring autocorrelations and crosscorrelations, and modifying parameterizations or sampling algorithms appropriately. We emphasize, however, that it is not possible to say with certainty that a finite sample from an MCMC algorithm is representative of an underlying stationary distribution. 1
Rates of convergence of the Hastings and Metropolis algorithms
 ANNALS OF STATISTICS
, 1996
"... We apply recent results in Markov chain theory to Hastings and Metropolis algorithms with either independent or symmetric candidate distributions, and provide necessary and sufficient conditions for the algorithms to converge at a geometric rate to a prescribed distribution ß. In the independence ca ..."
Abstract

Cited by 157 (15 self)
 Add to MetaCart
We apply recent results in Markov chain theory to Hastings and Metropolis algorithms with either independent or symmetric candidate distributions, and provide necessary and sufficient conditions for the algorithms to converge at a geometric rate to a prescribed distribution ß. In the independence case (in IR k ) these indicate that geometric convergence essentially occurs if and only if the candidate density is bounded below by a multiple of ß; in the symmetric case (in IR only) we show geometric convergence essentially occurs if and only if ß has geometric tails. We also evaluate recently developed computable bounds on the rates of convergence in this context: examples show that these theoretical bounds can be inherently extremely conservative, although when the chain is stochastically monotone the bounds may well be effective.
Slice sampling
 Annals of Statistics
, 2000
"... Abstract. Markov chain sampling methods that automatically adapt to characteristics of the distribution being sampled can be constructed by exploiting the principle that one can sample from a distribution by sampling uniformly from the region under the plot of its density function. A Markov chain th ..."
Abstract

Cited by 152 (5 self)
 Add to MetaCart
Abstract. Markov chain sampling methods that automatically adapt to characteristics of the distribution being sampled can be constructed by exploiting the principle that one can sample from a distribution by sampling uniformly from the region under the plot of its density function. A Markov chain that converges to this uniform distribution can be constructed by alternating uniform sampling in the vertical direction with uniform sampling from the horizontal ‘slice ’ defined by the current vertical position, or more generally, with some update that leaves the uniform distribution over this slice invariant. Variations on such ‘slice sampling ’ methods are easily implemented for univariate distributions, and can be used to sample from a multivariate distribution by updating each variable in turn. This approach is often easier to implement than Gibbs sampling, and more efficient than simple Metropolis updates, due to the ability of slice sampling to adaptively choose the magnitude of changes made. It is therefore attractive for routine and automated use. Slice sampling methods that update all variables simultaneously are also possible. These methods can adaptively choose the magnitudes of changes made to each variable, based on the local properties of the density function. More ambitiously, such methods could potentially allow the sampling to adapt to dependencies between variables by constructing local quadratic approximations. Another approach is to improve sampling efficiency by suppressing random walks. This can be done using ‘overrelaxed ’ versions of univariate slice sampling procedures, or by using ‘reflective ’ multivariate slice sampling methods, which bounce off the edges of the slice.
Geometric convergence and central limit theorems for multidimensional Hastings and Metropolis algorithms
 Biometrika
, 1996
"... ..."
Simulating ratios of normalizing constants via a simple identity: A theoretical exploration
 Statistica Sinica
, 1996
"... Abstract: Let pi(w),i =1, 2, be two densities with common support where each density is known up to a normalizing constant: pi(w) =qi(w)/ci. We have draws from each density (e.g., via Markov chain Monte Carlo), and we want to use these draws to simulate the ratio of the normalizing constants, c1/c2. ..."
Abstract

Cited by 111 (4 self)
 Add to MetaCart
Abstract: Let pi(w),i =1, 2, be two densities with common support where each density is known up to a normalizing constant: pi(w) =qi(w)/ci. We have draws from each density (e.g., via Markov chain Monte Carlo), and we want to use these draws to simulate the ratio of the normalizing constants, c1/c2. Such a computational problem is often encountered in likelihood and Bayesian inference, and arises in fields such as physics and genetics. Many methods proposed in statistical and other literature (e.g., computational physics) for dealing with this problem are based on various special cases of the following simple identity: c1 c2 = E2[q1(w)α(w)] E1[q2(w)α(w)]. Here Ei denotes the expectation with respect to pi (i =1, 2), and α is an arbitrary function such that the denominator is nonzero. A main purpose of this paper is to provide a theoretical study of the usefulness of this identity, with focus on (asymptotically) optimal and practical choices of α. Using a simple but informative example, we demonstrate that with sensible (not necessarily optimal) choices of α, we can reduce the simulation error by orders of magnitude when compared to the conventional importance sampling method, which corresponds to α =1/q2. We also introduce several generalizations of this identity for handling more complicated settings (e.g., estimating several ratios simultaneously) and pose several open problems that appear to have practical as well as theoretical value. Furthermore, we discuss related theoretical and empirical work.
Bayesian phylogenetic inference via Markov chain Monte Carlo methods
 Biometrics
, 1999
"... SUMMARY. We derive a Markov chain to sample from the posterior distribution for a phylogenetic tree given sequence information from the corresponding set of organisms, a stochastic model for these data, and a prior distribution on the space of trees. A transformation of the tree into a canonical cop ..."
Abstract

Cited by 85 (3 self)
 Add to MetaCart
SUMMARY. We derive a Markov chain to sample from the posterior distribution for a phylogenetic tree given sequence information from the corresponding set of organisms, a stochastic model for these data, and a prior distribution on the space of trees. A transformation of the tree into a canonical cophenetic matrix form suggests a simple and effective proposal distribution for selecting candidate trees close to the current tree in the chain. We illustrate the algorithm with restriction site data on 9 plant species, then extend to DNA sequences from 32 species of fish. The algorithm mixes well in both examples from random starting trees, generating reproducible estimates and credible sets for the path of evolution.
The practical implementation of Bayesian model selection
 Institute of Mathematical Statistics
, 2001
"... In principle, the Bayesian approach to model selection is straightforward. Prior probability distributions are used to describe the uncertainty surrounding all unknowns. After observing the data, the posterior distribution provides a coherent post data summary of the remaining uncertainty which is r ..."
Abstract

Cited by 84 (3 self)
 Add to MetaCart
In principle, the Bayesian approach to model selection is straightforward. Prior probability distributions are used to describe the uncertainty surrounding all unknowns. After observing the data, the posterior distribution provides a coherent post data summary of the remaining uncertainty which is relevant for model selection. However, the practical implementation of this approach often requires carefully tailored priors and novel posterior calculation methods. In this article, we illustrate some of the fundamental practical issues that arise for two different model selection problems: the variable selection problem for the linear model and the CART model selection problem.