Results 1 - 10
of
49
An Introduction to MCMC for Machine Learning
, 2003
"... This purpose of this introductory paper is threefold. First, it introduces the Monte Carlo method with emphasis on probabilistic machine learning. Second, it reviews the main building blocks of modern Markov chain Monte Carlo simulation, thereby providing and introduction to the remaining papers of ..."
Abstract
-
Cited by 141 (2 self)
- Add to MetaCart
This purpose of this introductory paper is threefold. First, it introduces the Monte Carlo method with emphasis on probabilistic machine learning. Second, it reviews the main building blocks of modern Markov chain Monte Carlo simulation, thereby providing and introduction to the remaining papers of this special issue. Lastly, it discusses new interesting research horizons.
Slice sampling
- Annals of Statistics
, 2000
"... Abstract. Markov chain sampling methods that automatically adapt to characteristics of the distribution being sampled can be constructed by exploiting the principle that one can sample from a distribution by sampling uniformly from the region under the plot of its density function. A Markov chain th ..."
Abstract
-
Cited by 84 (5 self)
- Add to MetaCart
Abstract. Markov chain sampling methods that automatically adapt to characteristics of the distribution being sampled can be constructed by exploiting the principle that one can sample from a distribution by sampling uniformly from the region under the plot of its density function. A Markov chain that converges to this uniform distribution can be constructed by alternating uniform sampling in the vertical direction with uniform sampling from the horizontal ‘slice ’ defined by the current vertical position, or more generally, with some update that leaves the uniform distribution over this slice invariant. Variations on such ‘slice sampling ’ methods are easily implemented for univariate distributions, and can be used to sample from a multivariate distribution by updating each variable in turn. This approach is often easier to implement than Gibbs sampling, and more efficient than simple Metropolis updates, due to the ability of slice sampling to adaptively choose the magnitude of changes made. It is therefore attractive for routine and automated use. Slice sampling methods that update all variables simultaneously are also possible. These methods can adaptively choose the magnitudes of changes made to each variable, based on the local properties of the density function. More ambitiously, such methods could potentially allow the sampling to adapt to dependencies between variables by constructing local quadratic approximations. Another approach is to improve sampling efficiency by suppressing random walks. This can be done using ‘overrelaxed ’ versions of univariate slice sampling procedures, or by using ‘reflective ’ multivariate slice sampling methods, which bounce off the edges of the slice.
Bayesian Methods for Hidden Markov Models -- Recursive Computing in the 21st Century
- JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION
, 2002
"... Markov chain Monte Carlo (MCMC) sampling strategies can be used to simulate hidden Markov model (HMM) parameters from their posterior distribution given observed data. Some MCMC methods (for computing likelihood, conditional probabilities of hidden states, and the most likely sequence of states) use ..."
Abstract
-
Cited by 52 (8 self)
- Add to MetaCart
Markov chain Monte Carlo (MCMC) sampling strategies can be used to simulate hidden Markov model (HMM) parameters from their posterior distribution given observed data. Some MCMC methods (for computing likelihood, conditional probabilities of hidden states, and the most likely sequence of states) used in practice can be improved by incorporating established recursive algorithms. The most important is a set of forward-backward recursions calculating conditional distributions of the hidden states given observed data and model parameters. We show how to use the recursive algorithms in an MCMC context and demonstrate mathematical and empirical results showing a Gibbs sampler using the forward-backward recursions mixes more rapidly than another sampler often used for HMM's. We introduce an augmented variables technique for obtaining unique state labels in HMM's and finite mixture models. We show how recursive computing allows statistically efficient use of MCMC output when estimating the hidden states. We directly calculate the posterior distribution of the hidden chain's state space size by MCMC, circumventing asymptotic arguments underlying the Bayesian information criterion, which is shown to be inappropriate for a frequently analyzed data set in the HMM literature. The use of log-likelihood for assessing MCMC convergence is illustrated, and posterior predictive checks are used to investigate application specific questions of model adequacy.
Parameter Expansion for Data Augmentation
- Journal of the American Statistical Association
, 1999
"... Viewing the observed data of a statistical model as incomplete and augmenting its missing parts are useful for clarifying concepts and central to the invention of two well-known statistical algorithms: expectation-maximization (EM) and data augmentation. Recently, Liu, Rubin, and Wu (1998) demonstra ..."
Abstract
-
Cited by 48 (1 self)
- Add to MetaCart
Viewing the observed data of a statistical model as incomplete and augmenting its missing parts are useful for clarifying concepts and central to the invention of two well-known statistical algorithms: expectation-maximization (EM) and data augmentation. Recently, Liu, Rubin, and Wu (1998) demonstrate that expanding the parameter space along with augmenting the missing data is useful for accelerating iterative computation in an EM algorithm. The main purpose of this article is to rigorously define a parameter expanded data augmentation (PX-DA) algorithm and to study its theoretical properties. The PX-DA is a special way of using auxiliary variables to accelerate Gibbs sampling algorithms and is closely related to reparameterization techniques. Theoretical results concerning the convergence rate of the PX-DA algorithm and the choice of prior for the expansion parameter are obtained. In order to understand the role of the expansion parameter, we establish a new theory for iterative condi...
Convergence of slice sampler Markov chains
- Journal of the Royal Statistical Society, Series B
, 1997
"... this paper, we analyse theoretical properties of the slice sampler. We find that the algorithm has extremely robust geometric ergodicity properties. For the case of just one auxiliary variable, we demonstrate that the algorithm is stochastic monotone, and deduce analytic bounds on the total variatio ..."
Abstract
-
Cited by 43 (9 self)
- Add to MetaCart
this paper, we analyse theoretical properties of the slice sampler. We find that the algorithm has extremely robust geometric ergodicity properties. For the case of just one auxiliary variable, we demonstrate that the algorithm is stochastic monotone, and deduce analytic bounds on the total variation distance from stationarity of the method using Foster-Lyapunov drift condition methodology. 1. Introduction.
Markov Chain Monte Carlo Methods Based on `Slicing' the Density Function
, 1997
"... . One way to sample from a distribution is to sample uniformly from the region under the plot of its density function. A Markov chain that converges to this uniform distribution can be constructed by alternating uniform sampling in the vertical direction with uniform sampling from the horizontal `sl ..."
Abstract
-
Cited by 39 (0 self)
- Add to MetaCart
. One way to sample from a distribution is to sample uniformly from the region under the plot of its density function. A Markov chain that converges to this uniform distribution can be constructed by alternating uniform sampling in the vertical direction with uniform sampling from the horizontal `slice' defined by the current vertical position. Variations on such `slice sampling' methods can easily be implemented for univariate distributions, and can be used to sample from a multivariate distribution by updating each variable in turn. This approach is often easier to implement than Gibbs sampling, and may be more efficient than easily-constructed versions of the Metropolis algorithm. Slice sampling is therefore attractive in routine Markov chain Monte Carlo applications, and for use by software that automatically generates a Markov chain sampler from a model specification. One can also easily devise overrelaxed versions of slice sampling, which sometimes greatly improve sampling effici...
Generalizing Swendsen-Wang to Sampling Arbitrary Posterior Probabilities
- PAMI
, 2005
"... Many vision tasks can be formulated as graph partition problems that minimize energy functions. For such problems, the Gibbs... ..."
Abstract
-
Cited by 34 (9 self)
- Add to MetaCart
Many vision tasks can be formulated as graph partition problems that minimize energy functions. For such problems, the Gibbs...
An efficient Markov chain Monte Carlo method for distributions with intractable normalising constants
- BIOMETRICA
, 2004
"... ..."
EQUI-ENERGY SAMPLER WITH APPLICATIONS IN STATISTICAL INFERENCE AND STATISTICAL MECHANICS
, 2006
"... We introduce a new sampling algorithm, the equi-energy sampler, for efficient statistical sampling and estimation. Complementary to the widely used temperature-domain methods, the equi-energy sampler, utilizing the temperature–energy duality, targets the energy directly. The focus on the energy func ..."
Abstract
-
Cited by 18 (3 self)
- Add to MetaCart
We introduce a new sampling algorithm, the equi-energy sampler, for efficient statistical sampling and estimation. Complementary to the widely used temperature-domain methods, the equi-energy sampler, utilizing the temperature–energy duality, targets the energy directly. The focus on the energy function not only facilitates efficient sampling, but also provides a powerful means for statistical estimation, for example, the calculation of the density of states and microcanonical averages in statistical mechanics. The equi-energy sampler is applied to a variety of problems, including exponential regression in statistics, motif sampling in computational biology and protein folding in biophysics.
Fully Bayesian Estimation of Gibbs Hyperparameters for Emission Computed Tomography Data
- IEEE Transactions on Medical Imaging
, 1997
"... In recent years, many investigators have proposed Gibbs prior models to regularize images reconstructed from emission computed tomography data. Unfortunately, hyperparameters used to specify Gibbs priors can greatly influence the degree of regularity imposed by such priors, and as a result, numerous ..."
Abstract
-
Cited by 16 (3 self)
- Add to MetaCart
In recent years, many investigators have proposed Gibbs prior models to regularize images reconstructed from emission computed tomography data. Unfortunately, hyperparameters used to specify Gibbs priors can greatly influence the degree of regularity imposed by such priors, and as a result, numerous procedures have been proposed to estimate hyperparameter values from observed image data. Many of these procedures attempt to maximize the joint posterior distribution on the image scene. To implement these methods, approximations to the joint posterior densities are required, because the dependence of the Gibbs partition function on the hyperparameter values is unknown. In this paper, we use recent results in Markov Chain Monte Carlo sampling to estimate the relative values of Gibbs partition functions, and using these values, sample from joint posterior distributions on image scenes. This allows for a fully Bayesian procedure which does not fix the hyperparameters at some estimated or spe...

