Results 1  10
of
28
An Introduction to MCMC for Machine Learning
, 2003
"... This purpose of this introductory paper is threefold. First, it introduces the Monte Carlo method with emphasis on probabilistic machine learning. Second, it reviews the main building blocks of modern Markov chain Monte Carlo simulation, thereby providing and introduction to the remaining papers of ..."
Abstract

Cited by 235 (2 self)
 Add to MetaCart
This purpose of this introductory paper is threefold. First, it introduces the Monte Carlo method with emphasis on probabilistic machine learning. Second, it reviews the main building blocks of modern Markov chain Monte Carlo simulation, thereby providing and introduction to the remaining papers of this special issue. Lastly, it discusses new interesting research horizons.
Slice sampling
 Annals of Statistics
, 2000
"... Abstract. Markov chain sampling methods that automatically adapt to characteristics of the distribution being sampled can be constructed by exploiting the principle that one can sample from a distribution by sampling uniformly from the region under the plot of its density function. A Markov chain th ..."
Abstract

Cited by 159 (5 self)
 Add to MetaCart
Abstract. Markov chain sampling methods that automatically adapt to characteristics of the distribution being sampled can be constructed by exploiting the principle that one can sample from a distribution by sampling uniformly from the region under the plot of its density function. A Markov chain that converges to this uniform distribution can be constructed by alternating uniform sampling in the vertical direction with uniform sampling from the horizontal ‘slice ’ defined by the current vertical position, or more generally, with some update that leaves the uniform distribution over this slice invariant. Variations on such ‘slice sampling ’ methods are easily implemented for univariate distributions, and can be used to sample from a multivariate distribution by updating each variable in turn. This approach is often easier to implement than Gibbs sampling, and more efficient than simple Metropolis updates, due to the ability of slice sampling to adaptively choose the magnitude of changes made. It is therefore attractive for routine and automated use. Slice sampling methods that update all variables simultaneously are also possible. These methods can adaptively choose the magnitudes of changes made to each variable, based on the local properties of the density function. More ambitiously, such methods could potentially allow the sampling to adapt to dependencies between variables by constructing local quadratic approximations. Another approach is to improve sampling efficiency by suppressing random walks. This can be done using ‘overrelaxed ’ versions of univariate slice sampling procedures, or by using ‘reflective ’ multivariate slice sampling methods, which bounce off the edges of the slice.
Transdimensional Markov chain Monte Carlo
 in Highly Structured Stochastic Systems
, 2003
"... In the context of samplebased computation of Bayesian posterior distributions in complex stochastic systems, this chapter discusses some of the uses for a Markov chain with a prescribed invariant distribution whose support is a union of euclidean spaces of differing dimensions. This leads into a re ..."
Abstract

Cited by 59 (0 self)
 Add to MetaCart
In the context of samplebased computation of Bayesian posterior distributions in complex stochastic systems, this chapter discusses some of the uses for a Markov chain with a prescribed invariant distribution whose support is a union of euclidean spaces of differing dimensions. This leads into a reformulation of the reversible jump MCMC framework for constructing such ‘transdimensional ’ Markov chains. This framework is compared to alternative approaches for the same task, including methods that involve separate sampling within different fixeddimension models. We consider some of the difficulties researchers have encountered with obtaining adequate performance with some of these methods, attributing some of these to misunderstandings, and offer tentative recommendations about algorithm choice for various classes of problem. The chapter concludes with a look towards desirable future developments.
Controlled MCMC for Optimal Sampling
, 2001
"... this paper we develop an original and general framework for automatically optimizing the statistical properties of Markov chain Monte Carlo (MCMC) samples, which are typically used to evaluate complex integrals. The MetropolisHastings algorithm is the basic building block of classical MCMC methods ..."
Abstract

Cited by 37 (6 self)
 Add to MetaCart
this paper we develop an original and general framework for automatically optimizing the statistical properties of Markov chain Monte Carlo (MCMC) samples, which are typically used to evaluate complex integrals. The MetropolisHastings algorithm is the basic building block of classical MCMC methods and requires the choice of a proposal distribution, which usually belongs to a parametric family. The correlation properties together with the exploratory ability of the Markov chain heavily depend on the choice of the proposal distribution. By monitoring the simulated path, our approach allows us to learn "on the fly" the optimal parameters of the proposal distribution for several statistical criteria. Keywords: Monte Carlo, adaptive MCMC, calibration, stochastic approximation, gradient method, optimal scaling, random walk, Langevin, Gibbs, controlled Markov chain, learning algorithm, reversible jump MCMC. 1. Motivation 1.1. Introduction Markov chain Monte Carlo (MCMC) is a general strategy for generating samples x i (i = 0; 1; : : :) from complex highdimensional distributions, say defined on the space X ae R nx , from which integrals of the type I (f) = Z X f (x) (x) dx; can be calculated using the estimator b I N (f) = 1 N + 1 N X i=0 f (x i ) ; provided that the Markov chain produced is ergodic. The main building block of this class of algorithms is the MetropolisHastings (MH) algorithm. It requires the definition of a proposal distribution q whose role is to generate possible transitions for the Markov chain, say from x to y, which are then accepted or rejected according to the probabilityy ff (x; y) = min ae 1; (y) q (y; x) (x) q (x; y) oe : The simplicity and universality of this algorithm are both its strength and weakness. The choice of ...
On the use of auxiliary variables in Markov chain Monte Carlo sampling
 Scandinavian Journal of Statistics
, 1997
"... We study the slice sampler, a method of constructing a reversible Markov chain with a specified invariant distribution. Given an independence MetropolisHastings algorithm it is always possible to construct a slice sampler that dominates it in the Peskun sense. This means that the resulting Mark ..."
Abstract

Cited by 16 (1 self)
 Add to MetaCart
We study the slice sampler, a method of constructing a reversible Markov chain with a specified invariant distribution. Given an independence MetropolisHastings algorithm it is always possible to construct a slice sampler that dominates it in the Peskun sense. This means that the resulting Markov chain produces estimates with a smaller asymptotic variance. Furthermore the slice sampler has a smaller secondlargest eigenvalue than the corresponding independence MetropolisHastings algorithm. This ensures faster convergence to the distribution of interest. A sufficient condition for uniform ergodicity of the slice sampler is given and an upper bound for the rate of convergence to stationarity is provided. Keywords: Auxiliary variables, Slice sampler, Peskun ordering, MetropolisHastings algorithm, Uniform ergodicity. 1 Introduction The slice sampler is a method of constructing a reversible Markov transition kernel with a given invariant distribution. Auxiliary variables ar...
The Normal Kernel Coupler: An adaptive Markov Chain Monte Carlo method for efficiently sampling from multimodal distributions
, 2001
"... The Normal Kernel Coupler (NKC) is an adaptive Markov Chain Monte Carlo (MCMC) method which maintains a set of current state vectors. At each iteration one state vector is updated using a density estimate formed by applying a normal kernel to the full set of states. This sampler is ergodic (irreduci ..."
Abstract

Cited by 14 (2 self)
 Add to MetaCart
The Normal Kernel Coupler (NKC) is an adaptive Markov Chain Monte Carlo (MCMC) method which maintains a set of current state vectors. At each iteration one state vector is updated using a density estimate formed by applying a normal kernel to the full set of states. This sampler is ergodic (irreducible, Harris recurrent and aperiodic) for any continuous distribution on ddimensional Euclidean space. The NKC outperforms standard MCMC methods on a variety of unimodal and bimodal problems in low to moderate dimension. We illustrate the utility of the NKC by tting a mixture model for genetic instability in cancer cells. This model, which which has two distinct and dissimilar modes, is not well handled by standard MCMC methods. In contrast, the NKC efficiently samples from this model and yields results that are consistent with current scientic understanding.
Adaptively scaling the Metropolis algorithm using expected squared jumped distance
, 2003
"... Using existing theory on efficient jumping rules and on adaptive MCMC, we construct and demonstrate the effectiveness of a workable scheme for improving the efficiency of Metropolis algorithms. A good choice of the proposal distribution is crucial for the rapid convergence of the Metropolis algorith ..."
Abstract

Cited by 14 (0 self)
 Add to MetaCart
Using existing theory on efficient jumping rules and on adaptive MCMC, we construct and demonstrate the effectiveness of a workable scheme for improving the efficiency of Metropolis algorithms. A good choice of the proposal distribution is crucial for the rapid convergence of the Metropolis algorithm. In this paper, given a family of parametric Markovian kernels, we develop an algorithm for optimizing the kernel by maximizing the expected squared jumped distance, an objective function that characterizes the Markov chain under its ddimensional stationary distribution. The algorithm uses the information accumulated by a single path and adapts the choice of the parametric kernel in the direction of the local maximum of the objective function using multiple importance sampling techniques. We follow a twostage approach: a series of adaptive optimization steps followed by an MCMC run with fixed kernel. It is not necessary for the adaptation itself to converge. Using several examples, we demonstrate the effectiveness of our method, even for cases in which the Metropolis transition kernel is initialized at very poor values.
Bayesian analysis of lidar signals with multiple returns
 IEEE Transactions on Pattern Analysis and Machine Intelligence
, 2007
"... Abstract—TimeCorrelated Single Photon Counting and Burst Illumination Laser data can be used for range profiling and target classification. In general, the problem is to analyze the response from a histogram of either photon counts or integrated intensities to assess the number, positions, and ampl ..."
Abstract

Cited by 6 (4 self)
 Add to MetaCart
Abstract—TimeCorrelated Single Photon Counting and Burst Illumination Laser data can be used for range profiling and target classification. In general, the problem is to analyze the response from a histogram of either photon counts or integrated intensities to assess the number, positions, and amplitudes of the reflected returns from object surfaces. The goal of our work is a complete characterization of the 3D surfaces viewed by the laser imaging system. The authors present a unified theory of pixel processing that is applicable to both approaches based on a Bayesian framework, which allows for careful and thorough treatment of all types of uncertainties associated with the data. We use reversible jump Markov chain Monte Carlo (RJMCMC) techniques to evaluate the posterior distribution of the parameters and to explore spaces with different dimensionality. Further, we use a delayed rejection step to allow the generated Markov chain to mix better through the use of different proposal distributions. The approach is demonstrated on simulated and real data, showing that the return parameters can be estimated to a high degree of accuracy. We also show some practical examples from both near and farrange depth imaging. Index Terms—Threedimensional reconstruction, burst illumination laser, delayed rejection, Lidar, photon counting, reversible jump
2008a), Accelerating Markov chain Monte Carlo simulation using selfadaptive differD
 VRUGT ET AL.: TREATMENT OF FORCING DATA ERROR USING MCMC SAMPLING
"... Markov chain Monte Carlo (MCMC) methods have found widespread use in many fields of study to estimate the average properties of complex systems, and for posterior inference in a Bayesian framework. Existing theory and experiments prove convergence of well constructed MCMC schemes to the appropriate ..."
Abstract

Cited by 5 (2 self)
 Add to MetaCart
Markov chain Monte Carlo (MCMC) methods have found widespread use in many fields of study to estimate the average properties of complex systems, and for posterior inference in a Bayesian framework. Existing theory and experiments prove convergence of well constructed MCMC schemes to the appropriate limiting distribution under a variety of different conditions. In practice, however this convergence is often observed to be disturbingly slow. This is frequently caused by an inappropriate selection of the proposal distribution used to generate trial moves in the Markov Chain. Here we show that significant improvements to the efficiency of MCMC simulation can be made by using a selfadaptive Differential Evolution learning strategy within a populationbased evolutionary framework. This scheme, entitled DiffeRential Evolution Adaptive Metropolis or DREAM, runs multiple different chains simultaneously for global exploration, and automatically tunes the scale and orientation of the proposal distribution in randomized subspaces during the search. Ergodicity of the algorithm is proved, and various examples involving nonlinearity, highdimensionality, and multimodality show
Reducing the Runtime of MCMC Programs by Multithreading on SMP Architectures
 IEEE WORKSHOP ON MULTITHREADED ARCHITECTURES AND APPLICATIONS (MTAAP '08)
, 2008
"... The increasing availability of multicore and multiprocessor architectures provides new opportunities for improving the performance of many computer simulations. Markov Chain Monte Carlo (MCMC) simulations are widely used for approximate counting problems, Bayesian inference and as a means for estim ..."
Abstract

Cited by 3 (2 self)
 Add to MetaCart
The increasing availability of multicore and multiprocessor architectures provides new opportunities for improving the performance of many computer simulations. Markov Chain Monte Carlo (MCMC) simulations are widely used for approximate counting problems, Bayesian inference and as a means for estimating very highdimensional integrals. As such MCMC has found a wide variety of applications in fields including computational biology and physics, financial econometrics, machine learning and image processing. This paper presents a new method for reducing the runtime of Markov Chain Monte Carlo simulations by using SMP machines to speculatively perform iterations in parallel, reducing the runtime of MCMC programs whilst producing statistically identical results to conventional sequential implementations. We calculate the theoretical reduction in runtime that may be achieved using our technique under perfect conditions, and test and compare the method on a selection of multicore and multiprocessor architectures. Experiments are presented that show reductions in runtime of 35 % using two cores and 55 % using four cores.