Results 1 - 10
of
87
Annealed Importance Sampling
- STATISTICS AND COMPUTING
, 1998
"... Simulated annealing --- moving from a tractable distribution to a distribution of interest via a sequence of intermediate distributions --- has traditionally been used as an inexact method of handling isolated modes in Markov chain samplers. Here, it is shown how one can use the Markov chain transi ..."
Abstract
-
Cited by 110 (2 self)
- Add to MetaCart
Simulated annealing --- moving from a tractable distribution to a distribution of interest via a sequence of intermediate distributions --- has traditionally been used as an inexact method of handling isolated modes in Markov chain samplers. Here, it is shown how one can use the Markov chain transitions for such an annealing sequence to define an importance sampler. The Markov chain aspect allows this method to perform acceptably even for high-dimensional problems, where finding good importance sampling distributions would otherwise be very difficult, while the use of importance weights ensures that the estimates found converge to the correct values as the number of annealing runs increases. This annealed importance sampling procedure resembles the second half of the previously-studied tempered transitions, and can be seen as a generalization of a recently-proposed variant of sequential importance sampling. It is also related to thermodynamic integration methods for estimating ratios...
Sequential Monte Carlo Samplers
, 2002
"... In this paper, we propose a general algorithm to sample sequentially from a sequence of probability distributions known up to a normalizing constant and de ned on a common space. A sequence of increasingly large arti cial joint distributions is built; each of these distributions admits a marginal ..."
Abstract
-
Cited by 82 (22 self)
- Add to MetaCart
In this paper, we propose a general algorithm to sample sequentially from a sequence of probability distributions known up to a normalizing constant and de ned on a common space. A sequence of increasingly large arti cial joint distributions is built; each of these distributions admits a marginal which is a distribution of interest. To sample from these distributions, we use sequential Monte Carlo methods. We show that these methods can be interpreted as interacting particle approximations of a nonlinear Feynman-Kac ow in distribution space. One interpretation of the Feynman-Kac ow corresponds to a nonlinear Markov kernel admitting a speci ed invariant distribution and is a natural nonlinear extension of the standard Metropolis-Hastings algorithm. Many theoretical results have already been established for such ows and their particle approximations. We demonstrate the use of these algorithms through simulation.
A Split-Merge Markov Chain Monte Carlo Procedure for the Dirichlet Process Mixture Model
- Journal of Computational and Graphical Statistics
, 2000
"... . We propose a split-merge Markov chain algorithm to address the problem of inefficient sampling for conjugate Dirichlet process mixture models. Traditional Markov chain Monte Carlo methods for Bayesian mixture models, such as Gibbs sampling, can become trapped in isolated modes corresponding to an ..."
Abstract
-
Cited by 64 (0 self)
- Add to MetaCart
. We propose a split-merge Markov chain algorithm to address the problem of inefficient sampling for conjugate Dirichlet process mixture models. Traditional Markov chain Monte Carlo methods for Bayesian mixture models, such as Gibbs sampling, can become trapped in isolated modes corresponding to an inappropriate clustering of data points. This article describes a Metropolis-Hastings procedure that can escape such local modes by splitting or merging mixture components. Our Metropolis-Hastings algorithm employs a new technique in which an appropriate proposal for splitting or merging components is obtained by using a restricted Gibbs sampling scan. We demonstrate empirically that our method outperforms the Gibbs sampler in situations where two or more components are similar in structure. Key words: Dirichlet process mixture model, Markov chain Monte Carlo, Metropolis-Hastings algorithm, Gibbs sampler, split-merge updates 1 Introduction Mixture models are often applied to density estim...
Auxiliary Variable Methods for Markov Chain Monte Carlo with Applications
- Journal of the American Statistical Association
, 1997
"... Suppose one wishes to sample from the density ß(x) using Markov chain Monte Carlo (MCMC). An auxiliary variable u and its conditional distribution ß(ujx) can be defined, giving the joint distribution ß(x; u) = ß(x)ß(ujx). A MCMC scheme which samples over this joint distribution can lead to substanti ..."
Abstract
-
Cited by 54 (1 self)
- Add to MetaCart
Suppose one wishes to sample from the density ß(x) using Markov chain Monte Carlo (MCMC). An auxiliary variable u and its conditional distribution ß(ujx) can be defined, giving the joint distribution ß(x; u) = ß(x)ß(ujx). A MCMC scheme which samples over this joint distribution can lead to substantial gains in efficiency compared to standard approaches. The revolutionary algorithm of Swendsen and Wang (1987) is one such example. In addition to reviewing the Swendsen-Wang algorithm and its generalizations, this paper introduces a new auxiliary variable method called partial decoupling. Two applications in Bayesian image analysis are considered. The first is a binary classification problem in which partial decoupling out performs SW and single site Metropolis. The second is a PET reconstruction which uses the gray level prior of Geman and McClure (1987). A generalized Swendsen-Wang algorithm is developed for this problem, which reduces the computing time to the point that MCMC is a viabl...
Hidden Markov models and disease mapping
- Journal of the American Statistical Association
, 2001
"... We present new methodology to extend Hidden Markov models to the spatial domain, and use this class of models to analyse spatial heterogeneity of count data on a rare phenomenon. This situation occurs commonly in many domains of application, particularly in disease mapping. We assume that the counts ..."
Abstract
-
Cited by 34 (3 self)
- Add to MetaCart
We present new methodology to extend Hidden Markov models to the spatial domain, and use this class of models to analyse spatial heterogeneity of count data on a rare phenomenon. This situation occurs commonly in many domains of application, particularly in disease mapping. We assume that the counts follow a Poisson model at the lowest level of the hierarchy, and introduce a finite mixture model for the Poisson rates at the next level. The novelty lies in the model for allocation to the mixture components, which follows a spatially correlated process, the Potts model, and in treating the number of components of the spatial mixture as unknown. Inference is performed in a Bayesian framework using reversible jump MCMC. The model introduced can be viewed as a Bayesian semiparametric approach to specifying exible spatial distribution in hierarchical models. Performance of the model and comparison with an alternative well-known Markov random field specification for the Poisson rates are demonstrated on synthetic data sets. We show that our allocation model avoids the problem of oversmoothing in cases where the underlying rates exhibit discontinuities, while giving equally good results in cases of smooth gradient-like or highly autocorrelated rates. The methodology is illustrated on an epidemiological application to data on a rare cancer in France.
Inference in Curved Exponential Family Models for Networks
- Journal of Computational and Graphical Statistics
, 2006
"... Network data arise in a wide variety of applications. Although descriptive statistics for networks abound in the literature, the science of fitting statistical models to complex network data is still in its infancy. The models considered in this article are based on exponential families; therefore, ..."
Abstract
-
Cited by 31 (8 self)
- Add to MetaCart
Network data arise in a wide variety of applications. Although descriptive statistics for networks abound in the literature, the science of fitting statistical models to complex network data is still in its infancy. The models considered in this article are based on exponential families; therefore, we refer to them as exponential random graph models (ERGMs). Although ERGMs are easy to postulate, maximum likelihood estimation of parameters in these models is very difficult. In this article, we first review the method of maximum likelihood estimation using Markov chain Monte Carlo in the context of fitting linear ERGMs. We then extend this methodology to the situation where the model comes from a curved exponential family. The curved exponential family methodology is applied to new specifications of ERGMs, proposed by Snijders et al. (2004), having non-linear parameters to represent structural properties of networks such as transitivity and heterogeneity of degrees. We review the difficult topic of implementing likelihood ratio tests for these models, then apply all these model-fitting and testing techniques to the estimation of linear and non-linear parameters for a collaboration network between partners in a New England law firm.
Variational bayesian learning of directed graphical models with hidden variables, Bayesian Analysis 1
, 2006
"... Abstract. A key problem in statistics and machine learning is inferring suitable structure of a model given some observed data. A Bayesian approach to model comparison makes use of the marginal likelihood of each candidate model to form a posterior distribution over models; unfortunately for most mo ..."
Abstract
-
Cited by 27 (2 self)
- Add to MetaCart
Abstract. A key problem in statistics and machine learning is inferring suitable structure of a model given some observed data. A Bayesian approach to model comparison makes use of the marginal likelihood of each candidate model to form a posterior distribution over models; unfortunately for most models of interest, notably those containing hidden or latent variables, the marginal likelihood is intractable to compute. We present the variational Bayesian (VB) algorithm for directed graphical models, which optimises a lower bound approximation to the marginal likelihood in a procedure similar to the standard EM algorithm. We show that for a large class of models, which we call conjugate exponential, the VB algorithm is a straightforward generalisation of the EM algorithm that incorporates uncertainty over model parameters. In a thorough case study using a small class of bipartite DAGs containing hidden variables, we compare the accuracy of the VB approximation to existing asymptotic-data approximations such as the Bayesian Information Criterion (BIC) and the Cheeseman-Stutz (CS) criterion, and also to a sampling based gold standard, Annealed Importance Sampling (AIS). We find that the VB algorithm is empirically superior to CS and BIC, and much faster than AIS. Moreover, we prove that a VB approximation can always be constructed in such a way that guarantees it to be more accurate than the CS approximation.
Assessing approximate inference for binary Gaussian process classification
- Journal of Machine Learning Research
, 2005
"... Gaussian process priors can be used to define flexible, probabilistic classification models. Unfortunately exact Bayesian inference is analytically intractable and various approximation techniques have been proposed. In this work we review and compare Laplace’s method and Expectation Propagation for ..."
Abstract
-
Cited by 26 (2 self)
- Add to MetaCart
Gaussian process priors can be used to define flexible, probabilistic classification models. Unfortunately exact Bayesian inference is analytically intractable and various approximation techniques have been proposed. In this work we review and compare Laplace’s method and Expectation Propagation for approximate Bayesian inference in the binary Gaussian process classification model. We present a comprehensive comparison of the approximations, their predictive performance and marginal likelihood estimates to results obtained by MCMC sampling. We explain theoretically and corroborate empirically the advantages of Expectation Propagation compared to Laplace’s method. Keywords: Gaussian process priors, probabilistic classification, Laplace’s approximation, expectation propagation, marginal likelihood, evidence, MCMC
Blocking Gibbs Sampling for Linkage Analysis in Large Pedigrees with Many Loops
- American Journal of Human Genetics
, 1996
"... We will apply the method of blocking Gibbs sampling to a problem of great importance and complexity -- linkage analysis. Blocking Gibbs combines exact local computations with Gibbs sampling in a way that complements the strengths of both. The method is able to handle problems with very high complexi ..."
Abstract
-
Cited by 17 (1 self)
- Add to MetaCart
We will apply the method of blocking Gibbs sampling to a problem of great importance and complexity -- linkage analysis. Blocking Gibbs combines exact local computations with Gibbs sampling in a way that complements the strengths of both. The method is able to handle problems with very high complexity such as linkage analysis in large pedigrees with many loops; a task that no other known method is able to handle. New developments of the method are outlined, and it is applied to a highly complex linkage problem. Keywords: Bayesian network, junction tree, pedigree analysis, Markov chain Monte Carlo, Gibbs sampling, loops, inbreeding 1 Introduction For linkage analysis - the problem of estimating the relative positions of the genes on the chromosomes - many methods have been developed over recent years. Fast and exact methods for computation in Bayesian networks (e.g., pedigrees) (Cannings, Thompson & Skolnick 1976; Pearl 1986; Lauritzen & Spiegelhalter 1988; Shenoy & Shafer 1990; Lauri...
Bayesian Monte Carlo
"... We investigate Bayesian alternatives to classical Monte Carlo methods for evaluating integrals. Bayesian Monte Carlo (BMC) allows the incorporation of prior knowledge, such as smoothness of the integrand, into the estimation. In a simple problem we show that this outperforms any classical import ..."
Abstract
-
Cited by 16 (2 self)
- Add to MetaCart
We investigate Bayesian alternatives to classical Monte Carlo methods for evaluating integrals. Bayesian Monte Carlo (BMC) allows the incorporation of prior knowledge, such as smoothness of the integrand, into the estimation. In a simple problem we show that this outperforms any classical importance sampling method. We also attempt more challenging multidimensional integrals involved in computing marginal likelihoods of statistical models (a.k.a. partition functions and model evidences) . We find that Bayesian Monte Carlo outperformed Annealed Importance Sampling, although for very high dimensional problems or problems with massive multimodality BMC may be less adequate. One advantage of the Bayesian approach to Monte Carlo is that samples can be drawn from any distribution. This allows for the possibility of active design of sample points so as to maximise information gain.

