Results 1  10
of
91
An Introduction to MCMC for Machine Learning
, 2003
"... This purpose of this introductory paper is threefold. First, it introduces the Monte Carlo method with emphasis on probabilistic machine learning. Second, it reviews the main building blocks of modern Markov chain Monte Carlo simulation, thereby providing and introduction to the remaining papers of ..."
Abstract

Cited by 247 (2 self)
 Add to MetaCart
This purpose of this introductory paper is threefold. First, it introduces the Monte Carlo method with emphasis on probabilistic machine learning. Second, it reviews the main building blocks of modern Markov chain Monte Carlo simulation, thereby providing and introduction to the remaining papers of this special issue. Lastly, it discusses new interesting research horizons.
Learning AuthorTopic Models from Text Corpora
 ACM TRANSACTIONS ON INFORMATION SYSTEMS
, 2008
"... We propose a new unsupervised learning technique for extracting information about authors and topics from large text collections. We model documents as if they were generated by a twostage stochastic process. An author is represented by a probability distribution over topics, and each topic is repr ..."
Abstract

Cited by 20 (2 self)
 Add to MetaCart
We propose a new unsupervised learning technique for extracting information about authors and topics from large text collections. We model documents as if they were generated by a twostage stochastic process. An author is represented by a probability distribution over topics, and each topic is represented as a probability distribution over words. The probability distribution over topics in a multiauthor paper is a mixture of the distributions associated with the authors. The topicword and authortopic distributions are learned from data in an unsupervised manner using a Markov chain Monte Carlo algorithm. We apply the methodology to three large text corpora: 150,000 abstracts from the CiteSeer digital library, 1,740 papers from the Neural Information Processing Systems (NIPS) Conferences, and 121,000 emails from the Enron corporation. We discuss in detail the interpretation of the results discovered by the system including specific topic and author models, ranking of authors by topic and topics by author, parsing of abstracts by topics and authors, and detection of unusual papers by specific authors. Experiments based on perplexity scores for test documents and precisionrecall for document retrieval are used to illustrate systematic differences between the proposed authortopic model and a number of alternatives. Extensions to the model, allowing (for example) generalizations of the notion of an author, are also briefly discussed.
Convergence Assessment for Reversible Jump MCMC Simulations
, 1998
"... In this paper we introduce the problem of assessing convergence of reversible jump MCMC algorithms on the basis of simulation output. We discuss the various direct approaches which could be employed, together with their associated drawbacks. Using the example of fitting a graphical Gaussian model vi ..."
Abstract

Cited by 15 (0 self)
 Add to MetaCart
In this paper we introduce the problem of assessing convergence of reversible jump MCMC algorithms on the basis of simulation output. We discuss the various direct approaches which could be employed, together with their associated drawbacks. Using the example of fitting a graphical Gaussian model via RJMCMC, we show how the simulation output for models which can be parameterised so that parameters of primary interest retain a coherent interpretation throughout the simulation, can be used to assess convergence. In the context of this example, we extend the work of Gelman and Rubin (1992) and Brooks and Gelman (1998), to provide convergence assessment procedures for graphical model determination problems, but which may be applied to any form of model choice problem and, indeed, MCMC simulations more generally.
A Bayesian Approach to Characterizing Uncertainty in Inverse Problems Using Coarse and Fine Scale Information
, 2001
"... The Bayesian approach allows one to easily quantify uncertainty, at least in theory. In practice, however, MCMC can be computationally expensive, particularly in complicated inverse problems. Here we present methodology for improving the speed and efficiency of an MCMC analysis by combining runs on ..."
Abstract

Cited by 12 (3 self)
 Add to MetaCart
(Show Context)
The Bayesian approach allows one to easily quantify uncertainty, at least in theory. In practice, however, MCMC can be computationally expensive, particularly in complicated inverse problems. Here we present methodology for improving the speed and efficiency of an MCMC analysis by combining runs on different scales. By using a coarser scale, the chain can run faster (particularly when there is an external forward simulator involved in the likelihood evaluation) and better explore the posterior, being less likely to become stuck in local maxima. We discuss methods for linking the coarse chain back to the original fine scale chain of interest. The resulting coupled chain can thus be run more efficiently without sacrificing the accuracy achieved at the finer scale.
Bayesian Methods for Neural Networks
, 1999
"... Summary The application of the Bayesian learning paradigm to neural networks results in a flexible and powerful nonlinear modelling framework that can be used for regression, density estimation, prediction and classification. Within this framework, all sources of uncertainty are expressed and meas ..."
Abstract

Cited by 10 (0 self)
 Add to MetaCart
(Show Context)
Summary The application of the Bayesian learning paradigm to neural networks results in a flexible and powerful nonlinear modelling framework that can be used for regression, density estimation, prediction and classification. Within this framework, all sources of uncertainty are expressed and measured by probabilities. This formulation allows for a probabilistic treatment of our a priori knowledge, domain specific knowledge, model selection schemes, parameter estimation methods and noise estimation techniques. Many researchers have contributed towards the development of the Bayesian learning approach for neural networks. This thesis advances this research by proposing several novel extensions in the areas of sequential learning, model selection, optimisation and convergence assessment. The first contribution is a regularisation strategy for sequential learning based on extended Kalman filtering and noise estimation via evidence maximisation. Using the expectation maximisation (EM) algorithm, a similar algorithm is derived for batch learning. Much of the thesis is, however, devoted to Monte Carlo simulation methods. A robust Bayesian method is proposed to estimate,
Some Issues in Monitoring Convergence of Iterative Simulations
 In Proceedings of the Section on Statistical Computing. ASA
, 1998
"... : In this paper, we discuss some recent results and open questions concerning monitoring convergence of iterative simulations. We begin by discussing the various approaches to convergence assessment proposed in the literature, grouping the methods according to their underlying principles. We then d ..."
Abstract

Cited by 10 (0 self)
 Add to MetaCart
(Show Context)
: In this paper, we discuss some recent results and open questions concerning monitoring convergence of iterative simulations. We begin by discussing the various approaches to convergence assessment proposed in the literature, grouping the methods according to their underlying principles. We then discuss how MCMC simulations can be constructed so that convergence monitoring is simplified. Finally, we discuss some new convergence assessment ideas that are the focus of current work. Key Words: Markov Chain Monte Carlo; Convergence Diagnosis; Inference. 1. Introduction Iterative simulations, especially Markov chain Monte Carlo (MCMC) methods, have been increasingly popular in statistical computation, most notably for drawing simulations from Bayesian posterior distributions, see Gilks et al (1996) and Brooks (1998a) for example. In addition to any implementational difficulties and computing resources required, iterative simulation presents two problems beyond those of traditional stati...
Variational MCMC
, 2001
"... We propose a new class of learning algorithms that combines variational approximation and Markov chain Monte Carlo (MCMC) simulation. Naive algorithms that use the variational approximation as proposal distribution can perform poorly because this approximation tends to underestimate the true varianc ..."
Abstract

Cited by 9 (1 self)
 Add to MetaCart
We propose a new class of learning algorithms that combines variational approximation and Markov chain Monte Carlo (MCMC) simulation. Naive algorithms that use the variational approximation as proposal distribution can perform poorly because this approximation tends to underestimate the true variance and other features of the data. We solve this problem by introducing more sophisticated MCMC algorithms. One of these algorithms is a mixture of two MCMC kernels: a random walk Metropolis kernel and a block MetropolisHastings (MH) kernel with a variational approximation as proposal distribution. The MH kernel allows one to locate regions of high probability eciently. The Metropolis kernel allows us to explore the vicinity of these regions. This algorithm outperforms variational approximations because it yields slightly better estimates of the mean and considerably better estimates of higher moments, such as covariances. It also outperforms standard MCMC algorithms because it locates the regions of high probability quickly, thus speeding up convergence. We also present and adaptive MCMC algorithm that iterates between improving the variational approximation and improving the MCMC approximation. We demonstrate the algorithms on the problem of Bayesian parameter estimation for logistic (sigmoid) belief networks. 1
Inferring Vascular Structure from 2D and 3D Imagery
, 2001
"... We describe a method for inferring vascular (treelike) structures from 2D and 3D imagery. A Bayesian formulation is used to make effective use of prior knowledge of likely tree structures with the observed being modelled locally with intensity profiles as being Gaussian. The local feature models ar ..."
Abstract

Cited by 8 (6 self)
 Add to MetaCart
We describe a method for inferring vascular (treelike) structures from 2D and 3D imagery. A Bayesian formulation is used to make effective use of prior knowledge of likely tree structures with the observed being modelled locally with intensity profiles as being Gaussian. The local feature models are estimated by combination of a multiresolution, windowed Fourier approach followed by an iterative, minimum meansquare estimation, which is both computationally efficient and robust. A Markov Chain Monte Carlo (MCMC) algorithm is employed to produce approximate samples from the posterior distribution given the feature model estimates. We present results of the multiresolution parameter estimation on representative 2D and 3D data, and show preliminary results of our implementation of the MCMC algorithm.
Bayesian Model Discrimination for Multiple Strata CaptureRecapture Data
 Gordon and Breach Science Publ. [6] Ruth Nussinov
, 2001
"... this paper we consider the problem of Bayesian model determination in the context of the analysis of multiplesite capturerecapture. Extending the work of Dupuis (1995), we motivate a range of biologically plausible models and show how the original Gibbs sampling algorithm of Dupuis can be exten ..."
Abstract

Cited by 8 (3 self)
 Add to MetaCart
(Show Context)
this paper we consider the problem of Bayesian model determination in the context of the analysis of multiplesite capturerecapture. Extending the work of Dupuis (1995), we motivate a range of biologically plausible models and show how the original Gibbs sampling algorithm of Dupuis can be extended to obtain posterior model probabilities through the introduction of reversible jump Markov chain Monte Carlo updates. This model selection procedure improves upon previous analyses in two distinct ways. First, if parameter estimates are of primary interest, then Bayesian model averaging provides a robust estimation technique which properly incorporates model uncertainty in the resulting intervals. Second, by discriminating between competing models, we are able to discern ne structure within the data e.g., whether or not survival depends upon age, year or location. Such questions are often of primary biological importance and can only be addressed through model comparison techn