Results 1 - 10
of
33
An Introduction to MCMC for Machine Learning
, 2003
"... This purpose of this introductory paper is threefold. First, it introduces the Monte Carlo method with emphasis on probabilistic machine learning. Second, it reviews the main building blocks of modern Markov chain Monte Carlo simulation, thereby providing and introduction to the remaining papers of ..."
Abstract
-
Cited by 141 (2 self)
- Add to MetaCart
This purpose of this introductory paper is threefold. First, it introduces the Monte Carlo method with emphasis on probabilistic machine learning. Second, it reviews the main building blocks of modern Markov chain Monte Carlo simulation, thereby providing and introduction to the remaining papers of this special issue. Lastly, it discusses new interesting research horizons.
Convergence Assessment for Reversible Jump MCMC Simulations
, 1998
"... In this paper we introduce the problem of assessing convergence of reversible jump MCMC algorithms on the basis of simulation output. We discuss the various direct approaches which could be employed, together with their associated drawbacks. Using the example of fitting a graphical Gaussian model vi ..."
Abstract
-
Cited by 10 (0 self)
- Add to MetaCart
In this paper we introduce the problem of assessing convergence of reversible jump MCMC algorithms on the basis of simulation output. We discuss the various direct approaches which could be employed, together with their associated drawbacks. Using the example of fitting a graphical Gaussian model via RJMCMC, we show how the simulation output for models which can be parameterised so that parameters of primary interest retain a coherent interpretation throughout the simulation, can be used to assess convergence. In the context of this example, we extend the work of Gelman and Rubin (1992) and Brooks and Gelman (1998), to provide convergence assessment procedures for graphical model determination problems, but which may be applied to any form of model choice problem and, indeed, MCMC simulations more generally.
Bayesian Methods for Neural Networks
, 1999
"... Summary The application of the Bayesian learning paradigm to neural networks results in a flexi-ble and powerful nonlinear modelling framework that can be used for regression, den-sity estimation, prediction and classification. Within this framework, all sources of uncertainty are expressed and meas ..."
Abstract
-
Cited by 8 (0 self)
- Add to MetaCart
Summary The application of the Bayesian learning paradigm to neural networks results in a flexi-ble and powerful nonlinear modelling framework that can be used for regression, den-sity estimation, prediction and classification. Within this framework, all sources of uncertainty are expressed and measured by probabilities. This formulation allows for a probabilistic treatment of our a priori knowledge, domain specific knowledge, model selection schemes, parameter estimation methods and noise estimation techniques. Many researchers have contributed towards the development of the Bayesian learn-ing approach for neural networks. This thesis advances this research by proposing several novel extensions in the areas of sequential learning, model selection, optimi-sation and convergence assessment. The first contribution is a regularisation strategy for sequential learning based on extended Kalman filtering and noise estimation via evidence maximisation. Using the expectation maximisation (EM) algorithm, a similar algorithm is derived for batch learning. Much of the thesis is, however, devoted to Monte Carlo simulation methods. A robust Bayesian method is proposed to estimate,
A Bayesian Approach to Characterizing Uncertainty in Inverse Problems Using Coarse and Fine Scale Information
, 2001
"... The Bayesian approach allows one to easily quantify uncertainty, at least in theory. In practice, however, MCMC can be computationally expensive, particularly in complicated inverse problems. Here we present methodology for improving the speed and efficiency of an MCMC analysis by combining runs on ..."
Abstract
-
Cited by 8 (3 self)
- Add to MetaCart
The Bayesian approach allows one to easily quantify uncertainty, at least in theory. In practice, however, MCMC can be computationally expensive, particularly in complicated inverse problems. Here we present methodology for improving the speed and efficiency of an MCMC analysis by combining runs on different scales. By using a coarser scale, the chain can run faster (particularly when there is an external forward simulator involved in the likelihood evaluation) and better explore the posterior, being less likely to become stuck in local maxima. We discuss methods for linking the coarse chain back to the original fine scale chain of interest. The resulting coupled chain can thus be run more efficiently without sacrificing the accuracy achieved at the finer scale.
Variational MCMC
, 2001
"... We propose a new class of learning algorithms that combines variational approximation and Markov chain Monte Carlo (MCMC) simulation. Naive algorithms that use the variational approximation as proposal distribution can perform poorly because this approximation tends to underestimate the true varianc ..."
Abstract
-
Cited by 8 (0 self)
- Add to MetaCart
We propose a new class of learning algorithms that combines variational approximation and Markov chain Monte Carlo (MCMC) simulation. Naive algorithms that use the variational approximation as proposal distribution can perform poorly because this approximation tends to underestimate the true variance and other features of the data. We solve this problem by introducing more sophisticated MCMC algorithms. One of these algorithms is a mixture of two MCMC kernels: a random walk Metropolis kernel and a block Metropolis-Hastings (MH) kernel with a variational approximation as proposal distribution. The MH kernel allows one to locate regions of high probability eciently. The Metropolis kernel allows us to explore the vicinity of these regions. This algorithm outperforms variational approximations because it yields slightly better estimates of the mean and considerably better estimates of higher moments, such as covariances. It also outperforms standard MCMC algorithms because it locates the regions of high probability quickly, thus speeding up convergence. We also present and adaptive MCMC algorithm that iterates between improving the variational approximation and improving the MCMC approximation. We demonstrate the algorithms on the problem of Bayesian parameter estimation for logistic (sigmoid) belief networks. 1
Learning Author-Topic Models from Text Corpora
- ACM TRANSACTIONS ON INFORMATION SYSTEMS
, 2008
"... We propose a new unsupervised learning technique for extracting information about authors and topics from large text collections. We model documents as if they were generated by a two-stage stochastic process. An author is represented by a probability distribution over topics, and each topic is repr ..."
Abstract
-
Cited by 7 (2 self)
- Add to MetaCart
We propose a new unsupervised learning technique for extracting information about authors and topics from large text collections. We model documents as if they were generated by a two-stage stochastic process. An author is represented by a probability distribution over topics, and each topic is represented as a probability distribution over words. The probability distribution over topics in a multi-author paper is a mixture of the distributions associated with the authors. The topic-word and author-topic distributions are learned from data in an unsupervised manner using a Markov chain Monte Carlo algorithm. We apply the methodology to three large text corpora: 150,000 abstracts from the CiteSeer digital library, 1,740 papers from the Neural Information Processing Systems (NIPS) Conferences, and 121,000 emails from the Enron corporation. We discuss in detail the interpretation of the results discovered by the system including specific topic and author models, ranking of authors by topic and topics by author, parsing of abstracts by topics and authors, and detection of unusual papers by specific authors. Experiments based on perplexity scores for test documents and precision-recall for document retrieval are used to illustrate systematic differences between the proposed author-topic model and a number of alternatives. Extensions to the model, allowing (for example) generalizations of the notion of an author, are also briefly discussed.
Some Issues in Monitoring Convergence of Iterative Simulations
- In Proceedings of the Section on Statistical Computing. ASA
, 1998
"... : In this paper, we discuss some recent results and open questions concerning monitoring convergence of iterative simulations. We begin by discussing the various approaches to convergence assessment proposed in the literature, grouping the methods according to their underlying principles. We then d ..."
Abstract
-
Cited by 7 (0 self)
- Add to MetaCart
: In this paper, we discuss some recent results and open questions concerning monitoring convergence of iterative simulations. We begin by discussing the various approaches to convergence assessment proposed in the literature, grouping the methods according to their underlying principles. We then discuss how MCMC simulations can be constructed so that convergence monitoring is simplified. Finally, we discuss some new convergence assessment ideas that are the focus of current work. Key Words: Markov Chain Monte Carlo; Convergence Diagnosis; Inference. 1. Introduction Iterative simulations, especially Markov chain Monte Carlo (MCMC) methods, have been increasingly popular in statistical computation, most notably for drawing simulations from Bayesian posterior distributions, see Gilks et al (1996) and Brooks (1998a) for example. In addition to any implementational difficulties and computing resources required, iterative simulation presents two problems beyond those of traditional stati...
Inferring Vascular Structure from 2D and 3D Imagery
, 2001
"... We describe a method for inferring vascular (tree-like) structures from 2D and 3D imagery. A Bayesian formulation is used to make effective use of prior knowledge of likely tree structures with the observed being modelled locally with intensity profiles as being Gaussian. The local feature models ar ..."
Abstract
-
Cited by 6 (6 self)
- Add to MetaCart
We describe a method for inferring vascular (tree-like) structures from 2D and 3D imagery. A Bayesian formulation is used to make effective use of prior knowledge of likely tree structures with the observed being modelled locally with intensity profiles as being Gaussian. The local feature models are estimated by combination of a multiresolution, windowed Fourier approach followed by an iterative, minimum mean-square estimation, which is both computationally efficient and robust. A Markov Chain Monte Carlo (MCMC) algorithm is employed to produce approximate samples from the posterior distribution given the feature model estimates. We present results of the multiresolution parameter estimation on representative 2D and 3D data, and show preliminary results of our implementation of the MCMC algorithm.
Association Models For Web Mining
, 2001
"... We describe how statistical association models and, specifically, graphical models, can be usefully employed to model web mining data. We describe some methodological problems related to the implementation of discrete graphical models for web mining data. In particular, we discuss model selection pr ..."
Abstract
-
Cited by 5 (2 self)
- Add to MetaCart
We describe how statistical association models and, specifically, graphical models, can be usefully employed to model web mining data. We describe some methodological problems related to the implementation of discrete graphical models for web mining data. In particular, we discuss model selection procedures.

