Results 1 
9 of
9
Controlled MCMC for Optimal Sampling
, 2001
"... this paper we develop an original and general framework for automatically optimizing the statistical properties of Markov chain Monte Carlo (MCMC) samples, which are typically used to evaluate complex integrals. The MetropolisHastings algorithm is the basic building block of classical MCMC methods ..."
Abstract

Cited by 39 (6 self)
 Add to MetaCart
this paper we develop an original and general framework for automatically optimizing the statistical properties of Markov chain Monte Carlo (MCMC) samples, which are typically used to evaluate complex integrals. The MetropolisHastings algorithm is the basic building block of classical MCMC methods and requires the choice of a proposal distribution, which usually belongs to a parametric family. The correlation properties together with the exploratory ability of the Markov chain heavily depend on the choice of the proposal distribution. By monitoring the simulated path, our approach allows us to learn "on the fly" the optimal parameters of the proposal distribution for several statistical criteria. Keywords: Monte Carlo, adaptive MCMC, calibration, stochastic approximation, gradient method, optimal scaling, random walk, Langevin, Gibbs, controlled Markov chain, learning algorithm, reversible jump MCMC. 1. Motivation 1.1. Introduction Markov chain Monte Carlo (MCMC) is a general strategy for generating samples x i (i = 0; 1; : : :) from complex highdimensional distributions, say defined on the space X ae R nx , from which integrals of the type I (f) = Z X f (x) (x) dx; can be calculated using the estimator b I N (f) = 1 N + 1 N X i=0 f (x i ) ; provided that the Markov chain produced is ergodic. The main building block of this class of algorithms is the MetropolisHastings (MH) algorithm. It requires the definition of a proposal distribution q whose role is to generate possible transitions for the Markov chain, say from x to y, which are then accepted or rejected according to the probabilityy ff (x; y) = min ae 1; (y) q (y; x) (x) q (x; y) oe : The simplicity and universality of this algorithm are both its strength and weakness. The choice of ...
An efficient data mining method for learning Bayesian networks using an evolutionary algorithmbased hybrid approach
, 2004
"... Abstract—Given the explosive growth of data collected from current business environment, data mining can potentially discover new knowledge to improve managerial decision making. This paper proposes a novel data mining approach that employs an evolutionary algorithm to discover knowledge represented ..."
Abstract

Cited by 5 (0 self)
 Add to MetaCart
Abstract—Given the explosive growth of data collected from current business environment, data mining can potentially discover new knowledge to improve managerial decision making. This paper proposes a novel data mining approach that employs an evolutionary algorithm to discover knowledge represented in Bayesian networks. The approach is applied successfully to handle the business problem of finding response models from direct marketing data. Learning Bayesian networks from data is a difficult problem. There are two different approaches to the network learning problem. The first one uses dependency analysis, while the second one searches good network structures according to a metric. Unfortunately, both approaches have their own drawbacks. Thus, we propose a novel hybrid algorithm of the two approaches, which consists of two phases, namely, the conditional independence (CI) test and the search phases. In the CI test phase, dependency analysis is conducted to reduce the size of the search space. In the search phase, good Bayesian network models are generated by using an evolutionary algorithm. A new operator is introduced to further enhance the search effectiveness and efficiency. In a number of experiments and comparisons, the hybrid algorithm outperforms MDLEP, our previous algorithm which uses evolutionary programming (EP) for network learning, and other network learning algorithms. We then apply the approach to two data sets of direct marketing and compare the performance of the evolved Bayesian networks obtained by the new algorithm with those by MDLEP, the logistic regression models, the naïve Bayesian classifiers, and the treeaugmented naïve Bayesian network classifiers (TAN). In the comparison, the new algorithm outperforms the others. Index Terms—Bayesian networks, data mining, evolutionary computation, evolutionary programming (EP). I.
A Study on the Evolution of Bayesian Network Graph Structures
"... Abstract. Bayesian Networks (BN) are often sought as useful descriptive and predictive models for theavaHable data. Learning algorithms trying to ascertain automatically the best BN model (graph structure) for some input data are of the greatest interest for practical reasons. In this paper we exami ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
Abstract. Bayesian Networks (BN) are often sought as useful descriptive and predictive models for theavaHable data. Learning algorithms trying to ascertain automatically the best BN model (graph structure) for some input data are of the greatest interest for practical reasons. In this paper we examine a number of evolutionary programming algorithms for this network induction problem. Our algorithms build on recent advances in the field and are based on selection and various kinds of mutation operators (working at both the directed acyclic and essential graph level). A review of related evolutionary work is also provided. We analyze and discuss the merit and computational toll of these EP algorithms in a couple of benchmark tasks. Some general conclusions'about the most efficient algorithms, and the most appropriate search landscapes are presented. 1
Prequential Quantum Dynamics
, 2000
"... Importation of methods from statistical physics into machine learning has led to rapid advances in methods for efficient learning of good representations for complex problems. This paper examines the potential for crossfertilization in the other direction. The Stapp ontology for quantum dynami ..."
Abstract
 Add to MetaCart
Importation of methods from statistical physics into machine learning has led to rapid advances in methods for efficient learning of good representations for complex problems. This paper examines the potential for crossfertilization in the other direction. The Stapp ontology for quantum dynamics can be coupled with evolutionary formulations of subjective decision theory to yield a unified theory of the physical and mental aspects of reality. The resulting ontology fills an acknowledged gap in contemporary physics and at the same time provides a natural bridge connecting the physical to the biological and social sciences. The implications of this ontology for artificial intelligence and quantum computing are explored. An explicitly probabilistic quantum logic is proposed as the foundation for a postclassical theory of computing. 1. INTRODUCTION The relationship between mind and matter has been a persistent puzzle since the dawn of science. The course of Western tho...
Biomathematics & Statistics Scotland
"... Many biologists believe that biological experimental techniques using gene expression with micro array enable to discover the structure and function of genetic networks in cells. Recent study interest is to use the results of the techniques and infer the biological facts such as signal pathways. In ..."
Abstract
 Add to MetaCart
Many biologists believe that biological experimental techniques using gene expression with micro array enable to discover the structure and function of genetic networks in cells. Recent study interest is to use the results of the techniques and infer the biological facts such as signal pathways. In this paper, I present and compare several methodologies to explore and reconstruct genetic networks. Many approaches are applied and compared to three different data sets: synthetic data generated from GeneNetSim, diffuse large B cell lymphoma gene expression data and gene expression data in Arabidopsis thaliana plants. I found that genes with transcription function act as hubs in biological networks in Arabidopsis thaliana plant. Finally, I built a genetic network for Arabidopsis thaliana. i Acknowledgements Many thanks to Dr. Dirk Husmeier and Miss. Fujun Ye for all of their help. ii Declaration I declare that this thesis was composed by myself, that the work contained herein is
A Study of Population MCMC for estimating Bayes Factors over Nonlinear ODE Models
"... Thesis submitted in accordance with the requirements of ..."
Population MCMC for Dirichlet Diffusion Trees
"... Dirichlet Diffusion Trees (DDT) [3, 4] are an interesting nonparametric Bayesian model, which defines a nonparametric Bayesian prior over binary trees, with an a priori unbounded depth. MCMC is a natural choise to perform inference with DDT. However, MCMC suffers from getting trapped in local optima ..."
Abstract
 Add to MetaCart
Dirichlet Diffusion Trees (DDT) [3, 4] are an interesting nonparametric Bayesian model, which defines a nonparametric Bayesian prior over binary trees, with an a priori unbounded depth. MCMC is a natural choise to perform inference with DDT. However, MCMC suffers from getting trapped in local optima. This presents a serious problem when performing inference with tree structures, since the resulting solution space is very complex and highly multimodal. In this scenario, the local nature of the moves of MCMC makes convergence prohibitive. Population MCMC (PopMCMC) [2, 1] has been proposed to try to avoid local optima when using MCMC for inference. PopMCMC runs multiple chains at the same time, and exchanges information between the chains, in order to propose nonlocal moves for each chain. In this work, we apply PopMCMC to DDT inference to try to overcome the local optima problem. DDT is a generative model, and data generated from DDT are exchangeable [3]. DDT generates data points sequentially, and assumes all data points diffuse from the origin for unit time, say [0, 1], according to brownian motion N (x1; 0, It). For each data point, at time t, it diverges from a branch shared by m previous points with probability a(t)dt/m, where a(t) is the predefined diverging function. The way DDT generates data can be represented as a tree. Figure 1, illustrates how DDT generates four points, where the detailed diffusion paths are suppressed and replaced by straight lines between diverging points and data points.
CRiSM Paper No. 0937v2, www.warwick.ac.uk/go/crism Parallel hierarchical sampling: a generalpurpose class of multiplechains MCMC algorithms
, 2009
"... This paper introduces the Parallel Hierarchical Sampler (PHS), a class of Markov chain Monte Carlo algorithms using several interacting chains having the same target distribution but different mixing properties. Unlike any singlechain MCMC algorithm, upon reaching stationarity one of the PHS chains ..."
Abstract
 Add to MetaCart
This paper introduces the Parallel Hierarchical Sampler (PHS), a class of Markov chain Monte Carlo algorithms using several interacting chains having the same target distribution but different mixing properties. Unlike any singlechain MCMC algorithm, upon reaching stationarity one of the PHS chains, which we call the “mother ” chain, attains exact Monte Carlo sampling of the target distribution of interest. We empirically show that this translates in a dramatic improvement in the sampler’s performance with respect to singlechain MCMC algorithms. Convergence of the PHS joint transition kernel is proved and its relationships with singlechain samplers, Parallel Tempering (PT) and variable augmentation algorithms are discussed. We then provide two illustrative examples comparing the accuracy of PHS with