Results 1  10
of
18
Simulating Normalized Constants: From Importance Sampling to Bridge Sampling to Path Sampling
 Statistical Science, 13, 163–185. COMPARISON OF METHODS FOR COMPUTING BAYES FACTORS 435
, 1998
"... Your use of the JSTOR archive indicates your acceptance of JSTOR's Terms and Conditions of Use, available at ..."
Abstract

Cited by 146 (4 self)
 Add to MetaCart
Your use of the JSTOR archive indicates your acceptance of JSTOR's Terms and Conditions of Use, available at
Simulating ratios of normalizing constants via a simple identity: A theoretical exploration
 Statistica Sinica
, 1996
"... Abstract: Let pi(w),i =1, 2, be two densities with common support where each density is known up to a normalizing constant: pi(w) =qi(w)/ci. We have draws from each density (e.g., via Markov chain Monte Carlo), and we want to use these draws to simulate the ratio of the normalizing constants, c1/c2. ..."
Abstract

Cited by 109 (4 self)
 Add to MetaCart
Abstract: Let pi(w),i =1, 2, be two densities with common support where each density is known up to a normalizing constant: pi(w) =qi(w)/ci. We have draws from each density (e.g., via Markov chain Monte Carlo), and we want to use these draws to simulate the ratio of the normalizing constants, c1/c2. Such a computational problem is often encountered in likelihood and Bayesian inference, and arises in fields such as physics and genetics. Many methods proposed in statistical and other literature (e.g., computational physics) for dealing with this problem are based on various special cases of the following simple identity: c1 c2 = E2[q1(w)α(w)] E1[q2(w)α(w)]. Here Ei denotes the expectation with respect to pi (i =1, 2), and α is an arbitrary function such that the denominator is nonzero. A main purpose of this paper is to provide a theoretical study of the usefulness of this identity, with focus on (asymptotically) optimal and practical choices of α. Using a simple but informative example, we demonstrate that with sensible (not necessarily optimal) choices of α, we can reduce the simulation error by orders of magnitude when compared to the conventional importance sampling method, which corresponds to α =1/q2. We also introduce several generalizations of this identity for handling more complicated settings (e.g., estimating several ratios simultaneously) and pose several open problems that appear to have practical as well as theoretical value. Furthermore, we discuss related theoretical and empirical work.
H: Computing Bayes factors using thermodynamic integration
 Syst Biol
"... Abstract.—In the Bayesian paradigm, a common method for comparing two models is to compute the Bayes factor, defined as the ratio of their respective marginal likelihoods. In recent phylogenetic works, the numerical evaluation of marginal likelihoods has often been performed using the harmonic mean ..."
Abstract

Cited by 33 (5 self)
 Add to MetaCart
Abstract.—In the Bayesian paradigm, a common method for comparing two models is to compute the Bayes factor, defined as the ratio of their respective marginal likelihoods. In recent phylogenetic works, the numerical evaluation of marginal likelihoods has often been performed using the harmonic mean estimation procedure. In the present article, we propose to employ another method, based on an analogy with statistical physics, called thermodynamic integration. We describe the method, propose an implementation, and show on two analytical examples that this numerical method yields reliable estimates. In contrast, the harmonic mean estimator leads to a strong overestimation of the marginal likelihood, which is all the more pronounced as the model is higher dimensional. As a result, the harmonic mean estimator systematically favors more parameterrich models, an artefact that might explain some recent puzzling observations, based on harmonic mean estimates, suggesting that Bayes factors tend to overscore complex models. Finally, we apply our method to the comparison of several alternative models of aminoacid replacement. We confirm our previous observations, indicating that modeling pattern heterogeneity across sites tends to yield better models than standard empirical matrices. [Bayes factor; harmonic mean; mixture model; path sampling; phylogeny; thermodynamic integration.] Bayesian methods have become popular in molecular phylogenetics over the recent years. The simple and intuitive interpretation of the concept of probabilities
Fully Bayesian Estimation of Gibbs Hyperparameters for Emission Computed Tomography Data
 IEEE Transactions on Medical Imaging
, 1997
"... In recent years, many investigators have proposed Gibbs prior models to regularize images reconstructed from emission computed tomography data. Unfortunately, hyperparameters used to specify Gibbs priors can greatly influence the degree of regularity imposed by such priors, and as a result, numerous ..."
Abstract

Cited by 20 (3 self)
 Add to MetaCart
In recent years, many investigators have proposed Gibbs prior models to regularize images reconstructed from emission computed tomography data. Unfortunately, hyperparameters used to specify Gibbs priors can greatly influence the degree of regularity imposed by such priors, and as a result, numerous procedures have been proposed to estimate hyperparameter values from observed image data. Many of these procedures attempt to maximize the joint posterior distribution on the image scene. To implement these methods, approximations to the joint posterior densities are required, because the dependence of the Gibbs partition function on the hyperparameter values is unknown. In this paper, we use recent results in Markov Chain Monte Carlo sampling to estimate the relative values of Gibbs partition functions, and using these values, sample from joint posterior distributions on image scenes. This allows for a fully Bayesian procedure which does not fix the hyperparameters at some estimated or spe...
Probabilistic Reasoning and Inference for Systems Biology
, 2007
"... One of the important challenges in Systems Biology is reasoning and performing hypotheses testing in uncertain conditions, when available knowledge may be incomplete and the experimental data may contain substantial noise. In this thesis we develop methods of probabilistic reasoning and inference th ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
One of the important challenges in Systems Biology is reasoning and performing hypotheses testing in uncertain conditions, when available knowledge may be incomplete and the experimental data may contain substantial noise. In this thesis we develop methods of probabilistic reasoning and inference that operate consistently within an environment of uncertain knowledge and data. Mechanistic mathematical models are used to describe hypotheses about biological systems. We consider both deductive model based reasoning and model inference from data. The main contributions are a novel modelling approach using continuous time Markov chains that enables deductive derivation of model behaviours and their properties, and the application of Bayesian inferential methods to solve the inverse problem of model inference and comparison, given uncertain knowledge and noisy data. In the first part of the thesis, we consider both individual and population
Optimizing Statistical Potentials by a Combination of a Gradient Method and Gibbs Sampling
, 2005
"... The inverse folding problem (i.e. determining the sequence of a protein given its conformation), has received much attention recently. However, it is usually understood in an engineering perspective, in which the aim is to design a sequence that stably folds into a given target structure. In the pre ..."
Abstract
 Add to MetaCart
The inverse folding problem (i.e. determining the sequence of a protein given its conformation), has received much attention recently. However, it is usually understood in an engineering perspective, in which the aim is to design a sequence that stably folds into a given target structure. In the present work, we propose a reformulation of the problem as one of statistical inference, where the objective is to learn the sequence patterns displayed by natural sequences of known conformation.
NORMALIZING CONSTANTS
"... Abstract. Computing (ratios of) normalizing constants of probability models is a fundamental computational problem for many statistical and scientific studies. Monte Carlo simulation is an effective technique, especially with complex and highdimensional models. This paper aims to bring to the atten ..."
Abstract
 Add to MetaCart
Abstract. Computing (ratios of) normalizing constants of probability models is a fundamental computational problem for many statistical and scientific studies. Monte Carlo simulation is an effective technique, especially with complex and highdimensional models. This paper aims to bring to the attention of general statistical audiences of some effective methods originating from theoretical physics and at the same time to explore these methods from a more statistical perspective, through establishing theoretical connections and illustrating their uses with statistical problems. We show that the acceptance ratio method and thermodynamic integration are natural generalizations of importance sampling, which is most familiar to statistical audiences. The former generalizes importance sampling through the use of a single “bridge ” density and is thus a case of bridge sampling in the sense of Meng and Wong. Thermodynamic integration, which is also known in the numerical analysis literature as Ogata’s method for highdimensional integration, corresponds to the use of infinitely many and continuously connected bridges (and thus a “path”). Our path sampling formulation offers more flexibility and thus potential efficiency to thermodynamic integration, and the search of optimal paths turns out to have close connections with the Jeffreys prior density and the Rao and Hellinger distances between two densities. We provide an informative theoretical example as well as two empirical examples (involving 17 to 70dimensional integrations) to illustrate the potential and implementation of path sampling. We also discuss some open problems.
CREST–INSEE, and
, 2007
"... The knearestneighbour procedure is a wellknown deterministic method used in supervised classification. While it has been superseded by more recent methods developed in machine learning, it remains an essential tool for classifiers. This paper proposes a reassessment of this approach as a statisti ..."
Abstract
 Add to MetaCart
The knearestneighbour procedure is a wellknown deterministic method used in supervised classification. While it has been superseded by more recent methods developed in machine learning, it remains an essential tool for classifiers. This paper proposes a reassessment of this approach as a statistical technique derived from a proper probabilistic model; in particular, we modify the assessment made in a previous analysis of this method undertaken by Holmes and Adams (2002, 2003) where the underlying probabilistic model is not completely welldefined. Once clear probabilistic bases of the knearestneighbour procedure are established, we proceed to the derivation of practical computational tools to conduct Bayesian inference on the parameters of the corresponding model. In particular, we assess the difficulties inherent to pseudolikelihood and to path sampling approximations of a missing normalising constant, and propose a perfect sampling strategy to implement a correct MCMC sampler associated with our model. Illustrations of the performance of the corresponding Bayesian classifier are provided for two benchmark datasets, demonstrating in particular the limitations of the pseudolikelihood approximation in this setup.
BMC Evolutionary Biology
, 2007
"... Research article Phylogenetic review of tonal sound production in whales in relation to sociality ..."
Abstract
 Add to MetaCart
Research article Phylogenetic review of tonal sound production in whales in relation to sociality