Results 1  10
of
50
Efficient construction of reversible jump markov chain monte carlo proposal distributions
 Journal of the Royal Statistical Society: Series B (Statistical Methodology
"... Summary. The major implementational problem for reversible jump Markov chain Monte Carlo methods is that there is commonly no natural way to choose jump proposals since there is no Euclidean structure in the parameter space to guide our choice. We consider mechanisms for guiding the choice of propos ..."
Abstract

Cited by 38 (2 self)
 Add to MetaCart
Summary. The major implementational problem for reversible jump Markov chain Monte Carlo methods is that there is commonly no natural way to choose jump proposals since there is no Euclidean structure in the parameter space to guide our choice. We consider mechanisms for guiding the choice of proposal. The first group of methods is based on an analysis of acceptance probabilities for jumps. Essentially, these methods involve a Taylor series expansion of the acceptance probability around certain canonical jumps and turn out to have close connections to Langevin algorithms.The second group of methods generalizes the reversible jump algorithm by using the socalled saturated space approach. These allow the chain to retain some degree of memory so that, when proposing to move from a smaller to a larger model, information is borrowed from the last time that the reverse move was performed. The main motivation for this paper is that, in complex problems, the probability that the Markov chain moves between such spaces may be prohibitively small, as the probability mass can be very thinly spread across the space. Therefore, finding reasonable jump proposals becomes extremely important. We illustrate the procedure by using several examples of reversible jump Markov chain Monte Carlo applications including the analysis of autoregressive time series, graphical Gaussian modelling and mixture modelling.
Improved learning of Bayesian networks
 Proc. of the Conf. on Uncertainty in Artificial Intelligence
, 2001
"... Two or more Bayesian network structures are Markov equivalent when the corresponding acyclic digraphs encode the same set of conditional independencies. Therefore, the search space of Bayesian network structures may be organized in equivalence classes, where each of them represents a different set o ..."
Abstract

Cited by 37 (6 self)
 Add to MetaCart
Two or more Bayesian network structures are Markov equivalent when the corresponding acyclic digraphs encode the same set of conditional independencies. Therefore, the search space of Bayesian network structures may be organized in equivalence classes, where each of them represents a different set of conditional independencies. The collection of sets of conditional independencies obeys a partial order, the socalled “inclusion order.” This paper discusses in depth the role that the inclusion order plays in learning the structure of Bayesian networks. In particular, this role involves the way a learning algorithm traverses the search space. We introduce a condition for traversal operators, the inclusion boundary condition, which, when it is satisfied, guarantees that the search strategy can avoid local maxima. This is proved under the assumptions that the data is sampled from a probability distribution which is faithful to an acyclic digraph, and the length of the sample is unbounded. The previous discussion leads to the design of a new traversal operator and two new learning algorithms in the context of heuristic search and the Markov Chain Monte Carlo method. We carry out a set of experiments with synthetic and realworld data that show empirically the benefit of striving for the inclusion order when learning Bayesian networks from data.
Tractable Bayesian Learning of Tree Belief Networks
, 2000
"... In this paper we present decomposable priors, a family of priors over structure and parameters of tree belief nets for which Bayesian learning with complete observations is tractable, in the sense that the posterior is also decomposable and can be completely determined analytically in polynomial tim ..."
Abstract

Cited by 37 (1 self)
 Add to MetaCart
In this paper we present decomposable priors, a family of priors over structure and parameters of tree belief nets for which Bayesian learning with complete observations is tractable, in the sense that the posterior is also decomposable and can be completely determined analytically in polynomial time. This follows from two main results: First, we show that factored distributions over spanning trees in a graph can be integrated in closed form. Second, we examine priors over tree parameters and show that a set of assumptions similar to (Heckerman and al., 1995) constrain the tree parameter priors to be a compactly parametrized product of Dirichlet distributions. Besides allowing for exact Bayesian learning, these results permit us to formulate a new class of tractable latent variable models in which the likelihood of a data point is computed through an ensemble average over tree structures. 1 Introduction In the framework of graphical models, tree distributions stand out by their spec...
Efficient stepwise selection in decomposable models
 In Proc. UAI
, 2001
"... In this paper, we present an efficient algorithm for performing stepwise selection in the class of decomposable models. We focus on the forward selection procedure, but we also discuss how backward selection and the combination of the two can be performed efficiently. The main contributions of this ..."
Abstract

Cited by 30 (2 self)
 Add to MetaCart
In this paper, we present an efficient algorithm for performing stepwise selection in the class of decomposable models. We focus on the forward selection procedure, but we also discuss how backward selection and the combination of the two can be performed efficiently. The main contributions of this paper are (1) a simple characterization for the edges that can be added to a decomposable model while retaining its decomposability and (2) an efficient algorithm for enumerating all such edges for a given decomposable model in O(n2) time, where n is the number of variables in the model. We also analyze the complexity of the overall stepwise selection procedure (which includes the complexity of enumerating eligible edges as well as the complexity of deciding how to “progress”). We use the KL divergence of the model from the saturated model as our metric, but the results we present here extend to many other metrics as well. 1
Modeling changing dependency structure in multivariate time series
 In International Conference in Machine Learning
, 2007
"... We show how to apply the efficient Bayesian changepoint detection techniques of Fearnhead in the multivariate setting. We model the joint density of vectorvalued observations using undirected Gaussian graphical models, whose structure we estimate. We show how we can exactly compute the MAP segmenta ..."
Abstract

Cited by 30 (0 self)
 Add to MetaCart
We show how to apply the efficient Bayesian changepoint detection techniques of Fearnhead in the multivariate setting. We model the joint density of vectorvalued observations using undirected Gaussian graphical models, whose structure we estimate. We show how we can exactly compute the MAP segmentation, as well as how to draw perfect samples from the posterior over segmentations, simultaneously accounting for uncertainty about the number and location of changepoints, as well as uncertainty about the covariance structure. We illustrate the technique by applying it to financial data and to bee tracking data. 1.
Featureinclusion stochastic search for Gaussian graphical models
 J. Comp. Graph. Statist
, 2008
"... We describe a serial algorithm called featureinclusion stochastic search, or FINCS, that uses online estimates of edgeinclusion probabilities to guide Bayesian model determination in Gaussian graphical models. FINCS is compared to MCMC, to Metropolisbased search methods, and to the popular lasso; ..."
Abstract

Cited by 20 (3 self)
 Add to MetaCart
We describe a serial algorithm called featureinclusion stochastic search, or FINCS, that uses online estimates of edgeinclusion probabilities to guide Bayesian model determination in Gaussian graphical models. FINCS is compared to MCMC, to Metropolisbased search methods, and to the popular lasso; it is found to be superior along a variety of dimensions, leading to better sets of discovered models, greater speed and stability, and reasonable estimates of edgeinclusion probabilities. We illustrate FINCS on an example involving mutualfund data, where we compare the modelaveraged predictive performance of models discovered with FINCS to those discovered by competing methods. Some key words: Covariance selection; Metropolis algorithm; lasso; Bayesian model selection; hyperinverse Wishart distribution
Convergence Assessment for Reversible Jump MCMC Simulations
, 1998
"... In this paper we introduce the problem of assessing convergence of reversible jump MCMC algorithms on the basis of simulation output. We discuss the various direct approaches which could be employed, together with their associated drawbacks. Using the example of fitting a graphical Gaussian model vi ..."
Abstract

Cited by 12 (0 self)
 Add to MetaCart
In this paper we introduce the problem of assessing convergence of reversible jump MCMC algorithms on the basis of simulation output. We discuss the various direct approaches which could be employed, together with their associated drawbacks. Using the example of fitting a graphical Gaussian model via RJMCMC, we show how the simulation output for models which can be parameterised so that parameters of primary interest retain a coherent interpretation throughout the simulation, can be used to assess convergence. In the context of this example, we extend the work of Gelman and Rubin (1992) and Brooks and Gelman (1998), to provide convergence assessment procedures for graphical model determination problems, but which may be applied to any form of model choice problem and, indeed, MCMC simulations more generally.
Objective Bayesian model selection in Gaussian graphical models
, 2007
"... This paper presents a default modelselection procedure for Gaussian graphical models that involves two new developments. First, we develop a default version of the hyperinverse Wishart prior for restricted covariance matrices, called the hyperinverse Wishart gprior, and show how it corresponds t ..."
Abstract

Cited by 12 (3 self)
 Add to MetaCart
This paper presents a default modelselection procedure for Gaussian graphical models that involves two new developments. First, we develop a default version of the hyperinverse Wishart prior for restricted covariance matrices, called the hyperinverse Wishart gprior, and show how it corresponds to the implied fractional prior for covariance selection using fractional Bayes factors. Second, we apply a class of priors that automatically handles the problem of multiple hypothesis testing implied by covariance selection. We demonstrate our methods on a variety of simulated examples, concluding with a real example analysing covariation in mutualfund returns. These studies reveal that the combined use of a multiplicitycorrection prior on graphs and fractional Bayes factors for computing marginal likelihoods yields better performance than existing Bayesian methods. Some key words: covariance selection; hyperinverse Wishart distribution; fractional Bayes factors; Bayesian model selection; multiple hypothesis testing.