## Transdimensional Markov Chains: A Decade of Progress and Future Perspectives (2005)

Citations: | 18 - 2 self |

### BibTeX

@MISC{SISSON05transdimensionalmarkov,

author = {Scott A. SISSON},

title = { Transdimensional Markov Chains: A Decade of Progress and Future Perspectives},

year = {2005}

}

### OpenURL

### Abstract

The last 10 years have witnessed the development of sampling frameworks that permit the construction of Markov chains that simultaneously traverse both parameter and model space. Substantial methodological progress has been made during this period. In this article we present a survey of the current state of the art and evaluate some of the most recent advances in this field. We also discuss future research perspectives in the context of the drive to develop sampling mechanisms with high degrees of both efficiency and automation.

### Citations

983 | Bayes factors - Kass, Raftery - 1995 |

825 | Reversible jump Markov chain Monte Carlo computation and Bayesian model determination
- Green
- 1995
(Show Context)
Citation Context ...lication in signal processing and other Bayesian analyses (Miller et al. 1995; Phillips and Smith 1996 for example), but has in general been superceded by the more accessible reversible jump sampler (=-=Green 1995-=-). In fact, correcting for the timediscretisation approximation via the Metropolis-Hastings acceptance probability template, the dump-diffusion sampler can be shown to result in an implementation of t... |

673 |
Inference from iterative simulation using multiple sequences
- Gelman, Rubin
- 1992
(Show Context)
Citation Context ...itoring may provide. These parameters, generally a small subset of the full parameter set, are then monitored using popular fixed-dimensional performance measures (Smith 2001; Brooks and Gelman 1998; =-=Gelman and Ruben 1992-=-; Geweke 1992, are typical), although in many cases this analysis is limited to a single diagnostic. Of course, there is an obvious danger in monitoring only a single diagnostic to evaluate sampler pe... |

440 | On Bayesian analysis of mixtures with an unknown number of components
- Richardson, Green
- 1997
(Show Context)
Citation Context ...y (the majority of fixed-dimensional analyses may be performed using 6sthe WinBuGS suite). The implementations most well supported are Gaussian mixture algorithms of varying forms (Cappé et al. 2003; =-=Richardson and Green 1997-=-; Sisson and Fan 2004a) and a number of methods which finesse the trans-dimensional nature of variable selection analyses by integrating out the within-model parameters, θk, prior to the analysis. The... |

406 | Exact sampling with coupled Markov chains and applications to statistical mechanics. Random Structures and Algorithms - Propp, Wilson - 1996 |

331 |
Variable Selection via Gibbs Sampling
- George, McCulloch
- 1993
(Show Context)
Citation Context ... candidate models is large, model selection methods are generally concerned with the maximisation (or minimization) of model ranking functionals according to a non-deterministic optimisation process (=-=George and McCulloch 1993-=-; Chipman et al. 2001, for example). As a means to automate model selection Brooks, Friel, and King (2003) (see also Andrieu et al. 2000) proposed an extension to the standard simulated annealing fram... |

324 | Marginal Likelihood from the Gibbs Output
- Chib
- 1995
(Show Context)
Citation Context ... Section 2.3. When full model conditionals are not available, trans-dimensionality may still be avoided by adopting any of the product space formulations (Brooks et al. 2003; Godsill 2001; Carlin and =-=Chib 1995-=-) 2.3 Implementation While the majority of Bayesian analyses are likely to be novel in some aspect, thereby raising the likelihood that custom-written code is required for their implementation, at the... |

273 | Evaluating the accuracy of sampling-based approaches to calculating posterior moments
- Geweke
- 1992
(Show Context)
Citation Context ...ese parameters, generally a small subset of the full parameter set, are then monitored using popular fixed-dimensional performance measures (Smith 2001; Brooks and Gelman 1998; Gelman and Ruben 1992; =-=Geweke 1992-=-, are typical), although in many cases this analysis is limited to a single diagnostic. Of course, there is an obvious danger in monitoring only a single diagnostic to evaluate sampler performance — a... |

231 | Markov chain Monte Carlo convergence diagnostics: A comparative review
- Cowles, Carlin
- 1996
(Show Context)
Citation Context ...in general difficult or impossible to determine; a posteriori convergence diagnostics assess necessary rather than sufficient indicators of chain convergence (see, for example, Mengersen et al. 1999; =-=Cowles and Carlin 1996-=- for comparative reviews). The trans-dimensional setting generates additional concerns — in particular, how might one assess convergence not only within each of a potentially large number of models, b... |

208 | General methods for monitoring convergence of iterative simulations
- B, Gelman
- 1998
(Show Context)
Citation Context ...ness” which marginal monitoring may provide. These parameters, generally a small subset of the full parameter set, are then monitored using popular fixed-dimensional performance measures (Smith 2001; =-=Brooks and Gelman 1998-=-; Gelman and Ruben 1992; Geweke 1992, are typical), although in many cases this analysis is limited to a single diagnostic. Of course, there is an obvious danger in monitoring only a single diagnostic... |

201 | Representations of knowledge in complex systems (with discussion - Grenander, Miller - 1994 |

159 | Annealing Markov Chain Monte Carlo With Applications to Ancestral Inference - Geyer, Thompson - 1995 |

153 |
Weak convergence and optimal scaling of random walk Metropolis algorithms. The Annals of Applied Probability
- Roberts, Gelman, et al.
- 1997
(Show Context)
Citation Context ..., and quickly prohibitive. There is therefore a strong argument for continued research into the development of assisted or automated proposal generation, for both standard MetropolisHastings methods (=-=Roberts et al. 1997-=-, for example), and for trans-dimensional sampling schemes in particular. Recently, Brooks, Guidici, and Roberts (2003) (see also Ehlers and Brooks 2003) introduced a number of methods to achieve the ... |

149 |
Bayesian Model Choice via Markov Chain Monte Carlo
- Carlin, Chib
- 1995
(Show Context)
Citation Context ...suite — see Section 2.3. When full model conditionals are not available, trans-dimensionality may still be avoided by adopting any of the product space formulations (Brooks et al. 2003; Godsill 2001; =-=Carlin and Chib 1995-=-) 2.3 Implementation While the majority of Bayesian analyses are likely to be novel in some aspect, thereby raising the likelihood that custom-written code is required for their implementation, at the... |

137 |
The intrinsic Bayes factor for model selection and prediction
- J, Pericchi
- 1996
(Show Context)
Citation Context ...en ignoring the underlying implication of marginal assessment, the issue of parameter selection is magnified when considering that even common parameters may change meaning from one model to another (=-=Berger and Pericchi 1996-=-, for example). This leads naturally to statistics, h, based on fitted and predicted values of observations as the obvious choice in many cases, reducing the problem to the fixed dimension setting (Gr... |

136 | Nonparametric regression using Bayesian variable selection - Smith, Kohn - 1996 |

136 | Bayesian Model Averaging: a tutorial - Hoeting, Madigan, et al. - 1999 |

119 |
Efficient Metropolis jumping rules
- Gelman, Roberts, et al.
- 1996
(Show Context)
Citation Context ...iliar analogy can be found in the fixed-dimensional Metropolis-Hastings setting, whereby it is trivial to ensure a near 100% acceptance rate, but at the expense of poor exploration of the state space =-=Gelman et al. 1996-=-, for example). From this perspective, a combination of both structurally local and more global between-model move-types which do not rely on structural knowledge of the models in order to specify the... |

101 |
Marginal Likelihood from Metropolis-Hastings Output
- Chib, Jeliazkov
- 2001
(Show Context)
Citation Context ...he auxiliary random process adopted for transitions between models increases the variability of the estimator. In contrast, individually estimating the marginal model probabilities mk(x) and mk ′(x) (=-=Chib and Jeliazkov 2001-=-; Chib 1995) or their ratio (Mira and Nicholls 2004; Meng and Schilling 2002; Chen and Shao 1997) via independent fixed-dimension simulations is more precise, although impracticalities emerge when the... |

98 | Bayesian Methods for Nonlinear Classification and Regression - Denison, Holmes, et al. - 2002 |

98 | Fractional Bayes Factors for Model Comparisons - O’Hagan - 1995 |

94 | An adaptive Metropolis algorithm - Haario, Saksman, et al. |

93 | Bayesian model averaging: A tutorial (with discussion). Statistical Science 14, 382–401. [A corrected version is available online at www.stat.washington.edu/www/research/online/ hoeting1999.pdf - Hoeting, Madigan, et al. - 1999 |

90 | Modelling Spatial Patterns” (with discussion - Ripley - 1977 |

85 | Variable length markov chains
- Buhlmann, Wyner
- 1999
(Show Context)
Citation Context ... distribution may be slightly biased, but which is mathematically more flexible than chains based on the full sample-path history. This could conceivably be extended to variable-length Markov chains (=-=Bühlmann and Wyner 1999-=-). While currently centered on the fixed dimensional problem it is easily envisaged that adaptive methods will eventually graduate to the trans-dimensional setting, permitting the construction of betw... |

84 | The practical implementation of Bayesian model selection
- Chipman, George, et al.
- 2001
(Show Context)
Citation Context ... model selection methods are generally concerned with the maximisation (or minimization) of model ranking functionals according to a non-deterministic optimisation process (George and McCulloch 1993; =-=Chipman et al. 2001-=-, for example). As a means to automate model selection Brooks, Friel, and King (2003) (see also Andrieu et al. 2000) proposed an extension to the standard simulated annealing framework by constructing... |

83 |
Simulation procedures and likelihood inference for spatial point processes
- Geyer, Møller
- 1994
(Show Context)
Citation Context ...ted by Preston (1977) and Ripley (1977). Stephens (2000) proposed observing particular trans-dimensional statistical problems in the guise of continuous time abstract marked point processes (see also =-=Geyer and Møller 1994-=-). Finite mixture modelling is one such setting with obvious interpretations for the birth and death of mixture components. Recent work by Cappé et al. (2003) has shown that the sampler of Stephens (2... |

75 | Automatic Bayesian Curve Fitting - Denison, Mallick, et al. - 1998 |

73 | Bayesian Model Comparison via Jump Diffusions - Phillips, Smith - 1996 |

72 | On Adaptive Markov Chain Monte Carlo Algorithms
- Atchade, Rosenthal
- 2003
(Show Context)
Citation Context ...posterior given the manner of adaptation. Care must be taken not to adapt too quickly or inconsistently, or the wrong target distribution may be attained; a result that is all to easily achieved (see =-=Atchade and Rosenthal 2003-=- for example). Subject to assumptions of uniformly ergodic transition kernels and bounded state spaces, the adaptive algorithm of Haario et al. (2001) which depends on the full history of the chain ca... |

72 | Spatial Birth-and-Death Processes - Preston - 1977 |

71 | Adaptive Markov Chain Monte Carlo Through Regeneration
- Gilks, Roberts, et al.
- 1998
(Show Context)
Citation Context ...he adaptation may be implemented at these times. The dependence on the full chain history is consequently mitigated and the Markovian structure is preserved (Sahu and Zhigljavsky 2003; G˙asemyr 2003; =-=Gilks et al. 1998-=-). Frigessi (2003) suggests there may be scope for development in adopting dth-order Markov chains whose 17sstationary distribution may be slightly biased, but which is mathematically more flexible th... |

65 | BUGS : Bayesian Inference Using Gibbs Sampling, Version 0.4
- Spiegelhalter, Thomas, et al.
- 1994
(Show Context)
Citation Context ...eworks — would permit routine implementation of trans-dimensional samplers by non-expert practitioners, perhaps via stand-alone software packages such as the popular WinBuGS suite (Gilks et al. 1992; =-=Spiegelhalter et al. 1996-=-b). Recent developments have made great strides in this direction, providing advances in the areas of between model transitions, both in terms of efficiency and constructing generic mappings, the exte... |

60 | Bayesian analysis of mixture models with an unknown number of components - An alternative to reversible jump methods
- Stephens
(Show Context)
Citation Context ...ns may be useful (Ntzoufras 2002, for example). Similarly, the birth-and-death approach has found some application in model settings that may be more naturally expressed in the point-process setting (=-=Stephens 2000-=-; Cappé et al. 2003), although these tend to be problem specific (see Table 2 for references to specific illustrations). Jump-diffusion methods, however, are more easily conceived in the discrete time... |

57 | Bayesian Curve-Fitting With Free-Knot Splines - DiMatteo, Genovese, et al. - 2001 |

54 | On the Ergodicity Properties of Some Adaptive MCMC Algorithms,” technical report
- Andrieu, Moulines
- 2002
(Show Context)
Citation Context ...adaptive algorithm of Haario et al. (2001) which depends on the full history of the chain can be directly shown to yield unbiased Monte Carlo estimates of the expectations of bounded functionals (see =-=Andrieu and Moulines 2002-=-, Atchade and Rosenthal 2003 for results in more general settings). In comparison, if so-called regeneration points exist — such as an independent sample drawn from the ‘hot’ distribution in a simulat... |

52 |
Bounds on the L 2 spectrum for Markov chains and Markov processes: a generalization of Cheeger’s inequality
- Lawler, Sokal
- 1988
(Show Context)
Citation Context ...ing on a constant proposal parameter vector for all state transitions. It can be shown that for a simple two model case, the above conditions are optimal in terms of the capacitance of the algorithm (=-=Lawler and Sokal 1988-=-). Brooks, Guidici, and Roberts (2003) also propose a second class of models based on augmenting the state space with an auxiliary set of state-dependent variables, vk, so that the state space of π(θk... |

50 | On Bayesian model and variable selection using MCMC
- Dellaportas, Forster, et al.
- 2002
(Show Context)
Citation Context ...ropriate, a number of impracticalities are apparent for even a moderately sized model space. Although some of the computation in sampling the full parameter vector, θ∗ , may be avoided (Godsill 2003; =-=Dellaportas et al. 2002-=-; Green and O’Hagan 1998) this approach requires the definition of pk(θ∗ I−k |θ∗ ), termed pseudo-priors. Ik While their specification is essentially arbitrary in terms of obtaining the desired margin... |

49 | Nonparametric Bayesian data analysis - Muller, Quintana - 2004 |

48 | Optimal predictive model selection - Barbieri, Berger |

48 |
Regression using fractional polynomials of continuous covariates: parsimonious parametric modelling. Applied Statistics
- Royston, Altman
- 1994
(Show Context)
Citation Context ...nvolving an uncountable number of models is in Bayesian non-parametrics where both the number of basis functions, and the functions themselves are free to vary — via fractional polynomial regression (=-=Royston and Altman 1994-=-) or free-knot splines for example. See Denison et al. (2002), DiMatteo et al. (2001), Denison et al. (1998) and Smith and Kohn (1996), for useful instances of Bayesian non-parametric curve fitting. P... |

43 |
Bayesian Model Averaging and Model Search Strategies,” in Bayesian Statistic 6
- Clyde
- 1999
(Show Context)
Citation Context ...k ′q(k′ → k)mk ′(x) ρkq(k → k ′ )mk(x) , which is is independent of both current and proposed parameter states. The algorithm thereby becomes a fixed dimensional sampler over the space of models (see =-=Clyde 1999-=-b, for example). It is (or it shortly will be) possible to implement both of these simplified simulation frameworks in the popular WinBuGS suite — see Section 2.3. When full model conditionals are not... |

41 | Exponential convergence of Langevin diffusions and their discrete approximations
- Roberts, Tweedie
- 1996
(Show Context)
Citation Context ...ment of Brownian motion, and ∇ the vector of partial derivatives. In practice (3) is approximated by a discrete-time version with a Metropolis-Hastings step to preserve the stationary distribution π (=-=Roberts and Tweedie 1996-=-). This method has found some application in signal processing and other Bayesian analyses (Miller et al. 1995; Phillips and Smith 1996 for example), but has in general been superceded by the more acc... |

38 | Efficient construction of reversible jump Markov chain Monte Carlo proposal distributions
- Brooks, Guidici, et al.
- 2003
(Show Context)
Citation Context ...frameworks in the popular WinBuGS suite — see Section 2.3. When full model conditionals are not available, trans-dimensionality may still be avoided by adopting any of the product space formulations (=-=Brooks et al. 2003-=-; Godsill 2001; Carlin and Chib 1995) 2.3 Implementation While the majority of Bayesian analyses are likely to be novel in some aspect, thereby raising the likelihood that custom-written code is requi... |

36 | Analysis of a Non-Reversible Markov Chain Sampler,” The Annals of Applied Probability - Diaconis, Holmes, et al. - 2000 |

32 | Efficient Metropolis Jumping Rules,” in Bayesian Statistics 5 - Gelman, Roberts, et al. - 1996 |

31 | Estimating Mixtures of Regressions - Hurn, Justel, et al. - 2003 |

31 | Bayesian Output Analysis Program (BOA), User’s Manual, Version 1.1,” technical report
- Smith
- 2004
(Show Context)
Citation Context ...e “effectiveness” which marginal monitoring may provide. These parameters, generally a small subset of the full parameter set, are then monitored using popular fixed-dimensional performance measures (=-=Smith 2001-=-; Brooks and Gelman 1998; Gelman and Ruben 1992; Geweke 1992, are typical), although in many cases this analysis is limited to a single diagnostic. Of course, there is an obvious danger in monitoring ... |

31 | Introduction to general state-space markov chain theory - Tierney - 1996 |

30 | On the Relationship Between Markov Chain Monte Carlo Methods for Model Uncertainty
- Godsill
- 2001
(Show Context)
Citation Context ...ing the desired marginal distributions, sampler performance is sensitive to their specification, introducing practical problems in terms of efficiency and tractability (see Godsill 2003; Green 2003a; =-=Godsill 2001-=- for a discussion). However, it is believed that, in contrast to the lack of memory of previously visited states inherent in the reversible jump sampler, in the product space formulations (which conta... |