Results 1 -
7 of
7
Transformed and parameter-expanded Gibbs samplers for multilevel linear and generalized linear models
, 2004
"... Hierarchical linear and generalized linear models can be fit using Gibbs samplers and Metropolis algorithms; these models, however, often have many parameters, and convergence of the seemingly most natural Gibbs and Metropolis algorithms can sometimes be slow. We examine solutions that involve repar ..."
Abstract
-
Cited by 8 (4 self)
- Add to MetaCart
Hierarchical linear and generalized linear models can be fit using Gibbs samplers and Metropolis algorithms; these models, however, often have many parameters, and convergence of the seemingly most natural Gibbs and Metropolis algorithms can sometimes be slow. We examine solutions that involve reparameterization and over-parameterization. We begin with parameter expansion using working parameters, a strategy developed for the EM algorithm by Meng and van Dyk (1997) and Liu, Rubin, and Wu (1998). This strategy can lead to algorithms that are much less susceptible to becoming stuck near zero values of the variance parameters than are more standard algorithms. Second, we consider a simple rotation of the regression coefficients based on an estimate of their posterior covariance matrix. This leads to a Gibbs algorithm based on updating the transformed parameters one at a time or a Metropolis algorithm with vector jumps; either of these algorithms can perform much better (in terms of total CPU time) than the two standard algorithms: one-at-a-time updating of untransformed parameters or vector updating using a linear regression at each step. We present an innovative evaluation of the algorithms in terms of how quickly they can get away from remote areas of parameter space, along with some more standard evaluation of computation and convergence speeds. We illustrate our methods with examples from our applied work. Our ultimate goal is to develop a fast and reliable method for fitting a hierarchical linear model as easily as one can now fit a non-hierarchical model, and to increase understanding of Gibbs samplers for hierarchical models in general. Keywords: Bayesian computation, blessing of dimensionality, Markov chain Monte Carlo, multilevel modeling, mixed effects models, PX-EM algorithm, random effects regression, redundant
Partially Collapsed Gibbs Samplers: Illustrations and Applications,” technical report
, 2008
"... Among the computationally intensive methods for fitting complex multilevel models, the Gibbs sampler is especially popular owing to its simplicity and power to effectively generate samples from a high-dimensional probability distribution. The Gibbs sampler, however, is often justifiably criticized f ..."
Abstract
-
Cited by 6 (4 self)
- Add to MetaCart
Among the computationally intensive methods for fitting complex multilevel models, the Gibbs sampler is especially popular owing to its simplicity and power to effectively generate samples from a high-dimensional probability distribution. The Gibbs sampler, however, is often justifiably criticized for its sometimes slow convergence, especially when it is used to fit highly structured complex models. The recently proposed Partially Collapsed Gibbs (PCG) sampler offers a new strategy for improving the convergence characteristics of a Gibbs sampler. A PCG sampler achieves faster convergence by reducing the conditioning in some or all of the component draws of its parent Gibbs sampler. Although this strategy can significantly improve convergence, it must be implemented with care to be sure that the desired stationary distribution is preserved. In some cases the set of conditional distributions sampled in a PCG sampler may be functionally incompatible and permuting the order of draws can change the stationary distribution of the chain. In this article, we draw an analogy between the PCG sampler and certain efficient EM-type algorithms that helps to explain the computational advantage of PCG samplers and to suggest when they might be used in practice.
To Center or Not To Center: That Is Not The Question
- in progress) Paul Baines 101909 Bayesian Computation in Color-Magnitude Diagrams
, 2009
"... For a broad class of multi-level models, there exist two well-known competing parameterizations, the centered parametrization (CP) and the non-centered parametrization (NCP), for effective MCMC implementation. Much literature has been devoted to the questions of when to use which and how to compromi ..."
Abstract
-
Cited by 3 (0 self)
- Add to MetaCart
For a broad class of multi-level models, there exist two well-known competing parameterizations, the centered parametrization (CP) and the non-centered parametrization (NCP), for effective MCMC implementation. Much literature has been devoted to the questions of when to use which and how to compromise between them via partial CP/NCP. This paper introduces an alternative strategy for boosting MCMC efficiency via simply interweaving— but not alternating—the two parameterizations. This strategy has the surprising property that failure of both the CP and NCP chains to converge geometrically does not prevent the interweaving algorithm from doing so. It achieves this seemingly magical property by taking advantage of the discordance of the two parameterizations, namely, the sufficiency of CP and the ancillarity of NCP, to substantially reduce the Markovian dependence, especially when the original CP and NCP form a “beauty and beast ” pair (i.e., when one chain mixes far more rapidly than the other). The ancillarity-sufficiency reformulation of the CP-NCP dichotomy allows us to borrow insight from the well-known Basu’s theorem on the independence of (complete) sufficient and ancillary statistics, albeit a Bayesian version of Basu’s
Sampling for Bayesian Computation With Large Datasets
, 2003
"... Multilevel models are extremely useful in handling large hierarchical datasets. However, computation can be a challenge, both in storage and CPU time per iteration of Gibbs sampler or other Markov chain Monte Carlo algorithms. We propose a computational strategy based on sampling the data, computing ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
Multilevel models are extremely useful in handling large hierarchical datasets. However, computation can be a challenge, both in storage and CPU time per iteration of Gibbs sampler or other Markov chain Monte Carlo algorithms. We propose a computational strategy based on sampling the data, computing separate posterior distributions based on each sample, and then combining these to get a consensus posterior inference. With hierarchical data structures, we perform cluster sampling into subsets with the same structures as the original data. This reduces the number of parameters as well as sample size for each separate model fit. We illustrate with examples from climate modeling and newspaper marketing.
Using redundant parameterizations to fit hierarchical models ∗
, 2007
"... Hierarchical linear and generalized linear models can be fit using Gibbs samplers and Metropolis algorithms; these models, however, often have many parameters, and convergence of the seemingly most natural Gibbs and Metropolis algorithms can sometimes be slow. We examine solutions that involve repar ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
Hierarchical linear and generalized linear models can be fit using Gibbs samplers and Metropolis algorithms; these models, however, often have many parameters, and convergence of the seemingly most natural Gibbs and Metropolis algorithms can sometimes be slow. We examine solutions that involve reparameterization and over-parameterization. We begin with parameter expansion using working parameters, a strategy developed for the EM algorithm by Meng and van Dyk (1997) and Liu, Rubin, and Wu (1998). This strategy can lead to algorithms that are much less susceptible to becoming stuck near zero values of the variance parameters than are more standard algorithms. Second, we consider a simple rotation of the regression coefficients based on an estimate of their posterior covariance matrix. This leads to a Gibbs algorithm based on updating the transformed parameters one at a time or a Metropolis algorithm with vector jumps; either of these algorithms can perform much better (in terms of total CPU time) than the two standard algorithms: one-at-a-time updating of untransformed parameters or vector updating using a linear regression at each step. We present an innovative evaluation of the algorithms in terms of how quickly they can get away from remote areas of parameter space, along with some more standard evaluation of computation and convergence speeds. We illustrate our methods with examples from our applied work. Our ultimate goal is to develop a fast and reliable method for fitting a hierarchical linear model as easily as one can now fit a non-hierarchical model, and to increase understanding of Gibbs samplers for hierarchical models in general. Keywords: Bayesian computation, blessing of dimensionality, Markov chain Monte Carlo, multilevel modeling, mixed effects models, PX-EM algorithm, random effects regression, redundant
Marginal Markov Chain Monte Carlo Methods
, 2008
"... Marginal Data Augmentation and Parameter-Expanded Data Augmentation are related methods for improving the the convergence properties of the two-step Gibbs sampler know as the Data Augmentation sampler. These methods expand the parameter space with a so-callled working parameter that is unidentifiabl ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
Marginal Data Augmentation and Parameter-Expanded Data Augmentation are related methods for improving the the convergence properties of the two-step Gibbs sampler know as the Data Augmentation sampler. These methods expand the parameter space with a so-callled working parameter that is unidentifiable given the observed data but is identifiable given the so-called augmented data. Although these methods can result in enormous computational gains, their use has been somewhat limited do the constrained framework they are constructed under and the necessary identification of a working parameter. This article proposes a new prescriptive framework that greatly expands the class of problems that can benefit from the key idea underlying these methods. In particular, we show how working parameters can automatically be introduced into any Gibbs sampler. Since these samplers are typically used in a Bayesian framework, the working parameter requires a prior distribution and the convergence properties of the Markov chain depend on the choice of this choice distribution. Under certain conditions the optimal choice is improper and results in a joint Markov chain on the expanded parameter space that is not positive recurrent. This leads to unexplored technical difficulties when one attempts to exploit the computational advantage in multi-step mcmc samplers, the very chains that might benefit most from this technology. In this article we develop strategies and theory that allow optimal marginal methods to be used in multistep samplers. We illustrate the potential to dramatically improve the convergence properties of mcmc samplers by applying the marginal Gibbs sampler to a logistic mixed model. 1 Expanding State Spaces in MCMC Constructing a Markov chain on an expanded state space in the context of Monte Carlo sampling can greatly simplify the required component draws or lead to chains with better mixing properties.
Parameter Expansion and Efficient Inference
, 2010
"... This EM review article focuses on parameter expansion, a simple technique introduced in the PX-EM algorithm to make EM converge faster while maintaining its simplicity and stability. The primary objective concerns the connection between parameter expansion and efficient inference. It reviews the st ..."
Abstract
- Add to MetaCart
This EM review article focuses on parameter expansion, a simple technique introduced in the PX-EM algorithm to make EM converge faster while maintaining its simplicity and stability. The primary objective concerns the connection between parameter expansion and efficient inference. It reviews the statistical interpretation of the PX-EM algorithm, in terms of efficient inference via bias reduction, and further unfolds the PX-EM mystery by looking at PX-EM from different perspectives. In addition, it briefly discusses potential applications of parameter expansion to statistical inference and the broader impact of statistical thinking on understanding and developing other iterative optimization algorithms.

