Results 1  10
of
10
Bayesian Computation and the Linear Model
, 2009
"... This paper is a review of computational strategies for Bayesian shrinkage and variable selection in the linear model. Our focus is less on traditional MCMC methods, which are covered in depth by earlier review papers. Instead, we focus more on recent innovations in stochastic search and adaptive MCM ..."
Abstract

Cited by 6 (0 self)
 Add to MetaCart
This paper is a review of computational strategies for Bayesian shrinkage and variable selection in the linear model. Our focus is less on traditional MCMC methods, which are covered in depth by earlier review papers. Instead, we focus more on recent innovations in stochastic search and adaptive MCMC, along with some comparatively new research on shrinkage priors. One of our conclusions is that true MCMC seems inferior to stochastic search if one’s goal is to discover good models, but that stochastic search can result in biased estimates of variable inclusion probabilities. We also find reasons to question the accuracy of inclusion probabilities generated by traditional MCMC on highdimensional, nonorthogonal problems, though the matter is far from settled. Some key words: adaptive MCMC; linear models; shrinkage priors; stochastic search; variable selection 1
Bayesian generalized double Pareto shrinkage
, 2010
"... We propose a generalized double Pareto prior for shrinkage estimation in linear models. The prior can be obtained via a scale mixture of Laplace or normal distributions, while forming a bridge between the Laplace and NormalJeffreys ’ priors. While it has a spike at zero like the Laplace density, it ..."
Abstract

Cited by 5 (1 self)
 Add to MetaCart
We propose a generalized double Pareto prior for shrinkage estimation in linear models. The prior can be obtained via a scale mixture of Laplace or normal distributions, while forming a bridge between the Laplace and NormalJeffreys ’ priors. While it has a spike at zero like the Laplace density, it also has a Studenttlike tail behavior. We show strong consistency of the posterior in regression models with a diverging number of parameters, providing a template to be used for other priors in similar settings. Bayesian computation is straightforward via a simple Gibbs sampling algorithm. We also investigate the properties of the maximum a posteriori estimator and reveal connections with some wellestablished regularization procedures. The performance of the new prior is tested through simulations.
Steel (2012): “Mixtures of gpriors for Bayesian model averaging with economic applications
 Journal of Econometrics
"... Abstract. We examine the issue of variable selection in linear regression modeling, where we have a potentially large amount of possible covariates and economic theory offers insufficient guidance on how to select the appropriate subset. In this context, Bayesian Model Averaging presents a formal Ba ..."
Abstract

Cited by 4 (0 self)
 Add to MetaCart
Abstract. We examine the issue of variable selection in linear regression modeling, where we have a potentially large amount of possible covariates and economic theory offers insufficient guidance on how to select the appropriate subset. In this context, Bayesian Model Averaging presents a formal Bayesian solution to dealing with model uncertainty. Our main interest here is the effect of the prior on the results, such as posterior inclusion probabilities of regressors and predictive performance. We combine a BinomialBeta prior on model size with a gprior on the coefficients of each model. In addition, we assign a hyperprior to g, as the choice of g has been found to have a large impact on the results. For the prior on g, we examine the ZellnerSiow prior and a class of Beta shrinkage priors, which covers most choices in the recent literature. We propose a benchmark Beta prior, inspired by earlier findings with fixed g, and show it leads to consistent model selection. The effect of this prior structure on penalties for complexity and lack of fit is described in some detail. Inference is conducted through a Markov chain Monte Carlo sampler over model space and g. We examine the performance of the various priors in the context of simulated and real data. For the latter, we consider two important applications in economics, namely crosscountry growth regression and returns to schooling. Recommendations to applied users are provided.
Sequential Monte Carlo on large binary sampling spaces
 Statist. Comput
, 2011
"... A Monte Carlo algorithm is said to be adaptive if it automatically calibrates its current proposal distribution using past simulations. The choice of the parametric family that defines the set of proposal distributions is critical for good performance. In this paper, we present such a parametric fam ..."
Abstract

Cited by 3 (0 self)
 Add to MetaCart
A Monte Carlo algorithm is said to be adaptive if it automatically calibrates its current proposal distribution using past simulations. The choice of the parametric family that defines the set of proposal distributions is critical for good performance. In this paper, we present such a parametric family for adaptive sampling on highdimensional binary spaces. A practical motivation for this problem is variable selection in a linear regression context. We want tosamplefromaBayesian posterior distribution on the model space using an appropriate version of Sequential Monte Carlo. Raw versions of Sequential Monte Carlo are easily implemented using binary vectors with independent components. For highdimensional problems, however, these simple proposals do not yield satisfactory results. The key to an efficient adaptive algorithm are binary parametric families which take correlations intoaccount, analogously tothemultivariate normaldistribution on continuous spaces. We provide a review of models for binary data and make one of them work in the context of Sequential Monte Carlo sampling. Computational studies on real life data with about a hundred covariates suggest that, on difficult instances, our Sequential Monte Carlo approach clearly outperforms standard techniques based on Markov chain exploration.
RaoBlackwellization for Bayesian Variable Selection and Model Averaging in Linear and Binary Regression: A Novel Data Augmentation Approach
"... Choosing the subset of covariates to use in regression or generalized linear models is a ubiquitous problem. The Bayesian paradigm addresses the problem of model uncertainty by considering models corresponding to all possible subsets of the covariates, where the posterior distribution over models is ..."
Abstract

Cited by 2 (0 self)
 Add to MetaCart
Choosing the subset of covariates to use in regression or generalized linear models is a ubiquitous problem. The Bayesian paradigm addresses the problem of model uncertainty by considering models corresponding to all possible subsets of the covariates, where the posterior distribution over models is used to select models or combine them via Bayesian model averaging (BMA). Although conceptually straightforward, BMA is often difficult to implement in practice, since either the number of covariates is too large for enumeration of all subsets, calculations cannot be done analytically, or both. For orthogonal designs with the appropriate choice of prior, the posterior probability of any model can be calculated without having to enumerate the entire model space and scales linearly with the number of predictors, p. In this article we extend this idea to a much broader class of nonorthogonal design matrices. We propose a novel method which augments the observed nonorthogonal design by at most p new rows to obtain a design matrix with orthogonal columns and generate the “missing ” response variables in a data augmentation algorithm. We show that our data augmentation approach keeps the original posterior distribution of interest unaltered, and develop methods to construct RaoBlackwellized estimates of several quantities of interest, including posterior model probabilities of any model, which may not be available from an ordinary Gibbs sampler. Our method can be used for BMA in linear regression and binary regression with nonorthogonal design matrices in conjunction with independent “spike and slab ” priors with a continuous prior component that is a Cauchy or other heavy tailed distribution that may be represented as a scale mixture of normals. We provide simulated and real examples to illustrate the methodology. Supplemental materials for the manuscript are available online.
A Note on the Bias . . .
, 2010
"... In variable selection problems that preclude enumeration of models, stochastic search algorithms, often based on Markov Chain Monte Carlo, are commonly used to identify a set of models for model selection or model averaging. Because Monte Carlo frequencies of models are often zero or one in high dim ..."
Abstract
 Add to MetaCart
In variable selection problems that preclude enumeration of models, stochastic search algorithms, often based on Markov Chain Monte Carlo, are commonly used to identify a set of models for model selection or model averaging. Because Monte Carlo frequencies of models are often zero or one in high dimensional problems, posterior probabilities calculated from the observed marginal likelihoods, renormalized over the sampled models are often employed. Such estimates are the only recourse in the newer generation of stochastic search algorithms. In this paper, we show that the approach of estimating model probabilities based on renormalization of posterior probabilities over the set of sampled models leads to bias in many quantities of interest and may not reduce mean squared error.
Bayesian Modeling Using Latent Structures by
, 2012
"... is devoted to modeling complex data from the Bayesian perspective via constructing priors with latent structures. There are three major contexts in which this is done – strategies for the analysis of dynamic longitudinal data, estimating shapeconstrained functions, and identifying subgroups. The me ..."
Abstract
 Add to MetaCart
is devoted to modeling complex data from the Bayesian perspective via constructing priors with latent structures. There are three major contexts in which this is done – strategies for the analysis of dynamic longitudinal data, estimating shapeconstrained functions, and identifying subgroups. The methodology is illustrated in three different interdisciplinary contexts: (1) adaptive measurement testing in education; (2) emulation of computer models for vehicle crashworthiness; and (3) subgroup analyses based on biomarkers. Chapter 1 presents an overview of the utilized latent structured priors and an overview of the remainder of the thesis. Chapter 2 is motivated by the problem of analyzing dichotomous longitudinal data observed at variable and irregular time points for adaptive measurement testing in education. One of its main contributions lies in developing a new class of Dynamic Item Response (DIR) models via specifying a novel dynamic structure on the prior of the latent trait. The Bayesian inference for DIR models is undertaken, which permits borrowing strength from different individuals,
Orthogonal Data Augmentation for Bayesian Model Averaging
"... Choosing the subset of covariates to use in regression or generalized linear models is a ubiquitous problem. The Bayesian paradigm can easily deal with this problem of model uncertainty by considering models corresponding to all possible subsets of the covariates, where the posterior distribution ov ..."
Abstract
 Add to MetaCart
Choosing the subset of covariates to use in regression or generalized linear models is a ubiquitous problem. The Bayesian paradigm can easily deal with this problem of model uncertainty by considering models corresponding to all possible subsets of the covariates, where the posterior distribution over models is used to select models or combine them in Bayesian model averaging. Although conceptually straightforward, it is often difficult to implement in practice, since either the number of covariates is too large, or calculations cannot be done analytically, or both. For orthogonal designs with the appropriate choice of prior, the posterior probability of any model can be calculated without having to enumerate the entire model space. In this article we propose a novel method, which augments the observed nonorthogonal design by new rows to obtain a design matrix with orthogonal columns. We show that our data augmentation approach keeps the original posterior distribution of interest unaltered, and develop methods to construct RaoBlackwellized estimates of several quantities of interest, including posterior model probabilities, which may not be available from an ordinary Gibbs sampler. The method can be used for BMA in linear regression with Cauchy or other heavy tailed priors that may be represented as a scale mixture of normals, as well as binary regression. We provide simulated and real examples to illustrate the methodology. Supplemental materials for the manuscript are available online.
Finite Population Estimators in Stochastic
"... Monte Carlo algorithms are commonly used to identify a set of models for Bayesian model selection or model averaging. Because empirical frequencies of models are often zero or one in high dimensional problems, posterior probabilities calculated from the observed marginal likelihoods, renormalized o ..."
Abstract
 Add to MetaCart
Monte Carlo algorithms are commonly used to identify a set of models for Bayesian model selection or model averaging. Because empirical frequencies of models are often zero or one in high dimensional problems, posterior probabilities calculated from the observed marginal likelihoods, renormalized over the sampled models are often employed. Such estimates are the only recourse in several newer stochastic search algorithms. In this paper, we prove that renormalization of posterior probabilities over the set of sampled models generally leads to bias which may dominate mean squared error. Viewing the model space as a finite population, we propose a new estimator based on a ratio of HorvitzThompson estimators which incorporates observed marginal likelihoods, but is approximately unbiased. This is shown to lead to a reduction in mean squared error compared to the empirical or renormalized estimators, with little increase in computational costs.
Political Institutions and Flights to Liquidity
"... Please do not cite without permission Thanks to Brendan Nyhan for helpful methodological discussions. I am responsible for all errors. ..."
Abstract
 Add to MetaCart
Please do not cite without permission Thanks to Brendan Nyhan for helpful methodological discussions. I am responsible for all errors.