Results 1  10
of
31
On the effect of prior assumptions in Bayesian model averaging with applications to growth regression
, 2008
"... Abstract. We consider the problem of variable selection in linear regression models. Bayesian model averaging has become an important tool in empirical settings with large numbers of potential regressors and relatively limited numbers of observations. We examine the effect of a variety of prior assu ..."
Abstract

Cited by 30 (3 self)
 Add to MetaCart
Abstract. We consider the problem of variable selection in linear regression models. Bayesian model averaging has become an important tool in empirical settings with large numbers of potential regressors and relatively limited numbers of observations. We examine the effect of a variety of prior assumptions on the inference concerning model size, posterior inclusion probabilities of regressors and on predictive performance. We illustrate these issues in the context of crosscountry growth regressions using three datasets with 41 to 67 potential drivers of growth and 72 to 93 observations. Finally, we recommend priors for use in this and related contexts.
Featureinclusion stochastic search for Gaussian graphical models
 J. Comp. Graph. Statist
, 2008
"... We describe a serial algorithm called featureinclusion stochastic search, or FINCS, that uses online estimates of edgeinclusion probabilities to guide Bayesian model determination in Gaussian graphical models. FINCS is compared to MCMC, to Metropolisbased search methods, and to the popular lasso; ..."
Abstract

Cited by 20 (3 self)
 Add to MetaCart
We describe a serial algorithm called featureinclusion stochastic search, or FINCS, that uses online estimates of edgeinclusion probabilities to guide Bayesian model determination in Gaussian graphical models. FINCS is compared to MCMC, to Metropolisbased search methods, and to the popular lasso; it is found to be superior along a variety of dimensions, leading to better sets of discovered models, greater speed and stability, and reasonable estimates of edgeinclusion probabilities. We illustrate FINCS on an example involving mutualfund data, where we compare the modelaveraged predictive performance of models discovered with FINCS to those discovered by competing methods. Some key words: Covariance selection; Metropolis algorithm; lasso; Bayesian model selection; hyperinverse Wishart distribution
Objective Bayesian model selection in Gaussian graphical models
, 2007
"... This paper presents a default modelselection procedure for Gaussian graphical models that involves two new developments. First, we develop a default version of the hyperinverse Wishart prior for restricted covariance matrices, called the hyperinverse Wishart gprior, and show how it corresponds t ..."
Abstract

Cited by 12 (3 self)
 Add to MetaCart
This paper presents a default modelselection procedure for Gaussian graphical models that involves two new developments. First, we develop a default version of the hyperinverse Wishart prior for restricted covariance matrices, called the hyperinverse Wishart gprior, and show how it corresponds to the implied fractional prior for covariance selection using fractional Bayes factors. Second, we apply a class of priors that automatically handles the problem of multiple hypothesis testing implied by covariance selection. We demonstrate our methods on a variety of simulated examples, concluding with a real example analysing covariation in mutualfund returns. These studies reveal that the combined use of a multiplicitycorrection prior on graphs and fractional Bayes factors for computing marginal likelihoods yields better performance than existing Bayesian methods. Some key words: covariance selection; hyperinverse Wishart distribution; fractional Bayes factors; Bayesian model selection; multiple hypothesis testing.
Nonparametric bayes conditional distribution modeling with variable selection
 Journal of the American Statistical Association
, 2009
"... This article considers methodology for flexibly characterizing the relationship between a response and multiple predictors. Goals are (1) to estimate the conditional response distribution addressing the distributional changes across the predictor space, and (2) to identify important predictors for t ..."
Abstract

Cited by 11 (7 self)
 Add to MetaCart
This article considers methodology for flexibly characterizing the relationship between a response and multiple predictors. Goals are (1) to estimate the conditional response distribution addressing the distributional changes across the predictor space, and (2) to identify important predictors for the response distribution change both with local regions and globally. We first introduce the probit stickbreaking process (PSBP) as a prior for an uncountable collection of predictordependent random probability measures and propose a PSBP mixture (PSBPM) of normal regressions for modeling the conditional distributions. A global variable selection structure is incorporated to discard unimportant predictors, while allowing estimation of posterior inclusion probabilities. Local variable selection is conducted relying on the conditional distribution estimates at different predictor points. An efficient stochastic search sampling algorithm is proposed for posterior computation. The methods are illustrated through simulation and applied to an epidemiologic study.
Bayesian Adaptive Sampling for Variable Selection and Model Averaging
"... For the problem of model choice in linear regression, we introduce a Bayesian adaptive sampling algorithm (BAS), that samples models without replacement from the space of models. For problems that permit enumeration of all models BAS is guaranteed to enumerate the model space in 2 p iterations where ..."
Abstract

Cited by 9 (4 self)
 Add to MetaCart
For the problem of model choice in linear regression, we introduce a Bayesian adaptive sampling algorithm (BAS), that samples models without replacement from the space of models. For problems that permit enumeration of all models BAS is guaranteed to enumerate the model space in 2 p iterations where p is the number of potential variables under consideration. For larger problems where sampling is required, we provide conditions under which BAS provides perfect samples without replacement. When the sampling probabilities in the algorithm are the marginal variable inclusion probabilities, BAS may be viewed as sampling models “near ” the median probability model of Barbieri and Berger. As marginal inclusion probabilities are not known in advance we discuss several strategies to estimate adaptively the marginal inclusion probabilities within BAS. We illustrate the performance of the algorithm using simulated and real data and show that BAS can outperform Markov chain Monte Carlo methods. The algorithm is implemented in the R package BAS available at CRAN.
VAR forecasting using Bayesian variable selection
 Journal of Applied Econometrics
, 2012
"... VAR forecasting using Bayesian variable selection ..."
Local shrinkage rules, Lévy processes, and regularized regression
, 2010
"... We use Lévy processes to generate joint prior distributions, and therefore penalty functions, for a location parameter β = (β1,...,βp) as p grows large. This generalizes the class of localglobal shrinkage rules based on scale mixtures of normals, illuminates new connections among disparate methods, ..."
Abstract

Cited by 4 (3 self)
 Add to MetaCart
We use Lévy processes to generate joint prior distributions, and therefore penalty functions, for a location parameter β = (β1,...,βp) as p grows large. This generalizes the class of localglobal shrinkage rules based on scale mixtures of normals, illuminates new connections among disparate methods, and leads to new results for computing posterior means and modes under a wide class of priors. We extend this framework to largescale regularized regression problems where p> n, and provide comparisons with other methodologies.
Supplement to “Computational approaches for empirical Bayes methods and Bayesian sensitivity analysis
, 2011
"... We consider situations in Bayesian analysis where we have a family of priors νh on the parameter θ, where h varies continuously over a space H, and we deal with two related problems. The first involves sensitivity analysis and is stated as follows. Suppose we fix a function f of θ. How do we efficie ..."
Abstract

Cited by 2 (2 self)
 Add to MetaCart
We consider situations in Bayesian analysis where we have a family of priors νh on the parameter θ, where h varies continuously over a space H, and we deal with two related problems. The first involves sensitivity analysis and is stated as follows. Suppose we fix a function f of θ. How do we efficiently estimate the posterior expectation of f(θ) simultaneously for all h in H? The second problem is how do we identify subsets of H which give rise to reasonable choices of νh? We assume that we are able to generate Markov chain samples from the posterior for a finite number of the priors, and we develop a methodology, based on a combination of importance sampling and the use of control variates, for dealing with these two problems. The methodology applies very generally, and we show how it applies in particular to a commonly used model for variable selection in Bayesian linear regression, and give an illustration on the US crime data of Vandaele. 1. Introduction. In
Predicting the present with bayesian structural time series
 Presented at JSM 2012
, 2012
"... This article describes a system for short term forecasting based on an ensemble prediction that averages over different combinations of predictors. The system combines a structural time series model for the target series with regression component capturing the contributions of contemporaneous search ..."
Abstract

Cited by 2 (2 self)
 Add to MetaCart
This article describes a system for short term forecasting based on an ensemble prediction that averages over different combinations of predictors. The system combines a structural time series model for the target series with regression component capturing the contributions of contemporaneous search query data. A spikeandslab prior on the regression coefficients induces sparsity, dramatically reducing the size of the regression problem. Our system averages over potential contributions from a very large set of models and gives easily digested reports of which coefficients are likely to be important. We illustrate with applications to initial claims for unemployment benefits and to retail sales. Although our exposition focuses on using search engine data to forecast economic time series, the underlying statistical methods can be applied to more general short term forecasting with large numbers of contemporaneous predictors. 1
RaoBlackwellization for Bayesian Variable Selection and Model Averaging in Linear and Binary Regression: A Novel Data Augmentation Approach
"... Choosing the subset of covariates to use in regression or generalized linear models is a ubiquitous problem. The Bayesian paradigm addresses the problem of model uncertainty by considering models corresponding to all possible subsets of the covariates, where the posterior distribution over models is ..."
Abstract

Cited by 2 (0 self)
 Add to MetaCart
Choosing the subset of covariates to use in regression or generalized linear models is a ubiquitous problem. The Bayesian paradigm addresses the problem of model uncertainty by considering models corresponding to all possible subsets of the covariates, where the posterior distribution over models is used to select models or combine them via Bayesian model averaging (BMA). Although conceptually straightforward, BMA is often difficult to implement in practice, since either the number of covariates is too large for enumeration of all subsets, calculations cannot be done analytically, or both. For orthogonal designs with the appropriate choice of prior, the posterior probability of any model can be calculated without having to enumerate the entire model space and scales linearly with the number of predictors, p. In this article we extend this idea to a much broader class of nonorthogonal design matrices. We propose a novel method which augments the observed nonorthogonal design by at most p new rows to obtain a design matrix with orthogonal columns and generate the “missing ” response variables in a data augmentation algorithm. We show that our data augmentation approach keeps the original posterior distribution of interest unaltered, and develop methods to construct RaoBlackwellized estimates of several quantities of interest, including posterior model probabilities of any model, which may not be available from an ordinary Gibbs sampler. Our method can be used for BMA in linear regression and binary regression with nonorthogonal design matrices in conjunction with independent “spike and slab ” priors with a continuous prior component that is a Cauchy or other heavy tailed distribution that may be represented as a scale mixture of normals. We provide simulated and real examples to illustrate the methodology. Supplemental materials for the manuscript are available online.