Results 1  10
of
89
Bayesian Model Averaging for Linear Regression Models
 Journal of the American Statistical Association
, 1997
"... We consider the problem of accounting for model uncertainty in linear regression models. Conditioning on a single selected model ignores model uncertainty, and thus leads to the underestimation of uncertainty when making inferences about quantities of interest. A Bayesian solution to this problem in ..."
Abstract

Cited by 184 (13 self)
 Add to MetaCart
We consider the problem of accounting for model uncertainty in linear regression models. Conditioning on a single selected model ignores model uncertainty, and thus leads to the underestimation of uncertainty when making inferences about quantities of interest. A Bayesian solution to this problem involves averaging over all possible models (i.e., combinations of predictors) when making inferences about quantities of
Nonparametric regression using Bayesian variable selection
 Journal of Econometrics
, 1996
"... This paper estimates an additive model semiparametrically, while automatically selecting the significant independent variables and the app~opriatc power transformation of the dependent variable. The nonlinear variables arc modeled as regression splincs, with significant knots selected fiom a large ..."
Abstract

Cited by 136 (10 self)
 Add to MetaCart
This paper estimates an additive model semiparametrically, while automatically selecting the significant independent variables and the app~opriatc power transformation of the dependent variable. The nonlinear variables arc modeled as regression splincs, with significant knots selected fiom a large number of candidate knots. The estimation is made robust by modeling the errors as a mixture of normals. A Bayesian approach is used to select the significant knots, the power transformation, and to identify oatliers using the Gibbs sampler to curry out the computation. Empirical evidence is given that the sampler works well on both simulated and real examples and that in the univariate case it compares faw)rably with a kernelweighted local linear smoother, The variable selection algorithm in the paper is substantially fasler than previous Bayesian variable sclcclion algorithms. K('I ' word~': Additive nlodel, Pov¢¢r Iransformalio:l: Robust cslinlalion
Approaches for Bayesian variable selection
 Statistica Sinica
, 1997
"... Abstract: This paper describes and compares various hierarchical mixture prior formulations of variable selection uncertainty in normal linear regression models. These include the nonconjugate SSVS formulation of George and McCulloch (1993), as well as conjugate formulations which allow for analytic ..."
Abstract

Cited by 124 (5 self)
 Add to MetaCart
Abstract: This paper describes and compares various hierarchical mixture prior formulations of variable selection uncertainty in normal linear regression models. These include the nonconjugate SSVS formulation of George and McCulloch (1993), as well as conjugate formulations which allow for analytical simplification. Hyperparameter settings which base selection on practical significance, and the implications of using mixtures with point priors are discussed. Computational methods for posterior evaluation and exploration are considered. Rapid updating methods are seen to provide feasible methods for exhaustive evaluation using Gray Code sequencing in moderately sized problems, and fast Markov Chain Monte Carlo exploration in large problems. Estimation of normalization constants is seen to provide improved posterior estimates of individual model probabilities and the total visited probability. Various procedures are illustrated on simulated sample problems and on a real problem concerning the construction of financial index tracking portfolios.
Multiple Shrinkage and Subset Selection in Wavelets
, 1997
"... This paper discusses Bayesian methods for multiple shrinkage estimation in wavelets. Wavelets are used in applications for data denoising, via shrinkage of the coefficients towards zero, and for data compression, by shrinkage and setting small coefficients to zero. We approach wavelet shrinkage by u ..."
Abstract

Cited by 118 (16 self)
 Add to MetaCart
This paper discusses Bayesian methods for multiple shrinkage estimation in wavelets. Wavelets are used in applications for data denoising, via shrinkage of the coefficients towards zero, and for data compression, by shrinkage and setting small coefficients to zero. We approach wavelet shrinkage by using Bayesian hierarchical models, assigning a positive prior probability to the wavelet coefficients being zero. The resulting estimator for the wavelet coefficients is a multiple shrinkage estimator that exhibits a wide variety of nonlinear shrinkage patterns. We discuss fast computational implementations, with a focus on easytocompute analytic approximations as well as importance sampling and Markov chain Monte Carlo methods. Multiple shrinkage estimators prove to have excellent mean squared error performance in reconstructing standard test functions. We demonstrate this in simulated test examples, comparing various implementations of multiple shrinkage to commonly used shrinkage rules. Finally, we illustrate our approach with an application to the socalled "glint" data.
Calibration and Empirical Bayes Variable Selection
 Biometrika
, 1997
"... this paper, is that with F =2logp. This choice was proposed by Foster &G eorge (1994) where it was called the Risk Inflation Criterion (RIC) because it asymptotically minimises the maximum predictive risk inflation due to selection when X is orthogonal. This choice and its minimax property were also ..."
Abstract

Cited by 114 (19 self)
 Add to MetaCart
this paper, is that with F =2logp. This choice was proposed by Foster &G eorge (1994) where it was called the Risk Inflation Criterion (RIC) because it asymptotically minimises the maximum predictive risk inflation due to selection when X is orthogonal. This choice and its minimax property were also discovered independently by Donoho & Johnstone (1994) in the wavelet regression context, where they refer to it as the universal hard thresholding rule
The practical implementation of Bayesian model selection
 Institute of Mathematical Statistics
, 2001
"... In principle, the Bayesian approach to model selection is straightforward. Prior probability distributions are used to describe the uncertainty surrounding all unknowns. After observing the data, the posterior distribution provides a coherent post data summary of the remaining uncertainty which is r ..."
Abstract

Cited by 85 (3 self)
 Add to MetaCart
In principle, the Bayesian approach to model selection is straightforward. Prior probability distributions are used to describe the uncertainty surrounding all unknowns. After observing the data, the posterior distribution provides a coherent post data summary of the remaining uncertainty which is relevant for model selection. However, the practical implementation of this approach often requires carefully tailored priors and novel posterior calculation methods. In this article, we illustrate some of the fundamental practical issues that arise for two different model selection problems: the variable selection problem for the linear model and the CART model selection problem.
Variable Selection and Model Comparison in Regression
, 1994
"... In the specification of linear regression models it is common to indicate a list of candidate variables from which a subset enters the model with nonzero coefficients. In some cases any combination of variables may enter, but in others certain necessary conditions must be satisfied: e.g., in time se ..."
Abstract

Cited by 62 (2 self)
 Add to MetaCart
In the specification of linear regression models it is common to indicate a list of candidate variables from which a subset enters the model with nonzero coefficients. In some cases any combination of variables may enter, but in others certain necessary conditions must be satisfied: e.g., in time series applications it is common to allow a lagged variable only if all shorter lags for the same variable also enter. This paper interprets this specification as a mixed continuousdiscrete prior distribution for coefficient values. It then utilizes a Gibbs sampler to construct posterior moments. It is shown how this method can incorporate sign constraints and provide posterior probabilities for all possible subsets of regressors. The methods are illustrated using some standard data sets.
Predictive Model Selection
 Journal of the Royal Statistical Society, Ser. B
, 1995
"... this article we propose three criteria that can be used to address model selection. These emphasize observables rather than parameters and are based on a certain Bayesian predictive density. They have a unifying basis that is simple and interpretable,are free of asymptotic de#nitions,and allow the i ..."
Abstract

Cited by 61 (4 self)
 Add to MetaCart
this article we propose three criteria that can be used to address model selection. These emphasize observables rather than parameters and are based on a certain Bayesian predictive density. They have a unifying basis that is simple and interpretable,are free of asymptotic de#nitions,and allow the incorporation of prior information. Moreover,two of these criteria are readily calibrated.
Model Uncertainty in CrossCountry Growth Regressions
 Journal of Applied Econometrics
, 2001
"... We investigate the issue of model uncertainty in crosscountry growth regressions using Bayesian Model Averaging (BMA). We find that the posterior probability is spread widely among many models, suggesting the superiority of BMA over choosing any single model. Outofsample predictive results suppor ..."
Abstract

Cited by 60 (3 self)
 Add to MetaCart
We investigate the issue of model uncertainty in crosscountry growth regressions using Bayesian Model Averaging (BMA). We find that the posterior probability is spread widely among many models, suggesting the superiority of BMA over choosing any single model. Outofsample predictive results support this claim. In contrast to Levine and Renelt (1992), our results broadly support the more ‘optimistic ’ conclusion of SalaiMartin (1997b), namely that some variables are important regressors for explaining crosscountry growth patterns. However, care should be taken in the methodology employed. The approach proposed here is firmly grounded in statistical theory and immediately leads to posterior and predictive inference. Copyright © 2001 John Wiley & Sons, Ltd. 1.
Prediction via Orthogonalized Model Mixing
 JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION
, 1994
"... In this paper we introduce an approach and algorithms for model mixing in large prediction problems with correlated predictors. We focus on the choice of predictors in linear models, and mix over possible subsets of candidate predictors. Our approach is based on expressing the space of models in ter ..."
Abstract

Cited by 50 (9 self)
 Add to MetaCart
In this paper we introduce an approach and algorithms for model mixing in large prediction problems with correlated predictors. We focus on the choice of predictors in linear models, and mix over possible subsets of candidate predictors. Our approach is based on expressing the space of models in terms of an orthogonalization of the design matrix. Advantages are both statistical and computational. Statistically, orthogonalization often leads to a reduction in the number of competing models by eliminating correlations. Computationally, large model spaces cannot be enumerated; recent approaches are based on sampling models with high posterior probability via Markov chains. Based on orthogonalization of the space of candidate predictors, we can approximate the posterior probabilities of models by products of predictorspecific terms. This leads to an importance sampling function for sampling directly from the joint distribution over the model space, without resorting to Markov chains. Comp...