Results 1 - 10
of
19
The practical implementation of Bayesian model selection
- Institute of Mathematical Statistics
, 2001
"... In principle, the Bayesian approach to model selection is straightforward. Prior probability distributions are used to describe the uncertainty surrounding all unknowns. After observing the data, the posterior distribution provides a coherent post data summary of the remaining uncertainty which is r ..."
Abstract
-
Cited by 48 (2 self)
- Add to MetaCart
In principle, the Bayesian approach to model selection is straightforward. Prior probability distributions are used to describe the uncertainty surrounding all unknowns. After observing the data, the posterior distribution provides a coherent post data summary of the remaining uncertainty which is relevant for model selection. However, the practical implementation of this approach often requires carefully tailored priors and novel posterior calculation methods. In this article, we illustrate some of the fundamental practical issues that arise for two different model selection problems: the variable selection problem for the linear model and the CART model selection problem.
The variable selection problem
- Journal of the American Statistical Association
, 2000
"... The problem of variable selection is one of the most pervasive model selection problems in statistical applications. Often referred to as the problem of subset selection, it arises when one wants to model the relationship between a variable of interest and a subset of potential explanatory variables ..."
Abstract
-
Cited by 25 (1 self)
- Add to MetaCart
The problem of variable selection is one of the most pervasive model selection problems in statistical applications. Often referred to as the problem of subset selection, it arises when one wants to model the relationship between a variable of interest and a subset of potential explanatory variables or predictors, but there is uncertainty about which subset to use. This vignette reviews some of the key developments which have led to the wide variety of approaches for this problem. 1
Optimal Predictive Model Selection
- Ann. Statist
, 2002
"... Often the goal of model selection is to choose a model for future prediction, and it is natural to measure the accuracy of a future prediction by squared error loss. ..."
Abstract
-
Cited by 19 (1 self)
- Add to MetaCart
Often the goal of model selection is to choose a model for future prediction, and it is natural to measure the accuracy of a future prediction by squared error loss.
Transdimensional Markov Chains: A Decade of Progress and Future Perspectives
- Journal of the American Statistical Association
, 2005
"... The last ten years have witnessed the development of sampling frameworks that permit the construction of Markov chains which simultaneously traverse both parameter and model space. In this time substantial methodological progress has been made. In this article we present a survey of the current stat ..."
Abstract
-
Cited by 12 (2 self)
- Add to MetaCart
The last ten years have witnessed the development of sampling frameworks that permit the construction of Markov chains which simultaneously traverse both parameter and model space. In this time substantial methodological progress has been made. In this article we present a survey of the current state of the art and evaluate some of the most recent advances in this field. We also discuss future research perspectives in the context of the drive to develop sampling mechanisms with high degrees of both efficiency and automation. 1
Long-Run Performance of Bayesian Model Averaging
- Journal of the American Statistical Association
, 2003
"... Hjort and Claeskens (HC) argue that statistical inference conditional on a single selected model underestimates uncertainty, and that model averaging is the way to remedy this; we strongly agree. They point out that Bayesian model averaging (BMA) has been the dominant approach to this, but argue tha ..."
Abstract
-
Cited by 11 (2 self)
- Add to MetaCart
Hjort and Claeskens (HC) argue that statistical inference conditional on a single selected model underestimates uncertainty, and that model averaging is the way to remedy this; we strongly agree. They point out that Bayesian model averaging (BMA) has been the dominant approach to this, but argue that its performance has been inadequately studied, and propose an alternative, Frequentist Model Averaging (FMA). We point out, however, that there is a substantial literature on the performance of BMA, consisting of three main threads: general theoretical results, simulation studies, and evaluation of out-of-sample performance. The theoretical results are scattered, and we summarize them. The results have been quite consistent: BMA has tended to outperform competing methods for model selection and taking account of model uncertainty. The theoretical results depend on the assumption that the \practical distribution" over which the performance of methods is assessed is the same as the prior distribution used, and we investigate sensitivity of results to this assumption in a simple normal example; they turn out not to be unduly sensitive.
A Review of Statistical Methods for the Meteorological Adjustment of Tropospheric Ozone
, 1999
"... A variety of statistical methods for meteorological adjustment of ozone have been proposed in the literature over the last decade or so. These can be broadly classified into regression methods, extreme value methods, and space-time methods. We describe and offer a critical review of the approaches, ..."
Abstract
-
Cited by 11 (1 self)
- Add to MetaCart
A variety of statistical methods for meteorological adjustment of ozone have been proposed in the literature over the last decade or so. These can be broadly classified into regression methods, extreme value methods, and space-time methods. We describe and offer a critical review of the approaches, discuss questions of variable selection and trend estimation, and compare selected methods as applied to ozone time series from the Chicago area. Key Words: Regression, extreme values, time series, spatial statistics, environmetrics. 6/2/99 2 2 1.
Bayesian Regression With Multivariate Linear Splines
, 1999
"... We present a Bayesian analysis of a piecewise linear model constructed using basis functions which generalises the univariate linear spline to higher dimensions. Prior distributions are adopted on both the number and locations of the splines which leads to a model averaging approach to prediction ..."
Abstract
-
Cited by 9 (0 self)
- Add to MetaCart
We present a Bayesian analysis of a piecewise linear model constructed using basis functions which generalises the univariate linear spline to higher dimensions. Prior distributions are adopted on both the number and locations of the splines which leads to a model averaging approach to prediction with predictive distributions that take into account model uncertainty. Conditioning on the data produces a Bayes local linear model with distributions on both predictions and on local linear parameters. The method is spatially adaptive and covariate selection is achieved by using splines of lower dimension than the data. KEYWORDS: Bayesian piecewise linear regression; Bayesian model averaging; nonlinear regression; multivariate splines; local linear regression. 1 1 Introduction Many methods exist for modelling the mean regression surface, E(Y jX), of a response variable Y , given a design matrix of covariates or predictors X 2 R p . Each method has associated benets and drawba...
Comparing Bayes model averaging and stacking when model approximation error cannot be ignored
- Journal of Machine Learning Research
, 2003
"... We compare Bayes Model Averaging, BMA, to a non-Bayes form of model averaging called stacking. In stacking, the weights are no longer posterior probabilities of models; they are obtained by a technique based on cross-validation. When the correct data generating model (DGM) is on the list of models u ..."
Abstract
-
Cited by 4 (0 self)
- Add to MetaCart
We compare Bayes Model Averaging, BMA, to a non-Bayes form of model averaging called stacking. In stacking, the weights are no longer posterior probabilities of models; they are obtained by a technique based on cross-validation. When the correct data generating model (DGM) is on the list of models under consideration BMA is never worse than stacking and often is demonstrably better, provided that the noise level is of order commensurate with the coefficients and explanatory variables. Here, however, we focus on the case that the correct DGM is not on the model list and may not be well approximated by the elements on the model list. We give a sequence of computed examples by choosing model lists and DGM’s to contrast the risk performance of stacking and BMA. In the first examples, the model lists are chosen to reflect geometric principles that should give good performance. In these cases, stacking typically outperforms BMA, sometimes by a wide margin. In the second set of examples we examine how stacking and BMA perform when the model list includes all subsets of a set of potential predictors. When we standardize the size of terms and coefficients in this setting, we find that BMA outperforms stacking when the deviant terms in the DGM ‘point ’ in directions accommodated by the model list but that when the deviant term points outside the model list stacking seems to do better. Overall, our results suggest the stacking has better robustness properties than BMA in the most important settings.

