## Model Selection and Accounting for Model Uncertainty in Linear Regression Models (1993)

### Cached

### Download Links

Citations: | 49 - 6 self |

### BibTeX

@MISC{Raftery93modelselection,

author = {Adrian Raftery and David Madigan and Jennifer Hoeting},

title = {Model Selection and Accounting for Model Uncertainty in Linear Regression Models},

year = {1993}

}

### Years of Citing Articles

### OpenURL

### Abstract

We consider the problems of variable selection and accounting for model uncertainty in linear regression models. Conditioning on a single selected model ignores model uncertainty, and thus leads to the underestimation of uncertainty when making inferences about quantities of interest. The complete Bayesian solution to this problem involves averaging over all possible models when making inferences about quantities of interest. This approach is often not practical. In this paper we offer two alternative approaches. First we describe a Bayesian model selection algorithm called "Occam's "Window" which involves averaging over a reduced set of models. Second, we describe a Markov chain Monte Carlo approach which directly approximates the exact solution. Both these model averaging procedures provide better predictive performance than any single model which might reasonably have been selected. In the extreme case where there are many candidate predictors but there is no relationship between any of them and the response, standard variable selection procedures often choose some subset of variables that yields a high R² and a highly significant overall F value. We refer to this unfortunate phenomenon as "Freedman's Paradox" (Freedman, 1983). In this situation, Occam's vVindow usually indicates the null model as the only one to be considered, or else a small number of models including the null model, thus largely resolving the paradox.

### Citations

1104 | Crime and Punishment: An Economic Approach - Becker - 1968 |

785 |
Applied Regression Analysis
- Draper, Smith
- 1998
(Show Context)
Citation Context ...esolving the paradox. The background literature for our approach includes several areas of research, namely the selection of subsets of predictor variables in linear regression models (Hocking, 1976; =-=Draper and Smith, 1981-=-; Linhart and Zucchini, 1986; Mitchell and Beauchamp, 1988; Miller, 1990; George and McCulloch, 1993) and model uncertainty (Raftery, 1993; Madigan and Raftery, 1994; Madigan and York, 1993; Kass and ... |

578 | The Theory of Probability - JEFFREYS - 1961 |

459 | Applied Linear Statistical Models - Neter, Wasserman, et al. - 1996 |

391 |
Bayesian computation via the Gibbs sampler and related Markov chain Monte-Carlo methods
- Smith, Roberts
- 1993
(Show Context)
Citation Context ...hain for t = 1; : : : ; N , then under certain regularity conditions, for any function g(M i ) defined on M, the average: G = 1 N N X t=1 g(M(t)) (10) is a simulation--consistent estimate of E(g(M)) (=-=Smith and Roberts, 1993-=-). To compute (1) in this fashion set g(M) = pr(\Delta j M;D). To construct the Markov chain we define a neighborhood nbd(M) for each M 2 M which consists of the model M itself and the set of models w... |

373 |
Variable selection via Gibbs sampling
- George, McCulloch
- 1993
(Show Context)
Citation Context ...rch, namely the selection of subsets of predictor variables in linear regression models (Hocking, 1976; Draper and Smith, 1981; Linhart and Zucchini, 1986; Mitchell and Beauchamp, 1988; Miller, 1990; =-=George and McCulloch, 1993-=-) and model uncertainty (Raftery, 1993; Madigan and Raftery, 1994; Madigan and York, 1993; Kass and Raftery, 1993; Draper, 1994). In the next section we outline the philosophy underlying our approach,... |

321 |
Applied Statistical Decision Theory
- Raiffa, Schlaifer
- 2000
(Show Context)
Citation Context ... 1 2 (4) hs+ (Y \Gamma X i �� i ) t (I +X i V i X t i ) \Gamma1 (Y \Gamma X i �� i ) i \Gamma (+n) 2 where X i is the design matrix and V i is the covariance matrix for fi corresponding to mod=-=el M i (Raiffa and Schlaifer, 1961). The Ba-=-yes factor for M 0 versus M 1 , the ratio of equation (4) for i = 0 and i = 1, is then given by B 01 = / jI +X 1 V 1 X t 1 j jI +X 0 V 0 X t 0 j ! 1 2 (5) "s+ (Y \Gamma X 0 �� 0 ) t (I +X 0 V... |

296 | Model selection and accounting for model uncertainty in graphical models using Occam’s window
- Madigan, Raftery
- 1994
(Show Context)
Citation Context ... complete Bayesian solution to this problem involves averaging over all possible models when making inferences about quantities of interest. Indeed, this approach provides optimal predictive ability (=-=Madigan and Raftery, 1994-=-). In many applications however, this averaging will not be a practical proposition and here we present two alternative approaches. First we extend the Bayesian graphical model selection algorithm of ... |

277 |
Subset selection in regression
- Miller
- 2002
(Show Context)
Citation Context ...areas of research, namely the selection of subsets of predictor variables in linear regression models (Hocking, 1976; Draper and Smith, 1981; Linhart and Zucchini, 1986; Mitchell and Beauchamp, 1988; =-=Miller, 1990-=-; George and McCulloch, 1993) and model uncertainty (Raftery, 1993; Madigan and Raftery, 1994; Madigan and York, 1993; Kass and Raftery, 1993; Draper, 1994). In the next section we outline the philoso... |

262 | Applied Linear Regression - Weisberg - 1980 |

260 | Bayesian graphical models for discrete data
- Madigan, York
- 1995
(Show Context)
Citation Context ...ing, 1976; Draper and Smith, 1981; Linhart and Zucchini, 1986; Mitchell and Beauchamp, 1988; Miller, 1990; George and McCulloch, 1993) and model uncertainty (Raftery, 1993; Madigan and Raftery, 1994; =-=Madigan and York, 1993-=-; Kass and Raftery, 1993; Draper, 1994). In the next section we outline the philosophy underlying our approach, describe how we selected prior distributions, and outline the two model averaging approa... |

221 | Participation in Illegitimate Activities: A Theoretical and Empirical Investigation - Ehrlich - 1973 |

196 | Markov Chains with Stationary Transition Probabilities - Chung - 1960 |

195 | Data Analysis and Regression - Mosteller, Tukey - 1977 |

174 | Rational Decisions - Good - 1952 |

149 | Assessment and Propagation of Model Uncertainty
- Draper
- 1995
(Show Context)
Citation Context ...ucchini, 1986; Mitchell and Beauchamp, 1988; Miller, 1990; George and McCulloch, 1993) and model uncertainty (Raftery, 1993; Madigan and Raftery, 1994; Madigan and York, 1993; Kass and Raftery, 1993; =-=Draper, 1994-=-). In the next section we outline the philosophy underlying our approach, describe how we selected prior distributions, and outline the two model averaging approaches. In Section 3 we provide an examp... |

120 |
Speci Searches
- Leamer
- 1978
(Show Context)
Citation Context ...ng to a single "best" model and to then make inference as if the selected model were the true model. However, this ignores a major component of uncertainty, namely uncertainty about the mode=-=l itself (Leamer, 1978-=-; Hodges, 1987; Raftery, 1988, 1993; Moulton, 1991; Draper, 1994). As a consequence, uncertainty about quantities of interest can be underestimated. For striking examples of this see Miller (1984), Re... |

115 |
Bayesian variable selection in linear regression (with discussion
- Mitchell, Beauchamp
- 1988
(Show Context)
Citation Context ...our approach includes several areas of research, namely the selection of subsets of predictor variables in linear regression models (Hocking, 1976; Draper and Smith, 1981; Linhart and Zucchini, 1986; =-=Mitchell and Beauchamp, 1988-=-; Miller, 1990; George and McCulloch, 1993) and model uncertainty (Raftery, 1993; Madigan and Raftery, 1994; Madigan and York, 1993; Kass and Raftery, 1993; Draper, 1994). In the next section we outli... |

113 | The Optimum Enforcement of Laws - Stigler - 1970 |

110 | Approximate Bayes factors and accounting for model uncertainty in generalised linear models
- Raftery
- 1996
(Show Context)
Citation Context ...ariables in linear regression models (Hocking, 1976; Draper and Smith, 1981; Linhart and Zucchini, 1986; Mitchell and Beauchamp, 1988; Miller, 1990; George and McCulloch, 1993) and model uncertainty (=-=Raftery, 1993-=-; Madigan and Raftery, 1994; Madigan and York, 1993; Kass and Raftery, 1993; Draper, 1994). In the next section we outline the philosophy underlying our approach, describe how we selected prior distri... |

109 |
The Analysis and Selection of Variables in Linear Regression
- Hocking
- 1976
(Show Context)
Citation Context ... thus largely resolving the paradox. The background literature for our approach includes several areas of research, namely the selection of subsets of predictor variables in linear regression models (=-=Hocking, 1976-=-; Draper and Smith, 1981; Linhart and Zucchini, 1986; Mitchell and Beauchamp, 1988; Miller, 1990; George and McCulloch, 1993) and model uncertainty (Raftery, 1993; Madigan and Raftery, 1994; Madigan a... |

95 | Bayes factors and model uncertainty
- Kass, Raftery
- 1993
(Show Context)
Citation Context ...ith, 1981; Linhart and Zucchini, 1986; Mitchell and Beauchamp, 1988; Miller, 1990; George and McCulloch, 1993) and model uncertainty (Raftery, 1993; Madigan and Raftery, 1994; Madigan and York, 1993; =-=Kass and Raftery, 1993-=-; Draper, 1994). In the next section we outline the philosophy underlying our approach, describe how we selected prior distributions, and outline the two model averaging approaches. In Section 3 we pr... |

83 | On assessing prior distributions and Bayesian regression analysis with g-prior distribution - Zellner |

70 | Applied regression analysis. 2nd edition - NR, Smith - 1981 |

62 | Interactive elicitation of opinion for a normal linear model - Kadane, Dickey, et al. - 1980 |

35 |
Uncertainty, Policy Analysis and Statistics
- Hodges
- 1987
(Show Context)
Citation Context ... "best" model and to then make inference as if the selected model were the true model. However, this ignores a major component of uncertainty, namely uncertainty about the model itself (Leam=-=er, 1978; Hodges, 1987-=-; Raftery, 1988, 1993; Moulton, 1991; Draper, 1994). As a consequence, uncertainty about quantities of interest can be underestimated. For striking examples of this see Miller (1984), Regal and Hook (... |

27 |
A note on screening regression equations
- Freedman
- 1983
(Show Context)
Citation Context ...d variable selection procedures often choose some subset of variables that yields a high R 2 and a highly significant overall F value. We refer to this unfortunate phenomenon as "Freedman's Parad=-=ox" (Freedman, 1983-=-). In this situation, Occam's Window usually indicates the null model as the only one to be considered, or else a small number of models including the null model, thus largely resolving the paradox. K... |

26 |
Reliability of Subjective Probability Forecasts of Precipitation and Temperature
- Murphy, Winkler
- 1977
(Show Context)
Citation Context ...score MC 3 model averaging 29.3 Occam's Window model averaging 31.4 Efroymson 3, 4, 8, 9, 13, 15 39.1 Adjusted R 2 1, 2, 3, 4, 5, 11, 12, 13, 15 44.9 Cp 1, 2, 3, 4, 11, 13, 15 45.2 (see, for example, =-=Murphy and Winkler, 1977-=-). The calibration plot for the model chosen by Efroymson and for model averaging using Occam's Window is shown in Figure 3. The shaded area in Figure 3 shows where the model averaging strategy produc... |

23 | Applied Statistics: Principles and Examples - Cox - 1981 |

19 |
A Bayesian Approach to Regression Selection and Estimation, with Application to a Price Index for Radio Services
- Moulton
- 1991
(Show Context)
Citation Context ...rence as if the selected model were the true model. However, this ignores a major component of uncertainty, namely uncertainty about the model itself (Leamer, 1978; Hodges, 1987; Raftery, 1988, 1993; =-=Moulton, 1991-=-; Draper, 1994). As a consequence, uncertainty about quantities of interest can be underestimated. For striking examples of this see Miller (1984), Regal and Hook (1991), Madigan and York (1993), Raft... |

17 | Elicitation of Prior Distributions for Variable-Selection Problems in Regression Annals of Statistics - Garthwaite, Dickey - 1992 |

17 |
Approximate Bayes Factors for Generalized Linear Models
- Raftery
- 1988
(Show Context)
Citation Context ...and to then make inference as if the selected model were the true model. However, this ignores a major component of uncertainty, namely uncertainty about the model itself (Leamer, 1978; Hodges, 1987; =-=Raftery, 1988-=-, 1993; Moulton, 1991; Draper, 1994). As a consequence, uncertainty about quantities of interest can be underestimated. For striking examples of this see Miller (1984), Regal and Hook (1991), Madigan ... |

17 | Applied Statistical Decision Theory - Raia, Schlaifer - 1961 |

15 | Selection of subsets of regression variables (with Discussion - Miller - 1984 |

15 | Participation in Illegitimate Activities: Ehrlich Revisited,” in Deterrence and Incapacitation - Vandaele - 1978 |

12 | The Effects of Model Selection on Confidence Intervals for the Size of a Closed Population.“ Statistics in Medicine 10 - Regal, Hook - 1991 |

6 |
Discussion on sampling and bayes inference in scientific modeling and robustness
- Geisser
- 1980
(Show Context)
Citation Context ...k, we assess predictive performance using two strategies. The first measure of predictive ability is the logarithmic scoring rule of Good (1952) which is based on the conditional predictive ordinate (=-=Geisser, 1980-=-). Specifically, we measured the predictive ability of an individual model, M , with: \Gamma X d2D P log pr(d j M;D T ): We measured the predictive performance for model averaging with: \Gamma X d2D P... |

5 | Statistics with Stata 3 - Hamilton - 1993 |

4 | Regression Analysis by Example, 2nd edition - Chatterjee, Price - 1991 |

1 | Applied Linear Regression - vVeisberg - 1985 |