## Model selection and accounting for model uncertainty in graphical models using Occam's window (1993)

### Cached

### Download Links

Citations: | 293 - 46 self |

### BibTeX

@TECHREPORT{Madigan93modelselection,

author = {David Madigan and Adrian E. Raftery},

title = {Model selection and accounting for model uncertainty in graphical models using Occam's window},

institution = {},

year = {1993}

}

### Years of Citing Articles

### OpenURL

### Abstract

We consider the problem of model selection and accounting for model uncertainty in high-dimensional contingency tables, motivated by expert system applications. The approach most used currently is a stepwise strategy guided by tests based on approximate asymptotic P-values leading to the selection of a single model; inference is then conditional on the selected model. The sampling properties of such a strategy are complex, and the failure to take account of model uncertainty leads to underestimation of uncertainty about quantities of interest. In principle, a panacea is provided by the standard Bayesian formalism which averages the posterior distributions of the quantity of interest under each of the models, weighted by their posterior model probabilities. Furthermore, this approach is optimal in the sense of maximising predictive ability. However, this has not been used in practice because computing the posterior model probabilities is hard and the number of models is very large (often greater than 1011). We argue that the standard Bayesian formalism is unsatisfactory and we propose an alternative Bayesian approach that, we contend, takes full account of the true model uncertainty byaveraging overamuch smaller set of models. An efficient search algorithm is developed for nding these models. We consider two classes of graphical models that arise in expert systems: the recursive causal models and the decomposable

### Citations

1349 |
Local computations with probabilities on graphical structures and their application to expert systems (with discussion
- Lauritzen, Spiegelhalter
- 1988
(Show Context)
Citation Context ...e log-linear models introduced by Goodman (1970) and Haberman (1974). This work is motivated by applications in expert systems which use a belief network to represent knowledge and perform inference (=-=Lauritzen and Spiegelhalter, 1988-=-). These are the two model classes that arise in such applications. Potentially the most important advantage of constructing expert systems in this fashion is the system's ability to modify itself as ... |

1140 | A Bayesian Method for the Induction of Probabilistic Networks from Data
- Cooper, Herskovits
- 1992
(Show Context)
Citation Context ... large sparse tables mentioned above are avoided by using exact tests when comparing models. 6.2 Model Priors In the examples considered above, the prior model probabilities pr(M) were assumed equal (=-=Cooper and Herskovits, 1992-=-, also assume that models are equally likely a priori). In general this can be unrealistic and may also be expensive and we will want to penalise the search strategy as it moves further away from the ... |

570 | Theory of Probability - Jeffreys - 1961 |

222 | Multivariate Analysis - Bishop, Fienberg, et al. - 1975 |

215 | A Theory of Inferred Causation
- Pearl, Verma
(Show Context)
Citation Context ...ring, then a directed link from vj to vi is prohibited. In certain applications it may be possible to search over all possible orderings but this will typically not be the case. Pearl's IC-algorithm (=-=Pearl and Verma, 1991-=-) induces directed \causal" structures from data. An ordering of the nodes is not required, but for each pairofnodesvi and vj, the algorithm does involve searching amongst all subsets of V nfvi�vjg fo... |

206 | 1990. Sequential updating of conditional probabilities on directed graphical structures - Spiegelhalter, Lauritzen |

199 | Bayesian analysis in expert systems - Spiegelhalter, Dawid, et al. - 1993 |

172 | Rational decisions - Good - 1952 |

152 |
Model Selection
- Linhart, Zucchini
- 1986
(Show Context)
Citation Context ... Whittaker (1984), Edwards and Havranek (1985) or Fowlkes et al. (1988)). There are also approaches based on information criteria and discrepancy measures (Gokhale and Kullback, 1978� Sakamoto, 1984� =-=Linhart and Zucchini, 1986-=-). A recent review is provided by Upton (1991) who advocates the use of the BIC statistic. The calculation of Bayes factors for contingency table models has been considered by Spiegelhalter and Smith ... |

124 | Hyper-Markov Laws in the Statistical Analysis of Decomposable Graphical Models,” The Annals of Statistics - Dawid, Lauritzen - 1993 |

120 | Testing a point null hypothesis: the irreconcilability of signi levels and evidence - Berger, Sellke - 1987 |

110 | Approximate Bayes factors and accounting for model uncertainty in generalised linear models - Raftery - 1996 |

95 | Bayes factors and model uncertainty - Kass, Raftery - 1995 |

76 | Discovering causal structure - Glymour, Scheines, et al. - 2000 |

62 | Metric methods for analyzing partially ranked data - Critchlow - 1985 |

61 | The Analysis of Frequency Data - Haberman - 1974 |

59 | Recursive causal models - Kiiveri, Speed, et al. - 1984 |

56 | On substantive research hypotheses, conditional independence graphs and graphical chain models (with discussion - Wermuth, Lauritzen - 1990 |

49 | A fast procedure for model search in multidimensional contingency tables - Edwards, Havraneek - 1985 |

48 | Probability forecasting - Dawid - 1986 |

45 | Bayes factors for linear and log-linear models with vague prior information - Spiegelhalter, Smith - 1982 |

39 | Choosing models for cross-classifications - RAFTERY - 1986 |

35 | The multivariate analysis of qualitative data: Interactions among multiple classifications - Goodman - 1970 |

35 |
Uncertainty, Policy Analysis and Statistics
- Hodges
- 1987
(Show Context)
Citation Context ... large, as was shown by Regal and Hook (1991) in the contingency table context and by Miller (1984) in the regression context. One bad consequence is that it can lead to decisions that are too risky (=-=Hodges, 1987-=-). In principle, the standard Bayesian formalism provides a panacea for all these di culties. If is the quantity ofinterest, such as a parameter, a future observation, or the utility ofa course of act... |

32 | Decomposition of maximum likelihood in mixed interaction models - Frydenberg, Lauritzen - 1989 |

29 | The analysis of multidimensional contingency tables when some variables are posterior to others: a modified path analysis approach - Goodman - 1973 |

27 |
A note on Bayes factors for log-linear contingency table models with vague prior information
- Raftery
- 1986
(Show Context)
Citation Context ...ture Plans� college or job� Upton (1991) reports that a model selection procedure based on the AIC criterion (Akaike, 1973) selects [ABCE][CDF][BCD][DEF] while a procedure based on the BIC criterion (=-=Raftery, 1986-=-a) selects the much simpler [A][BE][CE][CF][BD][DE][DF]. Clearly an important di erence between these two models is the treatment ofA. The Bayesian graphical model selection procedure started from the... |

24 |
The Information in Contingency Tables
- Gokhale, Kullback
- 1978
(Show Context)
Citation Context ...man (1973), Wermuth (1976), Havranek (1984), Whittaker (1984), Edwards and Havranek (1985) or Fowlkes et al. (1988)). There are also approaches based on information criteria and discrepancy measures (=-=Gokhale and Kullback, 1978-=-� Sakamoto, 1984� Linhart and Zucchini, 1986). A recent review is provided by Upton (1991) who advocates the use of the BIC statistic. The calculation of Bayes factors for contingency table models has... |

23 | Analysis of Multidimensional Contingency Tables by Exact Methods - Kreiner - 1987 |

20 | Bayesian testing of precise hypotheses (with discussion - Berger, Delampady - 1987 |

19 | A Bayesian Approach to Regression Selection and Estimation, with Application to a Price Index for Radio Services - Moulton - 1991 |

17 | Approximate Bayes Factors for Generalized Linear Models - Raftery - 1988 |

15 | Selection of subsets of regression variables (with Discussion - Miller - 1984 |

15 | Theory of Probability - reys, H - 1948 |

12 | Evaluating logistic models for large contingency tables - Fowlkes, Freeny, et al. - 1988 |

12 | A fast procedure for model search in multidimensional contingency tables - Edward, Havranek - 1985 |

12 | The Effects of Model Selection on Confidence Intervals for the Size of a Closed Population.“ Statistics in Medicine 10 - Regal, Hook - 1991 |

12 | Model search among multiplicative models - WERMUTH - 1976 |

8 | Some Properties of the Dirichlet-Multinomial Distributionand its Use - Chaloner, Duncan - 1987 |

7 | Steno: an expert system for medical diagnosis based on graphical models and model - Andersen, Krebs, et al. - 1991 |

7 | Implementing Bayesian methods in forensic science. Paper presented at the Fourth Valencia International Meeting on Bayesian Statistics - Evett - 1991 |

7 | Simultaneous test procedures—some theory of multiple comparisons - Gabriel - 1969 |

6 | Statistical theory{the prequential approach - Dawid - 1984 |

4 | The power function of conditional log-linear model tests - Fenech, Westfall - 1988 |

4 |
The exploratory analysis of survey data using log-linear models. The Statist
- Upton
- 1991
(Show Context)
Citation Context ...ll consequent orderings produced the single model shown in Figure 6. The selected model is similar to the model selected by Upton's BIC procedure. The model selected by AIC clearly over- ts the data (=-=Upton, 1991-=-). It is of interest to note the direction of the link from D to F . Both Upton (1991) and Fowlkes et al. (1988) treat D as a response variable and Upton's path diagram shows a directed link from F to... |

2 | Hierarchical mixed interaction models - Edwards - 1990 |

2 | Fitting all possible decomposable and graphical models to multiway contingency tables - Whittaker - 1984 |

2 | A fast procedure for model search inmultidimensional contingency tables - Edwards, T - 1985 |

2 |
Choosing models for cross-classi cations
- Raftery
- 1986
(Show Context)
Citation Context ...ture Plans� college or job� Upton (1991) reports that a model selection procedure based on the AIC criterion (Akaike, 1973) selects [ABCE][CDF][BCD][DEF] while a procedure based on the BIC criterion (=-=Raftery, 1986-=-a) selects the much simpler [A][BE][CE][CF][BD][DE][DF]. Clearly an important di erence between these two models is the treatment ofA. The Bayesian graphical model selection procedure started from the... |

1 | Guided and unguided methods for the selection of models for a set of Tmultidimensional contingency tables - Goodman - 1973 |