## Posterior Predictive Assessment of Model Fitness Via Realized Discrepancies (1996)

### Cached

### Download Links

Venue: | Statistica Sinica |

Citations: | 223 - 32 self |

### BibTeX

@ARTICLE{Gelman96posteriorpredictive,

author = {Andrew Gelman and Xiao-li Meng and Hal Stern},

title = {Posterior Predictive Assessment of Model Fitness Via Realized Discrepancies},

journal = {Statistica Sinica},

year = {1996},

pages = {733--807}

}

### Years of Citing Articles

### OpenURL

### Abstract

Abstract: This paper considers Bayesian counterparts of the classical tests for goodness of fit and their use in judging the fit of a single Bayesian model to the observed data. We focus on posterior predictive assessment, in a framework that also includes conditioning on auxiliary statistics. The Bayesian formulation facilitates the construction and calculation of a meaningful reference distribution not only for any (classical) statistic, but also for any parameter-dependent “statistic ” or discrepancy. The latter allows us to propose the realized discrepancy assessment of model fitness, which directly measures the true discrepancy between data and the posited model, for any aspect of the model which we want to explore. The computation required for the realized discrepancy assessment is a straightforward byproduct of the posterior simulation used for the original Bayesian analysis. We illustrate with three applied examples. The first example, which serves mainly to motivate the work, illustrates the difficulty of classical tests in assessing the fitness of a Poisson model to a positron emission tomography image that is constrained to be nonnegative. The second and third examples illustrate the details of the posterior predictive approach in two problems: estimation in a model with inequality constraints on the parameters, and estimation in a mixture model. In all three examples, standard test statistics (either a χ 2 or a likelihood ratio) are not pivotal: the difficulty is not just how to compute the reference distribution for the test, but that in the classical framework no such distribution exists, independent of the unknown model parameters. Key words and phrases: Bayesian p-value, χ 2 test, discrepancy, graphical assessment, mixture model, model criticism, posterior predictive p-value, prior predictive

### Citations

1598 | Bayesian data analysis
- Gelman, JB, et al.
- 1995
(Show Context)
Citation Context ...ould be a super-model that incorporates various alternative models, as in Leamer, 1978, Madigan and Raftery, 1994, and Draper, 1995), to the available data. As we illustrate here and elsewhere (e.g., =-=Gelman, Carlin, Stern, and Rubin, 1995-=-), it is entirely possible to construct sensible discrepancy variables to detect the lack of fit of a single model, in the absence of explicit alternative models. We disagree with the opinion that one... |

1075 |
Multiple Imputation for Nonresponse in Surveys
- Rubin
- 1987
(Show Context)
Citation Context ... `) and D(y rep ; `). We also note that "double parametric bootstrap" or various Bayesian bootstrap methods can sometimes be used to obtain approximations to posterior predictive distributio=-=ns (e.g., Rubin, 1987-=-, Ch. 4; Tsay, 1992). Simulating reference distributions for D min and D avg is more complicated because one must minimize or average over ` when evaluating their values. To compute D min , one needs ... |

874 | Inference from iterative simulation using multiple sequences - Gelman, Rubin - 1992 |

747 |
Statistical Analysis of Finite Mixture Distributions
- Titterington, Smith, et al.
- 1985
(Show Context)
Citation Context ...en (high scores on all variables). It is well known that the usual asymptotic reference distribution for the likelihood ratio test (the �� 2 distribution) is not appropriate for mixture models (e.=-=g., Titterington, Smith, and Makov, 1985-=-). At one level, this is a model selection problem (i.e., choosing the number of classes) for which a complete Bayesian analysis, incorporating the uncertainty in the number of classes, could be carri... |

686 | The calculation of posterior distribution by data augmentation (with discussion - Tanner, Wong - 1987 |

304 | Model selection and accounting for model uncertainty in graphical models using Occam’s window
- Madigan, Raftery
- 1994
(Show Context)
Citation Context ...mphasize that the posterior predictive approach is suitable for assessing the fitness of a single model (which could be a super-model that incorporates various alternative models, as in Leamer, 1978, =-=Madigan and Raftery, 1994-=-, and Draper, 1995), to the available data. As we illustrate here and elsewhere (e.g., Gelman, Carlin, Stern, and Rubin, 1995), it is entirely possible to construct sensible discrepancy variables to d... |

260 |
Speci searches: Ad hoc inference with nonexperimental data
- Leamer
- 1978
(Show Context)
Citation Context ...pectives. We emphasize that the posterior predictive approach is suitable for assessing the fitness of a single model (which could be a super-model that incorporates various alternative models, as in =-=Leamer, 1978-=-, Madigan and Raftery, 1994, and Draper, 1995), to the available data. As we illustrate here and elsewhere (e.g., Gelman, Carlin, Stern, and Rubin, 1995), it is entirely possible to construct sensible... |

137 | Bayesianly justifiable and relevant frequency calculations for the applied statistician - Rubin - 1984 |

135 |
Assessment and Propagation of Model Uncertainty (with discussion
- Draper
- 1995
(Show Context)
Citation Context ...dictive approach is suitable for assessing the fitness of a single model (which could be a super-model that incorporates various alternative models, as in Leamer, 1978, Madigan and Raftery, 1994, and =-=Draper, 1995-=-), to the available data. As we illustrate here and elsewhere (e.g., Gelman, Carlin, Stern, and Rubin, 1995), it is entirely possible to construct sensible discrepancy variables to detect the lack of ... |

122 | Sampling and Bayes’ inference in scientific modeling and robustness - Box - 1980 |

107 | Model Determination using Predictive Distributions with Implementation via Sampling-Based Methods (with discussion - Gelfand, Dey, et al. - 1992 |

80 |
Posterior predictive p-values
- Meng
- 1994
(Show Context)
Citation Context ...H; y)d`; (5) which is the classical p-value of (3) averaged over the posterior distribution of `. This is the p-value defined by Rubin (1984), which we term the posterior predictive p-value (also see =-=Meng, 1994-=-) to contrast it with the prior predictive p-value of Box (1980); see Section 4.1 for discussion. Clearly, the sampling and posterior predictive reference distributions of T (y rep ) are identical whe... |

72 |
On the distribution of the likelihood ratio
- Chernoff
- 1954
(Show Context)
Citation Context ...easy to implement, because their reference distributions are known, at least approximately. Useful approximations to distributions of test statistics are possible for some problems (see, for example, =-=Chernoff, 1954-=-, concerning extensions of the linear model), but are not always available (see, for example, McCullagh, 1985, 1986, concerning the difficulty of finding distributions of classical goodness-of-fit tes... |

61 |
On Substantive Research Hypotheses, Conditional Independence Graphs and Graphical Chain Models
- Wermuth, Lauritzen
- 1990
(Show Context)
Citation Context ...ause ` is unknown, but assumed to have the same value that generated the current data y, we simulate from its posterior distribution given y. Figure 1a is a conditional independence graph (see, e.g., =-=Wermuth and Lauritzen, 1990-=-) that displays the dependence relations between y, `, and y rep . Given `, the data y and the replicate y rep are independent, and both represent possible datasets resulting from the given value of `... |

50 | A Bayesian approach to outlier detection and residual analysis - Chaloner, Brant - 1988 |

44 |
The χ2 test of goodness of fit
- Cochran
- 1952
(Show Context)
Citation Context ...kson, 1980); we consider the minimum �� 2 in our presentation, but similar results could be obtained using the MLE.) Thus X 2 min (y) is approximately pivotal with a �� 2 n\Gammak distribution=-= (e.g., Cochran, 1952). C-=-onsequently, the posterior predictive p-value can be approximated by P (�� 2 n\GammaksX 2 min (y)). Furthermore, if ` is given a diffuse uniform prior distribution in the subspace defined by a lin... |

44 | Estimation in parallel randomized experiments - Rubin - 1981 |

36 | The use of the concept of a future observation in goodness-of-fit problems - Guttman - 1967 |

21 |
Minimum Chi-Square, not Maximum Likelihood! The
- Berkson
- 1980
(Show Context)
Citation Context ...-fit test statistic. (The classical �� 2 test is sometimes evaluated at the maximum likelihood estimate (MLE) and sometimes at the minimum-�� 2 estimate, a distinction of some controversy (see=-=, e.g., Berkson, 1980); we co-=-nsider the minimum �� 2 in our presentation, but similar results could be obtained using the MLE.) Thus X 2 min (y) is approximately pivotal with a �� 2 n\Gammak distribution (e.g., Cochran, 1... |

20 | Generalized p-values in significance testing of hypotheses in the presence of nuisance parameters - Tsui, Weerahandi - 1989 |

20 | Bayesian model monitoring - West - 1986 |

18 |
Model checking via parametric bootstraps in time series analysis
- Tsay
- 1992
(Show Context)
Citation Context .... We also note that "double parametric bootstrap" or various Bayesian bootstrap methods can sometimes be used to obtain approximations to posterior predictive distributions (e.g., Rubin, 198=-=7, Ch. 4; Tsay, 1992-=-). Simulating reference distributions for D min and D avg is more complicated because one must minimize or average over ` when evaluating their values. To compute D min , one needs to determine, for e... |

15 | Testing in latent class models using a posterior predictive check distribution - Rubin, Stern - 1994 |

11 | The conditional distribution of goodness-of-fit statistics for discrete data - McCullagh - 1986 |

10 | Bayesian residual analysis in the presence of censoring - Chaloner - 1991 |

10 | Robustness of maximum likelihood estimates for multi-step predictions: the exponential smoothing case - Tiao, Xu - 1993 |

9 | The analysis of repeated-measures data on schizophrenic reaction times using mixture models - Belin, Rubin - 1995 |

9 | Bayesian model-building by pure thought: some principles and examples, Statistica Sinica 6 - Gelman - 1996 |

6 | A Simple Monte Carlo Approach to Bayesian Graduation.” Transactions of the Society of Actuaries 44:55–76
- CARLIN
- 1992
(Show Context)
Citation Context ...amples 3.1 Fitting an increasing, convex mortality rate function For a simple real-life example, we reanalyze the data of Broffitt (1988), who presents a problem in the estimation of mortality rates (=-=Carlin, 1992-=-, provides another Bayesian analysis of these data). For each age, t, from 35 to 64 years, inclusive, Table 1 gives N t , the number of people insured under a certain policy and y t , the number of in... |

6 |
On the asymptotic distribution of Pearson’s statistic in linear exponential family models
- McCullagh
- 1985
(Show Context)
Citation Context ...tions to distributions of test statistics are possible for some problems (see, for example, Chernoff, 1954, concerning extensions of the linear model), but are not always available (see, for example, =-=McCullagh, 1985-=-, 1986, concerning the difficulty of finding distributions of classical goodness-of-fit tests in generalized linear models). The classical approach relying on known, or approximately known, reference ... |

3 |
Increasing and increasing convex Bayesian graduation
- Broffitt
- 1988
(Show Context)
Citation Context ...tribution for `. Since we were willing to use the MLE, we use a uniform prior distribution, under the constraint of increasing convexity. (The uniform distribution is also chosen here for simplicity; =-=Broffitt, 1988-=-, and Carlin, 1992, apply various forms of the gamma prior distribution.) Samples from the posterior distribution are generated by simulating a random walk through the space of permissible values of `... |

3 | Topics in Image Reconstruction for Emission Tomography - Gelman - 1990 |

2 | Using mixture models in temperament research - Stern, Arcus, et al. - 1995 |

1 | Statistical analysis of a medical imaging experiment - Gelman - 1992 |

1 | Bayesian model checking using tail area probabilities - L, Stern - 1992 |

1 | A Bayesian approach to model comparison in reliability via predictive simulation - Upadhyay, Smith - 1993 |