## Predictive model assessment for count data (2007)

### Cached

### Download Links

Citations: | 13 - 1 self |

### BibTeX

@TECHREPORT{Czado07predictivemodel,

author = {Claudia Czado and Tilmann Gneiting and Leonhard Held},

title = {Predictive model assessment for count data},

institution = {},

year = {2007}

}

### OpenURL

### Abstract

Summary. We discuss tools for the evaluation of probabilistic forecasts and the critique of statistical models for ordered discrete data. Our proposals include a non-randomized version of the probability integral transform, marginal calibration diagrams and proper scoring rules, such as the predictive deviance. In case studies, we critique count regression models for patent data, and assess the predictive performance of Bayesian age-period-cohort models for larynx cancer counts in Germany.

### Citations

2196 | Handbook of Mathematical Functions - Abramowitz, Stegun - 1965 |

1581 |
Generalized linear models
- Nelder, Wedderburn
- 1972
(Show Context)
Citation Context ...02, p. 587). If the predictive distribution is a member of a one-parameter exponential family, such as the binomial or Poisson, the standardizing term is routinely taken to be the saturated deviance (=-=McCullagh and Nelder, 1989-=-, pp. 33-34; Knorr-Held and Rainer, 2001, p. 114; Spiegelhalter et al., 2002, p. 606; Clements et al., 2005, p. 581). However, when the predictive distributions come from possibly distinct parametric ... |

366 |
Modern applied statistics with S-PLUS
- WN, BD
- 1997
(Show Context)
Citation Context ...shows non-randomized PIT histograms based on the leave-one-out predictive distributions, using Poisson and negative binomial count regression 13smodels fitted with the R functions glm() and glm.nb() (=-=Venables and Ripley, 1997-=-, Section 7.4). The PIT histogram for the Poisson case indicates under-dispersion of the Poisson regression model. The histogram for the negative binomial case does not show any lack of model fit. Fig... |

312 | Econometric Models for Count Data with an Application to the Patents-R&D - Hausman, Hall, et al. - 1984 |

265 | Evaluating Density Forecasts with an Application to Financial Risk Management.’’ International Economic Review 39:863–883 - Diebold, Gunther, et al. - 1998 |

181 |
Statistical theory: The prequential approach (with discussion
- Dawid
- 1984
(Show Context)
Citation Context ...ik.tu-muenchen.de † email: tilmann@stat.washington.edu ‡ email: leonhard.held@ifspm.unizh.ch 1sprobabilistic in nature, taking the form of probability distributions over future quantities and events (=-=Dawid, 1984-=-). Here, we consider the evaluation of probabilistic forecasts, or predictive distributions, for count data, as they occur in a wide range of epidemiological, ecological, environmental, climatological... |

169 | Strictly proper scoring rules, prediction and estimation
- Gneiting, Raftery
- 2004
(Show Context)
Citation Context ...nsform (PIT) that is tailored to count data, and the marginal calibration diagram. Section 3 discusses the use of scoring rules as omnibus performance measures. We stress the importance of propriety (=-=Gneiting and Raftery, 2007-=-), note examples, relate to classical measures of predictive performance, and 2sidentify the predictive deviance as a variant of the proper logarithmic score. Section 4 turns to a cross-validation stu... |

167 |
Rational decisions
- Good
- 1952
(Show Context)
Citation Context ...garithmic score is defined as logs(P, x) = − log px. (7) This is the only proper scoring rule that depends on the predictive distribution P only through the probability mass px at the observed count (=-=Good, 1952-=-). The associated expected score or generalized entropy function is the classical Shannon entropy. There is a close relationship between the logarithmic score and the predictive deviance, defined as d... |

160 | Bayesian measures of model complexity and fit (with discussion - Spiegelhalter, Best, et al. - 2002 |

141 |
Elicitation of Personal Probabilities and Expectations
- Savage
- 1971
(Show Context)
Citation Context ...is regular if s(P, x) is finite, except possibly that s(P, x) = ∞ if px = 0. Let P denotes the class of probability measures on the set of the nonnegative integers. The Savage representation theorem (=-=Savage, 1971-=-; Gneiting and Raftery, 2007) states that a regular scoring rule S for count data is proper if and only if where h : P →ss(P, x) = h(P ) − ∞� k=0 h ′ k (P )pk + h ′ x (P ) is a concave function and h ... |

138 |
Bayesian computation and stochastic systems (with Discussion
- Besag, Green, et al.
- 1995
(Show Context)
Citation Context ...re poor.” Here we use non-parametric smoothing priors within a hierarchical Bayesian framework, for which model-based extrapolation of period and cohort effects for future periods is straightforward (=-=Besag et al., 1995-=-). This choice has the additional advantage that adjustments for overdispersion are easy to make. Inference and prediction based on Markov chain Monte Carlo techniques is done as described in Knorr-He... |

124 |
Bayesianly justifiable and relevant frequency calculations for the applied statistician
- Rubin
- 1984
(Show Context)
Citation Context ...ive distributions subject to calibration. Calibration refers to the statistical consistency between the probabilistic forecasts and the observations, and its assessment requires frequentist thinking (=-=Rubin, 1984-=-). Gneiting et al. (2007) distinguish various modes of calibration and propose tools for the assessment of calibration and sharpness for probabilistic forecasts of continuous variables. Here, we adapt... |

99 |
The statistical evaluation of medical tests for classification and prediction
- Pepe
- 2004
(Show Context)
Citation Context ... Armstrong and Moolgavkar, 2005). To this date, statistical methods for the assessment of predictive performance have been studied primarily from biomedical, meteorological and economic perspectives (=-=Pepe, 2003-=-; Jolliffe and Stephenson, 2003; Clements, 2005), focusing on predictions of dichotomous events or real-valued continuous variables. Here, we consider the hybrid case of count data, in which methods d... |

68 | Model choice: A minimum posterior predictive loss approach - GELFAND, GHOSH - 1998 |

60 | 2001: Interpretation of rank histograms for verifying ensemble forecasts
- Hamill
(Show Context)
Citation Context ...Gunther and Tay, 1998; Gneiting et al., 2007). The PIT histogram is typically used informally as a diagnostic tool; formal tests can also be employed though they require care in their interpretation (=-=Hamill, 2001-=-; Jolliffe, 2007). Deviations from uniformity hint at reasons for forecast failures and model deficiencies. U-shaped histograms indicate underdispersed predictive distributions, hump or inverse-U shap... |

46 | Probabilistic forecasts, calibration and sharpness
- Gneiting, Balabdaoui, et al.
- 2007
(Show Context)
Citation Context ...y plotting the empirical CDF of a set of PIT values and comparing to the identity function, or by plotting the histogram of the PIT values and checking for uniformity (Diebold, Gunther and Tay, 1998; =-=Gneiting et al., 2007-=-). The PIT histogram is typically used informally as a diagnostic tool; formal tests can also be employed though they require care in their interpretation (Hamill, 2001; Jolliffe, 2007). Deviations fr... |

36 |
A scoring system for probability forecasts of ranked categories
- Epstein
- 1969
(Show Context)
Citation Context ...+ �p� 2 9 (8)sand sphs(P, x) = − px , (9) �p� respectively. Wecker (1989) proposed the use of the quadratic score in the assessment of time series predictions of counts. The ranked probability score (=-=Epstein, 1969-=-) was originally introduced for ranked categorical data. It is easily adapted to count data, by defining ∞� rps(P, x) = {Pk − 1(x ≤ k)} k=0 2 . (10) Equation (14) in Gneiting and Raftery (2007) implie... |

33 | 2007: Probabilistic quantitative precipitation forecasting using Bayesian model averaging.Mon - Sloughter, Raftery, et al. |

30 |
Negative binomial and mixed Poisson regression
- Lawless
- 1987
(Show Context)
Citation Context ...on relative to a Poisson regression model (Dean and Lawless, 1989; Winkelmann, 2005). Various alternatives have been suggested to accommodate this, such as negative binomial and mixed Poisson models (=-=Lawless, 1987-=-). In this section, we investigate whether the non-randomized PIT histogram, the marginal calibration diagram and proper scoring rules are effective tools for model criticism (O’Hagan, 2003) in this c... |

27 | Bayesian prediction of spatial count data using generalised linear mixed models
- Christensen, Waagepetersen
- 2002
(Show Context)
Citation Context ...obabilistic forecasts, or predictive distributions, for count data, as they occur in a wide range of epidemiological, ecological, environmental, climatological, demographic and economic applications (=-=Christensen and Waagepetersen, 2002-=-; Gotway and Wolfinger, 2003; McCabe and Martin, 2005; Elsner and Jagger, 2006; Frühwirth-Schnatter and Wagner, 2006; Nelson and Leroux, 2006). Our focus is on the low count situation in which continu... |

20 |
Diagnostic check of non-standard time series models
- Smith
- 1985
(Show Context)
Citation Context ...fically, if P is the predictive distribution, x ∼ P is a random count and v is standard uniform and independent of x, then u = Px−1 + v(Px − Px−1), x ≥ 1, (1) u = vP0, x = 0, (2) is standard uniform (=-=Smith, 1985-=-, pp. 286–287; Frühwirth-Schnatter, 1996, p. 297; Liesenfeld, Nolte and Pohlmeier, 2006, pp. 819–820). For time series data one typically considers onestep (or k-step) ahead predictions, based on a ti... |

18 | Modelling count data with overdispersion and spatial effects - GSCHLÖßL, CZADO - 2006 |

17 |
Tests for detecting overdispersion in Poisson regression models
- Dean, Lawless
- 1989
(Show Context)
Citation Context ...or, is the same for all three forecasts. 4. Case study: Model critique for count regression Count data often show substantial extra variation or overdispersion relative to a Poisson regression model (=-=Dean and Lawless, 1989-=-; Winkelmann, 2005). Various alternatives have been suggested to accommodate this, such as negative binomial and mixed Poisson models (Lawless, 1987). In this section, we investigate whether the non-r... |

16 | Using age, period and cohort models to estimate future mortality rates - Osmond - 1985 |

15 |
Evaluating Econometric Forecasts Of Economic And Financial Variables, Palgrave
- Clements
- 2005
(Show Context)
Citation Context ...ate, statistical methods for the assessment of predictive performance have been studied primarily from biomedical, meteorological and economic perspectives (Pepe, 2003; Jolliffe and Stephenson, 2003; =-=Clements, 2005-=-), focusing on predictions of dichotomous events or real-valued continuous variables. Here, we consider the hybrid case of count data, in which methods developed for either type of situation continue ... |

15 |
Bayesian predictions of low count time series
- McCabe, Martin
- 2005
(Show Context)
Citation Context ...a, as they occur in a wide range of epidemiological, ecological, environmental, climatological, demographic and economic applications (Christensen and Waagepetersen, 2002; Gotway and Wolfinger, 2003; =-=McCabe and Martin, 2005-=-; Elsner and Jagger, 2006; Frühwirth-Schnatter and Wagner, 2006; Nelson and Leroux, 2006). Our focus is on the low count situation in which continuum approximations fail; however, our results apply to... |

14 | Identification and Estimation of AgePeriod-Cohort Models in the Analysis of Discrete Archival Data.” Sociological Methodology 10 - Fienberg, Mason - 1979 |

12 | Coherent dispersion criteria for optimal experimental design - Dawid, Sebastiani - 1999 |

12 | The estimation of age, period and cohort effects for vital rates - Holford - 1983 |

11 |
2006), Prediction models for annual U.S. hurricane counts
- Jagger
- 1995
(Show Context)
Citation Context ...e range of epidemiological, ecological, environmental, climatological, demographic and economic applications (Christensen and Waagepetersen, 2002; Gotway and Wolfinger, 2003; McCabe and Martin, 2005; =-=Elsner and Jagger, 2006-=-; Frühwirth-Schnatter and Wagner, 2006; Nelson and Leroux, 2006). Our focus is on the low count situation in which continuum approximations fail; however, our results apply to high counts and rates as... |

10 | Zero-inflated generalized Poisson models with regression effects on the mean, dispersion and zero-inflation level applied to patent outsourcing rates. Stat. Modelling 7 - Czado, Erhardt, et al. - 2007 |

10 | Probabilistic quantitative precipitation forecasts for river forecasting - Krzysztofowicz, Drake - 1992 |

8 |
Auxiliary Mixture Sampling for parameter-driven Models of time Series of Counts with Application to State Space Modelling
- Fruehwirth-Schnatter, Wagner
- 2006
(Show Context)
Citation Context ...l, ecological, environmental, climatological, demographic and economic applications (Christensen and Waagepetersen, 2002; Gotway and Wolfinger, 2003; McCabe and Martin, 2005; Elsner and Jagger, 2006; =-=Frühwirth-Schnatter and Wagner, 2006-=-; Nelson and Leroux, 2006). Our focus is on the low count situation in which continuum approximations fail; however, our results apply to high counts and rates as well, as they occur routinely in epid... |

7 |
Scoring probabilistic forecasts: The importance of being proper
- Bröcker, LA
- 2007
(Show Context)
Citation Context ... be strictly proper. If s(Q, Q) ≤ s(P, Q) for all P and Q, the scoring rule is said to be proper. Propriety is an essential property of a scoring rule that encourages honest and coherent predictions (=-=Bröcker and Smith, 2007-=-; Gneiting and Raftery, 2007). Strict propriety ensures that both calibration and sharpness are being addressed. A scoring rule s for count data is regular if s(P, x) is finite, except possibly that s... |

6 | Application of Markov chain Monte Carlo methods to projecting cancer incidence and mortality - Bray |

6 |
2005): “Modelling Financial Transaction Price Movements: A Dynamic Integer Count Data Model
- Liesenfeld, Nolte, et al.
(Show Context)
Citation Context ... score, where µP and σ 2 P nses(P, x) = � � x − µP 2 σP , (12) denote the mean and the variance of P , ought be approximately one when averaged over the predictions (Carroll and Cressie, 1997, p. 52; =-=Liesenfeld et al., 2006-=-, pp. 811, 818). Gotway and Wolfinger (2003, p. 1423) call the mean normalized squared error score the average empirical-to-model variability ratio, arguing also that it should be close to 11 issone. ... |

5 |
Spatial modeling of snow water equivalent using covariances estimated from spatial and geomorphic attributes
- Carroll, Cressie
- 1997
(Show Context)
Citation Context ...sidual or normalized squared error score, where µP and σ 2 P nses(P, x) = � � x − µP 2 σP , (12) denote the mean and the variance of P , ought be approximately one when averaged over the predictions (=-=Carroll and Cressie, 1997-=-, p. 52; Liesenfeld et al., 2006, pp. 811, 818). Gotway and Wolfinger (2003, p. 1423) call the mean normalized squared error score the average empirical-to-model variability ratio, arguing also that i... |

5 | Analysis of Patent Data – A Mixed-Poisson-Regression-Model Approach - Wang, Cockburn, et al. - 1998 |

4 |
Recursive residuals and model diagnostics for normal and non–normal state space models
- Frühwirth–Schnatter
- 1996
(Show Context)
Citation Context ...ctive distribution, x ∼ P is a random count and v is standard uniform and independent of x, then u = Px−1 + v(Px − Px−1), x ≥ 1, (1) u = vP0, x = 0, (2) is standard uniform (Smith, 1985, pp. 286–287; =-=Frühwirth-Schnatter, 1996-=-, p. 297; Liesenfeld, Nolte and Pohlmeier, 2006, pp. 819–820). For time series data one typically considers onestep (or k-step) ahead predictions, based on a time series model fitted on past and curre... |

4 |
Spatial prediction of counts and rates
- Gotway, Wolfinger
- 2003
(Show Context)
Citation Context ...distributions, for count data, as they occur in a wide range of epidemiological, ecological, environmental, climatological, demographic and economic applications (Christensen and Waagepetersen, 2002; =-=Gotway and Wolfinger, 2003-=-; McCabe and Martin, 2005; Elsner and Jagger, 2006; Frühwirth-Schnatter and Wagner, 2006; Nelson and Leroux, 2006). Our focus is on the low count situation in which continuum approximations fail; howe... |

4 |
2007: Uncertainty and inference for verification measures
- Jolliffe
(Show Context)
Citation Context ...y, 1998; Gneiting et al., 2007). The PIT histogram is typically used informally as a diagnostic tool; formal tests can also be employed though they require care in their interpretation (Hamill, 2001; =-=Jolliffe, 2007-=-). Deviations from uniformity hint at reasons for forecast failures and model deficiencies. U-shaped histograms indicate underdispersed predictive distributions, hump or inverse-U shaped histograms po... |

4 | HSSS model criticism - O’Hagan - 2003 |

3 |
Lung cancer rate predictions using generalized additive models
- Clements, Armstrong, et al.
- 2005
(Show Context)
Citation Context ...ial or Poisson, the standardizing term is routinely taken to be the saturated deviance (McCullagh and Nelder, 1989, pp. 33-34; Knorr-Held and Rainer, 2001, p. 114; Spiegelhalter et al., 2002, p. 606; =-=Clements et al., 2005-=-, p. 581). However, when the predictive distributions come from possibly distinct parametric or non-parametric families, it is vital that the standardizing terms in the deviance are common (Spiegelhal... |

3 | A new marked point process model for the federal funds rate target: Methodology and forecast evaluation, forthcoming Journal of Economic Dynamics and Control - Kehrle, K - 2008 |

3 |
Projections of lung cancer mortality in West Germany: A case study in Bayesian prediction
- Knorr-Held, Rainer
- 2001
(Show Context)
Citation Context ...). Our focus is on the low count situation in which continuum approximations fail; however, our results apply to high counts and rates as well, as they occur routinely in epidemiological projections (=-=Knorr-Held and Rainer, 2001-=-; Clements, Armstrong and Moolgavkar, 2005). To this date, statistical methods for the assessment of predictive performance have been studied primarily from biomedical, meteorological and economic per... |

2 |
Bayesian projections: What are the effects of excluding data from younger age groups
- Baker, Bray
- 2005
(Show Context)
Citation Context ...ect cancer incidence and mortality rates. Data from younger age groups (typically age < 30 years) for which rates are low are often excluded from the analysis. However, a recent empirical comparison (=-=Baker and Bray, 2005-=-) based on data from Hungary suggests that age-specific predictions based on full data are more accurate. A natural question arises here in how to quantify the quality of the predictive distributions.... |

2 |
Forecast Verification: A Practicioner’s Guide in Atmospheric Science
- Jolliffe, Stephenson
- 2003
(Show Context)
Citation Context ...nd Moolgavkar, 2005). To this date, statistical methods for the assessment of predictive performance have been studied primarily from biomedical, meteorological and economic perspectives (Pepe, 2003; =-=Jolliffe and Stephenson, 2003-=-; Clements, 2005), focusing on predictions of dichotomous events or real-valued continuous variables. Here, we consider the hybrid case of count data, in which methods developed for either type of sit... |

2 |
Spatial models for autocorrelated count data
- Nelson, Leroux
- 2006
(Show Context)
Citation Context ...ogical, demographic and economic applications (Christensen and Waagepetersen, 2002; Gotway and Wolfinger, 2003; McCabe and Martin, 2005; Elsner and Jagger, 2006; Frühwirth-Schnatter and Wagner, 2006; =-=Nelson and Leroux, 2006-=-). Our focus is on the low count situation in which continuum approximations fail; however, our results apply to high counts and rates as well, as they occur routinely in epidemiological projections (... |

1 | Re: “Bayesian projections: What are the effects of excluding data from younger age groups - Clements, Hakulinen, et al. - 2006 |

1 | The R&D master file documentation - Hall, Cummins, et al. - 1988 |