## Using Bayesian model averaging to calibrate forecast ensembles. Monthly Weather Review 133 (2005)

### Cached

### Download Links

Citations: | 79 - 29 self |

### BibTeX

@MISC{Raftery05usingbayesian,

author = {Adrian E. Raftery and Tilmann Gneiting and Fadoua Balabdaoui and Michael Polakowski},

title = {Using Bayesian model averaging to calibrate forecast ensembles. Monthly Weather Review 133},

year = {2005}

}

### Years of Citing Articles

### OpenURL

### Abstract

Ensembles used for probabilistic weather forecasting often exhibit a spread-error correlation, but they tend to be underdispersive. This paper proposes a statistical method for postprocessing ensembles based on Bayesian model averaging (BMA), which is a standard method for combining predictive distributions from different sources. The BMA predictive probability density function (PDF) of any quantity of interest is a weighted average of PDFs centered on the individual bias-corrected forecasts, where the weights are equal to posterior probabilities of the models generating the forecasts and reflect the models ’ relative contributions to predictive skill over the training period. The BMA weights can be used to assess the usefulness of ensemble members, and this can be used as a basis for selecting ensemble members; this can be useful given the cost of running large ensembles. The BMA PDF can be represented as an unweighted ensemble of any desired size, by simulating from the BMA predictive distribution. The BMA predictive variance can be decomposed into two components, one corresponding to the between-forecast variability, and the second to the within-forecast variability. Predictive PDFs or intervals based solely on the ensemble spread incorporate the first component but not the second. Thus BMA provides a theoretical explanation of the tendency of ensembles to exhibit a spread-error correlation but yet

### Citations

8932 | Maximum likelihood from incomplete data via the em algorithm
- Dempster, Laird, et al.
- 1977
(Show Context)
Citation Context ...d it is complex to maximize numerically using direct nonlinear maximization methods such as Newton–Raphson and its variants. Instead, we maximize it using the expectation-maximization (EM) algorithm (=-=Dempster et al. 1977-=-; McLachlan and Krishnan 1997). The EM algorithm is a method for finding the maximum likelihood estimator when the problem can be recast in terms of unobserved quantities such that, if we knew what th... |

1146 | Bayes factors
- Kass, Raftery
- 1995
(Show Context)
Citation Context ...ons, and the typical approach, that of conditioning on a single model deemed to be “best”, ignores this source of uncertainty, thus underestimating uncertainty. Bayesian model averaging (Leamer 1978; =-=Kass and Raftery 1995-=-; Hoeting, Madigan, Raftery, and Volinsky 1999) overcomes this problem by conditioning, not on a single “best” model, but on the entire ensemble of statistical models first considered. In the case of ... |

1084 |
Finite mixture models
- McLachlan, Peel
- 2000
(Show Context)
Citation Context ...ave to be actual data that are missing; instead, they are often latent or unobserved quantities, knowledge of which would simplify the estimation problem. The BMA model (2) is a finite mixture model (=-=McLachlan and Peel 2000-=-). Here we introduce “missing data” zkst where zkst = 1 if ensemble member k is the best forecast for verification place s and time t, and zkst = 0 otherwise. For each (s, t), only one of {z1st, . . .... |

1063 |
The EM Algorithm and Extensions
- McLachlan, Krishnan
- 1997
(Show Context)
Citation Context ...ally using direct nonlinear maximization methods such as Newton-Raphson and its variants. Instead, we maximize it using the expectation-maximization, or EM algorithm (Dempster, Laird, and Rubin 1977; =-=McLachlan and Krishnan 1997-=-). The EM algorithm is a method for finding the maximum likelihood estimator when the problem can be recast in terms of “missing data” such that, if we knew the missing data, the estimation problem wo... |

951 | Deterministic Nonperiodic Flow - Lorenz - 1963 |

287 | Data assimilation using an ensemble Kalman filter technique
- Houtekamer, Mitchell
- 1998
(Show Context)
Citation Context ...ensembles of entire fields that reproduce the spatial correlation of the error field have been proposed for the situation where just one numerical weather prediction model and initialization is used (=-=Houtekamer and Mitchell 1998-=-; Houtekamer and Mitchell 2001; Gel, Raftery, and Gneiting 2003a). Such methods could be combined with the present proposal to produce multimodel and/or multianalysis ensembles that reproduce spatial ... |

270 |
On the mathematical foundations of theoretical statistics
- Fisher
- 1922
(Show Context)
Citation Context ...the corresponding verification. Here we will take the forecast horizon to be fixed; in practice we k=1 will estimate different models for each forecast horizon. 6sWe estimate θ by maximum likelihood (=-=Fisher 1922-=-) from the training data. The likelihood function is defined as the probability of the training data given θ, viewed as a function of θ. The maximum likelihood estimator is the value of θ that maximiz... |

267 |
Statistical Methods in the Atmospheric Sciences
- Wilks
- 1995
(Show Context)
Citation Context ...ias correction method. In our experiment we used a very simple linear bias correction method. Model output statistics (MOS) is the dominant approach to bias correction, and may give improved results (=-=Wilks 1995-=-). Approaches based on spatial and temporal neighborhoods have also been proposed, for example Eckel and Mass (2003) and Gel, Raftery, and Gneiting (2003b). Note that to be useful in our context, bias... |

170 | Rational decisions - Good - 1952 |

160 |
Ensemble forecasting at NMC: The generation of perturbations
- Toth, Kalnay
- 1993
(Show Context)
Citation Context ...is run several times with different initial conditions or model physics (Epstein 1969; Leith 1974). Ensembles based on global models have been found useful for medium-range probabilistic forecasting (=-=Toth and Kalnay 1993-=-; Molteni, Buizza, Palmer, and Petroliagis 1996; Houtekamer and Derome 1995; Hamill, Snyder, and Morss 2000). Typically the ensemble mean outperforms all or most of the individual ensemble members, an... |

147 |
2001: A sequential ensemble Kalman filter for atmospheric data assimilation
- Houtekamer, Mitchell
(Show Context)
Citation Context ...t reproduce the spatial correlation of the error field have been proposed for the situation where just one numerical weather prediction model and initialization is used (Houtekamer and Mitchell 1998; =-=Houtekamer and Mitchell 2001-=-; Gel, Raftery, and Gneiting 2003a). Such methods could be combined with the present proposal to produce multimodel and/or multianalysis ensembles that reproduce spatial correlation of error fields by... |

118 |
Specification Searches
- Leamer
- 1978
(Show Context)
Citation Context ...wing conclusions, and the typical approach, that of conditioning on a single model deemed to be “best”, ignores this source of uncertainty, thus underestimating uncertainty. Bayesian model averaging (=-=Leamer 1978-=-; Kass and Raftery 1995; Hoeting, Madigan, Raftery, and Volinsky 1999) overcomes this problem by conditioning, not on a single “best” model, but on the entire ensemble of statistical models first cons... |

105 |
Bayesian model averaging: a tutorial [with discussion]. Statistical Science 19, 382–417. [A version where the number of misprints has been significantly reduced is available at http://www.stat.washington.edu/raftery
- Hoeting, Madigan, et al.
- 1999
(Show Context)
Citation Context ...proach, that of conditioning on a single model deemed to be “best,” ignores this source of uncertainty, thus underestimating uncertainty. Bayesian model averaging (Leamer 1978; Kass and Raftery 1995; =-=Hoeting et al. 1999-=-) overcomes this problem by conditioning, not on a single “best” model, but on the entire ensemble of statistical models first considered. In the case of a quantity y to be forecast on the basis of tr... |

102 |
The use of Model Output Statistics (MOS) in objective weather forecasting
- Glahn, Lowry
- 1972
(Show Context)
Citation Context ...sts have not yet been bias corrected, estimation of a k and b k can be viewed as a very simple bias-correction process, and it can also be considered as a very simple form of model output statistics (=-=Glahn and Lowry 1972-=-; Carter et al. 1989). Note that we retain the a k and b k in (3) even if the forecasts have been bias corrected. We estimate w k, k � 1,...,K, and � by maximum likelihood (Fisher 1922) from the train... |

69 | Stochastic representation of model uncertainties in the ECMWF ensemble prediction system - Buizza, Miller, et al. - 1999 |

64 | 2001: Interpretation of rank histograms for verifying ensemble forecasts - Hamill |

63 |
and S.J Colucci,1997: Verification of Eta-RSM short-range ensemble forecasts
- Hamill
(Show Context)
Citation Context ...thus not calibrated. Here we focus on short-range mesoscale forecasting. Several authors have studied the use of a synoptic ensemble, the 15-member NCEP Eta-RSM ensemble, for short-range forecasting (=-=Hamill and Colucci 1997-=-; Hamill and Colucci 1998; Stensrud, Brooks, Du, Tracton, and Rogers 1999). As was the case for medium-range forecasting, the ensemble mean was more skillful for short-range forecasting than the indiv... |

62 |
A method for producing and evaluating probabilistic forecasts from ensemble model integrations
- Anderson
- 1996
(Show Context)
Citation Context ...ic Northwest, was 0.18 for temperature and 0.42 for sea level pressure; both correlations were positive and the latter was highly statistically significant. However, the verification rank histograms (=-=Anderson 1996-=-; Talagrand et al. 1997; Hamill 2001) for the same data, shown in Fig. 2, show the ensemble to be underdispersive and hence uncalibrated. In this case, the ensemble range based on five members would c... |

52 |
Evaluation of probabilistic prediction systems
- Talagrand, Strauss, et al.
- 1997
(Show Context)
Citation Context ...as 0.18 for temperature and 0.42 for sea level pressure; both correlations were positive and the latter was highly statistically significant. However, the verification rank histograms (Anderson 1996; =-=Talagrand et al. 1997-=-; Hamill 2001) for the same data, shown in Fig. 2, show the ensemble to be underdispersive and hence uncalibrated. In this case, the ensemble range based on five members would contain 4/6, or 66.7%, o... |

47 |
Stochastic dynamic prediction
- Epstein
- 1969
(Show Context)
Citation Context ...22 iis1 Introduction The dominant approach to probabilistic weather forecasting has been the use of ensembles in which a model is run several times with different initial conditions or model physics (=-=Epstein 1969-=-; Leith 1974). Ensembles based on global models have been found useful for medium-range probabilistic forecasting (Toth and Kalnay 1993; Molteni, Buizza, Palmer, and Petroliagis 1996; Houtekamer and D... |

39 |
Initial results of a mesoscale short-range ensemble forecast system over the Pacific Northwest
- Grimit, Mass
- 2002
(Show Context)
Citation Context ...endar days. For comparability, the same verifications were used in evaluating all the training periods, so the verifications for the first 100 days were not used. For some days the data were missing (=-=Grimit and Mass 2002-=-), so that the effective number of days in 12sLatitude Latitude 42 44 46 48 50 42 44 46 48 50 −128 −126 −124 −122 −120 −118 Longitude 995 1000 1005 1010 1015 Latitude 42 44 46 48 50 −128 −126 −124 −12... |

39 | 2000: A Comparison of Probabilistic Forecasts from Bred - Hamill, Snyder, et al. |

39 | 2001: Objective verification of the SAMEX ’98 ensemble forecasts - Hou, Kalnay, et al. |

38 | Bayesian model selection in structural equation models
- Raftery
- 1993
(Show Context)
Citation Context ...ad-Skill Relationship The BMA predictive variance of yst given the ensemble of forecasts can be written as Var(yst| ˜ f1st, . . . , ˜ K� � K� fKst) = wk ˜fkst − wi k=1 i=1 ˜ �2 K� fist + wkσ k=1 2 k (=-=Raftery 1993-=-). The right-hand side has two terms, the first of which summarizes betweenforecast spread, and the second measures the expected uncertainty conditional on one of the forecasts being best. We can summ... |

38 | Evaluating probabilistic forecasts using information theory - Roulston, Smith - 2002 |

35 | Decomposition of the continuous ranked probability score for ensemble prediction systems. Weather and Forecasting - Hersbach - 2000 |

32 | A Model Fitting Analysis of Daily Rainfall Data - Stern, Coe - 1984 |

32 |
Potential forecast skill of ensemble prediction and spread and skill distributions of the ECMWF ensemble prediction system
- Buizza
- 1997
(Show Context)
Citation Context ...lobal ensemble (Toth et al. 2001; Eckel and Walters 1998), the Canadian Ensemble Prediction System (Pellerin et al. 2003), and the European Centre for Medium-Range Weather Forecasts (ECMWF) ensemble (=-=Buizza 1997-=-; Buizza et al. 1999; Hersbach et al. 2000; Scherrer et al. 2004); for an overview see Buizza et al. (2005). Here we focus on short-range mesoscale forecasting. Several authors have studied the use of... |

31 |
Methods for ensemble prediction
- Houtekamer, Derome
- 1995
(Show Context)
Citation Context ...cs (Epstein 1969; Leith 1974). Ensembles based on global models have been found useful for medium-range probabilistic forecasting (Toth and Kalnay 1993; Molteni, Buizza, Palmer, and Petroliagis 1996; =-=Houtekamer and Derome 1995-=-; Hamill, Snyder, and Morss 2000). Typically the ensemble mean outperforms all or most of the individual ensemble members, and in some studies a spread-skill relationship has been observed, in which t... |

30 | Using ensembles for shortrange forecasting
- Stensrud, Brooks, et al.
- 1999
(Show Context)
Citation Context ...eral authors have studied the use of a synoptic ensemble, the 15-member NCEP Eta–Regional Spectral Model (RSM) ensemble, for short-range forecasting (Hamill and Colucci 1997; Hamill and Colucci 1998; =-=Stensrud et al. 1999-=-). As was the case for medium-range forecasting, the ensemble mean was more skillful for short-range forecasting than the individual ensemble members, but the spread–skill relationship was weak. The f... |

30 |
Improved weather and seasonal climate forecasts from multimodel superensemble
- Krishnamurti
- 1999
(Show Context)
Citation Context ... our experiments. It has also been proposed that forecasts be combined using multiple linear regression to produce a single deterministic forecast or “superensemble” (Van den Dool and Rukhovets 1994; =-=Krishnamurti et al. 1999-=-; Kharin and Zwiers 2002). It seems likely that BMA and regression would give similar forecasts. However, one difference is that the weights in BMA are constrained to be positive, whereas those in reg... |

27 |
Calibrated probabilistic quantitative precipitation forecasts based on the MRF ensemble
- Eckel, Walter
- 1998
(Show Context)
Citation Context ...d thus not calibrated. Both spread-error correlations and underdispersion have been observed in the National Centers for Environmental Prediction (NCEP) operational global ensemble (Toth et al. 2001; =-=Eckel and Walters 1998-=-), the Canadian Ensemble Prediction System (Pellerin et al. 2003), and the European Centre for Medium-Range Weather Forecasts (ECMWF) ensemble (Buizza 1997; Buizza et al. 1999; Hersbach et al. 2000; S... |

27 | 2005: A comparison of the ECMWF, MSC, and NCEP global ensemble prediction systems - Houtekamer, Toth, et al. |

23 |
Statistical forecasts based on the National Meteorological Center’s numerical weather prediction system
- Carter, Dallavalle, et al.
- 1989
(Show Context)
Citation Context ...bias corrected, estimation of a k and b k can be viewed as a very simple bias-correction process, and it can also be considered as a very simple form of model output statistics (Glahn and Lowry 1972; =-=Carter et al. 1989-=-). Note that we retain the a k and b k in (3) even if the forecasts have been bias corrected. We estimate w k, k � 1,...,K, and � by maximum likelihood (Fisher 1922) from the training data. The likeli... |

20 | Evaluation of Eta-RSM ensemble probabilistic precipitation forecasts
- Hamill, Colucci
- 1998
(Show Context)
Citation Context ... we focus on short-range mesoscale forecasting. Several authors have studied the use of a synoptic ensemble, the 15-member NCEP Eta-RSM ensemble, for short-range forecasting (Hamill and Colucci 1997; =-=Hamill and Colucci 1998-=-; Stensrud, Brooks, Du, Tracton, and Rogers 1999). As was the case for medium-range forecasting, the ensemble mean was more skillful for short-range forecasting than the individual ensemble members, b... |

17 | 2004: Ensemble re-forecasting: improving medium range forecast skill using retrospective forecasts - Whitaker, Wei |

16 |
Global and local skill forecasts
- Houtekamer
- 1993
(Show Context)
Citation Context ...ensembles of entire fields that reproduce the spatial correlation of the error field have been proposed for the situation where just one numerical weather prediction model and initialization is used (=-=Houtekamer 1993-=-; Houtekamer and Mitchell 1998, 2001; Gel et al. 2004). Such methods could be combined with the present proposal to produce multimodel and/or multianalysis ensembles that reproduce spatial correlation... |

15 | 2002: Climate predictions with multimodel ensembles - Kharin, Zwiers |

11 | Long Run Performance of Bayesian Model Averaging
- Raftery, Zheng
- 2003
(Show Context)
Citation Context ... models, weighted by their posterior model probabilities. BMA possesses a range of theoretical optimality properties and has shown good performance in a variety of simulated and real data situations (=-=Raftery and Zheng 2003-=-). We now extend BMA from statistical models to dynamical models. The basic idea is that for any given forecast there is a “best” model, but we do not know what it is, and 5sour uncertainty about the ... |

11 |
Generalized Linear Models (2d ed
- McCullagh, Nelder
- 1989
(Show Context)
Citation Context ...distribution, and it may be necessary to augment these with a component representing a positive probability of being equal to zero. This can be done within the framework of generalized linear models (=-=McCullagh and Nelder 1989-=-), and one example of how to model precipitation in this way was given by Stern and Coe (1984). One way to improve the performance of this method is to bias correct the forecasts before applying BMA. ... |

10 | Interpretation of Seasonal Climate Forecasts Using Brier - Stefanova, Krishnamurti - 2001 |

9 | 2004: Calibrated probabilistic mesoscale weather eld forecasting: The geostatistical output perturbation (GOP) method (with discussion and rejoinder - Gel, Raftery, et al. |

9 | 2001: The use of ensembles to identify forecasts with small and large uncertainty - Toth, Zhu, et al. |

8 |
E.,1974: Theoretical skill of Monte-Carlo forecasts
- Leith
(Show Context)
Citation Context ...uction The dominant approach to probabilistic weather forecasting has been the use of ensembles in which a model is run several times with different initial conditions or model physics (Epstein 1969; =-=Leith 1974-=-). Ensembles based on global models have been found useful for medium-range probabilistic forecasting (Toth and Kalnay 1993; Molteni, Buizza, Palmer, and Petroliagis 1996; Houtekamer and Derome 1995; ... |

8 |
The ECMWF ensemble system: Methodology and validation
- Molteni, Buizza, et al.
- 1996
(Show Context)
Citation Context ...) as a way of implementing the general framework presented by Epstein (1969). Ensembles based on global models have been found useful for medium-range probabilistic forecasting (Toth and Kalnay 1993; =-=Molteni et al. 1996-=-; Houtekamer and Derome 1995; Hamill et al. 2000). Typically the ensemble mean outperforms all or most of the individual ensemble members, and in some studies a spread-error correlation has been obser... |

8 | On the Weights for an Ensemble-Averaged 6–10-Day Forecast - Dool, Rukhovets - 1994 |

8 | 2002: Smoothing forecast ensembles with fitted probability distributions - Wilks |

7 | 2003: Increasing the horizontal resolution of ensemble forecasts at CMC.Non - Pellerin, Lefaivre, et al. |

7 | Rogers,1999: Using ensembles for short-range forecasting - Brooks, Du, et al. |

6 |
Verifying probabilistic forecasts: Calibration and sharpness
- Gneiting, Raftery, et al.
- 2003
(Show Context)
Citation Context ...er on average than those obtained from climatology. Clearly, the sharper the better. We adopt the principle that the goal of probabilistic forecasting is to maximize sharpness subject to calibration (=-=Gneiting et al. 2003-=-). To achieve this, we propose a statistical approach to postprocessing ensemble forecasts, based on Bayesian model averaging (BMA). This is a standard approach to inference in the presence of multipl... |