Results 1  10
of
43
Strictly Proper Scoring Rules, Prediction, and Estimation
, 2007
"... Scoring rules assess the quality of probabilistic forecasts, by assigning a numerical score based on the predictive distribution and on the event or value that materializes. A scoring rule is proper if the forecaster maximizes the expected score for an observation drawn from the distribution F if he ..."
Abstract

Cited by 182 (17 self)
 Add to MetaCart
Scoring rules assess the quality of probabilistic forecasts, by assigning a numerical score based on the predictive distribution and on the event or value that materializes. A scoring rule is proper if the forecaster maximizes the expected score for an observation drawn from the distribution F if he or she issues the probabilistic forecast F, rather than G ̸ = F. It is strictly proper if the maximum is unique. In prediction problems, proper scoring rules encourage the forecaster to make careful assessments and to be honest. In estimation problems, strictly proper scoring rules provide attractive loss and utility functions that can be tailored to the problem at hand. This article reviews and develops the theory of proper scoring rules on general probability spaces, and proposes and discusses examples thereof. Proper scoring rules derive from convex functions and relate to information measures, entropy functions, and Bregman divergences. In the case of categorical variables, we prove a rigorous version of the Savage representation. Examples of scoring rules for probabilistic forecasts in the form of predictive densities include the logarithmic, spherical, pseudospherical, and quadratic scores. The continuous ranked probability score applies to probabilistic forecasts that take the form of predictive cumulative distribution functions. It generalizes the absolute error and forms a special case of a new and very general type of score, the energy score. Like many other scoring rules, the energy score admits a kernel representation in terms of negative definite functions, with links to inequalities of Hoeffding type, in both univariate and multivariate settings. Proper scoring rules for quantile and interval forecasts are also discussed. We relate proper scoring rules to Bayes factors and to crossvalidation, and propose a novel form of crossvalidation known as randomfold crossvalidation. A case study on probabilistic weather forecasts in the North American Pacific Northwest illustrates the importance of propriety. We note optimum score approaches to point and quantile
Using Bayesian model averaging to calibrate forecast ensembles. Monthly Weather Review 133
, 2005
"... Ensembles used for probabilistic weather forecasting often exhibit a spreaderror correlation, but they tend to be underdispersive. This paper proposes a statistical method for postprocessing ensembles based on Bayesian model averaging (BMA), which is a standard method for combining predictive distr ..."
Abstract

Cited by 82 (30 self)
 Add to MetaCart
Ensembles used for probabilistic weather forecasting often exhibit a spreaderror correlation, but they tend to be underdispersive. This paper proposes a statistical method for postprocessing ensembles based on Bayesian model averaging (BMA), which is a standard method for combining predictive distributions from different sources. The BMA predictive probability density function (PDF) of any quantity of interest is a weighted average of PDFs centered on the individual biascorrected forecasts, where the weights are equal to posterior probabilities of the models generating the forecasts and reflect the models ’ relative contributions to predictive skill over the training period. The BMA weights can be used to assess the usefulness of ensemble members, and this can be used as a basis for selecting ensemble members; this can be useful given the cost of running large ensembles. The BMA PDF can be represented as an unweighted ensemble of any desired size, by simulating from the BMA predictive distribution. The BMA predictive variance can be decomposed into two components, one corresponding to the betweenforecast variability, and the second to the withinforecast variability. Predictive PDFs or intervals based solely on the ensemble spread incorporate the first component but not the second. Thus BMA provides a theoretical explanation of the tendency of ensembles to exhibit a spreaderror correlation but yet
Probabilistic forecasts, calibration and sharpness
 Journal of the Royal Statistical Society Series B
, 2007
"... Summary. Probabilistic forecasts of continuous variables take the form of predictive densities or predictive cumulative distribution functions. We propose a diagnostic approach to the evaluation of predictive performance that is based on the paradigm of maximizing the sharpness of the predictive dis ..."
Abstract

Cited by 53 (16 self)
 Add to MetaCart
(Show Context)
Summary. Probabilistic forecasts of continuous variables take the form of predictive densities or predictive cumulative distribution functions. We propose a diagnostic approach to the evaluation of predictive performance that is based on the paradigm of maximizing the sharpness of the predictive distributions subject to calibration. Calibration refers to the statistical consistency between the distributional forecasts and the observations and is a joint property of the predictions and the events that materialize. Sharpness refers to the concentration of the predictive distributions and is a property of the forecasts only. A simple theoretical framework allows us to distinguish between probabilistic calibration, exceedance calibration and marginal calibration. We propose and study tools for checking calibration and sharpness, among them the probability integral transform histogram, marginal calibration plots, the sharpness diagram and proper scoring rules. The diagnostic approach is illustrated by an assessment and ranking of probabilistic forecasts of wind speed at the Stateline wind energy centre in the US Pacific Northwest. In combination with crossvalidation or in the time series context, our proposal provides very general, nonparametric alternatives to the use of information criteria for model diagnostics and model selection.
Calibrated probabilistic forecasting using ensemble model output statistics and minimum CRPS estimation
 Monthly Weather Review
, 2005
"... Ensemble prediction systems typically show positive spreaderror correlation, but they are subject to forecast bias and underdispersion, and therefore uncalibrated. This work proposes the use of ensemble model output statistics (EMOS), an easy to implement postprocessing technique that addresses b ..."
Abstract

Cited by 38 (10 self)
 Add to MetaCart
Ensemble prediction systems typically show positive spreaderror correlation, but they are subject to forecast bias and underdispersion, and therefore uncalibrated. This work proposes the use of ensemble model output statistics (EMOS), an easy to implement postprocessing technique that addresses both forecast bias and underdispersion and takes account of the spreadskill relationship. The technique is based on multiple linear regression and akin to the superensemble approach that has traditionally been used for deterministicstyle forecasts. The EMOS technique yields probabilistic forecasts that take the form of Gaussian predictive probability density functions (PDFs) for continuous weather variables, and can be applied to gridded model output. The EMOS predictive mean is an optimal, biascorrected weighted average of the ensemble member forecasts, with coefficients that are constrained to be nonnegative and associated with the member model skill. The EMOS predictive mean provides a highly accurate deterministicstyle forecast. The EMOS predictive variance is a linear function of the ensemble spread. For fitting the EMOS coefficients, the method of minimum CRPS estimation is introduced.
du Preez, “Applicationindependent evaluation of speaker detection
 Computer Speech and Language
, 2006
"... We present a Bayesian analysis of the evaluation of speaker detection performance. We use expectation of utility to confirm that likelihoodratio is both an optimum and applicationindependent form of output for speaker detection systems. We point out that the problem of likelihoodratio calculation ..."
Abstract

Cited by 34 (2 self)
 Add to MetaCart
(Show Context)
We present a Bayesian analysis of the evaluation of speaker detection performance. We use expectation of utility to confirm that likelihoodratio is both an optimum and applicationindependent form of output for speaker detection systems. We point out that the problem of likelihoodratio calculation is equivalent to the problem of optimization of decision thresholds. It is shown that the decision cost that is used in the existing NIST evaluations effectively forms a utility (a proper scoring rule) for the evaluation of the quality of likelihoodratio presentation. As an alternative, a logarithmic utility (a strictly proper scoring rule) is proposed. Finally, an informationtheoretic interpretation of the expected logarithmic utility is given. It is hoped that this analysis and the proposed evaluation method will promote the use of likelihoodratio detector output rather than decision output. 1.
Calibrated probabilistic forecasting at the Stateline wind energy center: The regimeswitching spacetime (RST) method
 Journal of the American Statistical Association
, 2004
"... With the global proliferation of wind power, accurate shortterm forecasts of wind resources at wind energy sites are becoming paramount. Regimeswitching spacetime (RST) models merge meteorological and statistical expertise to obtain accurate and calibrated, fully probabilistic forecasts of wind s ..."
Abstract

Cited by 22 (10 self)
 Add to MetaCart
(Show Context)
With the global proliferation of wind power, accurate shortterm forecasts of wind resources at wind energy sites are becoming paramount. Regimeswitching spacetime (RST) models merge meteorological and statistical expertise to obtain accurate and calibrated, fully probabilistic forecasts of wind speed and wind power. The model formulation is parsimonious, yet takes account of all the salient features of wind speed: alternating atmospheric regimes, temporal and spatial correlation, diurnal and seasonal nonstationarity, conditional heteroscedasticity, and nonGaussianity. The RST method identifies forecast regimes at the wind energy site and fits a conditional predictive model for each regime. Geographically dispersed meteorological observations in the vicinity of the wind farm are used as offsite predictors. The RST technique was applied to 2hour ahead forecasts of hourly average wind speed at the Stateline wind farm in the US Pacific Northwest. In July 2003, for instance, the RST forecasts had rootmeansquare error (RMSE) 28.6 % less than the persistence forecasts. For each month in the test period, the RST forecasts had lower RMSE than forecasts using stateoftheart vector time series techniques. The RST method provides probabilistic forecasts in the form of
Geostatistical SpaceTime Models, Stationarity, Separability and Full Symmetry
"... Geostatistical approaches to modeling spatiotemporal data rely on parametric covariance models and rather stringent assumptions, such as stationarity, separability and full symmetry. This paper reviews recent advances in the literature on spacetime covariance functions in light of the aforemention ..."
Abstract

Cited by 21 (4 self)
 Add to MetaCart
Geostatistical approaches to modeling spatiotemporal data rely on parametric covariance models and rather stringent assumptions, such as stationarity, separability and full symmetry. This paper reviews recent advances in the literature on spacetime covariance functions in light of the aforementioned notions, which are illustrated using wind data from Ireland. Experiments with timeforward kriging predictors suggest that the use of more complex and more realistic covariance models results in improved predictive performance.
D (2006) Model error in weather and climate forecasting. In: Palmer T, Hagedorn R (eds) Predictability of weather and climate. Cambridge University Press, Cambridge Anderson JL (2001) An ensemble adjustment Kalman filter for data assimilation. Mon Weather
 Cliffs, NJ Bengtsson T, Snyder C, Nychka D
, 1999
"... “As if someone were to buy several copies of the morning newspaper to assure himself that what it said was true ” Ludwig Wittgenstein 1 ..."
Abstract

Cited by 16 (4 self)
 Add to MetaCart
(Show Context)
“As if someone were to buy several copies of the morning newspaper to assure himself that what it said was true ” Ludwig Wittgenstein 1
Improvement of ensemble reliability with a new dressing kernel
, 2005
"... A new method of combining dynamical and statistical ensembles for the purpose of improving ensemble reliability for underdispersive ensembles is introduced. The method involves adding independent sets of N random fourdimensional ‘dressing ’ perturbations to each of the K members of a dynamical ense ..."
Abstract

Cited by 15 (0 self)
 Add to MetaCart
(Show Context)
A new method of combining dynamical and statistical ensembles for the purpose of improving ensemble reliability for underdispersive ensembles is introduced. The method involves adding independent sets of N random fourdimensional ‘dressing ’ perturbations to each of the K members of a dynamical ensemble forecast to obtain an N × K dressed ensemble. The new method mathematically constrains the stochastic process used to generate the statistical dressing perturbations so that it removes seasonally averaged errors in the second moment measures for originally underdispersive ensembles. A randomnumber generator experiment and an experiment with the ensemble transform Kalman filter (ETKF) ensemble generation scheme show that the previously proposed ‘bestmember’ dressing method fails to reliably predict the second moment of the distribution of forecast errors, whereas the new dressing method reliably predicts this second moment. After being dressed with the second moment constraint method, the ETKF ensemble is more skilful than the undressed ensemble. The ETKF ensemble postprocessed with the new dressing method is applied for probabilistic forecasts of cooling degreedays (CDD) for Boston. It is shown that the new kernel’s ability to account for temporally correlated forecast errors results in ensemble forecasts of CDDs with reliable spread, whereas the bestmember method leads to an underdispersive ensemble of CDD forecasts.
2011: Using a stochastic kinetic energy backscatter scheme to improve MOGREPS probabilistic forecast skill.Mon
 Wea. Rev
"... Understanding model error in stateoftheart numerical weather prediction models and representing its impact on flowdependent predictability remains a complex and mostly unsolved problem. Here, a spectral stochastic kinetic energy backscatter scheme is used to simulate upscalepropagating errors c ..."
Abstract

Cited by 8 (3 self)
 Add to MetaCart
Understanding model error in stateoftheart numerical weather prediction models and representing its impact on flowdependent predictability remains a complex and mostly unsolved problem. Here, a spectral stochastic kinetic energy backscatter scheme is used to simulate upscalepropagating errors caused by unresolved subgridscale processes. For this purpose, stochastic streamfunction perturbations are generated by autoregressive processes in spectral space and injected into regions where numerical integration schemes and parameterizations in the model lead to excessive systematic kinetic energy loss. It is demonstrated how output from coarsegrained highresolution models can be used to inform the parameters of such a scheme. The performance of the spectral backscatter scheme is evaluated in the ensemble prediction system of the European Centre for MediumRange Weather Forecasts. Its implementation in conjunction with reduced initial perturbations results in a better spread–error relationship, more realistic kineticenergy spectra, a better representation of forecasterror growth, improved flowdependent predictability, improved rainfall forecasts, and better probabilistic skill. The improvement is most pronounced in the tropics and for largeanomaly events. It is found that whereas a simplified scheme assuming a constant dissipation rate already has some positive impact, the best results are obtained for flowdependent formulations of the unresolved processes. 1.