Results 1  10
of
18
Probabilistic forecasts, calibration and sharpness
 Journal of the Royal Statistical Society Series B
, 2007
"... Summary. Probabilistic forecasts of continuous variables take the form of predictive densities or predictive cumulative distribution functions. We propose a diagnostic approach to the evaluation of predictive performance that is based on the paradigm of maximizing the sharpness of the predictive dis ..."
Abstract

Cited by 38 (15 self)
 Add to MetaCart
Summary. Probabilistic forecasts of continuous variables take the form of predictive densities or predictive cumulative distribution functions. We propose a diagnostic approach to the evaluation of predictive performance that is based on the paradigm of maximizing the sharpness of the predictive distributions subject to calibration. Calibration refers to the statistical consistency between the distributional forecasts and the observations and is a joint property of the predictions and the events that materialize. Sharpness refers to the concentration of the predictive distributions and is a property of the forecasts only. A simple theoretical framework allows us to distinguish between probabilistic calibration, exceedance calibration and marginal calibration. We propose and study tools for checking calibration and sharpness, among them the probability integral transform histogram, marginal calibration plots, the sharpness diagram and proper scoring rules. The diagnostic approach is illustrated by an assessment and ranking of probabilistic forecasts of wind speed at the Stateline wind energy centre in the US Pacific Northwest. In combination with crossvalidation or in the time series context, our proposal provides very general, nonparametric alternatives to the use of information criteria for model diagnostics and model selection.
Defensive Forecasting
"... We consider how to make probability forecasts of binary labels. Our main mathematical result is that for any continuous gambling strategy used for detecting disagreement between the forecasts and the actual labels, there exists a forecasting strategy whose forecasts are ideal as far as this ga ..."
Abstract

Cited by 13 (12 self)
 Add to MetaCart
We consider how to make probability forecasts of binary labels. Our main mathematical result is that for any continuous gambling strategy used for detecting disagreement between the forecasts and the actual labels, there exists a forecasting strategy whose forecasts are ideal as far as this gambling strategy is concerned. A forecasting strategy obtained in this way from a gambling strategy demonstrating a strong law of large numbers is simplified and studied empirically.
A geometric proof of calibration
 hal00773218, version 1  12
, 2013
"... We provide yet another proof of the existence of calibrated forecasters; it has two merits. First, it is valid for an arbitrary finite number of outcomes. Second, it is short and simple and it follows from a direct application of Blackwell’s approachability theorem to carefully chosen vectorvalued ..."
Abstract

Cited by 9 (6 self)
 Add to MetaCart
We provide yet another proof of the existence of calibrated forecasters; it has two merits. First, it is valid for an arbitrary finite number of outcomes. Second, it is short and simple and it follows from a direct application of Blackwell’s approachability theorem to carefully chosen vectorvalued payoff function and convex target set. Our proof captures the essence of existing proofs based on approachability (e.g., the proof by Foster [5] in case of binary outcomes) and highlights the intrinsic connection between approachability and calibration.
Regret minimization in repeated matrix games with variable stage duration
, 2006
"... Regret minimization in repeated matrix games has been extensively studied ever since Hannan’s (1957) seminal paper. Several classes of noregret strategies now exist; such strategies secure a longterm average payoff as high as could be obtained by the fixed action that is best, in hindsight, against ..."
Abstract

Cited by 9 (6 self)
 Add to MetaCart
Regret minimization in repeated matrix games has been extensively studied ever since Hannan’s (1957) seminal paper. Several classes of noregret strategies now exist; such strategies secure a longterm average payoff as high as could be obtained by the fixed action that is best, in hindsight, against the observed action sequence of the opponent. We consider an extension of this framework to repeated games with variable stage duration, where the duration of each stage may depend on actions of both players, and the performance measure of interest is the average payoff per unit time. We start by showing that noregret strategies, in the above sense, do not exist in general. Consequently, we consider two classes of adaptive strategies, one based on Blackwell’s approachability theorem and the other on calibrated play, and examine their performance guarantees. We further provide sufficient conditions for existence of noregret strategies in this model. JEL Classification: C73; C44.
Testing Multiple Forecasters
 ECONOMETRICA
, 2007
"... We consider a crosscalibration test of predictions by multiple potential experts in a stochastic environment. This test checks whether each expert is calibrated conditional on the predictions made by other experts. We show that this test is good in the sense that a true expert—one informed of the t ..."
Abstract

Cited by 9 (1 self)
 Add to MetaCart
We consider a crosscalibration test of predictions by multiple potential experts in a stochastic environment. This test checks whether each expert is calibrated conditional on the predictions made by other experts. We show that this test is good in the sense that a true expert—one informed of the true distribution of the process—is guaranteed to pass the test no matter what the other potential experts do, and false experts will fail the test on all but a small (category one) set of true distributions. Furthermore, even when there is no true expert present, a test similar to crosscalibration cannot be simultaneously manipulated by multiple false experts, but at the cost of failing some true experts.
Online regression competitive with reproducing kernel Hilbert spaces
, 2005
"... We consider the problem of online prediction of realvalued labels of new objects. The prediction algorithm’s performance is measured by the squared deviation of the predictions from the actual labels. No probabilistic assumptions are made about the way the labels and objects are generated. Instead ..."
Abstract

Cited by 6 (3 self)
 Add to MetaCart
We consider the problem of online prediction of realvalued labels of new objects. The prediction algorithm’s performance is measured by the squared deviation of the predictions from the actual labels. No probabilistic assumptions are made about the way the labels and objects are generated. Instead, we are given a benchmark class of prediction rules some of which are hoped to produce good predictions. We show that for a wide range of infinitedimensional benchmark classes one can construct a prediction algorithm whose cumulative loss over the first N examples does not exceed the cumulative loss of any prediction rule in the class plus O ( √ N). Our proof technique is based on the recently developed method of defensive forecasting. 1
Combining Probability Forecasts
, 2008
"... Linear pooling is by the far the most popular method for combining probability forecasts. However, any nontrivial weighted average of two or more distinct, calibrated probability forecasts is necessarily uncalibrated and lacks sharpness. In view of this, linear pooling requires recalibration, even i ..."
Abstract

Cited by 3 (0 self)
 Add to MetaCart
Linear pooling is by the far the most popular method for combining probability forecasts. However, any nontrivial weighted average of two or more distinct, calibrated probability forecasts is necessarily uncalibrated and lacks sharpness. In view of this, linear pooling requires recalibration, even in the ideal case in which the individual forecasts are calibrated. Toward this end, we propose a beta transformed linear opinion pool (BLP) for the aggregation of probability forecasts from distinct, calibrated or uncalibrated sources. The BLP method fits an optimal nonlinearly recalibrated forecast combination, by compositing a beta transform and the traditional linear opinion pool. The technique is illustrated in a simulation example and in a case study on statistical and National Weather Service probability of precipitation forecasts.
Blackwell Approachability and NoRegret Learning are Equivalent
"... We consider the celebrated Blackwell Approachability Theorem for twoplayer games with vector payoffs. Blackwell himself previously showed that the theorem implies the existence of a “noregret ” algorithm for a simple online learning problem. We show that this relationship is in fact much stronger, ..."
Abstract

Cited by 2 (1 self)
 Add to MetaCart
We consider the celebrated Blackwell Approachability Theorem for twoplayer games with vector payoffs. Blackwell himself previously showed that the theorem implies the existence of a “noregret ” algorithm for a simple online learning problem. We show that this relationship is in fact much stronger, that Blackwell’s result is equivalent to, in a very strong sense, the problem of regret minimization for Online Linear Optimization. We show that any algorithm for one such problem can be efficiently converted into an algorithm for the other. We provide one novel application of this reduction: the first efficient algorithm for calibrated forecasting. 1.
No Manipulation Results for NonBayesian Tests
, 2005
"... In Dekel and Feinberg (2004) we suggested a test for discovering whether a potential expert is informed of the distribution of a stochastic process. This category test requires predicting a “small” – category I – set of outcomes. In this paper we show that there is a randomized category test that ca ..."
Abstract
 Add to MetaCart
In Dekel and Feinberg (2004) we suggested a test for discovering whether a potential expert is informed of the distribution of a stochastic process. This category test requires predicting a “small” – category I – set of outcomes. In this paper we show that there is a randomized category test that cannot be manipulated, i.e. such that no matter how the potential expert randomizes his prediction, there will be realizations where he will fail to pass the test with probability 1. The set of outcomes where he fails can be made large – a category II set – under the continuum hypothesis. Moreover, these results hold for the finite approximations of the category tests where the nonexpert is failed in finite time and the expert is failed with small probability. JEL Classification: K9 1