Results 1  10
of
124
Implementing approximate Bayesian inference for latent Gaussian models using integrated nested Laplace approximations: A manual for the inlaprogram
, 2008
"... Structured additive regression models are perhaps the most commonly used class of models in statistical applications. It includes, among others, (generalised) linear models, (generalised) additive models, smoothingspline models, statespace models, semiparametric regression, spatial and spatiotemp ..."
Abstract

Cited by 79 (16 self)
 Add to MetaCart
Structured additive regression models are perhaps the most commonly used class of models in statistical applications. It includes, among others, (generalised) linear models, (generalised) additive models, smoothingspline models, statespace models, semiparametric regression, spatial and spatiotemporal models, logGaussian Coxprocesses, geostatistical and geoadditive models. In this paper we consider approximate Bayesian inference in a popular subset of structured additive regression models, latent Gaussian models, where the latent field is Gaussian, controlled by a few hyperparameters and with nonGaussian response variables. The posterior marginals are not available in closed form due to the nonGaussian response variables. For such models, Markov chain Monte Carlo methods can be implemented, but they are not without problems, both in terms of convergence and computational time. In some practical applications, the extent of these problems is such that Markov chain Monte Carlo is simply not an appropriate tool for routine analysis. We show that, by using an integrated nested Laplace approximation and its simplified version, we can directly compute very accurate approximations to the posterior marginals. The main benefit of these approximations
Probabilistic forecasts, calibration and sharpness
 Journal of the Royal Statistical Society Series B
, 2007
"... Summary. Probabilistic forecasts of continuous variables take the form of predictive densities or predictive cumulative distribution functions. We propose a diagnostic approach to the evaluation of predictive performance that is based on the paradigm of maximizing the sharpness of the predictive dis ..."
Abstract

Cited by 38 (15 self)
 Add to MetaCart
Summary. Probabilistic forecasts of continuous variables take the form of predictive densities or predictive cumulative distribution functions. We propose a diagnostic approach to the evaluation of predictive performance that is based on the paradigm of maximizing the sharpness of the predictive distributions subject to calibration. Calibration refers to the statistical consistency between the distributional forecasts and the observations and is a joint property of the predictions and the events that materialize. Sharpness refers to the concentration of the predictive distributions and is a property of the forecasts only. A simple theoretical framework allows us to distinguish between probabilistic calibration, exceedance calibration and marginal calibration. We propose and study tools for checking calibration and sharpness, among them the probability integral transform histogram, marginal calibration plots, the sharpness diagram and proper scoring rules. The diagnostic approach is illustrated by an assessment and ranking of probabilistic forecasts of wind speed at the Stateline wind energy centre in the US Pacific Northwest. In combination with crossvalidation or in the time series context, our proposal provides very general, nonparametric alternatives to the use of information criteria for model diagnostics and model selection.
Loss Functions for Binary Class Probability Estimation and Classification: Structure and Applications,” manuscript, available at wwwstat.wharton.upenn.edu/~buja
, 2005
"... What are the natural loss functions or fitting criteria for binary class probability estimation? This question has a simple answer: socalled “proper scoring rules”, that is, functions that score probability estimates in view of data in a Fisherconsistent manner. Proper scoring rules comprise most ..."
Abstract

Cited by 33 (1 self)
 Add to MetaCart
What are the natural loss functions or fitting criteria for binary class probability estimation? This question has a simple answer: socalled “proper scoring rules”, that is, functions that score probability estimates in view of data in a Fisherconsistent manner. Proper scoring rules comprise most loss functions currently in use: logloss, squared error loss, boosting loss, and as limiting cases costweighted misclassification losses. Proper scoring rules have a rich structure: • Every proper scoring rules is a mixture (limit of sums) of costweighted misclassification losses. The mixture is specified by a weight function (or measure) that describes which misclassification cost weights are most emphasized by the proper scoring rule. • Proper scoring rules permit Fisher scoring and Iteratively Reweighted LS algorithms for model fitting. The weights are derived from a link function and the above weight function. • Proper scoring rules are in a 11 correspondence with information measures for treebased classification.
A new understanding of prediction markets via noregret learning
 In ACM EC
, 2010
"... We explore the striking mathematical connections that exist between market scoring rules, cost function based prediction markets, and noregret learning. We first show that any cost function based prediction market can be interpreted as an algorithm for the commonly studied problem of learning from ..."
Abstract

Cited by 30 (10 self)
 Add to MetaCart
We explore the striking mathematical connections that exist between market scoring rules, cost function based prediction markets, and noregret learning. We first show that any cost function based prediction market can be interpreted as an algorithm for the commonly studied problem of learning from expert advice by equating the set of outcomes on which bets are placed in the market with the set of experts in the learning setting, and equating trades made in the market with losses observed by the learning algorithm. If the loss of the market organizer is bounded, this bound can be used to derive an O ( √ T) regret bound for the corresponding learning algorithm. We then show that the class of markets with convex cost functions exactly corresponds to the class of Follow the Regularized Leader learning algorithms, with the choice of a cost function in the market corresponding to the choice of a regularizer in the learning problem. Finally, we show an equivalence between market scoring rules and prediction markets with convex cost functions. This implies both that any market scoring rule can be implemented as a cost function based market maker, and that market scoring rules can be interpreted naturally as Follow the Regularized Leader algorithms. These connections provide new insight into how it is that commonly studied markets, such as the Logarithmic Market Scoring Rule, can aggregate opinions into accurate estimates of the likelihood of future events.
Comparing and evaluating Bayesian predictive distributions of asset returns
 International Journal of Forecasting, forthcoming. http://www.biz.uiowa.edu/faculty/jgeweke/papers/paperD/paper.pdf
, 2009
"... Abstract: Bayesian inference in a time series model provides exact, outofsample predictive distributions that fully and coherently incorporate parameter uncertainty. This study compares and evaluates Bayesian predictive distributions from alternative models, using as an illustration five alternati ..."
Abstract

Cited by 22 (1 self)
 Add to MetaCart
Abstract: Bayesian inference in a time series model provides exact, outofsample predictive distributions that fully and coherently incorporate parameter uncertainty. This study compares and evaluates Bayesian predictive distributions from alternative models, using as an illustration five alternative models of asset returns applied to daily S&P 500 returns from 1972 through 2005. The comparison exercise uses predictive likelihoods and is inherently Bayesian. The evaluation exercise uses the probability integral transform and is inherently frequentist. The illustration shows that the two approaches can be complementary, each identifying strengths and weaknesses in models that are not evident using the other. JEL classification: C11, C53 Key words: forecasting, GARCH, inverse probability transform, Markovmixture, predictive likelihood, S&P 500 returns, stochastic volatility The authors gratefully acknowledge financial support from NSF grant SBR0720547. The views expressed here are the authors ’ and not necessarily those of the Federal Reserve Bank of Atlanta or the Federal Reserve System. Any remaining errors are the authors ’ responsibility.
Calibrated probabilistic forecasting at the Stateline wind energy center: The regimeswitching spacetime (RST) method
 Journal of the American Statistical Association
, 2004
"... With the global proliferation of wind power, accurate shortterm forecasts of wind resources at wind energy sites are becoming paramount. Regimeswitching spacetime (RST) models merge meteorological and statistical expertise to obtain accurate and calibrated, fully probabilistic forecasts of wind s ..."
Abstract

Cited by 19 (10 self)
 Add to MetaCart
With the global proliferation of wind power, accurate shortterm forecasts of wind resources at wind energy sites are becoming paramount. Regimeswitching spacetime (RST) models merge meteorological and statistical expertise to obtain accurate and calibrated, fully probabilistic forecasts of wind speed and wind power. The model formulation is parsimonious, yet takes account of all the salient features of wind speed: alternating atmospheric regimes, temporal and spatial correlation, diurnal and seasonal nonstationarity, conditional heteroscedasticity, and nonGaussianity. The RST method identifies forecast regimes at the wind energy site and fits a conditional predictive model for each regime. Geographically dispersed meteorological observations in the vicinity of the wind farm are used as offsite predictors. The RST technique was applied to 2hour ahead forecasts of hourly average wind speed at the Stateline wind farm in the US Pacific Northwest. In July 2003, for instance, the RST forecasts had rootmeansquare error (RMSE) 28.6 % less than the persistence forecasts. For each month in the test period, the RST forecasts had lower RMSE than forecasts using stateoftheart vector time series techniques. The RST method provides probabilistic forecasts in the form of
Eliciting Properties of Probability Distributions
 In Proceedings of the ninth ACM conference on electronic commerce
, 2008
"... We investigate the problem of incentivizing an expert to truthfully reveal probabilistic information about a random event. Probabilistic information consists of one or more properties, which are any realvalued functions of the distribution, such as the mean and variance. Not all properties can be e ..."
Abstract

Cited by 18 (4 self)
 Add to MetaCart
We investigate the problem of incentivizing an expert to truthfully reveal probabilistic information about a random event. Probabilistic information consists of one or more properties, which are any realvalued functions of the distribution, such as the mean and variance. Not all properties can be elicited truthfully. We provide a simple characterization of elicitable properties, and describe the general form of the associated payment functions that induce truthful revelation. We then consider sets of properties, and observe that all properties can be inferred from sets of elicitable properties. This suggests the concept of elicitation complexity for a property, the size of the smallest set implying the property.
Information, Divergence and Risk for Binary Experiments
 JOURNAL OF MACHINE LEARNING RESEARCH
, 2009
"... We unify fdivergences, Bregman divergences, surrogate regret bounds, proper scoring rules, cost curves, ROCcurves and statistical information. We do this by systematically studying integral and variational representations of these various objects and in so doing identify their primitives which all ..."
Abstract

Cited by 17 (6 self)
 Add to MetaCart
We unify fdivergences, Bregman divergences, surrogate regret bounds, proper scoring rules, cost curves, ROCcurves and statistical information. We do this by systematically studying integral and variational representations of these various objects and in so doing identify their primitives which all are related to costsensitive binary classification. As well as developing relationships between generative and discriminative views of learning, the new machinery leads to tight and more general surrogate regret bounds and generalised Pinsker inequalities relating fdivergences to variational divergence. The new viewpoint also illuminates existing algorithms: it provides a new derivation of Support Vector Machines in terms of divergences and relates Maximum Mean Discrepancy to Fisher Linear Discriminants.
Geostatistical SpaceTime Models, Stationarity, Separability and Full Symmetry
"... Geostatistical approaches to modeling spatiotemporal data rely on parametric covariance models and rather stringent assumptions, such as stationarity, separability and full symmetry. This paper reviews recent advances in the literature on spacetime covariance functions in light of the aforemention ..."
Abstract

Cited by 15 (3 self)
 Add to MetaCart
Geostatistical approaches to modeling spatiotemporal data rely on parametric covariance models and rather stringent assumptions, such as stationarity, separability and full symmetry. This paper reviews recent advances in the literature on spacetime covariance functions in light of the aforementioned notions, which are illustrated using wind data from Ireland. Experiments with timeforward kriging predictors suggest that the use of more complex and more realistic covariance models results in improved predictive performance.