Results 1 - 10
of
107
A tutorial on learning with Bayesian networks
- Learning in Graphical Models
, 1995
"... A companion set of lecture slides is available at ..."
Abstract
-
Cited by 710 (4 self)
- Add to MetaCart
A companion set of lecture slides is available at
How to Use Expert Advice
- JOURNAL OF THE ASSOCIATION FOR COMPUTING MACHINERY
, 1997
"... We analyze algorithms that predict a binary value by combining the predictions of several prediction strategies, called experts. Our analysis is for worst-case situations, i.e., we make no assumptions about the way the sequence of bits to be predicted is generated. We measure the performance of the ..."
Abstract
-
Cited by 267 (60 self)
- Add to MetaCart
We analyze algorithms that predict a binary value by combining the predictions of several prediction strategies, called experts. Our analysis is for worst-case situations, i.e., we make no assumptions about the way the sequence of bits to be predicted is generated. We measure the performance of the algorithm by the difference between the expected number of mistakes it makes on the bit sequence and the expected number of mistakes made by the best expert on this sequence, where the expectation is taken with respect to the randomization in the predictions. We show that the minimum achievable difference is on the order of the square root of the number of mistakes of the best expert, and we give efficient algorithms that achieve this. Our upper and lower bounds have matching leading constants in most cases. We then show howthis leads to certain kinds of pattern recognition/learning algorithms with performance bounds that improve on the best results currently known in this context. We also compare our analysis to the case in which log loss is used instead of the expected number of mistakes.
Adaptive Game Playing Using Multiplicative Weights
"... this paper, we present a simple algorithm for solving this problem, and give a simple analysis of the algorithm. The bounds we obtain are not asymptotic and hold for any finite number of rounds. The algorithm and its analysis are based directly on the "on-line prediction" methods of Littlestone and ..."
Abstract
-
Cited by 106 (14 self)
- Add to MetaCart
this paper, we present a simple algorithm for solving this problem, and give a simple analysis of the algorithm. The bounds we obtain are not asymptotic and hold for any finite number of rounds. The algorithm and its analysis are based directly on the "on-line prediction" methods of Littlestone and Warmuth [24]. The analysis of this algorithm yields a new (as far as we know) and simple proof of von Neumann's minmax theorem, as well as a provable method of approximately solving a game. We also give more refined variants of the algorithm for this purpose, and we show that one of these is optimal in a very strong sense. The paper is organized as follows. In Section 2 we define the mathematical setup and notation. In Section 3 we introduce the basic multiplicative weights algorithm whose average performance is guaranteed to be almost as good as that of the best fixed mixed strategy. In Section 4 we outline the relationship between our work and some of the extensive existing work on the use of multiplicative weights algorithms for on-line prediction. In Section 5 we show how the algorithm can be used to give a simple proof of Von-Neumann's min-max theorem. In Section 6 we give a version of the algorithm whose distributions are guaranteed to converge to an optimal mixed strategy. We note the possible application of this algorithm to solving linear programming problems and reference other work that have used multiplicative weights to this end. Finally, in Section 7 we show that the convergence rate of the second version of the algorithm is asymptotically optimal. 2 Playing repeated games
A Game of Prediction with Expert Advice
- Journal of Computer and System Sciences
, 1997
"... We consider the following problem. At each point of discrete time the learner must make a prediction; he is given the predictions made by a pool of experts. Each prediction and the outcome, which is disclosed after the learner has made his prediction, determine the incurred loss. It is known that, u ..."
Abstract
-
Cited by 86 (6 self)
- Add to MetaCart
We consider the following problem. At each point of discrete time the learner must make a prediction; he is given the predictions made by a pool of experts. Each prediction and the outcome, which is disclosed after the learner has made his prediction, determine the incurred loss. It is known that, under weak regularity, the learner can ensure that his cumulative loss never exceeds cL+ a ln n, where c and a are some constants, n is the size of the pool, and L is the cumulative loss incurred by the best expert in the pool. We find the set of those pairs (c; a) for which this is true.
Strictly Proper Scoring Rules, Prediction, and Estimation
, 2007
"... Scoring rules assess the quality of probabilistic forecasts, by assigning a numerical score based on the predictive distribution and on the event or value that materializes. A scoring rule is proper if the forecaster maximizes the expected score for an observation drawn from the distribution F if he ..."
Abstract
-
Cited by 86 (13 self)
- Add to MetaCart
Scoring rules assess the quality of probabilistic forecasts, by assigning a numerical score based on the predictive distribution and on the event or value that materializes. A scoring rule is proper if the forecaster maximizes the expected score for an observation drawn from the distribution F if he or she issues the probabilistic forecast F, rather than G ̸ = F. It is strictly proper if the maximum is unique. In prediction problems, proper scoring rules encourage the forecaster to make careful assessments and to be honest. In estimation problems, strictly proper scoring rules provide attractive loss and utility functions that can be tailored to the problem at hand. This article reviews and develops the theory of proper scoring rules on general probability spaces, and proposes and discusses examples thereof. Proper scoring rules derive from convex functions and relate to information measures, entropy functions, and Bregman divergences. In the case of categorical variables, we prove a rigorous version of the Savage representation. Examples of scoring rules for probabilistic forecasts in the form of predictive densities include the logarithmic, spherical, pseudospherical, and quadratic scores. The continuous ranked probability score applies to probabilistic forecasts that take the form of predictive cumulative distribution functions. It generalizes the absolute error and forms a special case of a new and very general type of score, the energy score. Like many other scoring rules, the energy score admits a kernel representation in terms of negative definite functions, with links to inequalities of Hoeffding type, in both univariate and multivariate settings. Proper scoring rules for quantile and interval forecasts are also discussed. We relate proper scoring rules to Bayes factors and to cross-validation, and propose a novel form of cross-validation known as random-fold cross-validation. A case study on probabilistic weather forecasts in the North American Pacific Northwest illustrates the importance of propriety. We note optimum score approaches to point and quantile
Assessment and Propagation of Model Uncertainty
, 1995
"... this paper I discuss a Bayesian approach to solving this problem that has long been available in principle but is only now becoming routinely feasible, by virtue of recent computational advances, and examine its implementation in examples that involve forecasting the price of oil and estimating the ..."
Abstract
-
Cited by 79 (0 self)
- Add to MetaCart
this paper I discuss a Bayesian approach to solving this problem that has long been available in principle but is only now becoming routinely feasible, by virtue of recent computational advances, and examine its implementation in examples that involve forecasting the price of oil and estimating the chance of catastrophic failure of the U.S. Space Shuttle.
Benchmark Priors for Bayesian Model Averaging
- FORTHCOMING IN THE JOURNAL OF ECONOMETRICS
, 2001
"... In contrast to a posterior analysis given a particular sampling model, posterior model probabilities in the context of model uncertainty are typically rather sensitive to the specification of the prior. In particular, “diffuse” priors on model-specific parameters can lead to quite unexpected consequ ..."
Abstract
-
Cited by 61 (3 self)
- Add to MetaCart
In contrast to a posterior analysis given a particular sampling model, posterior model probabilities in the context of model uncertainty are typically rather sensitive to the specification of the prior. In particular, “diffuse” priors on model-specific parameters can lead to quite unexpected consequences. Here we focus on the practically relevant situation where we need to entertain a (large) number of sampling models and we have (or wish to use) little or no subjective prior information. We aim at providing an “automatic” or “benchmark” prior structure that can be used in such cases. We focus on the Normal linear regression model with uncertainty in the choice of regressors. We propose a partly noninformative prior structure related to a Natural Conjugate g-prior specification, where the amount of subjective information requested from the user is limited to the choice of a single scalar hyperparameter g0j. The consequences of different choices for g0j are examined. We investigate theoretical properties, such as consistency of the implied Bayesian procedure. Links with classical information criteria are provided. More importantly, we examine the finite sample implications of several choices of g0j in a simulation study. The use of the MC3 algorithm of Madigan and York (1995), combined with efficient coding in Fortran, makes it feasible to conduct large simulations. In addition to posterior criteria, we shall also compare the predictive performance of different priors. A classic example concerning the economics of crime will also be provided and contrasted with results in the literature. The main findings of the paper will lead us to propose a “benchmark” prior specification in a linear regression context with model uncertainty.
Dynamic Conditional Independence Models And Markov Chain Monte Carlo Methods
- Journal of the American Statistical Association
, 1997
"... In dynamic statistical modeling situations, observations arise sequentially, causing the model to expand by progressive incorporation of new data items and new unknown parameters. For example, in clinical monitoring, new patient-specific parameters are introduced with each new patient. Markov chain ..."
Abstract
-
Cited by 57 (0 self)
- Add to MetaCart
In dynamic statistical modeling situations, observations arise sequentially, causing the model to expand by progressive incorporation of new data items and new unknown parameters. For example, in clinical monitoring, new patient-specific parameters are introduced with each new patient. Markov chain Monte Carlo (MCMC) might be used for posterior inference, but would need to be redone at each expansion stage. Thus such methods are often too slow for real-time implementation. By combining MCMC with importance-resampling, we show how real-time posterior updating can be effected. The proposed dynamic sampling algorithms utilize posterior samples from previous expansion stages, and exploit conditional independence between groups of parameters to allow samples of parameters no longer of interest to be discarded, such as when a patient dies or is discharged. We apply the methods to monitoring of heart transplant recipients during infection from cytomegalovirus. KEY WORDS : Bayesian Inference, ...
Asymptotic calibration
- Biometrika
, 1998
"... Can we forecast the probability of an arbitrary sequence of events happening so that the stated probability of an event happening is close to its empirical probability? We can view this prediction problem as a game played against nature, where at the beginning of the game Nature picks a data sequenc ..."
Abstract
-
Cited by 45 (4 self)
- Add to MetaCart
Can we forecast the probability of an arbitrary sequence of events happening so that the stated probability of an event happening is close to its empirical probability? We can view this prediction problem as a game played against nature, where at the beginning of the game Nature picks a data sequence and the forecaster picks a forecasting algorithm. If the forecaster is not allowed to randomize, then Nature win; there will always be data for which the forecaster does poorly. This paper shows that, if the forecaster can randomize, the forecaster wins in the sense that the forecasted probabilities and the empirical probabilities can be made arbitrarily close to each other.
Competitive on-line statistics
- International Statistical Review
, 1999
"... A radically new approach to statistical modelling, which combines mathematical techniques of Bayesian statistics with the philosophy of the theory of competitive on-line algorithms, has arisen over the last decade in computer science (to a large degree, under the influence of Dawid’s prequential sta ..."
Abstract
-
Cited by 39 (7 self)
- Add to MetaCart
A radically new approach to statistical modelling, which combines mathematical techniques of Bayesian statistics with the philosophy of the theory of competitive on-line algorithms, has arisen over the last decade in computer science (to a large degree, under the influence of Dawid’s prequential statistics). In this approach, which we call “competitive on-line statistics”, it is not assumed that data are generated by some stochastic mechanism; the bounds derived for the performance of competitive on-line statistical procedures are guaranteed to hold (and not just hold with high probability or on the average). This paper reviews some results in this area; the new material in it includes the proofs for the performance of the Aggregating Algorithm in the problem of linear regression with square loss. Keywords: Bayes’s rule, competitive on-line algorithms, linear regression, prequential statistics, worst-case analysis.

