Results 1  10
of
18
Universal prediction
 IEEE Transactions on Information Theory
, 1998
"... Abstract — This paper consists of an overview on universal prediction from an informationtheoretic perspective. Special attention is given to the notion of probability assignment under the selfinformation loss function, which is directly related to the theory of universal data compression. Both th ..."
Abstract

Cited by 136 (11 self)
 Add to MetaCart
Abstract — This paper consists of an overview on universal prediction from an informationtheoretic perspective. Special attention is given to the notion of probability assignment under the selfinformation loss function, which is directly related to the theory of universal data compression. Both the probabilistic setting and the deterministic setting of the universal prediction problem are described with emphasis on the analogy and the differences between results in the two settings. Index Terms — Bayes envelope, entropy, finitestate machine, linear prediction, loss function, probability assignment, redundancycapacity, stochastic complexity, universal coding, universal prediction. I.
Dynamic Conditional Independence Models And Markov Chain Monte Carlo Methods
 Journal of the American Statistical Association
, 1997
"... In dynamic statistical modeling situations, observations arise sequentially, causing the model to expand by progressive incorporation of new data items and new unknown parameters. For example, in clinical monitoring, new patientspecific parameters are introduced with each new patient. Markov chain ..."
Abstract

Cited by 71 (0 self)
 Add to MetaCart
In dynamic statistical modeling situations, observations arise sequentially, causing the model to expand by progressive incorporation of new data items and new unknown parameters. For example, in clinical monitoring, new patientspecific parameters are introduced with each new patient. Markov chain Monte Carlo (MCMC) might be used for posterior inference, but would need to be redone at each expansion stage. Thus such methods are often too slow for realtime implementation. By combining MCMC with importanceresampling, we show how realtime posterior updating can be effected. The proposed dynamic sampling algorithms utilize posterior samples from previous expansion stages, and exploit conditional independence between groups of parameters to allow samples of parameters no longer of interest to be discarded, such as when a patient dies or is discharged. We apply the methods to monitoring of heart transplant recipients during infection from cytomegalovirus. KEY WORDS : Bayesian Inference, ...
Competitive online statistics
 International Statistical Review
, 1999
"... A radically new approach to statistical modelling, which combines mathematical techniques of Bayesian statistics with the philosophy of the theory of competitive online algorithms, has arisen over the last decade in computer science (to a large degree, under the influence of Dawid’s prequential sta ..."
Abstract

Cited by 63 (10 self)
 Add to MetaCart
A radically new approach to statistical modelling, which combines mathematical techniques of Bayesian statistics with the philosophy of the theory of competitive online algorithms, has arisen over the last decade in computer science (to a large degree, under the influence of Dawid’s prequential statistics). In this approach, which we call “competitive online statistics”, it is not assumed that data are generated by some stochastic mechanism; the bounds derived for the performance of competitive online statistical procedures are guaranteed to hold (and not just hold with high probability or on the average). This paper reviews some results in this area; the new material in it includes the proofs for the performance of the Aggregating Algorithm in the problem of linear regression with square loss. Keywords: Bayes’s rule, competitive online algorithms, linear regression, prequential statistics, worstcase analysis.
Prequential Probability: Principles and Properties
, 1997
"... this paper we first illustrate the above considerations for a variety of appealling criteria, and then, in an attempt to understand this behaviour, introduce a new gametheoretic framework for Probability Theory, the `prequential framework', which is particularly suited for the study of such problem ..."
Abstract

Cited by 33 (2 self)
 Add to MetaCart
this paper we first illustrate the above considerations for a variety of appealling criteria, and then, in an attempt to understand this behaviour, introduce a new gametheoretic framework for Probability Theory, the `prequential framework', which is particularly suited for the study of such problems.
Estimating the integrated likelihood via posterior simulation using the harmonic mean identity
 Bayesian Statistics
, 2007
"... The integrated likelihood (also called the marginal likelihood or the normalizing constant) is a central quantity in Bayesian model selection and model averaging. It is defined as the integral over the parameter space of the likelihood times the prior density. The Bayes factor for model comparison a ..."
Abstract

Cited by 24 (2 self)
 Add to MetaCart
The integrated likelihood (also called the marginal likelihood or the normalizing constant) is a central quantity in Bayesian model selection and model averaging. It is defined as the integral over the parameter space of the likelihood times the prior density. The Bayes factor for model comparison and Bayesian testing is a ratio of integrated likelihoods, and the model weights in Bayesian model averaging are proportional to the integrated likelihoods. We consider the estimation of the integrated likelihood from posterior simulation output, aiming at a generic method that uses only the likelihoods from the posterior simulation iterations. The key is the harmonic mean identity, which says that the reciprocal of the integrated likelihood is equal to the posterior harmonic mean of the likelihood. The simplest estimator based on the identity is thus the harmonic mean of the likelihoods. While this is an unbiased and simulationconsistent estimator, its reciprocal can have infinite variance and so it is unstable in general. We describe two methods for stabilizing the harmonic mean estimator. In the first one, the parameter space is reduced in such a way that the modified estimator involves a harmonic mean of heaviertailed densities, thus resulting in a finite variance estimator. The resulting
On supervised selection of Bayesian networks
 In UAI99
, 1999
"... Given a set of possible models (e.g., Bayesian network structures) and a data sample, in the unsupervised model selection problem the task is to choose the most accurate model with respect to the domain joint probability distribution. In contrast to this, in supervised model selection it is a priori ..."
Abstract

Cited by 18 (6 self)
 Add to MetaCart
Given a set of possible models (e.g., Bayesian network structures) and a data sample, in the unsupervised model selection problem the task is to choose the most accurate model with respect to the domain joint probability distribution. In contrast to this, in supervised model selection it is a priori known that the chosen model will be used in the future for prediction tasks involving more \focused " predictive distributions. Although focused predictive distributions can be produced from the joint probability distribution by marginalization, in practice the best model in the unsupervised sense does not necessarily perform well in supervised domains. In particular, the standard marginal likelihood score is a criterion for the unsupervised task, and, although frequently used for supervised model selection also, does not perform well in such tasks. In this paper we study the performance of the marginal likelihood score empirically in supervised Bayesian network selection tasks by using a large number of publicly available classi cation data sets, and compare the results to those obtained by alternative model selection criteria, including empirical crossvalidation methods, an approximation of a supervised marginal likelihood measure, and a supervised version of Dawid's prequential (predictive sequential) principle. The results demonstrate that the marginal likelihood score does not perform well for supervised model selection, while the best results are obtained by using Dawid's prequential approach.
Asymptotics and the theory of inference
, 2003
"... Asymptotic analysis has always been very useful for deriving distributions in statistics in cases where the exact distribution is unavailable. More importantly, asymptotic analysis can also provide insight into the inference process itself, suggesting what information is available and how this infor ..."
Abstract

Cited by 16 (7 self)
 Add to MetaCart
Asymptotic analysis has always been very useful for deriving distributions in statistics in cases where the exact distribution is unavailable. More importantly, asymptotic analysis can also provide insight into the inference process itself, suggesting what information is available and how this information may be extracted. The development of likelihood inference over the past twentysome years provides an illustration of the interplay between techniques of approximation and statistical theory.
Coverage Probability Bias, Objective Bayes and the Likelihood Principle
 Biometrika
, 1999
"... this paper, the discussion focuses on the case of a single real parameter. ..."
Abstract

Cited by 13 (0 self)
 Add to MetaCart
this paper, the discussion focuses on the case of a single real parameter.
A Nonmanipulable Test
 Annals of Statistics
, 2009
"... A test is said to control for type I error if it is unlikely to reject the datagenerating process. However, if it is possible to produce stochastic processes at random such that, for all possible future realizations of the data, the selected process is unlikely to be rejected, then the test is said ..."
Abstract

Cited by 8 (1 self)
 Add to MetaCart
A test is said to control for type I error if it is unlikely to reject the datagenerating process. However, if it is possible to produce stochastic processes at random such that, for all possible future realizations of the data, the selected process is unlikely to be rejected, then the test is said to be manipulable. So, a manipulable test has essentially no capacity to reject a strategic expert. Many tests proposed in the existing literature, including calibration tests, control for type I error but are manipulable. We construct a test that controls for type I error and is nonmanipulable. 1. Introduction. Professional
Comparing Prequential Model Selection Criteria in Supervised Learning of Mixture Models
 Proceedings of the Eighth International Conference on Artificial Intelligence and Statistics
, 2001
"... In this paper we study prequential model selection criteria in supervised learning domains. The main problem with this approach is the fact that the criterion is sensitive to the ordering the data is processed with. We discuss several approaches for addressing the ordering problem, and compare ..."
Abstract

Cited by 6 (3 self)
 Add to MetaCart
In this paper we study prequential model selection criteria in supervised learning domains. The main problem with this approach is the fact that the criterion is sensitive to the ordering the data is processed with. We discuss several approaches for addressing the ordering problem, and compare empirically their performance in realworld supervised model selection tasks. The empirical results demonstrate that with the prequential approach it is quite easy to find predictive models that are significantly more accurate classifiers than the models found by the standard unsupervised marginal likelihood criterion. The results also suggest that averaging over random orderings may be a more sensible strategy for solving the ordering problem than trying to find the ordering optimizing the prequential model selection criterion. 1