Results 1 - 10
of
10
Game Theory, Maximum Entropy, Minimum Discrepancy And Robust Bayesian Decision Theory
- Annals of Statistics
, 2004
"... this paper appeared in the Proceedings of the 2002 IEEE Information Theory Workshop [see Grnwald and Dawid (2002)] ..."
Abstract
-
Cited by 53 (3 self)
- Add to MetaCart
this paper appeared in the Proceedings of the 2002 IEEE Information Theory Workshop [see Grnwald and Dawid (2002)]
Mutual information, Fisher information and population coding
- Neural Computation
, 1998
"... In the context of parameter estimation and model selection, it is only quite recently that a direct link between the Fisher information and information theoretic quantities has been exhibited. We give an interpretation of this link within the standard framework of information theory. We show that in ..."
Abstract
-
Cited by 44 (3 self)
- Add to MetaCart
In the context of parameter estimation and model selection, it is only quite recently that a direct link between the Fisher information and information theoretic quantities has been exhibited. We give an interpretation of this link within the standard framework of information theory. We show that in the context of population coding, the mutual information between the activity of a large array of neurons and a stimulus to which the neurons are tuned is naturally related to the Fisher information. In the light of this result we consider the optimization of the tuning curves parameters in the case of neurons responding to a stimulus represented by an angular variable. To appear in Neural Computation Vol. 10, Issue 7, published by the MIT press. 1 Laboratory associated with C.N.R.S. (U.R.A. 1306), ENS, and Universities Paris VI and Paris VII 1 Introduction A natural framework to study how neurons communicate, or transmit information, in the nervous system is information theory (see e...
Competitive on-line statistics
- International Statistical Review
, 1999
"... A radically new approach to statistical modelling, which combines mathematical techniques of Bayesian statistics with the philosophy of the theory of competitive on-line algorithms, has arisen over the last decade in computer science (to a large degree, under the influence of Dawid’s prequential sta ..."
Abstract
-
Cited by 39 (7 self)
- Add to MetaCart
A radically new approach to statistical modelling, which combines mathematical techniques of Bayesian statistics with the philosophy of the theory of competitive on-line algorithms, has arisen over the last decade in computer science (to a large degree, under the influence of Dawid’s prequential statistics). In this approach, which we call “competitive on-line statistics”, it is not assumed that data are generated by some stochastic mechanism; the bounds derived for the performance of competitive on-line statistical procedures are guaranteed to hold (and not just hold with high probability or on the average). This paper reviews some results in this area; the new material in it includes the proofs for the performance of the Aggregating Algorithm in the problem of linear regression with square loss. Keywords: Bayes’s rule, competitive on-line algorithms, linear regression, prequential statistics, worst-case analysis.
An empirical study of minimum description length model selection with infinite parametric complexity
- JOURNAL OF MATHEMATICAL PSYCHOLOGY
, 2006
"... Parametric complexity is a central concept in Minimum Description Length (MDL) model selection. In practice it often turns out to be infinite, even for quite simple models such as the Poisson and Geometric families. In such cases, MDL model selection as based on NML and Bayesian inference based on J ..."
Abstract
-
Cited by 7 (0 self)
- Add to MetaCart
Parametric complexity is a central concept in Minimum Description Length (MDL) model selection. In practice it often turns out to be infinite, even for quite simple models such as the Poisson and Geometric families. In such cases, MDL model selection as based on NML and Bayesian inference based on Jeffreys ’ prior can not be used. Several ways to resolve this problem have been proposed. We conduct experiments to compare and evaluate their behaviour on small sample sizes. We find interestingly poor behaviour for the plug-in predictive code; a restricted NML model performs quite well but it is questionable if the results validate its theoretical motivation. A Bayesian marginal distribution with Jeffreys’ prior can still be used if one sacrifices the first observation to make a proper posterior; this approach turns out to be most dependable.
Online Prediction with Experts under a Log-scoring Rule - Online Expert Prediction
"... F13.39> (x) = p(xj) is a stochastic process: This means that if we write X = X n = (X 1 ; :::; X n ) to mean the random variable with outcomes x = x n = (x 1 ; :::; x n ) then the density p(x n j) for n is the result of integrating p(x n+1 j) over x n+1 . Write J() to mean the Jereys prio ..."
Abstract
-
Cited by 2 (1 self)
- Add to MetaCart
F13.39> (x) = p(xj) is a stochastic process: This means that if we write X = X n = (X 1 ; :::; X n ) to mean the random variable with outcomes x = x n = (x 1 ; :::; x n ) then the density p(x n j) for n is the result of integrating p(x n+1 j) over x n+1 . Write J() to mean the Jereys prior on the parameter space, assuming it exists, and let w be another prior density on the parameter space. We use J() as a dominating measure for other priors unless stated otherwise. Denote by <F13
An empirical study of MDL model selection with infinite parametric complexity
- J. Mathematical Psychology
, 2006
"... Parametric complexity is a central concept in MDL model selection. In practice it often turns out to be infinite, even for quite simple models such as the Poisson and Geometric families. In such cases, MDL model selection as based on NML and Bayesian inference based on Jeffreys ’ prior can not be us ..."
Abstract
-
Cited by 2 (1 self)
- Add to MetaCart
Parametric complexity is a central concept in MDL model selection. In practice it often turns out to be infinite, even for quite simple models such as the Poisson and Geometric families. In such cases, MDL model selection as based on NML and Bayesian inference based on Jeffreys ’ prior can not be used. Several ways to resolve this problem have been proposed. We conduct experiments to compare and evaluate their behaviour on small sample sizes. We find interestingly poor behaviour for the plug-in predictive code; a restricted NML model performs quite well but it is questionable if the results validate its theoretical motivation. The Bayesian model with the improper Jeffreys ’ prior is the most dependable. 1
Discussion of the Papers by Rissanen, and by Wallace and Dowe
"... to Yang and Barron [1], and earlier work due to Barron and Cover [2], Bethel and Shumway [3], are efforts to provide general results for collections of classes that are recognized to have common properties---typically the dependence of the penalty term on n, the sample size. In particular, one may ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
to Yang and Barron [1], and earlier work due to Barron and Cover [2], Bethel and Shumway [3], are efforts to provide general results for collections of classes that are recognized to have common properties---typically the dependence of the penalty term on n, the sample size. In particular, one may consider an Akaike information criterion or AIC-class, see [4], of MSPs which contains the AIC and its equivalent formulations such as Mallow's C p and cross-validation, see [5]. Equivalently, the AIC class of MSPs can be defined as the MSPs that satisfy the same optimality criterion from prediction as the AIC itself does, see [6, 7]. Second, one may consider a Bayes information criterion, or BIC-class of MSPs which contains the BIC and other equivalent formulations such as the posterior quantities it approximates, see [8]. Perhaps the optimality criterion satisfied by the BIC, see [9], or the posterior probabilities, can be used to define this class. In some cases, the MDL is similar to
Information Optimality and Bayesian Modeling a ∗
"... The general approach of treating a statistical problem as one of information processing led to the Bayesian method of moments, reference priors, minimal information likelihoods, and stochastic complexity. These techniques rest on quantities that have physical itnerpretations from information theory. ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
The general approach of treating a statistical problem as one of information processing led to the Bayesian method of moments, reference priors, minimal information likelihoods, and stochastic complexity. These techniques rest on quantities that have physical itnerpretations from information theory. Current work includes: the role of prediction, the emergence of data dependent priors, the role of information measures in model selection, and the use of conditional mutual information to incorporate partial information. Key words: entropy, Bayesian method of moments, reference priors, stochastic complexity, data dependent priors
Partial Information Reference Priors
, 2000
"... Suppose X 1 ; : : : ; X n are IID p(j; ) where (; ) 2 IR d is distributed according to the prior density w(). For estimators S n = S(X n ) and T n = T (X n ) assumed to be consistent for some function of and asymptotically normal we examine the conditional Shannon mutual information (CSMI) be ..."
Abstract
- Add to MetaCart
Suppose X 1 ; : : : ; X n are IID p(j; ) where (; ) 2 IR d is distributed according to the prior density w(). For estimators S n = S(X n ) and T n = T (X n ) assumed to be consistent for some function of and asymptotically normal we examine the conditional Shannon mutual information (CSMI) between and T n given and S n , I(; T n j ; S n ). It is seen there are several important special cases of this CSMI. We establish an asymptotic formula for it and identify the resulting noninformative reference prior. As a consequence, we develop the notion of data dependent priors and a calibration for how close an estimator is to suciency. 1 x1
Heterogeneity, Selection and Wealth Dynamics
"... The market selection hypothesis states that, among expected utility maximizers, competitive markets select for agents with correct beliefs. In some economies this holds, while in others it fails. It holds in complete market economies with a common discount factor and bounded aggregate consumption. I ..."
Abstract
- Add to MetaCart
The market selection hypothesis states that, among expected utility maximizers, competitive markets select for agents with correct beliefs. In some economies this holds, while in others it fails. It holds in complete market economies with a common discount factor and bounded aggregate consumption. It can fail when markets are incomplete, when consumption grows too quickly, or when discount factors and beliefs are correlated. These insights have implication for the analysis of the heterogeneous agent stochastic dynamic general equilibrium models common in finance and macroeconomics. 1 “The trading floor is a jungle, ” he went on, “and the guy you end up working for is your jungle leader. Whether you succeed here or not depends on knowing how to survive in the jungle.” Lewis (1989, pp. 39–40.) 1

