Results 1  10
of
37
Informationtheoretic asymptotics of Bayes methods
 IEEE TRANSACTIONS ON INFORMATION THEORY
, 1990
"... In the absence of knowledge of the true density function, Bayesian models take the joint density function for a sequence of n random variables to be an average of densities with respect to a prior. We examine the relative entropy distance D,, between the true density and the Bayesian density and sh ..."
Abstract

Cited by 125 (12 self)
 Add to MetaCart
In the absence of knowledge of the true density function, Bayesian models take the joint density function for a sequence of n random variables to be an average of densities with respect to a prior. We examine the relative entropy distance D,, between the true density and the Bayesian density and show that the asymptotic distance is (d/2Xlogn)+ c, where d is the dimension of the parameter vector. Therefore, the relative entropy rate D,,/n converges to zero at rate (logn)/n. The constant c, which we explicitly identify, depends only on the prior density function and the Fisher information matrix evaluated at the true parameter value. Consequences are given for density estimation, universal data compression, composite hypothesis testing, and stockmarket portfolio selection.
Decisionmetrics: a decisionbased approach to econometric modelling
 Journal of Econometrics
, 2007
"... In many applications it is necessary to use a simple and therefore highly misspecified econometric model as the basis for decisionmaking. We propose an approach to developing a possibly misspecified econometric model that will be used as the beliefs of an objective expected utility maximiser. A dis ..."
Abstract

Cited by 19 (0 self)
 Add to MetaCart
(Show Context)
In many applications it is necessary to use a simple and therefore highly misspecified econometric model as the basis for decisionmaking. We propose an approach to developing a possibly misspecified econometric model that will be used as the beliefs of an objective expected utility maximiser. A discrepancy between model and ‘truth ’ is introduced that is interpretable as a measure of the model’s value for this decisionmaker. Our decisionbased approach utilises this discrepancy in estimation, selection, inference and evaluation of parametric or semiparametric models. The methods proposed nest quasilikelihood methods as a special case that arises when model value is measured by the KullbackLeibler information discrepancy and also provide an econometric approach for developing parametric decision rules (e.g. technical trading rules) with desirable properties. The approach is illustrated and applied in the context of a CARA investor’s decision problem for which analytical, simulation and empirical results suggest it is very effective.
Bayesian and Frequentist Approaches to Parametric Predictive Inference
 BAYESIAN STATISTICS, J. M. BERNARDO , J. O. BERGER , A. P. DAWID , A. F. M. SMITH (EDS.)
, 1998
"... ..."
Improved minimax predictive densities under Kullback–Leibler loss
 Ann. Statist
, 2006
"... Let Xµ ∼ Np(µ, vxI)and Y µ ∼ Np(µ, vyI)be independent pdimensional multivariate normal vectors with common unknown mean µ. Based on only observing X = x, we consider the problem of obtaining a predictive density ˆp(yx) for Y that is close to p(yµ) as measured by expected Kullback–Leibler loss. ..."
Abstract

Cited by 11 (0 self)
 Add to MetaCart
(Show Context)
Let Xµ ∼ Np(µ, vxI)and Y µ ∼ Np(µ, vyI)be independent pdimensional multivariate normal vectors with common unknown mean µ. Based on only observing X = x, we consider the problem of obtaining a predictive density ˆp(yx) for Y that is close to p(yµ) as measured by expected Kullback–Leibler loss. A natural procedure for this problem is the (formal) Bayes predictive density ˆpU(yx) under the uniform prior πU(µ) ≡ 1, which is best invariant and minimax. We show that any Bayes predictive density will be minimax if it is obtained by a prior yielding a marginal that is superharmonic or whose square root is superharmonic. This yields wide classes of minimax procedures that dominate ˆpU(yx), including Bayes predictive densities under superharmonic priors. Fundamental similarities and differences with the parallel theory of estimating a multivariate normal mean under quadratic loss are described. 1. Introduction. Let Xµ ∼ Np(µ, vxI) and Y µ ∼ Np(µ, vyI) be independent pdimensional multivariate normal vectors with common unknown mean µ,
Applications of Lindley Information Measure to the Design of Clinical Experiments
 Aspects of Uncertainty
, 1994
"... this paper we consider applications of Lindley information measure to the design of clinical experiments. We review the decision theoretic foundations underlying the use of Lindley information, and discuss its role in constructing utility functions suitable for clinical applications. We derive and i ..."
Abstract

Cited by 9 (3 self)
 Add to MetaCart
this paper we consider applications of Lindley information measure to the design of clinical experiments. We review the decision theoretic foundations underlying the use of Lindley information, and discuss its role in constructing utility functions suitable for clinical applications. We derive and interpret general firstorder conditions for the optimality of a design. We discuss examples: choosing the optimal fixed sample size of a clinical trial, and choosing the optimal followup time for patients in a survival analysis. We give special attention to the design of multicenter clinical trials. Research of D. A. Berry supported in part by the US Public Health Service under grant HS 0647501. Research of Giovanni Parmigiani and ISDS computing environment supported in part by NSF under grant DMS9305699. We are thankful to Chengchang Li, Peter Muller, Saurabh Mukhopadhyay and Dalene Stangl for helpful discussions. 1. INTRODUCTION From the point of view of decision making, information is anything that enables us to make a better decision, that is a decision with a higher expected utility. For example, an experiment that, irrespective of the outcome, will lead to the same decision that we would make prior to observing it, has no information content. Conversely, experiments able to lead to different decision are potentially of benefit. The expected change in utility can actually be used as a quantitative measure of the worth of an experiment in any given situation. This idea is about as old as Bayesian statistics (see Ramsey, 1990) and is discussed by Raiffa and Schlaifer (1961) and DeGroot (1984). The well known measure of information proposed by Lindley (1956) is the object of investigation in this paper. It can be seen as a very important special case of this general ap...
Asymptotic Redundancies for Universal Quantum Coding
"... Clarke and Barron have recently shown that the Jereys' invariant prior of Bayesian theory yields the common asymptotic (minimax and maximin) redundancy of universal data compression in a parametric setting. We seek a possible analogue of this result for the twolevel quantum systems. We restri ..."
Abstract

Cited by 8 (3 self)
 Add to MetaCart
Clarke and Barron have recently shown that the Jereys' invariant prior of Bayesian theory yields the common asymptotic (minimax and maximin) redundancy of universal data compression in a parametric setting. We seek a possible analogue of this result for the twolevel quantum systems. We restrict our considerations to prior probability distributions belonging to a certain oneparameter family, qu , 1 < u < 1. Within this setting, we are able to compute exact redundancy formulas, for which we nd the asymptotic limits. We compare our quantum asymptotic redundancy formulas to those derived by naively applying the (nonquantum) counterparts of Clarke and Barron, and nd certain common features. Our results are based on formulas we obtain for the eigenvalues and eigenvectors of 2 n 2 n (Bayesian density) matrices, n (u). These matrices are the weighted averages (with respect to qu) of all possible tensor products of n identical 2 2 density matrices, representing the twolevel quantum systems. We propose a form of universal coding for the situation in which the density matrix describing an ensemble of quantum signal states is unknown. A sequence of n signals would be projected onto the dominant eigenspaces of n (u).
Information geometry, Bayesian inference, ideal estimates and error decomposition
, 1998
"... In statistics it is necessary to study the relation among many probability distributions. Information geometry elucidates the geometric structure on the space of all distributions. When combined with Bayesian decision theory, it leads to the new concept of “ideal estimates”. They uniquely exist in t ..."
Abstract

Cited by 7 (1 self)
 Add to MetaCart
(Show Context)
In statistics it is necessary to study the relation among many probability distributions. Information geometry elucidates the geometric structure on the space of all distributions. When combined with Bayesian decision theory, it leads to the new concept of “ideal estimates”. They uniquely exist in the space of finite measures, and are generally sufficient statistic. The optimal estimate on any model is given by projecting the ideal estimate onto that model. An error decomposition theorem splits the error of an estimate into the sum of statistical error and approximation error. They can be expanded to yield higher order asymptotics. Furthermore, the ideal estimates under certain uniform priors, invariantly defined in information geometry, corresponds to various optimal nonBayesian estimates, such as the MLE.
On improved predictive density estimation with parametric constraints
 Electronic Journal of Statistics
, 2011
"... CIRJE Discussion Papers can be downloaded without charge from: ..."
Abstract

Cited by 4 (0 self)
 Add to MetaCart
(Show Context)
CIRJE Discussion Papers can be downloaded without charge from: