Results 1  10
of
109
Bayesian Data Analysis
, 1995
"... I actually own a copy of Harold Jeffreys’s Theory of Probability but have only read small bits of it, most recently over a decade ago to confirm that, indeed, Jeffreys was not too proud to use a classical chisquared pvalue when he wanted to check the misfit of a model to data (Gelman, Meng and Ste ..."
Abstract

Cited by 2132 (59 self)
 Add to MetaCart
(Show Context)
I actually own a copy of Harold Jeffreys’s Theory of Probability but have only read small bits of it, most recently over a decade ago to confirm that, indeed, Jeffreys was not too proud to use a classical chisquared pvalue when he wanted to check the misfit of a model to data (Gelman, Meng and Stern, 2006). I do, however, feel that it is important to understand where our probability models come from, and I welcome the opportunity to use the present article by Robert, Chopin and Rousseau as a platform for further discussion of foundational issues. 2 In this brief discussion I will argue the following: (1) in thinking about prior distributions, we should go beyond Jeffreys’s principles and move toward weakly informative priors; (2) it is natural for those of us who work in social and computational sciences to favor complex models, contra Jeffreys’s preference for simplicity; and (3) a key generalization of Jeffreys’s ideas is to explicitly include model checking in the process of data analysis.
From Laplace To Supernova Sn 1987a: Bayesian Inference In Astrophysics
, 1990
"... . The Bayesian approach to probability theory is presented as an alternative to the currently used longrun relative frequency approach, which does not offer clear, compelling criteria for the design of statistical methods. Bayesian probability theory offers unique and demonstrably optimal solutions ..."
Abstract

Cited by 68 (2 self)
 Add to MetaCart
. The Bayesian approach to probability theory is presented as an alternative to the currently used longrun relative frequency approach, which does not offer clear, compelling criteria for the design of statistical methods. Bayesian probability theory offers unique and demonstrably optimal solutions to wellposed statistical problems, and is historically the original approach to statistics. The reasons for earlier rejection of Bayesian methods are discussed, and it is noted that the work of Cox, Jaynes, and others answers earlier objections, giving Bayesian inference a firm logical and mathematical foundation as the correct mathematical language for quantifying uncertainty. The Bayesian approaches to parameter estimation and model comparison are outlined and illustrated by application to a simple problem based on the gaussian distribution. As further illustrations of the Bayesian paradigm, Bayesian solutions to two interesting astrophysical problems are outlined: the measurement of wea...
Provenance of correlations in psychological data
 PSYCHONOMIC BULLETIN & REVIEW
, 2005
"... ..."
Maximum Entropy MIMO Wireless Channel Models with Limited Information
 in Proc. MATHMOD Conference on Mathematical Modeling
, 2006
"... In this contribution, models of wireless channels are derived from the maximum entropy principle, for several cases where only limited information about the propagation environment is available. First, analytical models are derived for the cases where certain parameters (channel energy, average ener ..."
Abstract

Cited by 13 (7 self)
 Add to MetaCart
(Show Context)
In this contribution, models of wireless channels are derived from the maximum entropy principle, for several cases where only limited information about the propagation environment is available. First, analytical models are derived for the cases where certain parameters (channel energy, average energy, spatial correlation matrix) are known deterministically. Frequently, these parameters are unknown (typically because the received energy or the spatial correlation varies with the user position), but still known to represent meaningful system characteristics. In these cases, analytical channel models are derived by assigning entropymaximizing distributions to these parameters, and marginalizing them out. For the MIMO case with spatial correlation, we show that the distribution of the covariance matrices is conveniently handled through its eigenvalues. The entropymaximizing distribution of the covariance matrix is shown to be a Wishart distribution. Furthermore, the corresponding probability density function of the channel matrix is shown to be described analytically by a function of the channel Frobenius norm. This technique can provide channel models incorporating the effect of shadow fading and spatial correlation between antennas without the need to assume explicit values for these parameters. The results are compared in terms of mutual information to the classical i.i.d. Gaussian model.
Computational methods for Bayesian model choice
 In MaxEnt 2009 proceedings (A. I. of Physics
, 2009
"... In this note, we shortly survey some recent approaches on the approximation of the Bayes factor used in Bayesian hypothesis testing and in Bayesian model choice. In particular, we reassess importance sampling, harmonic mean sampling, and nested sampling from a unified perspective. ..."
Abstract

Cited by 12 (5 self)
 Add to MetaCart
In this note, we shortly survey some recent approaches on the approximation of the Bayes factor used in Bayesian hypothesis testing and in Bayesian model choice. In particular, we reassess importance sampling, harmonic mean sampling, and nested sampling from a unified perspective.
Partially adaptive estimation via the maximum entropy densities
 Econom. J. 2005
"... Adaptive estimation is frequently used when the error distribution is nonnormal. We propose a partially adaptive estimator based on the maximum entropy estimate of the error distribution. Under the conditions specified in McDonald and Newey (1988), the proposed estimator is asymptotically normal a ..."
Abstract

Cited by 10 (3 self)
 Add to MetaCart
Adaptive estimation is frequently used when the error distribution is nonnormal. We propose a partially adaptive estimator based on the maximum entropy estimate of the error distribution. Under the conditions specified in McDonald and Newey (1988), the proposed estimator is asymptotically normal and efficient for the slope parameters. We investigate the finite sample performance of the proposed method and compare it with existing methods. We also apply the estimator to real world data.
Comparison of maximum entropy and higherorder entropy estimators
 Journal of Econometrics 107 (2002) 195
"... We show that the generalized maximum entropy (GME) is the only estimation method that is consistent with a set of five axioms. The GME estimator can be nested using a single parameter, α, into two more general classes of estimators: GMEα estimators. Each of these GMEα estimators violates one of th ..."
Abstract

Cited by 9 (0 self)
 Add to MetaCart
(Show Context)
We show that the generalized maximum entropy (GME) is the only estimation method that is consistent with a set of five axioms. The GME estimator can be nested using a single parameter, α, into two more general classes of estimators: GMEα estimators. Each of these GMEα estimators violates one of the five basic axioms. However, smallsample simulations demonstrate that certain members of these GMEα classes of estimators may outperform the GME estimation rule.
An Efficient Robust Concept Exploration Method and Sequential Exploratory Experimental Design
, 2004
"... ..."
The Ising decoder: reading out the activity of large neural ensembles. Journal of computational neuroscience
, 2012
"... The Ising Model has recently received much attention for the statistical description of neural spike train data. In this paper, we propose and demonstrate its use for building decoders capable of predicting, on a millisecond timescale, the stimulus represented by a pattern of neural activity. After ..."
Abstract

Cited by 8 (0 self)
 Add to MetaCart
(Show Context)
The Ising Model has recently received much attention for the statistical description of neural spike train data. In this paper, we propose and demonstrate its use for building decoders capable of predicting, on a millisecond timescale, the stimulus represented by a pattern of neural activity. After fitting to a training dataset, the Ising decoder can be applied “online ” for instantaneous decoding of test data. While such models can be fit exactly using Boltzmann learning, this approach rapidly becomes computationally intractable as neural ensemble size increases. We show that several approaches, including the ThoulessAndersonPalmer (TAP) mean field approach from statistical physics, and the recently developed Minimum Probability Flow Learning (MPFL) algorithm, can be used for rapid inference of model parameters in largescale neural ensembles. Use of the Ising model for decoding, unlike other problems such as functional connectivity estimation, requires estimation of the partition function. As this involves summation over all possible responses, this step can be limiting. Mean field approaches avoid this problem by providing an analytical expression for the partition function. We demonstrate these decoding techniques by applying them to simulated neural ensemble responses from a mouse visual cortex model, finding an improvement in decoder performance for a model with heterogeneous as opposed to homogeneous neural tuning and response properties. Our results demonstrate the practicality of using the Ising model to read out, or decode, spatial patterns of activity comprised of many hundreds of neurons. 1
RESURRECTING LOGICAL PROBABILITY
 ERKENNTNIS
, 2001
"... The logical interpretation of probability, or “objective Bayesianism” – the theory that (some) probabilities are strictly logical degrees of partial implication – is defended. The main argument against it is that it requires the assignment of prior probabilities, and that any attempt to determine t ..."
Abstract

Cited by 8 (0 self)
 Add to MetaCart
The logical interpretation of probability, or “objective Bayesianism” – the theory that (some) probabilities are strictly logical degrees of partial implication – is defended. The main argument against it is that it requires the assignment of prior probabilities, and that any attempt to determine them by symmetry via a “principle of insufficient reason” inevitably leads to paradox. Three replies are advanced: that priors are imprecise or of little weight, so that disagreement about them does not matter, within limits; that it is possible to distinguish reasonable from unreasonable priors on logical grounds; and that in real cases disagreement about priors can usually be explained by differences in the background information. It is argued also that proponents of alternative conceptions of probability, such as frequentists, Bayesians and Popperians, are unable to avoid committing themselves to the basic principles of logical probability.