Results 1  10
of
30
Severe Testing as a Basic Concept in a NeymanPearson Philosophy of Induction
 BRITISH JOURNAL FOR THE PHILOSOPHY OF SCIENCE
, 2006
"... Despite the widespread use of key concepts of the Neyman–Pearson (N–P) statistical paradigm—type I and II errors, significance levels, power, confidence levels—they have been the subject of philosophical controversy and debate for over 60 years. Both current and longstanding problems of N–P tests s ..."
Abstract

Cited by 35 (14 self)
 Add to MetaCart
Despite the widespread use of key concepts of the Neyman–Pearson (N–P) statistical paradigm—type I and II errors, significance levels, power, confidence levels—they have been the subject of philosophical controversy and debate for over 60 years. Both current and longstanding problems of N–P tests stem from unclarity and confusion, even among N–P adherents, as to how a test’s (predata) error probabilities are to be used for (postdata) inductive inference as opposed to inductive behavior. We argue that the relevance of error probabilities is to ensure that only statistical hypotheses that have passed severe or probative tests are inferred from the data. The severity criterion supplies a metastatistical principle for evaluating proposed statistical inferences, avoiding classic fallacies from tests that are overly sensitive, as well as those not sensitive enough to particular errors and discrepancies.
Multilevel linear modelling for FMRI group analysis using Bayesian inference
 Neuroimage
, 2004
"... Functional magnetic resonance imaging studies often involve the acquisition of data from multiple sessions and/or multiple subjects. A hierarchical approach can be taken to modelling such data with a general linear model (GLM) at each level of the hierarchy introducing different random effects varia ..."
Abstract

Cited by 33 (6 self)
 Add to MetaCart
Functional magnetic resonance imaging studies often involve the acquisition of data from multiple sessions and/or multiple subjects. A hierarchical approach can be taken to modelling such data with a general linear model (GLM) at each level of the hierarchy introducing different random effects variance components. Inferring on these models is nontrivial with frequentist solutions being unavailable. A solution is to use a Bayesian framework. One important ingredient in this is the choice of prior on the variance components and toplevel regression parameters. Due to the typically small numbers of sessions or subjects in neuroimaging, the choice of prior is critical. To alleviate this problem, we introduce to neuroimage modelling the approach of reference priors, which drives the choice of prior such that it is noninformative in an informationtheoretic sense. We propose two inference techniques at the top level for multilevel hierarchies (a fast approach and a slower more accurate approach). We also demonstrate that we can infer on the top level of multilevel hierarchies by inferring on the levels of the hierarchy separately and passing summary statistics of a noncentral multivariate t distribution between them.
On the Dirichlet Prior and Bayesian Regularization
 In Advances in Neural Information Processing Systems 15
, 2002
"... A common objective in learning a model from data is to recover its network structure, while the model parameters are of minor interest. For example, we may wish to recover regulatory networks from highthroughput data sources. In this paper we examine how Bayesian regularization using a Dirichle ..."
Abstract

Cited by 22 (2 self)
 Add to MetaCart
A common objective in learning a model from data is to recover its network structure, while the model parameters are of minor interest. For example, we may wish to recover regulatory networks from highthroughput data sources. In this paper we examine how Bayesian regularization using a Dirichlet prior over the model parameters affects the learned model structure in a domain with discrete variables. Surprisingly, a weak prior in the sense of smaller equivalent sample size leads to a strong regularization of the model structure (sparse graph) given a sufficiently large data set. In particular, the empty graph is obtained in the limit of a vanishing strength of prior belief. This is diametrically opposite to what one may expect in this limit, namely the complete graph from an (unregularized) maximum likelihood estimate. Since the prior affects the parameters as expected, the prior strength balances a "tradeoff" between regularizing the parameters or the structure of the model. We demonstrate the benefits of optimizing this tradeoff in the sense of predictive accuracy.
Representation Dependence in Probabilistic Inference
 JOURNAL OF ARTIFICIAL INTELLIGENCE RESEARCH
, 2004
"... Nondeductive reasoning systems are often representation dependent: representing the same situation in two different ways may cause such a system to return two different answers. Some have viewed ..."
Abstract

Cited by 20 (1 self)
 Add to MetaCart
Nondeductive reasoning systems are often representation dependent: representing the same situation in two different ways may cause such a system to return two different answers. Some have viewed
Bayesian information criterion for censored survival models
 Biometrics
"... We investigate the Bayesian Information Criterion (BIC) for variable selection in models for censored survival data. Kass and Wasserman (1995) showed that BIC provides a close approximation to the Bayes factor when a unitinformation prior on the parameter space is used. We propose a revision of the ..."
Abstract

Cited by 15 (2 self)
 Add to MetaCart
We investigate the Bayesian Information Criterion (BIC) for variable selection in models for censored survival data. Kass and Wasserman (1995) showed that BIC provides a close approximation to the Bayes factor when a unitinformation prior on the parameter space is used. We propose a revision of the penalty term in BIC so that it is de ned in terms of the number of uncensored events instead of the number of observations. For the simplest censored data model, that of exponential distributions of survival times (i.e. a constant hazard rate), this revision results in a better approximation to the exact Bayes factor based on a conjugate unitinformation prior. In the Cox proportional hazards regression model, we propose de ning BIC in terms of the maximized partial likelihood. Using the number of deaths rather than the number of individuals in the BIC penalty term corresponds to a more realistic prior on the parameter space, and is shown to improve predictive performance for assessing stroke risk in the Cardiovascular Health Study.
Default priors for Bayesian and frequentist inference
 J. Royal Statist. Soc. B
, 2010
"... We investigate the choice of default prior for use with likelihood to facilitate Bayesian and frequentist inference. Such a prior is a density or relative density that weights an observed likelihood function leading to the elimination of parameters not of interest and accordingly providing a density ..."
Abstract

Cited by 12 (4 self)
 Add to MetaCart
We investigate the choice of default prior for use with likelihood to facilitate Bayesian and frequentist inference. Such a prior is a density or relative density that weights an observed likelihood function leading to the elimination of parameters not of interest and accordingly providing a density type assessment for a parameter of interest. For regular models with independent coordinates we develop a secondorder prior for the full parameter based on an approximate location relation from near a parameter value to near the observed data point; this derives directly from the coordinate distribution functions and is closely linked to the original Bayes approach. We then develop a modified prior that is targetted on a component parameter of interest and avoids the marginalization paradoxes of Dawid, Stone and Zidek (1973); this uses some extensions of WelchPeers theory that modify the Jeffreys prior and builds more generally on the approximate location property. A third type of prior is then developed that targets a vector interest parameter in the presence of a vector nuisance parameter and is based more directly on the original Jeffreys approach. Examples are given to clarify the computation of the priors and the flexibility of the approach.
Constructing a Logic of Plausible Inference: a Guide To Cox's Theorem
 International Journal of Approximate Reasoning
, 2003
"... Cox's Theorem provides a theoretical basis for using probability theory as a general logic of plausible inference. The theorem states that any system for plausible reasoning that satisfies certain qualitative requirements intended to ensure consistency with classical deductive logic and corresponden ..."
Abstract

Cited by 10 (0 self)
 Add to MetaCart
Cox's Theorem provides a theoretical basis for using probability theory as a general logic of plausible inference. The theorem states that any system for plausible reasoning that satisfies certain qualitative requirements intended to ensure consistency with classical deductive logic and correspondence with commonsense reasoning is isomorphic to probability theory. However, the requirements used to obtain this result have been the subject of much debate. We review Cox's Theorem, discussing its requirements, the intuition and reasoning behind these, and the most important objections, and finish with an abbreviated proof of the theorem.
Information Geometry and Prior Selection
, 2002
"... In this contribution, we study the problem of prior selection arising in Bayesian inference. There is an extensive literature on the construction of non informative priors and the subject seems far to be definitely solved [1]. Here we revisit this subject with differential geometry tools and propose ..."
Abstract

Cited by 8 (6 self)
 Add to MetaCart
In this contribution, we study the problem of prior selection arising in Bayesian inference. There is an extensive literature on the construction of non informative priors and the subject seems far to be definitely solved [1]. Here we revisit this subject with differential geometry tools and propose to construct the prior in a Bayesian decision theoretic framework. We show how the construction of prior by projection is the best way to take into account the modelization restriction. For instance, we apply this procedure to the curved parametric families where the ignorance is directly expressed by the relative geometry of the restricted model in the wider model containing it.
The BehrensFisher problem revisited: A Bayesfrequentist synthesis
, 2001
"... The BehrensFisher problem concerns the inference for the difference between the means of two normal populations whose ratio of variances is unknown. In this situation, Fisher's fiducial interval differs markedly from the NeymanPearson confidence interval. A prior proposed by Jeffreys leads to a cr ..."
Abstract

Cited by 7 (0 self)
 Add to MetaCart
The BehrensFisher problem concerns the inference for the difference between the means of two normal populations whose ratio of variances is unknown. In this situation, Fisher's fiducial interval differs markedly from the NeymanPearson confidence interval. A prior proposed by Jeffreys leads to a credible interval that is equivalent to Fisher's solution, but carries a different interpretation. The authors propose an alternative prior leading to a credible interval whose asymptotic coverage probability matches the frequentist coverage probability more accurately than Jeffreys' interval. Their simulation results indicate excellent matching even in small samples.
The maximum entropy on the mean method, noise and sensitivity. Maximum entropy and Bayesian methods
 MohammadDjafari et al. / International Journal of Mass Spectrometry
, 1994
"... ABSTRACT. In this paper we address the problem of building convenient criteria to solve linear and noisy inverse problems of the form y = Ax + n. Our approach is based on the speci cation of constraints on the solution x through its belonging to a given convex set C. The solution is chosen as the me ..."
Abstract

Cited by 7 (3 self)
 Add to MetaCart
ABSTRACT. In this paper we address the problem of building convenient criteria to solve linear and noisy inverse problems of the form y = Ax + n. Our approach is based on the speci cation of constraints on the solution x through its belonging to a given convex set C. The solution is chosen as the mean of the distribution which is the closest to a reference measure on C with respect to the Kullback divergence, or crossentropy. This is therefore called the Maximum Entropy on the Mean Method (memm). This problem is shown to be equivalent to the convex one x =argminxF(x) submitted to y = Ax (in the noiseless case). Many classical criteria are found to be particular solutions with di erent reference measures. But except for some measures, these primal criteria have no explicit expression. Nevertheless, taking advantage of a dual formulation of the problem, the memm enables us to compute a solution in such cases. This indicates that such criteria could hardly have been derived without the memm. In order to integrate the presence of additive noise in the memm scheme, the object and noise are searched simultaneously for in an appropriate convex C 0. The memm then gives a criterion of the form x = arg minx F(x)+G(y; Ax), where F and G are convex, without constraints. The functional G is related to the prior distribution of noise, and may be used to account for speci c noise distributions. Using the regularity of the criterion, the sensitivity