Results 1 -
5 of
5
Bayes Factors
, 1995
"... In a 1935 paper, and in his book Theory of Probability, Jeffreys developed a methodology for quantifying the evidence in favor of a scientific theory. The centerpiece was a number, now called the Bayes factor, which is the posterior odds of the null hypothesis when the prior probability on the null ..."
Abstract
-
Cited by 717 (65 self)
- Add to MetaCart
In a 1935 paper, and in his book Theory of Probability, Jeffreys developed a methodology for quantifying the evidence in favor of a scientific theory. The centerpiece was a number, now called the Bayes factor, which is the posterior odds of the null hypothesis when the prior probability on the null is one-half. Although there has been much discussion of Bayesian hypothesis testing in the context of criticism of P -values, less attention has been given to the Bayes factor as a practical tool of applied statistics. In this paper we review and discuss the uses of Bayes factors in the context of five scientific applications in genetics, sports, ecology, sociology and psychology.
Hypothesis Testing and Model Selection Via Posterior Simulation
- In Practical Markov Chain
, 1995
"... Introduction To motivate the methods described in this chapter, consider the following inference problem in astronomy (Soubiran, 1993). Until fairly recently, it has been believed that the Galaxy consists of two stellar populations, the disk and the halo. More recently, it has been hypothesized tha ..."
Abstract
-
Cited by 21 (1 self)
- Add to MetaCart
Introduction To motivate the methods described in this chapter, consider the following inference problem in astronomy (Soubiran, 1993). Until fairly recently, it has been believed that the Galaxy consists of two stellar populations, the disk and the halo. More recently, it has been hypothesized that there are in fact three stellar populations, the old (or thin) disk, the thick disk, and the halo, distinguished by their spatial distributions, their velocities, and their metallicities. These hypotheses have different implications for theories of the formation of the Galaxy. Some of the evidence for deciding whether there are two or three populations is shown in Figure 1, which shows radial and rotational velocities for n = 2; 370 stars. A natural model for this situation is a mixture model with J components, namely y i = J X j=1 ae j
Model Selection for Generalized Linear Models via GLIB, with Application to Epidemiology
, 1993
"... Epidemiological studies for assessing risk factors often use logistic regression, log-linear models, or other generalized linear models. They involve many decisions, including the choice and coding of risk factors and control variables. It is common practice to select independent variables using a s ..."
Abstract
-
Cited by 11 (5 self)
- Add to MetaCart
Epidemiological studies for assessing risk factors often use logistic regression, log-linear models, or other generalized linear models. They involve many decisions, including the choice and coding of risk factors and control variables. It is common practice to select independent variables using a series of significance tests and to choose the way variables are coded somewhat arbitrarily. The overall properties of such a procedure are not well understood, and conditioning on a single model ignores model uncertainty, leading to underestimation of uncertainty about quantities of interest (QUOIs). We describe a Bayesian modeling strategy that formalizes the model selection process and propagates model uncertainty through to inference about QUOIs. Each possible combination of modeling decisions defines a different model, and the models are compared using Bayes factors. Inference about a QUOI is based on an average of its posterior distributions under the individual models, weighted by thei...
Comparing Bayesian Model Class Selection Criteria by Discrete Finite Mixtures
- Information, Statistics and Induction in Science, pages 364--374, Proceedings of the ISIS'96 Conference
, 1996
"... : We investigate the problem of computing the posterior probability of a model class, given a data sample and a prior distribution for possible parameter settings. By a model class we mean a group of models which all share the same parametric form. In general this posterior may be very hard to compu ..."
Abstract
-
Cited by 9 (5 self)
- Add to MetaCart
: We investigate the problem of computing the posterior probability of a model class, given a data sample and a prior distribution for possible parameter settings. By a model class we mean a group of models which all share the same parametric form. In general this posterior may be very hard to compute for high-dimensional parameter spaces, which is usually the case with real-world applications. In the literature several methods for computing the posterior approximately have been proposed, but the quality of the approximations may depend heavily on the size of the available data sample. In this work we are interested in testing how well the approximative methods perform in real-world problem domains. In order to conduct such a study, we have chosen the model family of finite mixture distributions. With certain assumptions, we are able to derive the model class posterior analytically for this model family. We report a series of model class selection experiments on real-world data sets, w...
On the Accuracy of Stochastic Complexity Approximations
- In A. Gammerman (Ed.), Causal
"... Stochastic complexity of a data set is defined as the shortest possible code length for the data obtainable by using some fixed set of models. This measure is of great theoretical and practical importance as a tool for tasks such as determining model complexity, or performing predictive inference. U ..."
Abstract
-
Cited by 3 (3 self)
- Add to MetaCart
Stochastic complexity of a data set is defined as the shortest possible code length for the data obtainable by using some fixed set of models. This measure is of great theoretical and practical importance as a tool for tasks such as determining model complexity, or performing predictive inference. Unfortunately for cases where the data has missing information, computing the stochastic complexity requires marginalizing (integrating) over the missing data, which results even in the discrete data case to computing a sum with an exponential number of terms. Therefore in most cases the stochastic complexity measure has to be approximated. In this paper we will investigate empirically the performance of some of the most common stochastic complexity approximations in an attempt to understand their small sample behavior in the incomplete data framework. In earlier empirical evaluations the problem of not knowing the actual stochastic complexity for incomplete data was circumvented either by us...

