Results 1  10
of
19
Bayes Factors
, 1995
"... In a 1935 paper, and in his book Theory of Probability, Jeffreys developed a methodology for quantifying the evidence in favor of a scientific theory. The centerpiece was a number, now called the Bayes factor, which is the posterior odds of the null hypothesis when the prior probability on the null ..."
Abstract

Cited by 1567 (72 self)
 Add to MetaCart
In a 1935 paper, and in his book Theory of Probability, Jeffreys developed a methodology for quantifying the evidence in favor of a scientific theory. The centerpiece was a number, now called the Bayes factor, which is the posterior odds of the null hypothesis when the prior probability on the null is onehalf. Although there has been much discussion of Bayesian hypothesis testing in the context of criticism of P values, less attention has been given to the Bayes factor as a practical tool of applied statistics. In this paper we review and discuss the uses of Bayes factors in the context of five scientific applications in genetics, sports, ecology, sociology and psychology.
Benchmark Priors for Bayesian Model Averaging
 FORTHCOMING IN THE JOURNAL OF ECONOMETRICS
, 2001
"... In contrast to a posterior analysis given a particular sampling model, posterior model probabilities in the context of model uncertainty are typically rather sensitive to the specification of the prior. In particular, “diffuse” priors on modelspecific parameters can lead to quite unexpected consequ ..."
Abstract

Cited by 150 (5 self)
 Add to MetaCart
In contrast to a posterior analysis given a particular sampling model, posterior model probabilities in the context of model uncertainty are typically rather sensitive to the specification of the prior. In particular, “diffuse” priors on modelspecific parameters can lead to quite unexpected consequences. Here we focus on the practically relevant situation where we need to entertain a (large) number of sampling models and we have (or wish to use) little or no subjective prior information. We aim at providing an “automatic” or “benchmark” prior structure that can be used in such cases. We focus on the Normal linear regression model with uncertainty in the choice of regressors. We propose a partly noninformative prior structure related to a Natural Conjugate gprior specification, where the amount of subjective information requested from the user is limited to the choice of a single scalar hyperparameter g0j. The consequences of different choices for g0j are examined. We investigate theoretical properties, such as consistency of the implied Bayesian procedure. Links with classical information criteria are provided. More importantly, we examine the finite sample implications of several choices of g0j in a simulation study. The use of the MC3 algorithm of Madigan and York (1995), combined with efficient coding in Fortran, makes it feasible to conduct large simulations. In addition to posterior criteria, we shall also compare the predictive performance of different priors. A classic example concerning the economics of crime will also be provided and contrasted with results in the literature. The main findings of the paper will lead us to propose a “benchmark” prior specification in a linear regression context with model uncertainty.
Approximate Bayes Factors and Accounting for Model Uncertainty in Generalized Linear Models
, 1993
"... Ways of obtaining approximate Bayes factors for generalized linear models are described, based on the Laplace method for integrals. I propose a new approximation which uses only the output of standard computer programs such as GUM; this appears to be quite accurate. A reference set of proper priors ..."
Abstract

Cited by 134 (28 self)
 Add to MetaCart
Ways of obtaining approximate Bayes factors for generalized linear models are described, based on the Laplace method for integrals. I propose a new approximation which uses only the output of standard computer programs such as GUM; this appears to be quite accurate. A reference set of proper priors is suggested, both to represent the situation where there is not much prior information, and to assess the sensitivity of the results to the prior distribution. The methods can be used when the dispersion parameter is unknown, when there is overdispersion, to compare link functions, and to compare error distributions and variance functions. The methods can be used to implement the Bayesian approach to accounting for model uncertainty. I describe an application to inference about relative risks in the presence of control factors where model uncertainty is large and important. Software to implement the
Bayes factors and model uncertainty
 DEPARTMENT OF STATISTICS, UNIVERSITY OFWASHINGTON
, 1993
"... In a 1935 paper, and in his book Theory of Probability, Jeffreys developed a methodology for quantifying the evidence in favor of a scientific theory. The centerpiece was a number, now called the Bayes factor, which is the posterior odds of the null hypothesis when the prior probability on the null ..."
Abstract

Cited by 107 (6 self)
 Add to MetaCart
In a 1935 paper, and in his book Theory of Probability, Jeffreys developed a methodology for quantifying the evidence in favor of a scientific theory. The centerpiece was a number, now called the Bayes factor, which is the posterior odds of the null hypothesis when the prior probability on the null is onehalf. Although there has been much discussion of Bayesian hypothesis testing in the context of criticism of Pvalues, less attention has been given to the Bayes factor as a practical tool of applied statistics. In this paper we review and discuss the uses of Bayes factors in the context of five scientific applications. The points we emphasize are: from Jeffreys's Bayesian point of view, the purpose of hypothesis testing is to evaluate the evidence in favor of a scientific theory; Bayes factors offer a way of evaluating evidence in favor ofa null hypothesis; Bayes factors provide a way of incorporating external information into the evaluation of evidence about a hypothesis; Bayes factors are very general, and do not require alternative models to be nested; several techniques are available for computing Bayes factors, including asymptotic approximations which are easy to compute using the output from standard packages that maximize likelihoods; in "nonstandard " statistical models that do not satisfy common regularity conditions, it can be technically simpler to calculate Bayes factors than to derive nonBayesian significance
Bayesian model selection in structural equation models
, 1993
"... A Bayesian approach to model selection for structural equation models is outlined. This enables us to compare individual models, nested or nonnested, and also to search through the (perhaps vast) set of possible models for the best ones. The approach selects several models rather than just one, whe ..."
Abstract

Cited by 45 (10 self)
 Add to MetaCart
A Bayesian approach to model selection for structural equation models is outlined. This enables us to compare individual models, nested or nonnested, and also to search through the (perhaps vast) set of possible models for the best ones. The approach selects several models rather than just one, when appropriate, and so enables us to take account, both informally and formally, of uncertainty about model structure when making inferences about quantities of interest. The approach tends to select simpler models than strategies based on multiple Pvaluebased tests. It may thus help to overcome the criticism of structural
Earthquake likelihood model testing
 Seismological Research Letters
, 2007
"... The Regional Earthquake Likelihood Models (RELM) project aims to produce and evaluate alternate models of earthquake potential (probability per unit volume, magnitude, and time) for California. Based on differing assumptions, these models are produced both to test the validity of their assumptions a ..."
Abstract

Cited by 23 (4 self)
 Add to MetaCart
(Show Context)
The Regional Earthquake Likelihood Models (RELM) project aims to produce and evaluate alternate models of earthquake potential (probability per unit volume, magnitude, and time) for California. Based on differing assumptions, these models are produced both to test the validity of their assumptions and explore which models should be incorporated in seismic hazard and risk evaluation. Tests based on physical and geological criteria are useful but here we focus on statistical methods using future earthquake data only. We envision two evaluations: a selfconsistency test, and comparison of every pair of models for relative consistency. Both tests are based on the likelihood ratio method, and both would be fully prospective (that is, the models are not adjusted to fit the test data). To be tested, each model must assign a probability or probability density to any possible event within a specified region of space, time, and magnitude. For our tests the models must use a common format: earthquake rates in specified ”bins ” with location, magnitude, time and in some cases focal mechanism limits. 2
A comparison of scientific and engineering criteria for Bayesian model selection
, 1996
"... Given a set of possible models for variables X and a set of possible parameters for each model, the Bayesian “estimate ” of the probability distribution for X given observed data is obtained by averaging over the possible models and their parameters. An oftenused approximation for this estimate is ..."
Abstract

Cited by 22 (0 self)
 Add to MetaCart
Given a set of possible models for variables X and a set of possible parameters for each model, the Bayesian “estimate ” of the probability distribution for X given observed data is obtained by averaging over the possible models and their parameters. An oftenused approximation for this estimate is obtained by selecting a single model and averaging over its parameters. The approximation is useful because it is computationally efficient, and because it provides a model that facilitates understanding of the domain. A common criterion for model selection is the posterior probability of the model. Another criterion for model selection, proposed by San Martini and Spezzafari (1984), is the predictive performance of a model for the next observation to be seen. From the standpoint of domain understanding, both criteria are useful, because one identifies the model that is most likely, whereas the other identifies the model that is the best predictor of the next observation. To highlight the difference, we refer to the posteriorprobability and alternative criteria as the scientific criterion (SC) and engineering criterion (EC), respectively. When we are interested in predicting the next observation, the modelaveraged estimate is at least as good as that produced by EC, which itself is at least as good as the estimate produced by SC. We show experimentally that, for Bayesiannetwork models containing discrete variables only, the predictive performance of the model average can be significantly better than those of single models selected by either criterion, and that differences between models selected by the two criterion can be substantial. Keywords: model selection, model averaging, Bayesian selection criteria
Model selection in electromagnetic source analysis with an application to VEF’s
 IEEE Transactions on Biomedical Engineering
, 2002
"... Abstract — In electromagnetic source analysis it is necessary to determine how many sources are required to describe the EEG or MEG adequately. Model selection procedures (MSP’s, or goodness of fit procedures) give an estimate of the required number of sources. Existing and new MSP’s are evaluated i ..."
Abstract

Cited by 7 (4 self)
 Add to MetaCart
(Show Context)
Abstract — In electromagnetic source analysis it is necessary to determine how many sources are required to describe the EEG or MEG adequately. Model selection procedures (MSP’s, or goodness of fit procedures) give an estimate of the required number of sources. Existing and new MSP’s are evaluated in different source and noise settings: two sources which are close or distant, and noise which is uncorrelated or correlated. The commonly used MSP residual variance is seen to be ineffective, that is it often selects too many sources. Alternatives like the adjusted Hotelling’s test, Bayes information criterion, and the Wald test on source amplitudes are seen to be effective. The adjusted Hotelling’s test is recommended if a conservative approach is taken, and MSP’s such as Bayes information criterion or the Wald test on source amplitudes are recommended if a more liberal approach is desirable. The MSP’s are applied to empirical data (visual evoked fields). I.
Information and Posterior Probability Criteria for Model Selection in Local Likelihood Estimation
 J Amer. Stat. Ass
, 1998
"... this paper we propose a modification to the methods used to motivate many information and posterior probability criteria for the weighted likelihood case. We derive weighted versions for two of the most widely known criteria, namely the AIC and BIC. Via a simple modification, the criteria are also m ..."
Abstract

Cited by 6 (0 self)
 Add to MetaCart
this paper we propose a modification to the methods used to motivate many information and posterior probability criteria for the weighted likelihood case. We derive weighted versions for two of the most widely known criteria, namely the AIC and BIC. Via a simple modification, the criteria are also made useful for window span selection. The usefulness of the weighted version of these criteria are demonstrated through a simulation study and an application to three data sets. KEY WORDS: Information Criteria; Posterior Probability Criteria; Model Selection; Local Likelihood. 1. INTRODUCTION Local regression has become a popular method for smoothing scatterplots and for nonparametric regression in general. It has proven to be a useful tool in finding structure in datasets (Cleveland and Devlin 1988). Local regression estimation is a method for smoothing scatterplots (x i ; y i ), i = 1; : : : ; n in which the fitted value at x 0 is the value of a polynomial fit to the data using weighted least squares where the weight given to (x i ; y i ) is related to the distance between x i and x 0 . Stone (1977) shows that estimates obtained using the local regression methods have desirable theoretical properties. Recently, Fan (1993) has studied minimax properties of local linear regression. Tibshirani and Hastie (1987) extend the ideas of local regression to a local likelihood procedure. This procedure is designed for nonparametric regression modeling in situations where weighted least squares is inappropriate as an estimation method, for example binary data. Local regression may be viewed as a special case of local likelihood estimation. Tibshirani and Hastie (1987), Staniswalis (1989), and Loader (1999) apply local likelihood estimation to several types of data where local regressio...
On the Accuracy of Stochastic Complexity Approximations
 IN A. GAMMERMAN (ED.), CAUSAL
, 1997
"... Stochastic complexity of a data set is defined as the shortest possible code length for the data obtainable by using some fixed set of models. This measure is of great theoretical and practical importance as a tool for tasks such as determining model complexity, or performing predictive inference. U ..."
Abstract

Cited by 5 (3 self)
 Add to MetaCart
Stochastic complexity of a data set is defined as the shortest possible code length for the data obtainable by using some fixed set of models. This measure is of great theoretical and practical importance as a tool for tasks such as determining model complexity, or performing predictive inference. Unfortunately for cases where the data has missing information, computing the stochastic complexity requires marginalizing (integrating) over the missing data, which results even in the discrete data case to computing a sum with an exponential number of terms. Therefore in most cases the stochastic complexity measure has to be approximated. In this paper we will investigate empirically the performance of some of the most common stochastic complexity approximations in an attempt to understand their small sample behavior in the incomplete data framework. In earlier empirical evaluations the problem of not knowing the actual stochastic complexity for incomplete data was circumvented either by us...