Results 1  10
of
35
Bayes Factors
, 1995
"... In a 1935 paper, and in his book Theory of Probability, Jeffreys developed a methodology for quantifying the evidence in favor of a scientific theory. The centerpiece was a number, now called the Bayes factor, which is the posterior odds of the null hypothesis when the prior probability on the null ..."
Abstract

Cited by 973 (70 self)
 Add to MetaCart
In a 1935 paper, and in his book Theory of Probability, Jeffreys developed a methodology for quantifying the evidence in favor of a scientific theory. The centerpiece was a number, now called the Bayes factor, which is the posterior odds of the null hypothesis when the prior probability on the null is onehalf. Although there has been much discussion of Bayesian hypothesis testing in the context of criticism of P values, less attention has been given to the Bayes factor as a practical tool of applied statistics. In this paper we review and discuss the uses of Bayes factors in the context of five scientific applications in genetics, sports, ecology, sociology and psychology.
Bayesian Model Selection in Social Research (with Discussion by Andrew Gelman & Donald B. Rubin, and Robert M. Hauser, and a Rejoinder)
 SOCIOLOGICAL METHODOLOGY 1995, EDITED BY PETER V. MARSDEN, CAMBRIDGE,; MASS.: BLACKWELLS.
, 1995
"... It is argued that Pvalues and the tests based upon them give unsatisfactory results, especially in large samples. It is shown that, in regression, when there are many candidate independent variables, standard variable selection procedures can give very misleading results. Also, by selecting a singl ..."
Abstract

Cited by 251 (19 self)
 Add to MetaCart
It is argued that Pvalues and the tests based upon them give unsatisfactory results, especially in large samples. It is shown that, in regression, when there are many candidate independent variables, standard variable selection procedures can give very misleading results. Also, by selecting a single model, they ignore model uncertainty and so underestimate the uncertainty about quantities of interest. The Bayesian approach to hypothesis testing, model selection and accounting for model uncertainty is presented. Implementing this is straightforward using the simple and accurate BIC approximation, and can be done using the output from standard software. Specific results are presented for most of the types of model commonly used in sociology. It is shown that this approach overcomes the difficulties with P values and standard model selection procedures based on them. It also allows easy comparison of nonnested models, and permits the quantification of the evidence for a null hypothesis...
Multiple Shrinkage and Subset Selection in Wavelets
, 1997
"... This paper discusses Bayesian methods for multiple shrinkage estimation in wavelets. Wavelets are used in applications for data denoising, via shrinkage of the coefficients towards zero, and for data compression, by shrinkage and setting small coefficients to zero. We approach wavelet shrinkage by u ..."
Abstract

Cited by 118 (16 self)
 Add to MetaCart
This paper discusses Bayesian methods for multiple shrinkage estimation in wavelets. Wavelets are used in applications for data denoising, via shrinkage of the coefficients towards zero, and for data compression, by shrinkage and setting small coefficients to zero. We approach wavelet shrinkage by using Bayesian hierarchical models, assigning a positive prior probability to the wavelet coefficients being zero. The resulting estimator for the wavelet coefficients is a multiple shrinkage estimator that exhibits a wide variety of nonlinear shrinkage patterns. We discuss fast computational implementations, with a focus on easytocompute analytic approximations as well as importance sampling and Markov chain Monte Carlo methods. Multiple shrinkage estimators prove to have excellent mean squared error performance in reconstructing standard test functions. We demonstrate this in simulated test examples, comparing various implementations of multiple shrinkage to commonly used shrinkage rules. Finally, we illustrate our approach with an application to the socalled "glint" data.
Benchmark Priors for Bayesian Model Averaging
 FORTHCOMING IN THE JOURNAL OF ECONOMETRICS
, 2001
"... In contrast to a posterior analysis given a particular sampling model, posterior model probabilities in the context of model uncertainty are typically rather sensitive to the specification of the prior. In particular, “diffuse” priors on modelspecific parameters can lead to quite unexpected consequ ..."
Abstract

Cited by 94 (5 self)
 Add to MetaCart
In contrast to a posterior analysis given a particular sampling model, posterior model probabilities in the context of model uncertainty are typically rather sensitive to the specification of the prior. In particular, “diffuse” priors on modelspecific parameters can lead to quite unexpected consequences. Here we focus on the practically relevant situation where we need to entertain a (large) number of sampling models and we have (or wish to use) little or no subjective prior information. We aim at providing an “automatic” or “benchmark” prior structure that can be used in such cases. We focus on the Normal linear regression model with uncertainty in the choice of regressors. We propose a partly noninformative prior structure related to a Natural Conjugate gprior specification, where the amount of subjective information requested from the user is limited to the choice of a single scalar hyperparameter g0j. The consequences of different choices for g0j are examined. We investigate theoretical properties, such as consistency of the implied Bayesian procedure. Links with classical information criteria are provided. More importantly, we examine the finite sample implications of several choices of g0j in a simulation study. The use of the MC3 algorithm of Madigan and York (1995), combined with efficient coding in Fortran, makes it feasible to conduct large simulations. In addition to posterior criteria, we shall also compare the predictive performance of different priors. A classic example concerning the economics of crime will also be provided and contrasted with results in the literature. The main findings of the paper will lead us to propose a “benchmark” prior specification in a linear regression context with model uncertainty.
The practical implementation of Bayesian model selection
 Institute of Mathematical Statistics
, 2001
"... In principle, the Bayesian approach to model selection is straightforward. Prior probability distributions are used to describe the uncertainty surrounding all unknowns. After observing the data, the posterior distribution provides a coherent post data summary of the remaining uncertainty which is r ..."
Abstract

Cited by 84 (3 self)
 Add to MetaCart
In principle, the Bayesian approach to model selection is straightforward. Prior probability distributions are used to describe the uncertainty surrounding all unknowns. After observing the data, the posterior distribution provides a coherent post data summary of the remaining uncertainty which is relevant for model selection. However, the practical implementation of this approach often requires carefully tailored priors and novel posterior calculation methods. In this article, we illustrate some of the fundamental practical issues that arise for two different model selection problems: the variable selection problem for the linear model and the CART model selection problem.
Model Selection for Probabilistic Clustering Using CrossValidated Likelihood
 Statistics and Computing
, 1998
"... Crossvalidated likelihood is investigated as a tool for automatically determining the appropriate number of components (given the data) in finite mixture modelling, particularly in the context of modelbased probabilistic clustering. The conceptual framework for the crossvalidation approach to mod ..."
Abstract

Cited by 65 (4 self)
 Add to MetaCart
Crossvalidated likelihood is investigated as a tool for automatically determining the appropriate number of components (given the data) in finite mixture modelling, particularly in the context of modelbased probabilistic clustering. The conceptual framework for the crossvalidation approach to model selection is direct in the sense that models are judged directly on their outofsample predictive performance. The method is applied to a wellknown clustering problem in the atmospheric science literature using historical records of upper atmosphere geopotential height in the Northern hemisphere. Crossvalidated likelihood provides strong evidence for three clusters in the data set, providing an objective confirmation of earlier results derived using nonprobabilistic clustering techniques. 1 Introduction Crossvalidation is a wellknown technique in supervised learning to select a model from a family of candidate models. Examples include selecting the best classification tree using cr...
Prediction via Orthogonalized Model Mixing
 JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION
, 1994
"... In this paper we introduce an approach and algorithms for model mixing in large prediction problems with correlated predictors. We focus on the choice of predictors in linear models, and mix over possible subsets of candidate predictors. Our approach is based on expressing the space of models in ter ..."
Abstract

Cited by 50 (9 self)
 Add to MetaCart
In this paper we introduce an approach and algorithms for model mixing in large prediction problems with correlated predictors. We focus on the choice of predictors in linear models, and mix over possible subsets of candidate predictors. Our approach is based on expressing the space of models in terms of an orthogonalization of the design matrix. Advantages are both statistical and computational. Statistically, orthogonalization often leads to a reduction in the number of competing models by eliminating correlations. Computationally, large model spaces cannot be enumerated; recent approaches are based on sampling models with high posterior probability via Markov chains. Based on orthogonalization of the space of candidate predictors, we can approximate the posterior probabilities of models by products of predictorspecific terms. This leads to an importance sampling function for sampling directly from the joint distribution over the model space, without resorting to Markov chains. Comp...
Bayesian model averaging
 STAT.SCI
, 1999
"... Standard statistical practice ignores model uncertainty. Data analysts typically select a model from some class of models and then proceed as if the selected model had generated the data. This approach ignores the uncertainty in model selection, leading to overcon dent inferences and decisions tha ..."
Abstract

Cited by 41 (0 self)
 Add to MetaCart
Standard statistical practice ignores model uncertainty. Data analysts typically select a model from some class of models and then proceed as if the selected model had generated the data. This approach ignores the uncertainty in model selection, leading to overcon dent inferences and decisions that are more risky than one thinks they are. Bayesian model averaging (BMA) provides a coherent mechanism for accounting for this model uncertainty. Several methods for implementing BMA haverecently emerged. We discuss these methods and present anumber of examples. In these examples, BMA provides improved outofsample predictive performance. We also provide a catalogue of
Issues in Bayesian Analysis of Neural Network Models
, 1998
"... This paper discusses these issues exploring the potentiality of Bayesian ideas in the analysis of NN models. Buntine and Weigend (1991) and MacKay (1992) have provided frameworks for their Bayesian analysis based on Gaussian approximations and Neal (1993) has applied hybrid Monte Carlo ideas. Ripley ..."
Abstract

Cited by 31 (0 self)
 Add to MetaCart
This paper discusses these issues exploring the potentiality of Bayesian ideas in the analysis of NN models. Buntine and Weigend (1991) and MacKay (1992) have provided frameworks for their Bayesian analysis based on Gaussian approximations and Neal (1993) has applied hybrid Monte Carlo ideas. Ripley (1993) and Cheng and Titterington (1994) have dwelt on the power of these ideas, specially as far as interpretation and architecture selection are concerned. See MacKay (1995) for a recent review. From a statistical modeling point of view NN's are a special instance of mixture models. Many issues about posterior multimodality and computational strategies in NN modeling are of relevance in the wider class of mixture models. Related recent references in the Bayesian literature on mixture models include Diebolt and Robert (1994), Escobar and West (1994), Robert and Mengersen (1995), Roeder and Wasserman (1995), West (1994), West and Cao (1993), West, Muller and Escobar (1994), and West and Turner (1994). We concentrate on approximation problems, though many of our suggestions can be translated to other areas. For those problems, NN's are viewed as highly nonlinear (semiparametric) approximators, where parameters are typically estimated by least squares. Applications of interest for practicioners include nonlinear regression, stochastic optimisation and regression metamodels for simulation output. The main issue we address here is how to undertake a Bayesian analysis of a NN model, and the uses of it we may make. Our contributions include: an evaluation of computational approaches to Bayesian analysis of NN models, including a novel Markov chain Monte Carlo scheme; a suggestion of a scheme for handling a variable architecture model and a scheme for combining NN models with more ...
Bayesian Model Averaging in proportional hazard models: Assessing the risk of a stroke
 Applied Statistics
, 1997
"... Evaluating the risk of stroke is important in reducing the incidence of this devastating disease. Here, we apply Bayesian model averaging to variable selection in Cox proportional hazard models in the context of the Cardiovascular Health Study, a comprehensive investigation into the risk factors for ..."
Abstract

Cited by 28 (5 self)
 Add to MetaCart
Evaluating the risk of stroke is important in reducing the incidence of this devastating disease. Here, we apply Bayesian model averaging to variable selection in Cox proportional hazard models in the context of the Cardiovascular Health Study, a comprehensive investigation into the risk factors for stroke. We introduce a technique based on the leaps and bounds algorithm which e ciently locates and ts the best models in the very large model space and thereby extends all subsets regression to Cox models. For each independent variable considered, the method provides the posterior probability that it belongs in the model. This is more directly interpretable than the corresponding Pvalues, and also more valid in that it takes account of model uncertainty. Pvalues from models preferred by stepwise methods tend to overstate the evidence for the predictive value of a variable. In our data Bayesian model averaging predictively outperforms standard model selection methods for assessing