Results 11 - 20
of
33
Variable selection and Bayesian model averaging in case-control studies
, 1998
"... Covariate and confounder selection in case-control studies is most commonly carried out using either a two-step method or a stepwise variable selection method in logistic regression. Inference is then carried out conditionally on the selected model, but this ignores the model uncertainty implicit in ..."
Abstract
-
Cited by 17 (7 self)
- Add to MetaCart
Covariate and confounder selection in case-control studies is most commonly carried out using either a two-step method or a stepwise variable selection method in logistic regression. Inference is then carried out conditionally on the selected model, but this ignores the model uncertainty implicit in the variable selection process, and so underestimates uncertainty about relative risks. We report on a simulation study designed to be similar to actual case-control studies. This shows that p-values computed after variable selection can greatly overstate the strength of conclusions. For example, for our simulated case-control studies with 1,000 subjects, of variables declared to be "significant" with p-values between.01 and.05, only 49 % actually were risk factors when stepwise variable selection was used. We propose Bayesian model averaging as a formal way of taking account of model uncertainty in case-control studies. This yields an easily interpreted summary, the posterior probability that a variable is a risk factor, and our simulation study indicates this to be reasonably well calibrated in the situations simulated. The methods are applied and compared
Bayesian model averaging: development of an improved multi-class, gene selection and classification tool for microarray data
, 2005
"... ..."
Bayesian Variable Selection for Proportional Hazards Models
, 1996
"... The authors consider the problem of Bayesian variable selection for proportional hazards regression models with right censored data. They propose a semi-parametric approach in which a nonparametric prior is specified for the baseline hazard rate and a fully parametric prior is specified for the regr ..."
Abstract
-
Cited by 12 (1 self)
- Add to MetaCart
The authors consider the problem of Bayesian variable selection for proportional hazards regression models with right censored data. They propose a semi-parametric approach in which a nonparametric prior is specified for the baseline hazard rate and a fully parametric prior is specified for the regression coe#cients. For the baseline hazard, they use a discrete gamma process prior, and for the regression coe#cients and the model space, they propose a semi-automatic parametric informative prior specification that focuses on the observables rather than the parameters. To implement the methodology, they propose a Markov chain Monte Carlo method to compute the posterior model probabilities. Examples using simulated and real data are given to demonstrate the methodology. R ESUM E Les auteurs abordent d'un point de vue bayesien le problemedelaselection de variables dans les modeles de regression des risques proportionnels en presence de censure a droite. Ils proposent une approche semi-p...
Long-Run Performance of Bayesian Model Averaging
- Journal of the American Statistical Association
, 2003
"... Hjort and Claeskens (HC) argue that statistical inference conditional on a single selected model underestimates uncertainty, and that model averaging is the way to remedy this; we strongly agree. They point out that Bayesian model averaging (BMA) has been the dominant approach to this, but argue tha ..."
Abstract
-
Cited by 11 (2 self)
- Add to MetaCart
Hjort and Claeskens (HC) argue that statistical inference conditional on a single selected model underestimates uncertainty, and that model averaging is the way to remedy this; we strongly agree. They point out that Bayesian model averaging (BMA) has been the dominant approach to this, but argue that its performance has been inadequately studied, and propose an alternative, Frequentist Model Averaging (FMA). We point out, however, that there is a substantial literature on the performance of BMA, consisting of three main threads: general theoretical results, simulation studies, and evaluation of out-of-sample performance. The theoretical results are scattered, and we summarize them. The results have been quite consistent: BMA has tended to outperform competing methods for model selection and taking account of model uncertainty. The theoretical results depend on the assumption that the \practical distribution" over which the performance of methods is assessed is the same as the prior distribution used, and we investigate sensitivity of results to this assumption in a simple normal example; they turn out not to be unduly sensitive.
Bayesian Partitioning for Estimating Disease Risk
, 1999
"... This paper presents a Bayesian nonparametric approach for the analysis of spatial count data. It extends the Bayesian partition methodology of Holmes, Denison and Mallick (1999) to handle data which involves counts. A demonstration involving incidence rates of leukemia in New York state is used to h ..."
Abstract
-
Cited by 10 (0 self)
- Add to MetaCart
This paper presents a Bayesian nonparametric approach for the analysis of spatial count data. It extends the Bayesian partition methodology of Holmes, Denison and Mallick (1999) to handle data which involves counts. A demonstration involving incidence rates of leukemia in New York state is used to highlight the methodology. The model allows us to make probability statements on the incidence rates around point sources without making any parametric assumptions about the nature of the influence between the sources and the surrounding location. Keywords: Bayesian computation; Leukemia incidence data; Markov chain Monte Carlo (MCMC); Point source; Spatial count data; Voronoi tessellation. 1 Introduction The analysis of spatial data is of great importance to researches in many fields, with the greatest interest shown by environmental engineers (e.g. studying the extent of pollution from a source) and medical statisticians (e.g. estimating spatially-varying disease incidence). A good review...
Bayes Factors and BIC -- Comment on “A Critique of the Bayesian Information Criterion for Model Selection”
, 1999
"... I would like to thank David L. Weakliem (1999 [this issue]) for a thought-provoking discussion of the basis of the Bayesian information criterion (BIC). We may be in closer agreement than one might think from reading his article. When writing about Bayesian model selection for social researchers, I ..."
Abstract
-
Cited by 10 (5 self)
- Add to MetaCart
I would like to thank David L. Weakliem (1999 [this issue]) for a thought-provoking discussion of the basis of the Bayesian information criterion (BIC). We may be in closer agreement than one might think from reading his article. When writing about Bayesian model selection for social researchers, I focused on the BIC approximation on the grounds that it is easily implemented and often reasonable, and simplifies the exposition of an already technical topic. As Weakliem says, BIC corresponds to one of many possible priors, although I will argue that this prior is such as to make BIC appropriate for baseline reference use and reporting, albeit not necessarily always appropriate for drawing final conclusions. When writing about the same subject for statistical journals, however, I have paid considerable attention to the choice of priors for Bayes factors. I thank Weakliem for bringing this subtle but important topic to the attention of sociologists. In 1986, I proposed replacing P values by Bayes factors as the basis for hypothesis testing and model selection in social research, and I suggested BIC as a simple and convenient, albeit crude, approximation. Since then, a great deal has been learned about Bayes factors in general, and about BIC in particular. Weakliem seems to agree that the Bayes factor framework is a useful one for hypothesis testing and model selection; his concern is with how the Bayes factors are to be evaluated. Weakliem makes two main points about the BIC approximation. The first is that BIC yields an approximation to Bayes factors that corresponds closely to a particular prior (the unit information prior) on
Enhancing the Predictive Performance of Bayesian Graphical Models
- Communications in Statistics – Theory and Methods
, 1995
"... Both knowledge-based systems and statistical models are typically concerned with making predictions about future observables. Here we focus on assessment of predictive performance and provide two techniques for improving the predictive performance of Bayesian graphical models. First, we present Baye ..."
Abstract
-
Cited by 7 (4 self)
- Add to MetaCart
Both knowledge-based systems and statistical models are typically concerned with making predictions about future observables. Here we focus on assessment of predictive performance and provide two techniques for improving the predictive performance of Bayesian graphical models. First, we present Bayesian model averaging, a technique for accounting for model uncertainty. Second, we describe a technique for eliciting a prior distribution for competing models from domain experts. We explore the predictive performance of both techniques in the context of a urological diagnostic problem. KEYWORDS: Prediction; Bayesian graphical model; Bayesian network; Decomposable model; Model uncertainty; Elicitation. 1 Introduction Both statistical methods and knowledge-based systems are typically concerned with combining information from various sources to make inferences about prospective measurements. Inevitably, to combine information, we must make modeling assumptions. It follows that we should car...
Orthogonalizations and Prior Distributions for Orthogonalized Model Mixing
- In Modelling and Prediction
, 1996
"... Prediction methods based on mixing over a set of plausible models can help alleviate the sensitivity of inference and decisions to modeling assumptions. One important application area is prediction in linear models. Computing techniques for model mixing in linear models include Markov chain Monte Ca ..."
Abstract
-
Cited by 6 (3 self)
- Add to MetaCart
Prediction methods based on mixing over a set of plausible models can help alleviate the sensitivity of inference and decisions to modeling assumptions. One important application area is prediction in linear models. Computing techniques for model mixing in linear models include Markov chain Monte Carlo methods as well as importance sampling. Clyde, DeSimone and Parmigiani (1996) developed an importance sampling strategy based on expressing the space of predictors in terms of an orthogonal basis. This leads both to a better identified problem and to simple approximations to the posterior model probabilities. Such approximations can be used to construct efficient importance samplers. For brevity, we call this strategy orthogonalized model mixing. Two key elements of orthogonalized model mixing are: a) the orthogonalization method and b) the prior probability distributions assigned to the models and the coefficients. In this paper we consider in further detail the specification of these t...
Bayesian covariance selection in generalized linear mixed models
- Biometrics
, 2006
"... SUMMARY. The generalized linear mixed model (GLMM), which extends the generalized linear model (GLM) to incorporate random effects characterizing heterogeneity among subjects, is widely used in analyzing correlated and longitudinal data. Although there is often interest in identify-ing the subset of ..."
Abstract
-
Cited by 6 (3 self)
- Add to MetaCart
SUMMARY. The generalized linear mixed model (GLMM), which extends the generalized linear model (GLM) to incorporate random effects characterizing heterogeneity among subjects, is widely used in analyzing correlated and longitudinal data. Although there is often interest in identify-ing the subset of predictors that have random effects, random effects selection can be challenging, particularly when outcome distributions are non-normal. This article proposes a fully Bayesian approach to the problem of simultaneous selection of fixed and random effects in GLMMs. Inte-grating out the random effects induces a covariance structure on the multivariate outcome data, and an important problem which we also consider is that of covariance selection. Our approach relies on variable selection-type mixture priors for the components in a special LDU decomposition of the random effects covariance. A stochastic search MCMC algorithm is developed, which relies on Gibbs sampling, with Taylor series expansions used to approximate intractable integrals. Simu-lated data examples are presented for different exponential family distributions, and the approach is applied to discrete survival data from a time-to-pregnancy study.

