Bayes Factors and BIC  Comment on “A Critique of the Bayesian Information Criterion for Model Selection”
I would like to thank David L. Weakliem (1999 [this issue]) for a thoughtprovoking discussion of the basis of the Bayesian information criterion (BIC). We may be in closer agreement than one might think from reading his article. When writing about Bayesian model selection for social researchers, I focused on the BIC approximation on the grounds that it is easily implemented and often reasonable, and simplifies the exposition of an already technical topic. As Weakliem says, BIC corresponds to one of many possible priors, although I will argue that this prior is such as to make BIC appropriate for baseline reference use and reporting, albeit not necessarily always appropriate for drawing final conclusions. When writing about the same subject for statistical journals, however, I have paid considerable attention to the choice of priors for Bayes factors. I thank Weakliem for bringing this subtle but important topic to the attention of sociologists. In 1986, I proposed replacing P values by Bayes factors as the basis for hypothesis testing and model selection in social research, and I suggested BIC as a simple and convenient, albeit crude, approximation. Since then, a great deal has been learned about Bayes factors in general, and about BIC in particular. Weakliem seems to agree that the Bayes factor framework is a useful one for hypothesis testing and model selection; his concern is with how the Bayes factors are to be evaluated. Weakliem makes two main points about the BIC approximation. The first is that BIC yields an approximation to Bayes factors that corresponds closely to a particular prior (the unit information prior) on
Logicist Statistics I. Models and Modeling
 Statistical Science
Abstract. Arguments are presented to support increased emphasis on logical aspects of formal methods of analysis, depending on probability in the sense of R. A. Fisher. Formulating probabilistic models that convey uncertain knowledge of objective phenomena and using such models for inductive reasoning are central activities of individuals that introduce limited but necessary subjectivity into science. Statistical models are classified into overlapping types called here empirical, stochastic and predictive, all drawing on a common mathematical theory of probability, and all facilitating statements with logical and epistemic content. Contexts in which these ideas are intended to apply are discussed via three major examples. Key words and phrases: Logicism and proceduralism; specificity of analysis; formal subjective probability; complementarity; subjective and objective; formal and informal; empirical, stochastic and predictive models; U.S. national census; screening for chronic disease; global climate change.
Bayes Factors and BIC: Comment on Weakliem
Weakliem agrees that Bayes factors are useful for model selection and hypothesis testing. He reminds us that the simple and convenient BIC approximation corresponds most closely to one particular prior on the parameter space, the unit information prior, and points out that researchers may have different prior information or opinions. Clearly a prior that represents the available information should be used, although the unit information prior often seems reasonable in the absence of strong prior information. It seems that, among the Bayes factors likely to be used in practice, BIC is conservative in the sense of tending to provide less evidence for additional parameters or "effects". Thus if a Bayes factor based on additional prior information favors an effect, but BIC does not, the prior information is playing a crucial role and this should be made clear when the research is reported. BIC may well have a role as a baseline reference analysis to be provided in routine reporting of research results, perhaps along with Bayes factors based on other priors. In Weakliem's 2 x 2 table examples, BIC and Bayes factors based on Weakliem's preferred priors lead to similar substantive conclusions, but both differ from those based on P values. When there is additional prior information, the technology now exists to express it as