Results 1  10
of
17
Bayes Factors
, 1995
"... In a 1935 paper, and in his book Theory of Probability, Jeffreys developed a methodology for quantifying the evidence in favor of a scientific theory. The centerpiece was a number, now called the Bayes factor, which is the posterior odds of the null hypothesis when the prior probability on the null ..."
Abstract

Cited by 981 (70 self)
 Add to MetaCart
In a 1935 paper, and in his book Theory of Probability, Jeffreys developed a methodology for quantifying the evidence in favor of a scientific theory. The centerpiece was a number, now called the Bayes factor, which is the posterior odds of the null hypothesis when the prior probability on the null is onehalf. Although there has been much discussion of Bayesian hypothesis testing in the context of criticism of P values, less attention has been given to the Bayes factor as a practical tool of applied statistics. In this paper we review and discuss the uses of Bayes factors in the context of five scientific applications in genetics, sports, ecology, sociology and psychology.
The variable selection problem
 Journal of the American Statistical Association
, 2000
"... The problem of variable selection is one of the most pervasive model selection problems in statistical applications. Often referred to as the problem of subset selection, it arises when one wants to model the relationship between a variable of interest and a subset of potential explanatory variables ..."
Abstract

Cited by 39 (2 self)
 Add to MetaCart
The problem of variable selection is one of the most pervasive model selection problems in statistical applications. Often referred to as the problem of subset selection, it arises when one wants to model the relationship between a variable of interest and a subset of potential explanatory variables or predictors, but there is uncertainty about which subset to use. This vignette reviews some of the key developments which have led to the wide variety of approaches for this problem. 1
Performance Prediction for Exponential Language Models
"... We investigate the task of performance prediction for language models belonging to the exponential family. First, we attempt to empirically discover a formula for predicting test set crossentropy for ngram language models. We build models over varying domains, data set sizes, and ngram orders, an ..."
Abstract

Cited by 10 (3 self)
 Add to MetaCart
We investigate the task of performance prediction for language models belonging to the exponential family. First, we attempt to empirically discover a formula for predicting test set crossentropy for ngram language models. We build models over varying domains, data set sizes, and ngram orders, and perform linear regression to see whether we can model test set performance as a simple function of training set performance and various model statistics. Remarkably, we find a simple relationship that predicts test set performance with a correlation of 0.9997. We analyze why this relationship holds and show that it holds for other exponential language models as well, including classbased models and minimum discrimination information models. Finally, we discuss how this relationship can be applied to improve language model performance. 1
Factorized asymptotic bayesian inference for mixture models
 In AISTATS
, 2012
"... This paper proposes a novel Bayesian approximation inference method for mixture modeling. Our key idea is to factorize marginal loglikelihood using a variational distribution over latent variables. An asymptotic approximation, a factorized information criterion (FIC), is obtained by applying the La ..."
Abstract

Cited by 2 (1 self)
 Add to MetaCart
This paper proposes a novel Bayesian approximation inference method for mixture modeling. Our key idea is to factorize marginal loglikelihood using a variational distribution over latent variables. An asymptotic approximation, a factorized information criterion (FIC), is obtained by applying the Laplace method to each of the factorized components. In order to evaluate FIC, we propose factorized asymptotic Bayesian inference (FAB), which maximizes an asymptoticallyconsistent lower bound of FIC. FIC and FAB have several desirable properties: 1) asymptotic consistency with the marginal loglikelihood, 2) automatic component selection on the basis of an intrinsic shrinkage mechanism, and 3) parameter identifiability in mixture modeling. Experimental results show that FAB outperforms stateoftheart VB methods. 1
Schwarz, Wallace, and Rissanen: Intertwining Themes in Theories of Model Selection
, 2000
"... Investigators interested in model order estimation have tended to divide themselves into widely separated camps; this survey of the contributions of Schwarz, Wallace, Rissanen, and their coworkers attempts to build bridges between the various viewpoints, illuminating connections which may have pr ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
Investigators interested in model order estimation have tended to divide themselves into widely separated camps; this survey of the contributions of Schwarz, Wallace, Rissanen, and their coworkers attempts to build bridges between the various viewpoints, illuminating connections which may have previously gone unnoticed and clarifying misconceptions which seem to have propagated in the applied literature. Our tour begins with Schwarz's approximation of Bayesian integrals via Laplace's method. We then introduce the concepts underlying Rissanen 's minimum description length principle via a Bayesian scenario with a known prior; this provides the groundwork for understanding his more complex nonBayesian MDL which employs a "universal" encoding of the integers. Rissanen's method of parameter truncation is contrasted with that employed in various versions of Wallace's minimum message length criteria.
Asymptotics For The Gic In Model Selection
"... It is known that the C p method, which selects a model by minimizing the sum of squared residuals plus 2 times the model dimension, is asymptotically valid only when there is no fixeddimension correct model, and that the GIC method, which selects a model by minimizing the sum of squared residuals p ..."
Abstract
 Add to MetaCart
It is known that the C p method, which selects a model by minimizing the sum of squared residuals plus 2 times the model dimension, is asymptotically valid only when there is no fixeddimension correct model, and that the GIC method, which selects a model by minimizing the sum of squared residuals plus times the model dimension, is asymptotically valid when there are fixeddimension correct models in the class of models to be selected. However, the behavior of the GIC is not clear when there is no fixeddimension correct model. Also, when there are fixeddimension correct models, how to choose is still an unsolved problem. In the first part of this paper we provide an asymptotic justification for the GIC in the case where there is no fixeddimension correct model, using a loss function different from the customary squared error loss. The second part of this paper contains a result showing that in the GIC should be chosen as a function of the signalnoise ratio if the signalnoise ra...
Master Thesis
, 91
"... this paper. 129 in encoding y using q(y) is \Gamma ln q(y) + ln p l (yjx(y)) = ln ..."
Abstract
 Add to MetaCart
this paper. 129 in encoding y using q(y) is \Gamma ln q(y) + ln p l (yjx(y)) = ln
unknown title
, 2003
"... The authors are indebted to Yulia VeldMerkoulova and Monique de Jong for their excellent research assistance ..."
Abstract
 Add to MetaCart
The authors are indebted to Yulia VeldMerkoulova and Monique de Jong for their excellent research assistance
ISSN 09247815THE DIVIDEND AND SHARE REPURCHASE POLICIES OF CANADIAN FIRMS: EMPIRICAL EVIDENCE BASED ON A NEW RESEARCH DESIGN
, 2000
"... Ronald van Dijk is a senior research analyst at ING Investment Management in The Hague. ..."
Abstract
 Add to MetaCart
Ronald van Dijk is a senior research analyst at ING Investment Management in The Hague.