Results 1  10
of
48
Bayes Factors
, 1995
"... In a 1935 paper, and in his book Theory of Probability, Jeffreys developed a methodology for quantifying the evidence in favor of a scientific theory. The centerpiece was a number, now called the Bayes factor, which is the posterior odds of the null hypothesis when the prior probability on the null ..."
Abstract

Cited by 981 (70 self)
 Add to MetaCart
In a 1935 paper, and in his book Theory of Probability, Jeffreys developed a methodology for quantifying the evidence in favor of a scientific theory. The centerpiece was a number, now called the Bayes factor, which is the posterior odds of the null hypothesis when the prior probability on the null is onehalf. Although there has been much discussion of Bayesian hypothesis testing in the context of criticism of P values, less attention has been given to the Bayes factor as a practical tool of applied statistics. In this paper we review and discuss the uses of Bayes factors in the context of five scientific applications in genetics, sports, ecology, sociology and psychology.
Operations for Learning with Graphical Models
 Journal of Artificial Intelligence Research
, 1994
"... This paper is a multidisciplinary review of empirical, statistical learning from a graphical model perspective. Wellknown examples of graphical models include Bayesian networks, directed graphs representing a Markov chain, and undirected networks representing a Markov field. These graphical models ..."
Abstract

Cited by 249 (12 self)
 Add to MetaCart
This paper is a multidisciplinary review of empirical, statistical learning from a graphical model perspective. Wellknown examples of graphical models include Bayesian networks, directed graphs representing a Markov chain, and undirected networks representing a Markov field. These graphical models are extended to model data analysis and empirical learning using the notation of plates. Graphical operations for simplifying and manipulating a problem are provided including decomposition, differentiation, and the manipulation of probability models from the exponential family. Two standard algorithm schemas for learning are reviewed in a graphical framework: Gibbs sampling and the expectation maximization algorithm. Using these operations and schemas, some popular algorithms can be synthesized from their graphical specification. This includes versions of linear regression, techniques for feedforward networks, and learning Gaussian and discrete Bayesian networks from data. The paper conclu...
Model Selection and the Principle of Minimum Description Length
 Journal of the American Statistical Association
, 1998
"... This paper reviews the principle of Minimum Description Length (MDL) for problems of model selection. By viewing statistical modeling as a means of generating descriptions of observed data, the MDL framework discriminates between competing models based on the complexity of each description. This ..."
Abstract

Cited by 145 (5 self)
 Add to MetaCart
This paper reviews the principle of Minimum Description Length (MDL) for problems of model selection. By viewing statistical modeling as a means of generating descriptions of observed data, the MDL framework discriminates between competing models based on the complexity of each description. This approach began with Kolmogorov's theory of algorithmic complexity, matured in the literature on information theory, and has recently received renewed interest within the statistics community. In the pages that follow, we review both the practical as well as the theoretical aspects of MDL as a tool for model selection, emphasizing the rich connections between information theory and statistics. At the boundary between these two disciplines, we find many interesting interpretations of popular frequentist and Bayesian procedures. As we will see, MDL provides an objective umbrella under which rather disparate approaches to statistical modeling can coexist and be compared. We illustrate th...
A Reference Bayesian Test for Nested Hypotheses And its Relationship to the Schwarz Criterion
 Journal of the American Statistical Association
, 1994
"... To compute a Bayes factor for testing H 0 : / = / 0 in the presence of a nuisance parameter fi, priors under the null and alternative hypotheses must be chosen. As in Bayesian estimation, an important problem has been to define automatic or "reference" methods for determining priors based only on t ..."
Abstract

Cited by 125 (4 self)
 Add to MetaCart
To compute a Bayes factor for testing H 0 : / = / 0 in the presence of a nuisance parameter fi, priors under the null and alternative hypotheses must be chosen. As in Bayesian estimation, an important problem has been to define automatic or "reference" methods for determining priors based only on the structure of the model. In this paper we apply the heuristic device of taking the amount of information in the prior on / equal to the amount of information in a single observation. Then, after transforming fi to be "null orthogonal" to /, we take the marginal priors on fi to be equal under the null and alternative hypotheses. Doing so, and taking the prior on / to be Normal, we find that the log of the Bayes factor may be approximated by the Schwarz criterion with an error of order O(n \Gamma1=2 ), rather than the usual error of order O(1). This result suggests the Schwarz criterion should provide sensible approximate solutions to Bayesian testing problems, at least when the hypothese...
Calibration and Empirical Bayes Variable Selection
 Biometrika
, 1997
"... this paper, is that with F =2logp. This choice was proposed by Foster &G eorge (1994) where it was called the Risk Inflation Criterion (RIC) because it asymptotically minimises the maximum predictive risk inflation due to selection when X is orthogonal. This choice and its minimax property were also ..."
Abstract

Cited by 114 (19 self)
 Add to MetaCart
this paper, is that with F =2logp. This choice was proposed by Foster &G eorge (1994) where it was called the Risk Inflation Criterion (RIC) because it asymptotically minimises the maximum predictive risk inflation due to selection when X is orthogonal. This choice and its minimax property were also discovered independently by Donoho & Johnstone (1994) in the wavelet regression context, where they refer to it as the universal hard thresholding rule
The Impact of Jumps in Volatility and Returns
 Journal of Finance
, 2002
"... This paper examines a class of continuoustime models with stochastic volatility that incorporate jumps in returns and volatility. We develop a likelihoodbased es timation strategy and provide estimates of model parameters, spot volatility, jump times and jump sizes using S&P 500 and Nasdaq 100 ..."
Abstract

Cited by 113 (7 self)
 Add to MetaCart
This paper examines a class of continuoustime models with stochastic volatility that incorporate jumps in returns and volatility. We develop a likelihoodbased es timation strategy and provide estimates of model parameters, spot volatility, jump times and jump sizes using S&P 500 and Nasdaq 100 index returns. Estimates of jump times, jump sizes and volatility are particularly useful for identifying the effects of these factors during periods of market stress, such as those in 1987, 1997 and 1998.
Benchmark Priors for Bayesian Model Averaging
 FORTHCOMING IN THE JOURNAL OF ECONOMETRICS
, 2001
"... In contrast to a posterior analysis given a particular sampling model, posterior model probabilities in the context of model uncertainty are typically rather sensitive to the specification of the prior. In particular, “diffuse” priors on modelspecific parameters can lead to quite unexpected consequ ..."
Abstract

Cited by 94 (5 self)
 Add to MetaCart
In contrast to a posterior analysis given a particular sampling model, posterior model probabilities in the context of model uncertainty are typically rather sensitive to the specification of the prior. In particular, “diffuse” priors on modelspecific parameters can lead to quite unexpected consequences. Here we focus on the practically relevant situation where we need to entertain a (large) number of sampling models and we have (or wish to use) little or no subjective prior information. We aim at providing an “automatic” or “benchmark” prior structure that can be used in such cases. We focus on the Normal linear regression model with uncertainty in the choice of regressors. We propose a partly noninformative prior structure related to a Natural Conjugate gprior specification, where the amount of subjective information requested from the user is limited to the choice of a single scalar hyperparameter g0j. The consequences of different choices for g0j are examined. We investigate theoretical properties, such as consistency of the implied Bayesian procedure. Links with classical information criteria are provided. More importantly, we examine the finite sample implications of several choices of g0j in a simulation study. The use of the MC3 algorithm of Madigan and York (1995), combined with efficient coding in Fortran, makes it feasible to conduct large simulations. In addition to posterior criteria, we shall also compare the predictive performance of different priors. A classic example concerning the economics of crime will also be provided and contrasted with results in the literature. The main findings of the paper will lead us to propose a “benchmark” prior specification in a linear regression context with model uncertainty.
Bayes factors and model uncertainty
 DEPARTMENT OF STATISTICS, UNIVERSITY OFWASHINGTON
, 1993
"... In a 1935 paper, and in his book Theory of Probability, Jeffreys developed a methodology for quantifying the evidence in favor of a scientific theory. The centerpiece was a number, now called the Bayes factor, which is the posterior odds of the null hypothesis when the prior probability on the null ..."
Abstract

Cited by 89 (6 self)
 Add to MetaCart
In a 1935 paper, and in his book Theory of Probability, Jeffreys developed a methodology for quantifying the evidence in favor of a scientific theory. The centerpiece was a number, now called the Bayes factor, which is the posterior odds of the null hypothesis when the prior probability on the null is onehalf. Although there has been much discussion of Bayesian hypothesis testing in the context of criticism of Pvalues, less attention has been given to the Bayes factor as a practical tool of applied statistics. In this paper we review and discuss the uses of Bayes factors in the context of five scientific applications. The points we emphasize are: from Jeffreys's Bayesian point of view, the purpose of hypothesis testing is to evaluate the evidence in favor of a scientific theory; Bayes factors offer a way of evaluating evidence in favor ofa null hypothesis; Bayes factors provide a way of incorporating external information into the evaluation of evidence about a hypothesis; Bayes factors are very general, and do not require alternative models to be nested; several techniques are available for computing Bayes factors, including asymptotic approximations which are easy to compute using the output from standard packages that maximize likelihoods; in "nonstandard " statistical models that do not satisfy common regularity conditions, it can be technically simpler to calculate Bayes factors than to derive nonBayesian significance
Predictive Model Selection
 Journal of the Royal Statistical Society, Ser. B
, 1995
"... this article we propose three criteria that can be used to address model selection. These emphasize observables rather than parameters and are based on a certain Bayesian predictive density. They have a unifying basis that is simple and interpretable,are free of asymptotic de#nitions,and allow the i ..."
Abstract

Cited by 61 (4 self)
 Add to MetaCart
this article we propose three criteria that can be used to address model selection. These emphasize observables rather than parameters and are based on a certain Bayesian predictive density. They have a unifying basis that is simple and interpretable,are free of asymptotic de#nitions,and allow the incorporation of prior information. Moreover,two of these criteria are readily calibrated.
Frailty Correlated Default
, 2008
"... This paper shows that the probability of extreme default losses on portfolios of U.S. corporate debt is much greater than would be estimated under the standard assumption that default correlation arises only from exposure to observable risk factors. At the high confidence levels at which bank loan p ..."
Abstract

Cited by 33 (2 self)
 Add to MetaCart
This paper shows that the probability of extreme default losses on portfolios of U.S. corporate debt is much greater than would be estimated under the standard assumption that default correlation arises only from exposure to observable risk factors. At the high confidence levels at which bank loan portfolio and CDO default losses are typically measured for economiccapital and rating purposes, our empirical results indicate that conventionally based estimates are downward biased by a full order of magnitude on test portfolios. Our estimates are based on U.S. public nonfinancial firms existing between 1979 and 2004. We find strong evidence for the presence of common latent factors, even when controlling for observable factors that provide the most accurate available model of firmbyfirm default probabilities. ∗ We are grateful for financial support from Moody’s Corporation and Morgan Stanley, and for research assistance from Sabri Oncu and Vineet Bhagwat. We are also grateful for remarks from Torben Andersen, André Lucas, Richard Cantor, Stav Gaon, Tyler Shumway, and especially Michael Johannes. This revision is much improved because of suggestions by a referee, an associate editor, and Campbell Harvey. We are thankful to Moodys and to Ed Altman for generous assistance with data. Duffie is at The Graduate School of Business, Stanford University. Eckner and Horel are at Merrill Lynch. Saita is at Lehman