Results 1 - 10
of
125
Bayes Factors
, 1995
"... In a 1935 paper, and in his book Theory of Probability, Jeffreys developed a methodology for quantifying the evidence in favor of a scientific theory. The centerpiece was a number, now called the Bayes factor, which is the posterior odds of the null hypothesis when the prior probability on the null ..."
Abstract
-
Cited by 717 (65 self)
- Add to MetaCart
In a 1935 paper, and in his book Theory of Probability, Jeffreys developed a methodology for quantifying the evidence in favor of a scientific theory. The centerpiece was a number, now called the Bayes factor, which is the posterior odds of the null hypothesis when the prior probability on the null is one-half. Although there has been much discussion of Bayesian hypothesis testing in the context of criticism of P -values, less attention has been given to the Bayes factor as a practical tool of applied statistics. In this paper we review and discuss the uses of Bayes factors in the context of five scientific applications in genetics, sports, ecology, sociology and psychology.
Operations for Learning with Graphical Models
- Journal of Artificial Intelligence Research
, 1994
"... This paper is a multidisciplinary review of empirical, statistical learning from a graphical model perspective. Well-known examples of graphical models include Bayesian networks, directed graphs representing a Markov chain, and undirected networks representing a Markov field. These graphical models ..."
Abstract
-
Cited by 214 (13 self)
- Add to MetaCart
This paper is a multidisciplinary review of empirical, statistical learning from a graphical model perspective. Well-known examples of graphical models include Bayesian networks, directed graphs representing a Markov chain, and undirected networks representing a Markov field. These graphical models are extended to model data analysis and empirical learning using the notation of plates. Graphical operations for simplifying and manipulating a problem are provided including decomposition, differentiation, and the manipulation of probability models from the exponential family. Two standard algorithm schemas for learning are reviewed in a graphical framework: Gibbs sampling and the expectation maximization algorithm. Using these operations and schemas, some popular algorithms can be synthesized from their graphical specification. This includes versions of linear regression, techniques for feed-forward networks, and learning Gaussian and discrete Bayesian networks from data. The paper conclu...
Evaluating the Accuracy of Sampling-Based Approaches to the Calculation of Posterior Moments
- IN BAYESIAN STATISTICS
, 1992
"... Data augmentation and Gibbs sampling are two closely related, sampling-based approaches to the calculation of posterior moments. The fact that each produces a sample whose constituents are neither independent nor identically distributed complicates the assessment of convergence and numerical accurac ..."
Abstract
-
Cited by 176 (11 self)
- Add to MetaCart
Data augmentation and Gibbs sampling are two closely related, sampling-based approaches to the calculation of posterior moments. The fact that each produces a sample whose constituents are neither independent nor identically distributed complicates the assessment of convergence and numerical accuracy of the approximations to the expected value of functions of interest under the posterior. In this paper methods from spectral analysis are used to evaluate numerical accuracy formally and construct diagnostics for convergence. These methods are illustrated in the normal linear model with informative priors, and in the Tobit-censored regression model.
A Monte Carlo Approach to Nonnormal and Nonlinear State-Space Modeling
, 1992
"... this article then is to develop methodology for modeling the nonnormality of the ut, the vt, or both. A second departure from the model specification ( 1 ) is to allow for unknown variances in the state or observational equation, as well as for unknown parameters in the transition matrices Ft and Ht ..."
Abstract
-
Cited by 98 (9 self)
- Add to MetaCart
this article then is to develop methodology for modeling the nonnormality of the ut, the vt, or both. A second departure from the model specification ( 1 ) is to allow for unknown variances in the state or observational equation, as well as for unknown parameters in the transition matrices Ft and Ht. As a third generalization we allow for nonlinear model structures; that is, X t = ft(Xt-l) q- Ut, and Yt = ht(xt) + vt, t = 1, ..., n, (2) whereft( ) and ht(. ) are given, but perhaps also depend on some unknown parameters. The experimenter may wish to entertain a variety of error distributions. Our goal throughout the article is an analysis for general state-space models that does not resort to convenient assumptions at the expense of model adequacy
Information-theoretic asymptotics of Bayes methods
- IEEE Transactions on Information Theory
, 1990
"... Abstract-In the absence of knowledge of the true density function, Bayesian models take the joint density function for a sequence of n random variables to be an average of densities with respect to a prior. We examine the relative entropy distance D,, between the true density and the Bayesian densit ..."
Abstract
-
Cited by 92 (7 self)
- Add to MetaCart
Abstract-In the absence of knowledge of the true density function, Bayesian models take the joint density function for a sequence of n random variables to be an average of densities with respect to a prior. We examine the relative entropy distance D,, between the true density and the Bayesian density and show that the asymptotic distance is (d/2Xlogn)+ c, where d is the dimension of the parameter vector. Therefore, the relative entropy rate D,,/n converges to zero at rate (logn)/n. The constant c, which we explicitly identify, depends only on the prior density function and the Fisher information matrix evaluated at the true parameter value. Consequences are given for density estima-tion, universal data compression, composite hypothesis testing, and stock-market portfolio selection. 1.
Approximate Bayes Factors and Accounting for Model Uncertainty in Generalized Linear Models
, 1993
"... Ways of obtaining approximate Bayes factors for generalized linear models are described, based on the Laplace method for integrals. I propose a new approximation which uses only the output of standard computer programs such as GUM; this appears to be quite accurate. A reference set of proper priors ..."
Abstract
-
Cited by 79 (28 self)
- Add to MetaCart
Ways of obtaining approximate Bayes factors for generalized linear models are described, based on the Laplace method for integrals. I propose a new approximation which uses only the output of standard computer programs such as GUM; this appears to be quite accurate. A reference set of proper priors is suggested, both to represent the situation where there is not much prior information, and to assess the sensitivity of the results to the prior distribution. The methods can be used when the dispersion parameter is unknown, when there is overdispersion, to compare link functions, and to compare error distributions and variance functions. The methods can be used to implement the Bayesian approach to accounting for model uncertainty. I describe an application to inference about relative risks in the presence of control factors where model uncertainty is large and important. Software to implement the
Bayes factors and model uncertainty
- DEPARTMENT OF STATISTICS, UNIVERSITY OFWASHINGTON
, 1993
"... In a 1935 paper, and in his book Theory of Probability, Jeffreys developed a methodology for quantifying the evidence in favor of a scientific theory. The centerpiece was a number, now called the Bayes factor, which is the posterior odds of the null hypothesis when the prior probability on the null ..."
Abstract
-
Cited by 70 (6 self)
- Add to MetaCart
In a 1935 paper, and in his book Theory of Probability, Jeffreys developed a methodology for quantifying the evidence in favor of a scientific theory. The centerpiece was a number, now called the Bayes factor, which is the posterior odds of the null hypothesis when the prior probability on the null is one-half. Although there has been much discussion of Bayesian hypothesis testing in the context of criticism of P-values, less attention has been given to the Bayes factor as a practical tool of applied statistics. In this paper we review and discuss the uses of Bayes factors in the context of five scientific applications. The points we emphasize are:- from Jeffreys's Bayesian point of view, the purpose of hypothesis testing is to evaluate the evidence in favor of a scientific theory;- Bayes factors offer a way of evaluating evidence in favor ofa null hypothesis;- Bayes factors provide a way of incorporating external information into the evaluation of evidence about a hypothesis;- Bayes factors are very general, and do not require alternative models to be nested;- several techniques are available for computing Bayes factors, including asymptotic approximations which are easy to compute using the output from standard packages that maximize likelihoods;- in "non-standard " statistical models that do not satisfy common regularity conditions, it can be technically simpler to calculate Bayes factors than to derive non-Bayesian significance
Detecting Features in Spatial Point Processes with . . .
, 1995
"... We consider the problem of detecting features in spatial point processes in the presence of substantial clutter. One example is the detection of mine elds using reconnaissance aircraft images that erroneously identify many objects that are not mines. Another is the detection of seismic faults on the ..."
Abstract
-
Cited by 68 (32 self)
- Add to MetaCart
We consider the problem of detecting features in spatial point processes in the presence of substantial clutter. One example is the detection of mine elds using reconnaissance aircraft images that erroneously identify many objects that are not mines. Another is the detection of seismic faults on the basis of earthquake catalogs: earthquakes tend to be clustered close to the faults, but there are many that are farther away. Our solution uses model-based clustering based on a mixture model for the process, in which features are assumed to generate points according to highly linear multivariate normal densities, and the clutter arises according to a spatial Poisson process. Very nonlinear features are represented by several highly linear multivariate normal densities, giving a piecewise linear representation. The model is estimated in two stages. In the rst stage, hierarchical model-based clustering is used to provide a rst estimate of the features. In the second stage, this clustering is re ned using the EM algorithm. The number of features is found using an approximation to the posterior probability of each number of features. For the minefield
Bayesian Methods for Hidden Markov Models -- Recursive Computing in the 21st Century
- JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION
, 2002
"... Markov chain Monte Carlo (MCMC) sampling strategies can be used to simulate hidden Markov model (HMM) parameters from their posterior distribution given observed data. Some MCMC methods (for computing likelihood, conditional probabilities of hidden states, and the most likely sequence of states) use ..."
Abstract
-
Cited by 52 (8 self)
- Add to MetaCart
Markov chain Monte Carlo (MCMC) sampling strategies can be used to simulate hidden Markov model (HMM) parameters from their posterior distribution given observed data. Some MCMC methods (for computing likelihood, conditional probabilities of hidden states, and the most likely sequence of states) used in practice can be improved by incorporating established recursive algorithms. The most important is a set of forward-backward recursions calculating conditional distributions of the hidden states given observed data and model parameters. We show how to use the recursive algorithms in an MCMC context and demonstrate mathematical and empirical results showing a Gibbs sampler using the forward-backward recursions mixes more rapidly than another sampler often used for HMM's. We introduce an augmented variables technique for obtaining unique state labels in HMM's and finite mixture models. We show how recursive computing allows statistically efficient use of MCMC output when estimating the hidden states. We directly calculate the posterior distribution of the hidden chain's state space size by MCMC, circumventing asymptotic arguments underlying the Bayesian information criterion, which is shown to be inappropriate for a frequently analyzed data set in the HMM literature. The use of log-likelihood for assessing MCMC convergence is illustrated, and posterior predictive checks are used to investigate application specific questions of model adequacy.
The practical implementation of Bayesian model selection
- Institute of Mathematical Statistics
, 2001
"... In principle, the Bayesian approach to model selection is straightforward. Prior probability distributions are used to describe the uncertainty surrounding all unknowns. After observing the data, the posterior distribution provides a coherent post data summary of the remaining uncertainty which is r ..."
Abstract
-
Cited by 48 (2 self)
- Add to MetaCart
In principle, the Bayesian approach to model selection is straightforward. Prior probability distributions are used to describe the uncertainty surrounding all unknowns. After observing the data, the posterior distribution provides a coherent post data summary of the remaining uncertainty which is relevant for model selection. However, the practical implementation of this approach often requires carefully tailored priors and novel posterior calculation methods. In this article, we illustrate some of the fundamental practical issues that arise for two different model selection problems: the variable selection problem for the linear model and the CART model selection problem.

