• Documents
  • Authors
  • Tables
  • Other Seers ▼
    RefSeer AckSeer CollabSeer SeerSeer
  • Log in
  • Sign up
  • MetaCart

CiteSeerX logo

Advanced Search Include Citations
Advanced Search Include Citations | Disambiguate

Prior distributions for variance parameters in hierarchical models. Bayesian Anal 1:515–534 (0)

by A Gelman
Add To MetaCart

Tools

Sorted by:
Results 1 - 10 of 55
Next 10 →

Bayesian Data Analysis

by Andrew Gelman, Christian Robert, Nicolas Chopin, Judith Rousseau , 1995
"... I actually own a copy of Harold Jeffreys’s Theory of Probability but have only read small bits of it, most recently over a decade ago to confirm that, indeed, Jeffreys was not too proud to use a classical chi-squared p-value when he wanted to check the misfit of a model to data (Gelman, Meng and Ste ..."
Abstract - Cited by 887 (44 self) - Add to MetaCart
I actually own a copy of Harold Jeffreys’s Theory of Probability but have only read small bits of it, most recently over a decade ago to confirm that, indeed, Jeffreys was not too proud to use a classical chi-squared p-value when he wanted to check the misfit of a model to data (Gelman, Meng and Stern, 2006). I do, however, feel that it is important to understand where our probability models come from, and I welcome the opportunity to use the present article by Robert, Chopin and Rousseau as a platform for further discussion of foundational issues. 2 In this brief discussion I will argue the following: (1) in thinking about prior distributions, we should go beyond Jeffreys’s principles and move toward weakly informative priors; (2) it is natural for those of us who work in social and computational sciences to favor complex models, contra Jeffreys’s preference for simplicity; and (3) a key generalization of Jeffreys’s ideas is to explicitly include model checking in the process of data analysis.

Modeling changing dependency structure in multivariate time series

by Xiang Xuan, Kevin Murphy - In International Conference in Machine Learning , 2007
"... We show how to apply the efficient Bayesian changepoint detection techniques of Fearnhead in the multivariate setting. We model the joint density of vector-valued observations using undirected Gaussian graphical models, whose structure we estimate. We show how we can exactly compute the MAP segmenta ..."
Abstract - Cited by 23 (0 self) - Add to MetaCart
We show how to apply the efficient Bayesian changepoint detection techniques of Fearnhead in the multivariate setting. We model the joint density of vector-valued observations using undirected Gaussian graphical models, whose structure we estimate. We show how we can exactly compute the MAP segmentation, as well as how to draw perfect samples from the posterior over segmentations, simultaneously accounting for uncertainty about the number and location of changepoints, as well as uncertainty about the covariance structure. We illustrate the technique by applying it to financial data and to bee tracking data. 1.

Transformed and parameter-expanded Gibbs samplers for multilevel linear and generalized linear models

by Andrew Gelman, David A. Van Dyk, Zaiying Huang, W. John Boscardin , 2004
"... Hierarchical linear and generalized linear models can be fit using Gibbs samplers and Metropolis algorithms; these models, however, often have many parameters, and convergence of the seemingly most natural Gibbs and Metropolis algorithms can sometimes be slow. We examine solutions that involve repar ..."
Abstract - Cited by 8 (4 self) - Add to MetaCart
Hierarchical linear and generalized linear models can be fit using Gibbs samplers and Metropolis algorithms; these models, however, often have many parameters, and convergence of the seemingly most natural Gibbs and Metropolis algorithms can sometimes be slow. We examine solutions that involve reparameterization and over-parameterization. We begin with parameter expansion using working parameters, a strategy developed for the EM algorithm by Meng and van Dyk (1997) and Liu, Rubin, and Wu (1998). This strategy can lead to algorithms that are much less susceptible to becoming stuck near zero values of the variance parameters than are more standard algorithms. Second, we consider a simple rotation of the regression coefficients based on an estimate of their posterior covariance matrix. This leads to a Gibbs algorithm based on updating the transformed parameters one at a time or a Metropolis algorithm with vector jumps; either of these algorithms can perform much better (in terms of total CPU time) than the two standard algorithms: one-at-a-time updating of untransformed parameters or vector updating using a linear regression at each step. We present an innovative evaluation of the algorithms in terms of how quickly they can get away from remote areas of parameter space, along with some more standard evaluation of computation and convergence speeds. We illustrate our methods with examples from our applied work. Our ultimate goal is to develop a fast and reliable method for fitting a hierarchical linear model as easily as one can now fit a non-hierarchical model, and to increase understanding of Gibbs samplers for hierarchical models in general. Keywords: Bayesian computation, blessing of dimensionality, Markov chain Monte Carlo, multilevel modeling, mixed effects models, PX-EM algorithm, random effects regression, redundant

Handling sparsity via the horseshoe

by Carlos M. Carvalho, Nicholas G. Polson, James G. Scott - Journal of Machine Learning Research, W&CP
"... This paper presents a general, fully Bayesian framework for sparse supervised-learning problems based on the horseshoe prior. The horseshoe prior is a member of the family of multivariate scale mixtures of normals, and is therefore closely related to widely used approaches for sparse Bayesian learni ..."
Abstract - Cited by 5 (1 self) - Add to MetaCart
This paper presents a general, fully Bayesian framework for sparse supervised-learning problems based on the horseshoe prior. The horseshoe prior is a member of the family of multivariate scale mixtures of normals, and is therefore closely related to widely used approaches for sparse Bayesian learning, including, among others, Laplacian priors (e.g. the LASSO) and Student-t priors (e.g. the relevance vector machine). The advantages of the horseshoe are its robustness at handling unknown sparsity and large outlying signals. These properties are justified theoretically via a representation theorem and accompanied by comprehensive empirical experiments that compare its performance to benchmark alternatives. 1

Default prior distributions and efficient posterior computation in Bayesian factor analysis

by Joyee Ghosh, David B. Dunson - Journal of Computational and Graphical Statistics , 2009
"... Factor analytic models are widely used in social sciences. These models have also proven useful for sparse modeling of the covariance structure in multidimensional data. Normal prior distributions for factor loadings and inverse gamma prior distributions for residual variances are a popular choice b ..."
Abstract - Cited by 5 (1 self) - Add to MetaCart
Factor analytic models are widely used in social sciences. These models have also proven useful for sparse modeling of the covariance structure in multidimensional data. Normal prior distributions for factor loadings and inverse gamma prior distributions for residual variances are a popular choice because of their conditionally conjugate form. However, such prior distributions require elicitation of many hyperparameters and tend to result in poorly behaved Gibbs samplers. In addition, one must choose an informative specification, as high variance prior distributions face problems due to impropriety of the posterior distribution. This article proposes a default, heavy tailed prior distribution specification, which is induced through parameter expansion while facilitating efficient posterior computation. We also develop an approach to allow uncertainty in the number of factors. The methods are illustrated through simulated examples and epidemiology and toxicology applications.

Default Priors and Efficient Posterior Computation in Bayesian Factor Analysis

by Joyee Ghosh, David B. Dunson
"... Abstract. Factor analytic models are widely used in social sciences. These models have also proven useful for sparse modeling of the covariance structure in multidimensional data. Normal priors for factor loadings and inverse gamma priors for residual variances are a popular choice because of their ..."
Abstract - Cited by 4 (2 self) - Add to MetaCart
Abstract. Factor analytic models are widely used in social sciences. These models have also proven useful for sparse modeling of the covariance structure in multidimensional data. Normal priors for factor loadings and inverse gamma priors for residual variances are a popular choice because of their conditionally conjugate form. However, such priors require elicitation of many hyperparameters and tend to result in poorly behaved Gibbs samplers. In addition, one must choose an informative specification, as high variance priors face problems due to impropriety of the posterior. This article proposes a default, heavy tailed prior specification, which is induced through parameter expansion while facilitating efficient posterior computation. We also develop an approach to allow uncertainty in the number of factors. The methods are illustrated through simulated examples and epidemiology and toxicology applications.

Why we (usually) don’t have to worry about multiple comparisons ∗

by Andrew Gelman, Jennifer Hill, Masanao Yajima , 2008
"... The problem of multiple comparisons can disappear when viewed from a Bayesian perspective. We propose building multilevel models in the settings where multiple comparisons arise. These address the multiple comparisons problem and also yield more efficient estimates, especially in settings with low g ..."
Abstract - Cited by 4 (2 self) - Add to MetaCart
The problem of multiple comparisons can disappear when viewed from a Bayesian perspective. We propose building multilevel models in the settings where multiple comparisons arise. These address the multiple comparisons problem and also yield more efficient estimates, especially in settings with low group-level variation, which is where multiple comparisons are a particular concern. Multilevel models perform partial pooling (shifting estimates toward each other), whereas classical procedures typically keep the centers of intervals stationary, adjusting for multiple comparisons by making the intervals wider (or, equivalently, adjusting the pvalues corresponding to intervals of fixed width). Multilevel estimates make comparisons more conservative, in the sense that intervals for comparisons are less likely to include zero; as a result, those comparisons that are made with confidence are more likely to be valid.

Struggles with Survey Weighting and Regression Modeling

by Andrew Gelman - Statistical Science , 2007
"... Abstract. The general principles of Bayesian data analysis imply that models for survey responses should be constructed conditional on all variables that affect the probability of inclusion and nonresponse, which are also the variables used in survey weighting and clustering. However, such models ca ..."
Abstract - Cited by 4 (1 self) - Add to MetaCart
Abstract. The general principles of Bayesian data analysis imply that models for survey responses should be constructed conditional on all variables that affect the probability of inclusion and nonresponse, which are also the variables used in survey weighting and clustering. However, such models can quickly become very complicated, with potentially thousands of poststratification cells. It is then a challenge to develop general families of multilevel probability models that yield reasonable Bayesian inferences. We discuss in the context of several ongoing public health and social surveys. This work is currently open-ended, and we conclude with thoughts on how research could proceed to solve these problems. Multilevel modeling, poststratification, sam-Key words and phrases:

Bayesian analysis of matrix normal graphical models

by Hao Wang, Mike West - Biometrika , 2009
"... We develop Bayesian analysis of matrix-variate normal data with conditional independence graphical structuring of the characterising variance matrix parameters. This leads to fully Bayesian analysis of matrix normal graphical models, including discussion of novel prior specifications, the resulting ..."
Abstract - Cited by 4 (3 self) - Add to MetaCart
We develop Bayesian analysis of matrix-variate normal data with conditional independence graphical structuring of the characterising variance matrix parameters. This leads to fully Bayesian analysis of matrix normal graphical models, including discussion of novel prior specifications, the resulting problems of posterior computation addressed using Markov chain Monte Carlo methods, and graphical model assessment that involves approximate evaluation of marginal likelihood functions under specified graphical models. Modelling and inference for spatial/image data via a novel class of Markov random fields that arise as natural examples of matrix normal graphical models is discussed. This is complemented by the development of a broad class of dynamic models for matrix-variate time series within which stochastic elements defining time series errors and structural changes over time are subject to graphical model structuring. Three examples illustrate these developments and highlight questions of graphical model uncertainty and comparison in matrix data contexts.

Multilevel (hierarchical) modeling: what it can and can’t do ∗

by Andrew Gelman , 2005
"... Multilevel (hierarchical) modeling is a generalization of linear and generalized linear modeling in which regression coefficients are themselves given a model, whose parameters are also estimated from data. We illustrate the strengths and limitations of multilevel modeling through an example of the ..."
Abstract - Cited by 3 (0 self) - Add to MetaCart
Multilevel (hierarchical) modeling is a generalization of linear and generalized linear modeling in which regression coefficients are themselves given a model, whose parameters are also estimated from data. We illustrate the strengths and limitations of multilevel modeling through an example of the prediction of home radon levels in U.S. counties. The multilevel model is highly effective for predictions at both levels of the model but could easily be misinterpreted for causal inference.
The National Science Foundation
  • About CiteSeerX
  • Submit Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2010 The Pennsylvania State University