Results 1  10
of
12
Reversible jump Markov chain Monte Carlo computation and Bayesian model determination
 Biometrika
, 1995
"... Markov chain Monte Carlo methods for Bayesian computation have until recently been restricted to problems where the joint distribution of all variables has a density with respect to some xed standard underlying measure. They have therefore not been available for application to Bayesian model determi ..."
Abstract

Cited by 827 (19 self)
 Add to MetaCart
Markov chain Monte Carlo methods for Bayesian computation have until recently been restricted to problems where the joint distribution of all variables has a density with respect to some xed standard underlying measure. They have therefore not been available for application to Bayesian model determination, where the dimensionality of the parameter vector is typically not xed. This article proposes a new framework for the construction of reversible Markov chain samplers that jump between parameter subspaces of di ering dimensionality, which is exible and entirely constructive. It should therefore have wide applicability in model determination problems. The methodology is illustrated with applications to multiple changepoint analysis in one and two dimensions, and toaBayesian comparison of binomial experiments.
Multiple Shrinkage and Subset Selection in Wavelets
, 1997
"... This paper discusses Bayesian methods for multiple shrinkage estimation in wavelets. Wavelets are used in applications for data denoising, via shrinkage of the coefficients towards zero, and for data compression, by shrinkage and setting small coefficients to zero. We approach wavelet shrinkage by u ..."
Abstract

Cited by 118 (16 self)
 Add to MetaCart
This paper discusses Bayesian methods for multiple shrinkage estimation in wavelets. Wavelets are used in applications for data denoising, via shrinkage of the coefficients towards zero, and for data compression, by shrinkage and setting small coefficients to zero. We approach wavelet shrinkage by using Bayesian hierarchical models, assigning a positive prior probability to the wavelet coefficients being zero. The resulting estimator for the wavelet coefficients is a multiple shrinkage estimator that exhibits a wide variety of nonlinear shrinkage patterns. We discuss fast computational implementations, with a focus on easytocompute analytic approximations as well as importance sampling and Markov chain Monte Carlo methods. Multiple shrinkage estimators prove to have excellent mean squared error performance in reconstructing standard test functions. We demonstrate this in simulated test examples, comparing various implementations of multiple shrinkage to commonly used shrinkage rules. Finally, we illustrate our approach with an application to the socalled "glint" data.
Prediction via Orthogonalized Model Mixing
 JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION
, 1994
"... In this paper we introduce an approach and algorithms for model mixing in large prediction problems with correlated predictors. We focus on the choice of predictors in linear models, and mix over possible subsets of candidate predictors. Our approach is based on expressing the space of models in ter ..."
Abstract

Cited by 50 (9 self)
 Add to MetaCart
In this paper we introduce an approach and algorithms for model mixing in large prediction problems with correlated predictors. We focus on the choice of predictors in linear models, and mix over possible subsets of candidate predictors. Our approach is based on expressing the space of models in terms of an orthogonalization of the design matrix. Advantages are both statistical and computational. Statistically, orthogonalization often leads to a reduction in the number of competing models by eliminating correlations. Computationally, large model spaces cannot be enumerated; recent approaches are based on sampling models with high posterior probability via Markov chains. Based on orthogonalization of the space of candidate predictors, we can approximate the posterior probabilities of models by products of predictorspecific terms. This leads to an importance sampling function for sampling directly from the joint distribution over the model space, without resorting to Markov chains. Comp...
Orthogonalizations and Prior Distributions for Orthogonalized Model Mixing
 In Modelling and Prediction
, 1996
"... Prediction methods based on mixing over a set of plausible models can help alleviate the sensitivity of inference and decisions to modeling assumptions. One important application area is prediction in linear models. Computing techniques for model mixing in linear models include Markov chain Monte Ca ..."
Abstract

Cited by 7 (3 self)
 Add to MetaCart
Prediction methods based on mixing over a set of plausible models can help alleviate the sensitivity of inference and decisions to modeling assumptions. One important application area is prediction in linear models. Computing techniques for model mixing in linear models include Markov chain Monte Carlo methods as well as importance sampling. Clyde, DeSimone and Parmigiani (1996) developed an importance sampling strategy based on expressing the space of predictors in terms of an orthogonal basis. This leads both to a better identified problem and to simple approximations to the posterior model probabilities. Such approximations can be used to construct efficient importance samplers. For brevity, we call this strategy orthogonalized model mixing. Two key elements of orthogonalized model mixing are: a) the orthogonalization method and b) the prior probability distributions assigned to the models and the coefficients. In this paper we consider in further detail the specification of these t...
Distribution of eigenvalues and eigenvectors of Wishart matrix when the population eigenvalues are infinitely dispersed
, 2002
"... We consider the asymptotic joint distribution of the eigenvalues and eigenvectors of Wishart matrix when the population eigenvalues become infinitely dispersed. We show that the normalized sample eigenvalues and the relevant elements of the sample eigenvectors are asymptotically all mutually indepen ..."
Abstract

Cited by 6 (5 self)
 Add to MetaCart
We consider the asymptotic joint distribution of the eigenvalues and eigenvectors of Wishart matrix when the population eigenvalues become infinitely dispersed. We show that the normalized sample eigenvalues and the relevant elements of the sample eigenvectors are asymptotically all mutually independently distributed. The limiting distributions of the normalized sample eigenvalues are chisquared distributions with varying degrees of freedom and the distribution of the relevant elements of the eigenvectors is the standard normal distribution. As an application of this result, we investigate tail minimaxity in the estimation of the population covariance matrix of Wishart distribution with respect to Stein's loss function and the quadratic loss function. Under mild regularity conditions, we show that the behavior of a broad class of minimax estimators is identical when the sample eigenvalues become infinitely dispersed. Keywords and phrases asymptotic distribution, covariance matrix, minimax estimator, quadratic loss, singular parameter, Stein's loss, tail minimaxity. 1
Improved minimax predictive densities under Kullback–Leibler loss
 Ann. Statist
, 2006
"... Let Xµ ∼ Np(µ, vxI)and Y µ ∼ Np(µ, vyI)be independent pdimensional multivariate normal vectors with common unknown mean µ. Based on only observing X = x, we consider the problem of obtaining a predictive density ˆp(yx) for Y that is close to p(yµ) as measured by expected Kullback–Leibler loss. ..."
Abstract

Cited by 5 (0 self)
 Add to MetaCart
Let Xµ ∼ Np(µ, vxI)and Y µ ∼ Np(µ, vyI)be independent pdimensional multivariate normal vectors with common unknown mean µ. Based on only observing X = x, we consider the problem of obtaining a predictive density ˆp(yx) for Y that is close to p(yµ) as measured by expected Kullback–Leibler loss. A natural procedure for this problem is the (formal) Bayes predictive density ˆpU(yx) under the uniform prior πU(µ) ≡ 1, which is best invariant and minimax. We show that any Bayes predictive density will be minimax if it is obtained by a prior yielding a marginal that is superharmonic or whose square root is superharmonic. This yields wide classes of minimax procedures that dominate ˆpU(yx), including Bayes predictive densities under superharmonic priors. Fundamental similarities and differences with the parallel theory of estimating a multivariate normal mean under quadratic loss are described. 1. Introduction. Let Xµ ∼ Np(µ, vxI) and Y µ ∼ Np(µ, vyI) be independent pdimensional multivariate normal vectors with common unknown mean µ,
Bayesian semiparametric multiple shrinkage
"... National Institute of Health High dimensional and highly correlated data leading to non or weaklyidentified effects are commonplace. Maximum likelihood will typically fail in such situations and a variety of shrinkage methods have been proposed. Standard techniques, such as ridge regression or th ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
National Institute of Health High dimensional and highly correlated data leading to non or weaklyidentified effects are commonplace. Maximum likelihood will typically fail in such situations and a variety of shrinkage methods have been proposed. Standard techniques, such as ridge regression or the lasso, shrink estimates toward zero, with some approaches allowing coefficients to be selected out of the model by achieving a value of zero. When substantive information is available, estimates can be shrunk to nonnull values; however, such information may not be available. We propose a Bayesian semiparametric approach that allows shrinkage to multiple locations. Coefficients are given a mixture of heavy tailed double exponential priors, with location and scale parameters assigned Dirichlet process hyperpriors to allow groups of coefficients to be shrunk toward the same, possibly nonzero, mean. Our approach favors sparse, but flexible structure, by shrinking towards a small number of random locations. The methods are illustrated using a study of genetic polymorphisms and multiple myeloma.
Robust Hierarchical Bayes Methodology for Clinical Studies
, 1996
"... Outlier observations can have an adverse effect on statistical inference. Identification and elimination of such observations are one option, however, dealing with outliers in this manner has many drawbacks. An alternative approach is to utilize statistical methods that are robust to outliers. Robus ..."
Abstract
 Add to MetaCart
Outlier observations can have an adverse effect on statistical inference. Identification and elimination of such observations are one option, however, dealing with outliers in this manner has many drawbacks. An alternative approach is to utilize statistical methods that are robust to outliers. Robustness is a desirable property of statistical estimators because it ensures that the estimator reflects the pattern in the majority of the data, without being too sensitive to a handful of outliers. In this dissertation robust methodology for constructing empirical Bayes confidence intervals is presented. Three different robust models are proposed: a variance inflation model, a locationshift model and a heavytailed model. These three general types of models are described within a hierarchical Bayes framework and are applied in two separate contexts. In the first, we apply the robust methodologies to the normal means problem, and in the second we apply them to the modelling of longitudinal data by randomeffects models. The Gibbs sampler is used for analysis of these complex models. Four alternative types of confidence intervals are proposed and evaluated. The proposed
a discussion
"... In this discussion of Polson and Scott, we emphasize the links with the classical shrinkage literature. It is quite pleasant to witness the links made by Polson and Scott between the current sparse modeling strategies and the more classical (or JamesStein) shrinkage literature of the 70’s and 80’s ..."
Abstract
 Add to MetaCart
In this discussion of Polson and Scott, we emphasize the links with the classical shrinkage literature. It is quite pleasant to witness the links made by Polson and Scott between the current sparse modeling strategies and the more classical (or JamesStein) shrinkage literature of the 70’s and 80’s that was instrumental in the first author’s (CPR) personal Bayesian epiphany! Nevertheless, we have some reservation about this unification process in that (a) MAP estimators do not fit a decisiontheoretic framework and (b) the classical shrinkage approach is some adverse to sparsity. Indeed, as shown in Judge and Bock (1978), the socalled pretest estimators that took the value zero with positive probability are inadmissible and dominated by smooth shrinkage estimators under the classical losses. While the efficiency of priors (respective to others) is not clearly defined in Polson and Scott’s paper, the use of a mean sum of squared errors in Table 1 seems to indicate the authors favour the quadratic loss (Berger, 1985) at the core of the JamesStein literature. It would be of considerable interest to connect sparseness and minimaxity, if at all possible.