Results 1  10
of
26
Flexible empirical Bayes estimation for wavelets
 Journal of the Royal Statistics Society, Series B
, 2000
"... Wavelet shrinkage estimation is an increasingly popular method for signal denoising and compression. Although Bayes estimators can provide excellent mean squared error (MSE) properties, selection of an effective prior is a difficult task. To address this problem, we propose Empirical Bayes (EB) prio ..."
Abstract

Cited by 73 (13 self)
 Add to MetaCart
(Show Context)
Wavelet shrinkage estimation is an increasingly popular method for signal denoising and compression. Although Bayes estimators can provide excellent mean squared error (MSE) properties, selection of an effective prior is a difficult task. To address this problem, we propose Empirical Bayes (EB) prior selection methods for various error distributions including the normal and the heavier tailed Student t distributions. Under such EB prior distributions, we obtain threshold shrinkage estimators based on model selection, and multiple shrinkage estimators based on model averaging. These EB estimators are seen to be computationally competitive with standard classical thresholding methods, and to be robust to outliers in both the data and wavelet domains. Simulated and real examples are used to illustrate the flexibility and improved MSE performance of these methods in a wide variety of settings.
Prediction via Orthogonalized Model Mixing
 JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION
, 1994
"... In this paper we introduce an approach and algorithms for model mixing in large prediction problems with correlated predictors. We focus on the choice of predictors in linear models, and mix over possible subsets of candidate predictors. Our approach is based on expressing the space of models in ter ..."
Abstract

Cited by 54 (9 self)
 Add to MetaCart
(Show Context)
In this paper we introduce an approach and algorithms for model mixing in large prediction problems with correlated predictors. We focus on the choice of predictors in linear models, and mix over possible subsets of candidate predictors. Our approach is based on expressing the space of models in terms of an orthogonalization of the design matrix. Advantages are both statistical and computational. Statistically, orthogonalization often leads to a reduction in the number of competing models by eliminating correlations. Computationally, large model spaces cannot be enumerated; recent approaches are based on sampling models with high posterior probability via Markov chains. Based on orthogonalization of the space of candidate predictors, we can approximate the posterior probabilities of models by products of predictorspecific terms. This leads to an importance sampling function for sampling directly from the joint distribution over the model space, without resorting to Markov chains. Comp...
Spike and slab variable selection: frequentist and bayesian strategies
 The Annals of Statistics
"... Variable selection in the linear regression model takes many apparent faces from both frequentist and Bayesian standpoints. In this paper we introduce a variable selection method referred to as a rescaled spike and slab model. We study the importance of prior hierarchical specifications and draw con ..."
Abstract

Cited by 44 (7 self)
 Add to MetaCart
Variable selection in the linear regression model takes many apparent faces from both frequentist and Bayesian standpoints. In this paper we introduce a variable selection method referred to as a rescaled spike and slab model. We study the importance of prior hierarchical specifications and draw connections to frequentist generalized ridge regression estimation. Specifically, we study the usefulness of continuous bimodal priors to model hypervariance parameters, and the effect scaling has on the posterior mean through its relationship to penalization. Several model selection strategies, some frequentist and some Bayesian in nature, are developed and studied theoretically. We demonstrate the importance of selective shrinkage for effective variable selection in terms of risk misclassification, and show this is achieved using the posterior from a rescaled spike and slab model. We also show how to verify a procedure’s ability to reduce model uncertainty in finite samples using a specialized forward selection strategy. Using this tool, we illustrate the effectiveness of rescaled spike and slab models in reducing model uncertainty. 1. Introduction. We
Frequentist model average estimators
 Journal of the American Statistical Association
, 2003
"... Abstract. The traditional use of model selection methods in practice is to proceed as if the final selected model had been chosen in advance, without acknowledging the additional uncertainty introduced by model selection. This often means underreporting of variability and too optimistic confidence ..."
Abstract

Cited by 43 (1 self)
 Add to MetaCart
Abstract. The traditional use of model selection methods in practice is to proceed as if the final selected model had been chosen in advance, without acknowledging the additional uncertainty introduced by model selection. This often means underreporting of variability and too optimistic confidence intervals. We build a general largesample likelihood apparatus in which limiting distributions and risk properties of estimatorspostselection as well as of model average estimators are precisely described, also explicitly taking modelling bias into account. This allows a drastic reduction of complexity, as competing model averaging schemes may be developed, discussed and compared inside a statistical prototype experiment where only a few crucial quantities matter. In particular we offer a frequentist view on Bayesian model averaging methods and give a link to generalised ridge estimators. Our work also leads to new model selection criteria. The methods are illustrated with real data applications. Key words: bias and variance balance, growing models, likelihood inference, model average estimators, model information criteria, moderate misspecification 1. Introduction and
Empirical Bayes Estimation in Wavelet Nonparametric Regression
"... Bayesian methods based on hierarchical mixture models have demonstrated excellent mean squared error properties in constructing data dependent shrinkage estimators in wavelets, however, subjective elicitation of the hyperparameters is challenging. In this chapter we use an Empirical Bayes approach t ..."
Abstract

Cited by 35 (5 self)
 Add to MetaCart
(Show Context)
Bayesian methods based on hierarchical mixture models have demonstrated excellent mean squared error properties in constructing data dependent shrinkage estimators in wavelets, however, subjective elicitation of the hyperparameters is challenging. In this chapter we use an Empirical Bayes approach to estimate the hyperparameters for each level of the wavelet decomposition, bypassing the usual difficulty of hyperparameter specification in the hierarchical model. The EB approach is computationally competitive with standard methods and offers improved MSE performance over several Bayes and classical estimators in a wide variety of examples.
A Statistical Framework for ExpressionBased Molecular Classification in Cancer
, 2002
"... this paper, our aim is to provide a framework to support this treefaceted enterprise. We propose a probabilistic definition of differential expression in the context of unsupervised classification, and we use it to define molecular profiles, and to assess quantities of potential use in classificati ..."
Abstract

Cited by 32 (2 self)
 Add to MetaCart
this paper, our aim is to provide a framework to support this treefaceted enterprise. We propose a probabilistic definition of differential expression in the context of unsupervised classification, and we use it to define molecular profiles, and to assess quantities of potential use in classification, such as the probability that a tumour belongs to a given profile and the probability that two tumours have the same profile. Our longterm goals are (a) to provide tools that will facilitate the use of prior knowledge about gene function in the screening process, in an interactive way, to improve the interpretation and clinical validation of the classification that will ultimately emerge from the analysis, and (b) to capture the potentially categorical nature of differential gene expression, by using latent categorical data that can be interpreted as a gene being turned `on' or `off ' compared with normal expression
Bayesian Tests And Model Diagnostics In Conditionally Independent Hierarchical Models
 Journal of the American Statistical Association
, 1994
"... Consider the conditionally independent hierarchical model (CIHM) where observations y i are independently distributed from f(y i j` i ), the parameters ` i are independently distributed from distributions g(`j), and the hyperparameters are distributed according to a distribution h(). The posterior ..."
Abstract

Cited by 19 (1 self)
 Add to MetaCart
(Show Context)
Consider the conditionally independent hierarchical model (CIHM) where observations y i are independently distributed from f(y i j` i ), the parameters ` i are independently distributed from distributions g(`j), and the hyperparameters are distributed according to a distribution h(). The posterior distribution of all parameters of the CIHM can be efficiently simulated by Monte Carlo Markov Chain (MCMC) algorithms. Although these simulation algorithms have facilitated the application of CIHM's, they generally have not addressed the problem of computing quantities useful in model selection. This paper explores how MCMC simulation algorithms and other related computational algorithms can be used to compute Bayes factors that are useful in criticizing a particular CIHM. In the case where the CIHM models a belief that the parameters are exchangeable or lie on a regression surface, the Bayes factor can measure the consistency of the data with the structural prior belief. Bayes factors can ...
Distribution of eigenvalues and eigenvectors of Wishart matrix when the population eigenvalues are infinitely dispersed
, 2002
"... We consider the asymptotic joint distribution of the eigenvalues and eigenvectors of Wishart matrix when the population eigenvalues become infinitely dispersed. We show that the normalized sample eigenvalues and the relevant elements of the sample eigenvectors are asymptotically all mutually indepen ..."
Abstract

Cited by 8 (5 self)
 Add to MetaCart
We consider the asymptotic joint distribution of the eigenvalues and eigenvectors of Wishart matrix when the population eigenvalues become infinitely dispersed. We show that the normalized sample eigenvalues and the relevant elements of the sample eigenvectors are asymptotically all mutually independently distributed. The limiting distributions of the normalized sample eigenvalues are chisquared distributions with varying degrees of freedom and the distribution of the relevant elements of the eigenvectors is the standard normal distribution. As an application of this result, we investigate tail minimaxity in the estimation of the population covariance matrix of Wishart distribution with respect to Stein's loss function and the quadratic loss function. Under mild regularity conditions, we show that the behavior of a broad class of minimax estimators is identical when the sample eigenvalues become infinitely dispersed. Keywords and phrases asymptotic distribution, covariance matrix, minimax estimator, quadratic loss, singular parameter, Stein's loss, tail minimaxity. 1
Orthogonalizations and Prior Distributions for Orthogonalized Model Mixing
 In Modelling and Prediction
, 1996
"... Prediction methods based on mixing over a set of plausible models can help alleviate the sensitivity of inference and decisions to modeling assumptions. One important application area is prediction in linear models. Computing techniques for model mixing in linear models include Markov chain Monte Ca ..."
Abstract

Cited by 7 (3 self)
 Add to MetaCart
Prediction methods based on mixing over a set of plausible models can help alleviate the sensitivity of inference and decisions to modeling assumptions. One important application area is prediction in linear models. Computing techniques for model mixing in linear models include Markov chain Monte Carlo methods as well as importance sampling. Clyde, DeSimone and Parmigiani (1996) developed an importance sampling strategy based on expressing the space of predictors in terms of an orthogonal basis. This leads both to a better identified problem and to simple approximations to the posterior model probabilities. Such approximations can be used to construct efficient importance samplers. For brevity, we call this strategy orthogonalized model mixing. Two key elements of orthogonalized model mixing are: a) the orthogonalization method and b) the prior probability distributions assigned to the models and the coefficients. In this paper we consider in further detail the specification of these t...