Results 1 
9 of
9
Tweedie’s Formula and Selection Bias
"... We suppose that the statistician observes some large number of estimates zi, each with its own unobserved expectation parameter µi. The largest few of the zi’s are likely to substantially overestimate their corresponding µi’s, this being an example of selection bias, or regression to the mean. Tweed ..."
Abstract

Cited by 3 (0 self)
 Add to MetaCart
We suppose that the statistician observes some large number of estimates zi, each with its own unobserved expectation parameter µi. The largest few of the zi’s are likely to substantially overestimate their corresponding µi’s, this being an example of selection bias, or regression to the mean. Tweedie’s formula, first reported by Robbins in 1956, offers a simple empirical Bayes approach for correcting selection bias. This paper investigates its merits and limitations. In addition to the methodology, Tweedie’s formula raises more general questions concerning empirical Bayes theory, discussed here as “relevance ” and “empirical Bayes information. ” There is a close connection between applications of the formula and James–Stein estimation. Keywords: Bayesian relevance, empirical Bayes information, James–Stein, false discovery rates, regret, winner’s curse
Simultaneous Confidence Intervals with more Power to Determine Signs
, 2010
"... We develop new simultaneous confidence intervals for the components of a multivariate mean. The intervals determine the signs of the parameters more frequently than standard intervals do: the set of data values for which each interval includes parameter values with only one sign is larger. When one ..."
Abstract

Cited by 3 (2 self)
 Add to MetaCart
We develop new simultaneous confidence intervals for the components of a multivariate mean. The intervals determine the signs of the parameters more frequently than standard intervals do: the set of data values for which each interval includes parameter values with only one sign is larger. When one or more estimated means are small, the new intervals sacrifice some length to avoid crossing zero. But when all the estimated means are large, the new intervals coincide with standard simultaneous confidence intervals, so there is no sacrifice of precision. The improved ability to determine signs is remarkable. For example, if two means are to be estimated and the intervals are allowed to be at most 80 % longer than standard intervals, when only one mean is small its sign is determined almost as well as by a onesided test that ignores multiplicity and has a prespecified direction. When both are small the sign is determined better than by twosided tests that ignore multiplicity. The intervals are constructed by inverting levelα tests to form a 1−α confidence set, then projecting that set onto the coordinate axes to get confidence intervals. The tests have hyperrectangular acceptance regions that minimize the maximum amount by which the acceptance region protrudes from the orthant that contains the hypothesized parameter value, subject to a constraint on the maximum side length of the hyperrectangle. R and SAS scripts are available online. Key Words: Nonequivariant hypothesis test, hyperrectangular acceptance region 1
BAYESIAN METHODS TO OVERCOME THE WINNER’S CURSE IN GENETIC STUDIES 1
"... Parameter estimates for associated genetic variants, report ed in the initial discovery samples, are often grossly inflated compared to the values observed in the followup replication samples. This type of bias is a consequence of the sequential procedure in which the estimated effect of an associa ..."
Abstract

Cited by 2 (1 self)
 Add to MetaCart
Parameter estimates for associated genetic variants, report ed in the initial discovery samples, are often grossly inflated compared to the values observed in the followup replication samples. This type of bias is a consequence of the sequential procedure in which the estimated effect of an associated genetic marker must first pass a stringent significance threshold. We propose a hierarchical Bayes method in which a spikeandslab prior is used to account for the possibility that the significant test result may be due to chance. We examine the robustness of the method using different priors corresponding to different degrees of confidence in the testing results and propose a Bayesian model averaging procedure to combine estimates produced by different models. The Bayesian estimators yield smaller variance compared to the conditional likelihood estimator and outperform the latter in studies with low power. We investigate the performance of the method with simulations and applications to four real data examples. 1. Introduction. Parameter estimates such as odds ratios (OR) for an associated
Adjusted Bayesian inference for selected parameters
, 2009
"... We address the problem of providing inference for parameters selected after viewing the data. A frequentist solution to this problem is False Discovery Rate adjusted inference. We explain the role of selection in controlling the occurrence of false discoveries in Bayesian analysis, and argue that Ba ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
We address the problem of providing inference for parameters selected after viewing the data. A frequentist solution to this problem is False Discovery Rate adjusted inference. We explain the role of selection in controlling the occurrence of false discoveries in Bayesian analysis, and argue that Bayesian inference may also be affected by selection – in particular Bayesian inference based on subjective priors. We introduce selectionadjusted Bayesian methodology based on the conditional posterior distribution of the parameters given selection; show how it can be used to specify selection criteria; explain how it relates to the Bayesian FDR approach; and apply it to microarray data. 1
Biomedical Informatics Graduate Training Program and Division of Systems Medicine,
, 2011
"... metaanalysis of RNA expression to identify genes with variants associated with immune dysfunction ..."
Abstract
 Add to MetaCart
metaanalysis of RNA expression to identify genes with variants associated with immune dysfunction
PROCEEDINGS Open Access Largescale risk prediction applied to Genetic Analysis Workshop 17 miniexome sequence data
"... We consider the application of Efron’s empirical Bayes classification method to risk prediction in a genomewide association study using the Genetic Analysis Workshop 17 (GAW17) data. A major advantage of using this method is that the effect size distribution for the set of possible features is empi ..."
Abstract
 Add to MetaCart
We consider the application of Efron’s empirical Bayes classification method to risk prediction in a genomewide association study using the Genetic Analysis Workshop 17 (GAW17) data. A major advantage of using this method is that the effect size distribution for the set of possible features is empirically estimated and that all subsequent parameter estimation and risk prediction is guided by this distribution. Here, we generalize Efron’s method to allow for some of the peculiarities of the GAW17 data. In particular, we introduce two ways to extend Efron’s model: a weighted empirical Bayes model and a joint covariance model that allows the model to properly incorporate the annotation information of singlenucleotide polymorphisms (SNPs). In the course of our analysis, we examine several aspects of the possible simulation model, including the identity of the most important genes, the differing effects of synonymous and nonsynonymous SNPs, and the relative roles of covariates and genes in conferring disease risk. Finally, we compare the three methods to each other and to other classifiers (random forest and neural network). Background The development of diseaserisk prediction models based on genomewide association data is a great challenge
Associations Between Incident Ischemic Stroke Events and Stroke and Cardiovascular DiseaseRelated GenomeWide Association Studies Single Nucleotide Polymorphisms in the Population Architecture Using Genomics and Epidemiology Study
"... Background—Genomewide association studies (GWAS) have identified loci associated with ischemic stroke (IS) and cardiovascular disease (CVD) in Europeandescent individuals, but their replication in different populations has been largely unexplored. Methods and Results—Nine single nucleotide polymor ..."
Abstract
 Add to MetaCart
Background—Genomewide association studies (GWAS) have identified loci associated with ischemic stroke (IS) and cardiovascular disease (CVD) in Europeandescent individuals, but their replication in different populations has been largely unexplored. Methods and Results—Nine single nucleotide polymorphisms (SNPs) selected from GWAS and metaanalyses of stroke, and 86 SNPs previously associated with myocardial infarction and CVD risk factors, including blood lipids (high density lipoprotein [HDL], low density lipoprotein [LDL], and triglycerides), type 2 diabetes, and body mass index (BMI), were
Genetic Epidemiology 34: 643–652 (2010) Risk Prediction Using GenomeWide Association Studies
"... Over the last few years, many new genetic associations have been identified by genomewide association studies (GWAS). There are potentially many uses of these identified variants: a better understanding of disease etiology, personalized medicine, new leads for studying underlying biology, and risk ..."
Abstract
 Add to MetaCart
Over the last few years, many new genetic associations have been identified by genomewide association studies (GWAS). There are potentially many uses of these identified variants: a better understanding of disease etiology, personalized medicine, new leads for studying underlying biology, and risk prediction. Recently, there has been some skepticism regarding the prospects of risk prediction using GWAS, primarily motivated by the fact that individual effect sizes of variants associated with the phenotype are mostly small. However, there have also been arguments that many diseaseassociated variants have not yet been identified; hence, prospects for risk prediction may improve if more variants are included. From a risk prediction perspective, it is reasonable to average a larger number of predictors, of which some may have (limited) predictive power, and some actually may be noise. The idea being that when added together, the combined small signals results in a signal that is stronger than the noise from the unrelated predictors. We examine various aspects of the construction of models for the estimation of disease probability. We compare different methods to construct such models, to examine how implementation of crossvalidation may influence results, and to examine which single nucleotide polymorphisms (SNPs) are most useful for prediction. We carry out our investigation on GWAS of the Welcome Trust Case Control Consortium. For Crohn’s disease, we confirm our results on another GWAS. Our results suggest that utilizing a