Results 1  10
of
2,410
The control of the false discovery rate in multiple testing under dependency
 Annals of Statistics
, 2001
"... Benjamini and Hochberg suggest that the false discovery rate may be the appropriate error rate to control in many applied multiple testing problems. A simple procedure was given there as an FDR controlling procedure for independent test statistics and was shown to be much more powerful than comparab ..."
Abstract

Cited by 469 (8 self)
 Add to MetaCart
Benjamini and Hochberg suggest that the false discovery rate may be the appropriate error rate to control in many applied multiple testing problems. A simple procedure was given there as an FDR controlling procedure for independent test statistics and was shown to be much more powerful than comparable procedures which control the traditional familywise error rate. We prove that this same procedure also controls the false discovery rate when the test statistics have positive regression dependency on each of the test statistics corresponding to the true null hypotheses. This condition for positive dependency is general enough to cover many problems of practical interest, including the comparisons of many treatments with a single control, multivariate normal test statistics with positive correlation matrix and multivariate t. Furthermore, the test statistics may be discrete, and the tested hypotheses composite without posing special difficulties. For all other forms of dependency, a simple conservative modification of the procedure controls the false discovery rate. Thus the range of problems for which
A direct approach to false discovery rates
, 2002
"... Summary. Multiplehypothesis testing involves guarding against much more complicated errors than singlehypothesis testing. Whereas we typically control the type I error rate for a singlehypothesis test, a compound error rate is controlled for multiplehypothesis tests. For example, controlling the ..."
Abstract

Cited by 347 (6 self)
 Add to MetaCart
Summary. Multiplehypothesis testing involves guarding against much more complicated errors than singlehypothesis testing. Whereas we typically control the type I error rate for a singlehypothesis test, a compound error rate is controlled for multiplehypothesis tests. For example, controlling the false discovery rate FDR traditionally involves intricate sequential pvalue rejection methods based on the observed data. Whereas a sequential pvalue method fixes the error rate and estimates its corresponding rejection region, we propose the opposite approach—we fix the rejection region and then estimate its corresponding error rate. This new approach offers increased applicability, accuracy and power. We apply the methodology to both the positive false discovery rate pFDR and FDR, and provide evidence for its benefits. It is shown that pFDR is probably the quantity of interest over FDR. Also discussed is the calculation of the qvalue, the pFDR analogue of the pvalue, which eliminates the need to set the error rate beforehand as is traditionally done. Some simple numerical examples are presented that show that this new approach can yield an increase of over eight times in power compared with the Benjamini–Hochberg FDR method.
Limma: linear models for microarray data
 Bioinformatics and Computational Biology Solutions using R and Bioconductor
, 2005
"... This free opensource software implements academic research by the authors and coworkers. If you use it, please support the project by citing the appropriate journal articles listed in Section 2.1.Contents ..."
Abstract

Cited by 285 (5 self)
 Add to MetaCart
This free opensource software implements academic research by the authors and coworkers. If you use it, please support the project by citing the appropriate journal articles listed in Section 2.1.Contents
Empirical Bayes Analysis of a Microarray Experiment
 Journal of the American Statistical Association
, 2001
"... Microarrays are a novel technology that facilitates the simultaneous measurement of thousands of gene expression levels. A typical microarray experiment can produce millions of data points, raising serious problems of data reduction, and simultaneous inference. We consider one such experiment in whi ..."
Abstract

Cited by 269 (15 self)
 Add to MetaCart
Microarrays are a novel technology that facilitates the simultaneous measurement of thousands of gene expression levels. A typical microarray experiment can produce millions of data points, raising serious problems of data reduction, and simultaneous inference. We consider one such experiment in which oligonucleotide arrays were employed to assess the genetic effects of ionizing radiation on seven thousand human genes. A simple nonparametric empirical Bayes model is introduced, which is used to guide the ef � cient reduction of the data to a single summary statistic per gene, and also to make simultaneous inferences concerning which genes were affected by the radiation. Although our focus is on one speci � c experiment, the proposed methods can be applied quite generally. The empirical Bayes inferences are closely related to the frequentist false discovery rate (FDR) criterion. 1.
Thresholding of statistical maps in functional neuroimaging using the false discovery rate
 Neuroimage
, 2002
"... Finding objective and effective thresholds for voxelwise statistics derived from neuroimaging data has been a longstanding problem. With at least one test performed for every voxel in an image, some correction of the thresholds is needed to control the error rates, but standard procedures for multi ..."
Abstract

Cited by 196 (4 self)
 Add to MetaCart
Finding objective and effective thresholds for voxelwise statistics derived from neuroimaging data has been a longstanding problem. With at least one test performed for every voxel in an image, some correction of the thresholds is needed to control the error rates, but standard procedures for multiple hypothesis testing (e.g., Bonferroni) tend to not be sensitive enough to be useful in this context. This paper introduces to the neuroscience literature statistical procedures for controlling the false discovery rate (FDR). Recent theoretical work in statistics suggests that FDRcontrolling procedures will be effective for the analysis of neuroimaging data. These procedures operate simultaneously on all voxelwise test statistics to determine which tests should be considered statistically significant. The innovation of the procedures is that they control the expected proportion of the rejected hypotheses that are falsely rejected. We demonstrate this approach using both simulations and functional magnetic resonance imaging data from two
Sparse solution of underdetermined linear equations by stagewise orthogonal matching pursuit
, 2006
"... Finding the sparsest solution to underdetermined systems of linear equations y = Φx is NPhard in general. We show here that for systems with ‘typical’/‘random ’ Φ, a good approximation to the sparsest solution is obtained by applying a fixed number of standard operations from linear algebra. Our pr ..."
Abstract

Cited by 172 (20 self)
 Add to MetaCart
Finding the sparsest solution to underdetermined systems of linear equations y = Φx is NPhard in general. We show here that for systems with ‘typical’/‘random ’ Φ, a good approximation to the sparsest solution is obtained by applying a fixed number of standard operations from linear algebra. Our proposal, Stagewise Orthogonal Matching Pursuit (StOMP), successively transforms the signal into a negligible residual. Starting with initial residual r0 = y, at the sth stage it forms the ‘matched filter ’ Φ T rs−1, identifies all coordinates with amplitudes exceeding a speciallychosen threshold, solves a leastsquares problem using the selected coordinates, and subtracts the leastsquares fit, producing a new residual. After a fixed number of stages (e.g. 10), it stops. In contrast to Orthogonal Matching Pursuit (OMP), many coefficients can enter the model at each stage in StOMP while only one enters per stage in OMP; and StOMP takes a fixed number of stages (e.g. 10), while OMP can take many (e.g. n). StOMP runs much faster than competing proposals for sparse solutions, such as ℓ1 minimization and OMP, and so is attractive for solving largescale problems. We use phase diagrams to compare algorithm performance. The problem of recovering a ksparse vector x0 from (y, Φ) where Φ is random n × N and y = Φx0 is represented by a point (n/N, k/n)
Largescale simultaneous hypothesis testing: the choice of a null hypothesis
 JASA
, 2004
"... Current scientific techniques in genomics and image processing routinely produce hypothesis testing problems with hundreds or thousands of cases to consider simultaneously. This poses new difficulties for the statistician, but also opens new opportunities. In particular it allows empirical estimatio ..."
Abstract

Cited by 162 (14 self)
 Add to MetaCart
Current scientific techniques in genomics and image processing routinely produce hypothesis testing problems with hundreds or thousands of cases to consider simultaneously. This poses new difficulties for the statistician, but also opens new opportunities. In particular it allows empirical estimation of an appropriate null hypothesis. The empirical null may be considerably more dispersed than the usual theoretical null distribution that would be used for any one case considered separately. An empirical Bayes analysis plan for this situation is developed, using a local version of the false discovery rate to examine the inference issues. Two genomics problems are used as examples to show the importance of correctly choosing the null hypothesis. Key Words: local false discovery rate, empirical Bayes, microarray analysis, empirical null hypothesis, unobserved covariates
The positive false discovery rate: A Bayesian interpretation and the qvalue
 Annals of Statistics
, 2003
"... Multiple hypothesis testing is concerned with controlling the rate of false positives when testing several hypotheses simultaneously. One multiple hypothesis testing error measure is the false discovery rate (FDR), which is loosely defined to be the expected proportion of false positives among all s ..."
Abstract

Cited by 158 (3 self)
 Add to MetaCart
Multiple hypothesis testing is concerned with controlling the rate of false positives when testing several hypotheses simultaneously. One multiple hypothesis testing error measure is the false discovery rate (FDR), which is loosely defined to be the expected proportion of false positives among all significant hypotheses. The FDR is especially appropriate for exploratory analyses in which one is interested in finding several significant results among many tests. In this work, we introduce a modified version of the FDR called the “positive false discovery rate ” (pFDR). We discuss the advantages and disadvantages of the pFDR and investigate its statistical properties. When assuming the test statistics follow a mixture distribution, we show that the pFDR can be written as a Bayesian posterior probability and can be connected to classification theory. These properties remain asymptotically true under fairly general conditions, even under certain forms of dependence. Also, a new quantity called the “qvalue ” is introduced and investigated, which is a natural “Bayesian posterior pvalue, ” or rather the pFDR analogue of the pvalue.
A Shrinkage Approach to LargeScale Covariance Matrix Estimation and Implications for Functional Genomics
, 2005
"... ..."
Identifying differentially expressed genes using false discovery rate controlling procedures. Bioinformatics 19: 368–375
, 2003
"... Motivation: DNA microarrays have recently been used for the purpose of monitoring expression levels of thousands of genes simultaneously and identifying those genes that are differentially expressed. The probability that a false identification (type I error) is committed can increase sharply when th ..."
Abstract

Cited by 122 (2 self)
 Add to MetaCart
Motivation: DNA microarrays have recently been used for the purpose of monitoring expression levels of thousands of genes simultaneously and identifying those genes that are differentially expressed. The probability that a false identification (type I error) is committed can increase sharply when the number of tested genes gets large. Correlation between the test statistics attributed to gene coregulation and dependency in the measurement errors of the gene expression levels further complicates the problem. In this paper we address this very large multiplicity problem by adopting the false discovery rate (FDR) controlling approach. In order to address the dependency problem, we present three resamplingbased FDR controlling procedures, that account for the test statistics distribution, and compare their performance to that of the naïve application of the linear stepup procedure in Benjamini and Hochberg (1995). The procedures are studied using simulated microarray data, and their performance is examined relative to their ease of implementation. Results: Comparative simulation analysis shows that all four FDR controlling procedures control the FDR at the desired level, and retain substantially more power then the familywise error rate controlling procedures. In terms of power, using resampling of the marginal distribution of each test statistics substantially improves the performance over the naïve one. The highest power is achieved, at the expense of a more sophisticated algorithm, by the resamplingbased procedures that resample the joint distribution of the test statistics and estimate the level of FDR control.