Results 1 - 10
of
75
The control of the false discovery rate in multiple testing under dependency
- Annals of Statistics
, 2001
"... Benjamini and Hochberg suggest that the false discovery rate may be the appropriate error rate to control in many applied multiple testing problems. A simple procedure was given there as an FDR controlling procedure for independent test statistics and was shown to be much more powerful than comparab ..."
Abstract
-
Cited by 267 (3 self)
- Add to MetaCart
Benjamini and Hochberg suggest that the false discovery rate may be the appropriate error rate to control in many applied multiple testing problems. A simple procedure was given there as an FDR controlling procedure for independent test statistics and was shown to be much more powerful than comparable procedures which control the traditional familywise error rate. We prove that this same procedure also controls the false discovery rate when the test statistics have positive regression dependency on each of the test statistics corresponding to the true null hypotheses. This condition for positive dependency is general enough to cover many problems of practical interest, including the comparisons of many treatments with a single control, multivariate normal test statistics with positive correlation matrix and multivariate t. Furthermore, the test statistics may be discrete, and the tested hypotheses composite without posing special difficulties. For all other forms of dependency, a simple conservative modification of the procedure controls the false discovery rate. Thus the range of problems for which
Statistical Comparisons of Classifiers over Multiple Data Sets
, 2006
"... While methods for comparing two learning algorithms on a single data set have been scrutinized for quite some time already, the issue of statistical tests for comparisons of more algorithms on multiple data sets, which is even more essential to typical machine learning studies, has been all but igno ..."
Abstract
-
Cited by 120 (0 self)
- Add to MetaCart
While methods for comparing two learning algorithms on a single data set have been scrutinized for quite some time already, the issue of statistical tests for comparisons of more algorithms on multiple data sets, which is even more essential to typical machine learning studies, has been all but ignored. This article reviews the current practice and then theoretically and empirically examines several suitable tests. Based on that, we recommend a set of simple, yet safe and robust non-parametric tests for statistical comparisons of classifiers: the Wilcoxon signed ranks test for comparison of two classifiers and the Friedman test with the corresponding post-hoc tests for comparison of more classifiers over multiple data sets. Results of the latter can also be neatly presented with the newly introduced CD (critical difference) diagrams.
A linear non-gaussian acyclic model for causal discovery
- J. Machine Learning Research
, 2006
"... In recent years, several methods have been proposed for the discovery of causal structure from non-experimental data. Such methods make various assumptions on the data generating process to facilitate its identification from purely observational data. Continuing this line of research, we show how to ..."
Abstract
-
Cited by 33 (16 self)
- Add to MetaCart
In recent years, several methods have been proposed for the discovery of causal structure from non-experimental data. Such methods make various assumptions on the data generating process to facilitate its identification from purely observational data. Continuing this line of research, we show how to discover the complete causal structure of continuous-valued data, under the assumptions that (a) the data generating process is linear, (b) there are no unobserved confounders, and (c) disturbance variables have non-Gaussian distributions of non-zero variances. The solution relies on the use of the statistical method known as independent component analysis, and does not require any pre-specified time-ordering of the variables. We provide a complete Matlab package for performing this LiNGAM analysis (short for Linear Non-Gaussian Acyclic Model), and demonstrate the effectiveness of the method using artificially generated data and real-world data.
Controlling the familywise error rate in functional neuroimaging: a comparative review
- Statistical Methods in Medical Research
, 2003
"... Functional neuroimaging data embodies a massive multiple testing problem, where 100 000 correlated test statistics must be assessed. The familywise error rate, the chance of any false positives is the standard measure of Type I errors in multiple testing. In this paper we review and evaluate three a ..."
Abstract
-
Cited by 31 (3 self)
- Add to MetaCart
Functional neuroimaging data embodies a massive multiple testing problem, where 100 000 correlated test statistics must be assessed. The familywise error rate, the chance of any false positives is the standard measure of Type I errors in multiple testing. In this paper we review and evaluate three approaches to thresholding images of test statistics: Bonferroni, random �eld and the permutation test. Owing to recent developments, improved Bonferroni procedures, such as Hochberg’s methods, are now applicable to dependent data. Continuous random �eld methods use the smoothness of the image to adapt to the severity of the multiple testing problem. Also, increased computing power has made both permutation and bootstrap methods applicable to functional neuroimaging. We evaluate these approaches on t images using simulations and a collection of real datasets. We �nd that Bonferroni-related tests offer little improvement over Bonferroni, while the permutation method offers substantial improvement over the random �eld method for low smoothness and low degrees of freedom. We also show the limitations of trying to �nd an equivalent number of independent tests for an image of correlated test statistics. 1
An extension on ―statistical comparisons of classifiers over multiple data sets‖ for all pairwise comparisons
- Journal of Machine Learning Research
"... In a recently published paper in JMLR, Demˇsar (2006) recommends a set of non-parametric statistical tests and procedures which can be safely used for comparing the performance of classifiers over multiple data sets. After studying the paper, we realize that the paper correctly introduces the basic ..."
Abstract
-
Cited by 11 (2 self)
- Add to MetaCart
In a recently published paper in JMLR, Demˇsar (2006) recommends a set of non-parametric statistical tests and procedures which can be safely used for comparing the performance of classifiers over multiple data sets. After studying the paper, we realize that the paper correctly introduces the basic procedures and some of the most advanced ones when comparing a control method. However, it does not deal with some advanced topics in depth. Regarding these topics, we focus on more powerful proposals of statistical procedures for comparing n×n classifiers. Moreover, we illustrate an easy way of obtaining adjusted and comparable p-values in multiple comparison procedures.
Bayesian Maximum a Posteriori Multiple Testing Procedure
- Sankhya
, 2006
"... We consider a Bayesian approach to multiple hypothesis testing. A hierarchical prior model is based on imposing a prior distribution π(k) on the number of hypotheses arising from alternatives (false nulls). We then apply the maximum a posteriori (MAP) rule to find the most likely configuration of nu ..."
Abstract
-
Cited by 10 (4 self)
- Add to MetaCart
We consider a Bayesian approach to multiple hypothesis testing. A hierarchical prior model is based on imposing a prior distribution π(k) on the number of hypotheses arising from alternatives (false nulls). We then apply the maximum a posteriori (MAP) rule to find the most likely configuration of null and alternative hypotheses. The resulting MAP procedure and its closely related step-up and step-down versions compare ordered Bayes factors of individual hypotheses with a sequence of critical values depending on the prior. We discuss the relations between the proposed MAP procedure and the existing frequentist and Bayesian counterparts. A more detailed analysis is given for the normal data, where we show, in particular, that by choosing a specific π(k), the MAP procedure can mimic several known familywise error (FWE) and false discovery rate (FDR) controlling procedures. The performance of MAP procedures is illustrated on a simulated example. AMS (2000) subject classification. Primary 62F15, 62F03.
Nonparametric Hypothesis Testing for a Spatial Signal
, 2001
"... this article, we propose a procedure called Enhanced FDR (EFDR), which is based on controlling the false discovery rate (FDR) and a concept known as generalized degrees of freedom (GDF). EFDR differs from the standard FDR procedure through its reducing of the number of hypotheses tested. This is don ..."
Abstract
-
Cited by 9 (0 self)
- Add to MetaCart
this article, we propose a procedure called Enhanced FDR (EFDR), which is based on controlling the false discovery rate (FDR) and a concept known as generalized degrees of freedom (GDF). EFDR differs from the standard FDR procedure through its reducing of the number of hypotheses tested. This is done in two ways: first, the model is represented more parsimoniously in the wavelet domain, and second, an optimal selection of hypotheses is made using a criterion based on generalized degrees of freedom. Not only does the EFDR procedure tell us whether a spatial signal is present or not, it has an added bonus that, if a signal is deemed present, it can indicate its location and magnitude. We examine EFDR's operating characteristics, and in simulations we show that it outperforms the standard FDR and conventional testing procedures. Finally, the EFDR procedure is applied to an air-temperature data set generated from the Climate System Model (CSM) of the National Center for Atmospheric Research (NCAR), where air temperatures in the 1980s are compared to those in the 1990s. We conclude that temperature change has occurred between the two decades, mostly warming in the central part of the USA and in coastal regions of South America at about 20 S. Key words: Denoising, false discovery rate, generalized degrees of freedom, pixel, power, signal detection, wavelets
On optimality of stepdown and stepup multiple test procedures
- Ann. Statist
, 2005
"... Consider the multiple testing problem of testing k null hypotheses, where the unknown family of distributions is assumed to satisfy a certain monotonicity assumption. Attention is restricted to procedures that control the familywise error rate in the strong sense and which satisfy a monotonicity con ..."
Abstract
-
Cited by 8 (2 self)
- Add to MetaCart
Consider the multiple testing problem of testing k null hypotheses, where the unknown family of distributions is assumed to satisfy a certain monotonicity assumption. Attention is restricted to procedures that control the familywise error rate in the strong sense and which satisfy a monotonicity condition. Under these assumptions, we prove certain maximin optimality results for some well-known stepdown and stepup procedures. 1. Introduction. For

