## Evaluating microarray-based classifiers: an overview (2007)

Citations: | 5 - 1 self |

### BibTeX

@MISC{Boulesteix07evaluatingmicroarray-based,

author = {A. -l. Boulesteix and C. Strobl and T. Augustin and M. Daumer and A. -l. Boulesteix and C. Strobl and T. Augustin and M. Daumer},

title = {Evaluating microarray-based classifiers: an overview},

year = {2007}

}

### OpenURL

### Abstract

many

### Citations

8973 | Statistical Learning Theory
- Vapnik
- 1998
(Show Context)
Citation Context ...alysis of Microarrays (PAM) method based on shrunken centroids (Tibshirani et al., 2002) or the more recent regularized linear discriminant analysis (Guo et al., 2007). Support Vector Machines (SVM) (=-=Vapnik, 1995-=-) or ensemble methods based on recursive partitioning belong to the second category. Ensemble methods include for example bagging procedures (Breiman, 1996) applied to microarray data by Dudoit et al.... |

7339 |
Genetic Algorithms
- Goldberg
- 1989
(Show Context)
Citation Context ...ist of univariately best variables (Jäger et al., 2003). In contrast, other authors seek for globally optimal subsets of variables based on sophisticated search algorithms such as genetic algorithms (=-=Goldberg, 1989-=-) applied to microarray data by, e.g., Ooi and Tan (2003). Note that most multivariate variable selection methods take only correlations between variables but not interactions into account, depending ... |

2492 | Bagging predictors
- Breiman
- 1996
(Show Context)
Citation Context ...et al., 2007). Support Vector Machines (SVM) (Vapnik, 1995) or ensemble methods based on recursive partitioning belong to the second category. Ensemble methods include for example bagging procedures (=-=Breiman, 1996-=-) applied to microarray data by Dudoit et al. (2002), boosting (Freund and Schapire, 1997) used by Dettling and Bühlmann 7s(2003), BagBoosting (Dettling, 2004) or Breiman’s (2001) random forests exami... |

2308 | A decision-theoretic generalization of online learning and an application to boosting
- Freund, Schapire
- 1997
(Show Context)
Citation Context ...ds based on recursive partitioning belong to the second category. Ensemble methods include for example bagging procedures (Breiman, 1996) applied to microarray data by Dudoit et al. (2002), boosting (=-=Freund and Schapire, 1997-=-) used by Dettling and Bühlmann 7s(2003), BagBoosting (Dettling, 2004) or Breiman’s (2001) random forests examined by Diaz-Uriarte and de Andrés (2006) in the context of variable selection for classif... |

2046 |
The Elements of Statistical Learning
- Hastie, Tibshirani, et al.
- 2001
(Show Context)
Citation Context ...validation Another option to evaluate prediction accuracy consists of considering all the available observations as test observations successively in a procedure denoted as crossvalidation (see, e.g. =-=Hastie et al., 2001-=-). The available observations {1, . . . , n} are 18sdivided into m non-overlapping subsets whose indices are given by t (1) , . . . , t (m) . The cross-validation procedure consists of a succession of... |

1394 | Random forests
- Breiman
- 2001
(Show Context)
Citation Context ... but not interactions into account, depending on the considered criterion used to score the variable subsets. The recent method suggested by Diaz-Uriarte and de Andrés (2006) based on random forests (=-=Breiman, 2001-=-) is one of the very few methods taking interactions into account explicitly. Potential pitfalls of multivariate methods are the computational expense, the sensitivity to small changes in the learning... |

1237 |
Statistical Decision Theory and Bayesian Analysis
- Berger
- 1985
(Show Context)
Citation Context ...onal issues in statistics, where in the frequentist-Bayesian debate the former usually advocate in favor of the unconditional point of view while Bayesian inference is eo ipso conditional (cf., e.g., =-=Berger, 1980-=-, Section 1.6). Also a view at the corresponding discussion in sampling theory on evalu16sating post stratification is illuminating in this context (see, e.g., Hold and Smith, 1979, for a classical pa... |

1227 | Molecular classification of cancer: Class discovery and class prediction by gene expression monitoring
- Golub, Slomin, et al.
- 1999
(Show Context)
Citation Context ...he term response class refers to the categorical variable that has to be predicted based on gene expression data. It can be, e.g., the presence or absence of disease, a tumor subtype such as ALL/AML (=-=Golub et al., 1999-=-) or the responder status to a therapy (Ghadimi et al., 2005). The number of classes may be higher than two, though binary class prediction is by far the most frequent case in practice. Note that gene... |

1113 |
Pattern recognition and neural networks
- Ripley
- 1996
(Show Context)
Citation Context ...cial prior probability for this class, or one has to “tell” the classifier directly that misclassifications of class 0 are more severe by means of specifying higher misclassification costs (cf, e.g., =-=Ripley, 1996-=-). Obviously, such changes in the prior probabilities and costs, that are internally handled as different weights for class 0 and 1 observations, affect sensitivity and specificity. For example, when ... |

728 |
Broad patterns of gene expression revealed by clustering of tumor and normal colon tissues probed by oligonucleotide arrays
- Alon, Barkai, et al.
- 1999
(Show Context)
Citation Context ...ticles with strong methodological background (e.g., Natsoulis et al., 2005). Most of these studies include common “benchmark” data sets such as the well-known leukemia (Golub et al., 1999) and colon (=-=Alon et al., 1999-=-) data sets.Table 2 (Appendix B) summarizes the characteristics and results of six published comparison studies, which we took as neutral, because they satisfy the following criteria: • The title incl... |

539 |
The meaning and use of the area under a receiver operating characteristic (ROC) curve
- Hanley, McNeil
- 1982
(Show Context)
Citation Context ...ents the better classifier. Confidence bounds for ROC curves can be computed (e.g., Schäfer, 1994). The distance from the diagonal, the so called area under curve (AUC), is another useful diagnostic (=-=Hanley and McNeil, 1982-=-) and can be estimated via several approaches (e.g., DeLong et al., 1988). The AUC can also be used to compare intersecting ROC curves. After this overview on accuracy measures for the comparison of c... |

521 | Linear models and empirical bayes methods for assessing differential expression in microarray experiments
- Smyth
(Show Context)
Citation Context ...). In the context of differential expression detection, several regularized variants of the standard t-statistic have been proposed in the last few years. They include, e.g., empirical Bayes methods (=-=Smyth, 2004-=-). An overview can be found in Opgen-Rhein and Strimmer (2007). Although these empirical Bayes methods are usually considered as univariate approaches, such methods involve a multivariate component in... |

503 | Comparison of Discrimination Methods for the Classification of Tumors Using Gene Expression Data
- Dudoit, Fridlyand, et al.
- 2002
(Show Context)
Citation Context ...reviewed and compared by Dudoit et al. (2002) including linear and quadratic discriminant analysis or Fisher’s linear discriminant analysis, classical logistic regression or k-nearest-neighbors (e.g. =-=Dudoit et al., 2002-=-) which, in principle, could be applied to a high number of variables but performs poorly on noisy data. 4sMany variable selection approaches have been described in the bioinformatics literature. Over... |

398 | Support Vector Machine Classification and Validation of Cancer Tissue Samples Using Microarray Expression Data - Furey, Cristianini, et al. - 2000 |

362 |
Diagnosis of Multiple Cancer Types by Shrunken Centroids of Gene Expression
- Tibshirani, Hastie, et al.
- 2002
(Show Context)
Citation Context ...ne learning community on the other hand. The first category includes, e.g., penalized logistic regression (Zhu, 2004), the Prediction Analysis of Microarrays (PAM) method based on shrunken centroids (=-=Tibshirani et al., 2002-=-) or the more recent regularized linear discriminant analysis (Guo et al., 2007). Support Vector Machines (SVM) (Vapnik, 1995) or ensemble methods based on recursive partitioning belong to the second ... |

272 |
Measuring the accuracy of diagnostic systems
- Swets
- 1988
(Show Context)
Citation Context ...he sensitivity and specificity of a classifier are not fixed characteristics, but are influenced by the misclassification cost scheme, the receiver operating characteristic (ROC) approach (cf., e.g., =-=Swets, 1988-=-, for an introduction and application examples) could be borrowed from signal detection, and could be used for comparing classifier performance, incorporating the performance under different cost sche... |

225 | Core Team (2006): R: A language and environment for statistical computing. R Foundation for Statistical Computing - Development |

173 |
The rank analysis of incomplete block designs, i: The method of paired comparisons
- Bradley, Terry
- 1952
(Show Context)
Citation Context ...n on the other hand the aim of a benchmark study is a complete ranking of all considered classifiers with respect to any performance measure the Bradley-Terry(- 25sLuce) model for paired comparisons (=-=Bradley and Terry, 1952-=-) or the recent approach of Hornik and Meyer (2007) for consensus rankings are attractive. In addition to the purely descriptive ranking of these approaches statistical inference on the performance di... |

170 | A Leisurely Look at the Bootstrap, the Jackknife, and CrossValidation - Efron, Gong |

127 | Boosting for tumor classification with gene expression data
- Dettling, Buhlmann
- 2003
(Show Context)
Citation Context ...ds include for example bagging procedures (Breiman, 1996) applied to microarray data by Dudoit et al. (2002), boosting (Freund and Schapire, 1997) used by Dettling and Bühlmann 7s(2003), BagBoosting (=-=Dettling, 2004-=-) or Breiman’s (2001) random forests examined by Diaz-Uriarte and de Andrés (2006) in the context of variable selection for classification. These methods may be easily applied in the n < p setting, es... |

117 |
Tumor classification by partial least squares using microarray gene expression data
- Nguyen, Rocke
- 2002
(Show Context)
Citation Context ...redictors in form of a small number of new components (often linear combinations of the original predictors). Well-known examples are principal component analysis (PCA) or Partial Least Squares (PLS, =-=Nguyen and Rocke, 2002-=-; Boulesteix, 2004; Boulesteix and Strimmer, 2007) and its generalizations (Fort and Lambert-Lacroix, 2005; Ding and Gentleman, 2005). A concise overview of dimension reduction methods that have been ... |

109 | Analysis of representations for domain adaptation
- Ben-David, Blitzer, et al.
- 2007
(Show Context)
Citation Context ...etected (Kifer et al., 2004) and modelled accordingly – or in restricted situations it is even possible to formalize conditions under which some performance guarantees can be proven for the test set (=-=Ben-David et al., 2007-=-). When on the other hand the ultimate goal is to find a classifier that is generalizable to all kinds of test sets, including those from different places or points in time, as a consequence we would ... |

107 |
Comparing the areas under two or more correlated receiver operating characteristic curves: a nonparametric approach
- DeLong
- 1988
(Show Context)
Citation Context ... (e.g., Schäfer, 1994). The distance from the diagonal, the so called area under curve (AUC), is another useful diagnostic (Hanley and McNeil, 1982) and can be estimated via several approaches (e.g., =-=DeLong et al., 1988-=-). The AUC can also be used to compare intersecting ROC curves. After this overview on accuracy measures for the comparison of classifiers, the next section describes possible sampling strategies for ... |

102 | A model for measurement error for gene expression arrays
- Rocke, Durbin
(Show Context)
Citation Context ... fact that due to the many steps involved in the experimental process, from hybridization to image analysis, even in high quality experimental data severe measurement error may be present (see, e.g., =-=Rocke and Durbin, 2001-=-; Tadesse et al., 2005; Purdom and Holmes, 2005). As a consequence, 30sprediction and diagnosis do not longer coincide, since prediction is usually still based on the mismeasured variables, while diag... |

101 |
Improvements on cross-validation: the .632+ bootstrap method
- Efron, Tibshirani
- 1997
(Show Context)
Citation Context ...stimation methods may depend on whether one considers the conditional or unconditional error rate. For instance when using the unconditional error rate results are somewhat in favor of the bootstrap (=-=Efron and Tibshirani, 1997-=-). Estimating the error rate Suppose we use a learning set Dl to construct the classifier CM . The joint distribution Dl function F being unknown, the true conditional error rate Err(C M Dl ) = EF(I(Y... |

94 | Detecting change in data streams
- Kifer, Ben-David, et al.
- 2004
(Show Context)
Citation Context ...ssue discussed as “data drift”, “concept drift” or “structural change” in the literature. In this latter case, rather than discarding the classifier, the change in the data stream should be detected (=-=Kifer et al., 2004-=-) and modelled accordingly – or in restricted situations it is even possible to formalize conditions under which some performance guarantees can be proven for the test set (Ben-David et al., 2007). Wh... |

86 | Is cross-validation valid for small-sample microarray classification
- Braga-Neto, Dougherty
- 2004
(Show Context)
Citation Context ...ata sets. Contradicting studies have been published on the comparison of CV, MCCV and bootstrap strategies for error rate estimation. The use of CV (Eq. 14) in small sample settings is controversial (=-=Braga-Neto and Dougherty, 2004-=-) because of its high variability compared to MCCV (Eq. 18) or bootstrap sampling (Eq. 19,20). For instance, in the case of n = 30, each observation accounts for more than 3% in the error rate estimat... |

83 |
A comprehensive evaluation of multicategory classification methods for microarray gene expression cancer diagnosis
- Statnikov, Aliferis, et al.
- 2005
(Show Context)
Citation Context ...ous goals. One goal may be to compare several classification methods from a methodological point of view and explain observed differences (for instance, Dudoit et al., 2002; Romualdi et al., 2003; 23s=-=Statnikov et al., 2005-=-). Medical or biological articles on the other hand are concerned with the performance on future independent data of the best classifier, which should be selected following a strict procedure (typical... |

66 |
New feature subset selection procedures for classification of expression profiles
- Bø, Jonassen
(Show Context)
Citation Context ...ltivariate t-statistic). There have also been various proposals regarding the search algorithms. Some methods, which could be denoted as “semi-multivariate” restrict the search to pairs of variables (=-=Bo and Jonassen, 2002-=-) or subsets of low-correlated and thus presumably non-redundant variables derived from the list of univariately best variables (Jäger et al., 2003). In contrast, other authors seek for globally optim... |

55 | Genetic algorithms applied to multi-class prediction for the analysis of gene expression data - Ooi, Tan - 2003 |

47 | Prediction error estimation: a comparison of resampling methods - Molinaro, Simon, et al. |

44 |
Statistical Regression with Measurement Error
- Cheng, Ness
- 1999
(Show Context)
Citation Context ...rstand the material relations between the true variables. While several powerful procedures to correct for measurement error are available for regression models (see, e.g., Wansbeek and Meijer, 2000; =-=Cheng and Ness, 1999-=-; Carroll et al., 2006; Schneeweiß and Augustin, 2006, for surveys considering linear and nonlinear models, respectively), in the classification context well-founded treatment of measurement error is ... |

40 | Classifier technology and the illusion of progress. Statistical Science 21(1):1–14 - Hand |

40 | Accurate ranking of differentially expressed genes by a distribution-free shrinkage approach - Opgen-Rhein, Strimmer |

40 | 2003): Nonparametric methods for identifying differentially expressed genes in microarray data - Troyanskaya, Garber, et al. - 2001 |

39 | Improved gene selection for classification of microarrays
- Sengupta, R, et al.
- 2003
(Show Context)
Citation Context ...riate” restrict the search to pairs of variables (Bo and Jonassen, 2002) or subsets of low-correlated and thus presumably non-redundant variables derived from the list of univariately best variables (=-=Jäger et al., 2003-=-). In contrast, other authors seek for globally optimal subsets of variables based on sophisticated search algorithms such as genetic algorithms (Goldberg, 1989) applied to microarray data by, e.g., O... |

38 |
Critical review of published microarray studies for cancer outcome and guidelines on statistical analysis and reporting. J Natl Cancer Inst
- Dupuy, RM
(Show Context)
Citation Context ...iewed by Troyanskaya et al. (2002). The t-statistic, the Mann-Whitney statistic and the heuristic signal-to-noise ratio suggested by Golub et al. (1999) are the most widely-used criteria in practice (=-=Dupuy and Simon, 2007-=-). In the context of differential expression detection, several regularized variants of the standard t-statistic have been proposed in the last few years. They include, e.g., empirical Bayes methods (... |

37 | The naive credal classifier - Zaffalon |

30 | Classification Using Partial Least Squares with Penalized Logistic Regression
- Fort, Lambert-Lacroix
- 2005
(Show Context)
Citation Context ...redictors). Well-known examples are principal component analysis (PCA) or Partial Least Squares (PLS, Nguyen and Rocke, 2002; Boulesteix, 2004; Boulesteix and Strimmer, 2007) and its generalizations (=-=Fort and Lambert-Lacroix, 2005-=-; Ding and Gentleman, 2005). A concise overview of dimension reduction methods that have been used for classification with microarray data is given in Boulesteix (2006). After dimension reduction, one... |

29 | An extensive comparison of recent classification tools applied to microarray data - Lee, Lee, et al. |

27 |
Microarrays and molecular research : noise discovery. Lancet
- IOANNIDIS
- 2005
(Show Context)
Citation Context ...e selection using all n arrays and leaving no separate test set for validation should definitively be banished. Bad practice related to this aspect has probably contributed to much “noise discovery” (=-=Ioannidis, 2005-=-). A further important connection between classifiers and variable selection is the use of classifiers to evaluate the influence of single variables on the response class a posteriori. Parametric mode... |

26 | Partial Least Squares: A Versatile Tool for the Analysis
- Boulesteix, Strimmer
- 2006
(Show Context)
Citation Context ... components (often linear combinations of the original predictors). Well-known examples are principal component analysis (PCA) or Partial Least Squares (PLS, Nguyen and Rocke, 2002; Boulesteix, 2004; =-=Boulesteix and Strimmer, 2007-=-) and its generalizations (Fort and Lambert-Lacroix, 2005; Ding and Gentleman, 2005). A concise overview of dimension reduction methods that have been used for classification with microarray data is g... |

25 |
Post stratification
- Holt, Smith
- 1979
(Show Context)
Citation Context ...pso conditional (cf., e.g., Berger, 1980, Section 1.6). Also a view at the corresponding discussion in sampling theory on evalu16sating post stratification is illuminating in this context (see, e.g., =-=Hold and Smith, 1979-=-, for a classical paper). In this article, we arbitrarily use the notation ˆɛ for all the estimators, which refers to the unconditional error rate. However, the reviewed estimators can also be seen as... |

22 | Classification Using Generalized Partial Least Squares
- Ding, Gentleman
- 2004
(Show Context)
Citation Context ...are principal component analysis (PCA) or Partial Least Squares (PLS, Nguyen and Rocke, 2002; Boulesteix, 2004; Boulesteix and Strimmer, 2007) and its generalizations (Fort and Lambert-Lacroix, 2005; =-=Ding and Gentleman, 2005-=-). A concise overview of dimension reduction methods that have been used for classification with microarray data is given in Boulesteix (2006). After dimension reduction, one can basically apply any c... |

22 |
Classification of a large microarray data set: algorithm comparison and analysis of drug signatures. Genome Research
- Natsoulis, Ghaoui, et al.
- 2005
(Show Context)
Citation Context ...3); Man et al. (2004); Lee et al. (2005); Statnikov et al. (2005). Comparison of different classification methods can also be found in biological articles with strong methodological background (e.g., =-=Natsoulis et al., 2005-=-). Most of these studies include common “benchmark” data sets such as the well-known leukemia (Golub et al., 1999) and colon (Alon et al., 1999) data sets.Table 2 (Appendix B) summarizes the character... |

22 |
Probabilistic Prediction in Patient-Management and ClinicalTrials
- Spiegelhalter
- 1986
(Show Context)
Citation Context ...ions of recursive partitioning, the difference between the predicted class probability and the true class membership can be computed, e.g., by the Brier Score (i.e. the quadratic distance, see, e.g., =-=Spiegelhalter, 1986-=-, for an introduction). Since the theoretical joint distribution F is always unknown in real data analysis, the classifier has to be estimated from an available data set. Moreover, once a classifier i... |

22 |
Bias in random forest variable importance measures: Illustrations, sources and a solution
- Strobl, Boulesteix, et al.
- 2007
(Show Context)
Citation Context ...surement or their number of categories, as, e.g., when both genetic and clinical covariates are considered, the computation of the variable importance can be biased and must be performed differently (=-=Strobl et al., 2007-=-). 3 Measures of Classification Accuracy We have seen in the previous section that in large-scale association studies classification can either be conducted with previous variable selection, dimension... |

20 |
Regularized linear discriminant analysis and its application in microarrays
- GUO, HASTIE, et al.
- 2007
(Show Context)
Citation Context ...logistic regression (Zhu, 2004), the Prediction Analysis of Microarrays (PAM) method based on shrunken centroids (Tibshirani et al., 2002) or the more recent regularized linear discriminant analysis (=-=Guo et al., 2007-=-). Support Vector Machines (SVM) (Vapnik, 1995) or ensemble methods based on recursive partitioning belong to the second category. Ensemble methods include for example bagging procedures (Breiman, 199... |

18 | Reliable diagnoses of dementia by the naive credal classifier inferred from incomplete cognitive data." Artificial Intelligence in Medicine 29(12): 61-79. CONTACT For more information, please contact the corresponding author Dr
- Zaffalon, Wesnes
- 2003
(Show Context)
Citation Context ...called “credal classification”, where a subset of possible classes for each configuration of predictor variables is returned when there is not enough information to predict one single class (see also =-=Zaffalon et al., 2003-=-, for an application to dementia diagnosis). 4 Evaluation Strategies For simplicity, we assume in the following that the error rate is used as an accuracy measure, but the same principles hold for oth... |

17 | Comparison and evaluation of methods for generating differentially expressed gene lists from microarray data - Jeffery, Higgins, et al. - 2006 |