Results 1 - 10
of
85
Consensus clustering -- A resampling-based method for class discovery and visualization of gene expression microarray data
- MACHINE LEARNING, FUNCTIONAL GENOMICS SPECIAL ISSUE
, 2003
"... ..."
Prediction by supervised principal components
- Journal of the American Statistical Association
, 2006
"... In regression problems where the number of predictors greatly exceeds the number of observations, conventional regression techniques may produce unsatisfactory results. We describe a technique called supervised principal components that can be applied to this type of problem. Supervised principal co ..."
Abstract
-
Cited by 36 (5 self)
- Add to MetaCart
In regression problems where the number of predictors greatly exceeds the number of observations, conventional regression techniques may produce unsatisfactory results. We describe a technique called supervised principal components that can be applied to this type of problem. Supervised principal components is similar to conventional principal components analysis except that it uses a subset of the predictors selected based on their association with the outcome. Supervised principal components can be applied to regression and generalized regression problems, such as survival analysis. It compares favorably to other techniques for this type of problem, and can also account for the effects of other covariates and help identify which predictor variables are most important. We also provide asymptotic consistency results to help support our empirical findings. These methods could become important tools for DNA microarray data, where they may be used to more accurately diagnose and treat cancer. KEY WORDS: Gene expression; Microarray; Regression; Survival analysis. 1.
Translation of microarray data into clinically relevant cancer diagnostic tests using gene expression ratios in lung cancer and mesothelioma
- Cancer Res
, 2002
"... The pathological distinction between malignant pleural mesothelioma (MPM) and adenocarcinoma (ADCA) of the lung can be cumbersome using established methods. We propose that a simple technique, based on the expression levels of a small number of genes, can be useful in the early and accurate diagnosi ..."
Abstract
-
Cited by 35 (1 self)
- Add to MetaCart
The pathological distinction between malignant pleural mesothelioma (MPM) and adenocarcinoma (ADCA) of the lung can be cumbersome using established methods. We propose that a simple technique, based on the expression levels of a small number of genes, can be useful in the early and accurate diagnosis of MPM and lung cancer. This method is designed to accurately distinguish between genetically disparate tissues using gene expression ratios and rationally chosen thresholds. Here we have tested the fidelity of ratio-based diagnosis in differentiating between MPM and lung cancer in 181 tissue samples (31 MPM and 150 ADCA). A training set of 32 samples (16 MPM and 16 ADCA) was used to identify pairs of genes with highly significant, inversely correlated expression levels to form a total of 15 diagnostic ratios using expression profiling data. Any single ratio of the 15 examined was at least 90 % accurate in predicting diagnosis for the remaining 149 samples (e.g., test set). We then examined (in the test
Stability selection
"... Proofs subject to correction. Not to be reproduced without permission. Contributions to the discussion must not exceed 400 words. Contributions longer than 400 words will be cut by the editor. 1 2 ..."
Abstract
-
Cited by 18 (2 self)
- Add to MetaCart
Proofs subject to correction. Not to be reproduced without permission. Contributions to the discussion must not exceed 400 words. Contributions longer than 400 words will be cut by the editor. 1 2
Bayesian model averaging: development of an improved multi-class, gene selection and classification tool for microarray data
, 2005
"... ..."
Partial least squares: A versatile tool for the analysis of high-dimensional genomic data
- Briefings in Bioinformatics
, 2007
"... Partial Least Squares (PLS) is a highly efficient statistical regression technique that is well suited for the analysis of high-dimensional genomic data. In this paper we review the theory and applications of PLS both under methodological and biological points of view. Focusing on microarray express ..."
Abstract
-
Cited by 15 (5 self)
- Add to MetaCart
Partial Least Squares (PLS) is a highly efficient statistical regression technique that is well suited for the analysis of high-dimensional genomic data. In this paper we review the theory and applications of PLS both under methodological and biological points of view. Focusing on microarray expression data we provide a systematic comparison of the PLS approaches currently employed, and discuss problems as different as tumor classification, identification of relevant genes, survival analysis and modeling of gene networks. 2 1
Estimating dataset size requirements for classifying DNA Microarray data
- Journal of Computational Biology
, 2003
"... microarray analysis, sample size estimation. ..."
Optimal gene expression analysis by microarrays
- Cancer Cell
, 2002
"... DNA microarrays make possible the rapid and comprehensive assessment of the transcriptional activity of a cell, and as such have proven valuable in assessing the molecular contributors to biological processes and in the classification of human cancers. The major challenge in using this technology is ..."
Abstract
-
Cited by 11 (2 self)
- Add to MetaCart
DNA microarrays make possible the rapid and comprehensive assessment of the transcriptional activity of a cell, and as such have proven valuable in assessing the molecular contributors to biological processes and in the classification of human cancers. The major challenge in using this technology is the analysis of its massive data output, which requires computational means for interpretation and a heightened need for quality data. The optimal analysis requires an accounting and control of the many sources of variance within the system, an understanding of the limitations of the statistical approaches, and the ability to make sense of the results through intelligent database interrogation. Expression array technology Expression genomics is an approach that examines gene expression in a comprehensive and massively parallel fashion. The core technology in expression genomics is microarrays, whereby thousands of DNA probes are immobilized on a solid surface and hybridized against fluorophore-labeled cDNA or cRNA targets from template RNA sources. The two major platforms for microarrays are spotted arrays, where the probes are mechanically deposited onto modified glass slides by contact or
Statistical challenges in functional genomics
- Statist. Sci
, 2003
"... On February 12, 2001 the Human Genome Project announced that it had assembled a draft physical map of the human genome- the genetic blueprint for a human being. Now the challenge is to annotate this map, by understanding the functions of genes and their interplay with proteins and the environment to ..."
Abstract
-
Cited by 8 (0 self)
- Add to MetaCart
On February 12, 2001 the Human Genome Project announced that it had assembled a draft physical map of the human genome- the genetic blueprint for a human being. Now the challenge is to annotate this map, by understanding the functions of genes and their interplay with proteins and the environment to create complex, dynamic living systems. This is the goal of functional genomics. Recent technological advances enable biomedical investigators to observe the genome of entire organisms in action by simultaneously measuring the level of activation of thousands of genes under the same experimental conditions. This technology, known as microarrays, provides today unparalleled discovery opportunities and it is reshaping biomedical sciences. One of the main aspects of this revolution is the introduction of heavily quantitative data-analytical methods in biomedical research. This paper reviews the foundations of this technology
Robust sparse hyperplane classifiers: application to uncertain molecular profiling data
- Journal of Computational Biology
, 2004
"... Key words: robust sparse hyperplanes; second-order cone program; linear programming; breast cancer; molecular profiling; two-class high-dimensional data ..."
Abstract
-
Cited by 8 (1 self)
- Add to MetaCart
Key words: robust sparse hyperplanes; second-order cone program; linear programming; breast cancer; molecular profiling; two-class high-dimensional data

