Results 1 - 10
of
33
Prediction by supervised principal components
- Journal of the American Statistical Association
, 2006
"... In regression problems where the number of predictors greatly exceeds the number of observations, conventional regression techniques may produce unsatisfactory results. We describe a technique called supervised principal components that can be applied to this type of problem. Supervised principal co ..."
Abstract
-
Cited by 98 (9 self)
- Add to MetaCart
In regression problems where the number of predictors greatly exceeds the number of observations, conventional regression techniques may produce unsatisfactory results. We describe a technique called supervised principal components that can be applied to this type of problem. Supervised principal components is similar to conventional principal components analysis except that it uses a subset of the predictors selected based on their association with the outcome. Supervised principal components can be applied to regression and generalized regression problems, such as survival analysis. It compares favorably to other techniques for this type of problem, and can also account for the effects of other covariates and help identify which predictor variables are most important. We also provide asymptotic consistency results to help support our empirical findings. These methods could become important tools for DNA microarray data, where they may be used to more accurately diagnose and treat cancer. KEY WORDS: Gene expression; Microarray; Regression; Survival analysis. 1.
R: Prediction error estimation: a comparison of resampling methods
- Bioinformatics
"... In genomic studies, thousands of features are collected on relatively few samples. One of the goals of these studies is to build classifiers to predict the outcome of future observations. There are three inherent steps to this process: feature selection, model selection, and prediction assessment. W ..."
Abstract
-
Cited by 84 (12 self)
- Add to MetaCart
In genomic studies, thousands of features are collected on relatively few samples. One of the goals of these studies is to build classifiers to predict the outcome of future observations. There are three inherent steps to this process: feature selection, model selection, and prediction assessment. With a focus on pre-diction assessment, we compare several methods for estimating the ’true ’ prediction error of a prediction model in the presence of feature selection. For small studies where features are selected from thousands of candidates, the resubstitution and simple split-sample estimates are seriously biased. In these small samples, leave-one-out (LOOCV), 10-fold cross-validation (CV), and the.632+ bootstrap have the smallest bias for diagonal discriminant analysis, nearest neighbor, and classification trees. LOOCV and 10-fold CV have the smallest bias for linear discriminant analysis. Additionally, LOOCV, 5- and 10-fold CV, and the.632+ bootstrap have the lowest mean square error. The.632+ bootstrap is quite biased in small sample sizes with strong signal to noise ratios. The differences in performance among resampling methods are reduced as the number of specimens available increases. Supplementary Information: R code for simulations and analyses is available from the authors. Tables and figures for all analyses are available at
High dimensional classification using features annealed independence rules
- Ann. Statist
, 2008
"... ABSTRACT. Classification using high-dimensional features arises frequently in many contemporary statistical studies such as tumor classification using microarray or other high-throughput data. The impact of dimensionality on classifications is largely poorly understood. In a seminal paper, Bickel an ..."
Abstract
-
Cited by 79 (19 self)
- Add to MetaCart
ABSTRACT. Classification using high-dimensional features arises frequently in many contemporary statistical studies such as tumor classification using microarray or other high-throughput data. The impact of dimensionality on classifications is largely poorly understood. In a seminal paper, Bickel and Levina (2004) show that the Fisher discriminant performs poorly due to diverging spectra and they propose to use the independence rule to overcome the problem. We first demonstrate that even for the independence classification rule, classification using all the features can be as bad as the random guessing due to noise accumulation in estimating population centroids in high-dimensional feature space. In fact, we demonstrate further that almost all linear discriminants can perform as bad as the random guessing. Thus, it is paramountly important to select a subset of important features for high-dimensional classification, resulting in Features Annealed Independence Rules (FAIR). The conditions under which all the important features can be selected by the two-sample t-statistic are established. The choice of the optimal number of features, or equivalently, the threshold value of the test statistics are proposed based on an upper bound of the classification error. Simulation studies and real data analysis support our theoretical results and demonstrate convincingly the advantage of our new classification procedure.
A SELECTIVE OVERVIEW OF VARIABLE SELECTION IN HIGH DIMENSIONAL FEATURE SPACE
, 2010
"... High dimensional statistical problems arise from diverse fields of scientific research and technological development. Variable selection plays a pivotal role in contemporary statistical learning and scientific discoveries. The traditional idea of best subset selection methods, which can be regarded ..."
Abstract
-
Cited by 70 (6 self)
- Add to MetaCart
High dimensional statistical problems arise from diverse fields of scientific research and technological development. Variable selection plays a pivotal role in contemporary statistical learning and scientific discoveries. The traditional idea of best subset selection methods, which can be regarded as a specific form of penalized likelihood, is computationally too expensive for many modern statistical applications. Other forms of penalized likelihood methods have been successfully developed over the last decade to cope with high dimensionality. They have been widely applied for simultaneously selecting important variables and estimating their effects in high dimensional statistical inference. In this article, we present a brief account of the recent developments of theory, methods, and implementations for high dimensional variable selection. What limits of the dimensionality such methods can handle, what the role of penalty functions is, and what the statistical properties are rapidly drive the advances of the field. The properties of non-concave penalized likelihood and its roles in high dimensional statistical modeling are emphasized. We also review some recent advances in ultra-high dimensional variable selection, with emphasis on independence screening and two-scale methods.
Sufficient dimension reduction via inverse regression a minimum discrepancy approach
- J. Am. Stat. Assoc
, 2005
"... A family of dimension-reduction methods, the inverse regression (IR) family, is developed by minimizing a quadratic objective function. An optimal member of this family, the inverse regression estimator (IRE), is proposed, along with inference methods and a computational algorithm. The IRE has at le ..."
Abstract
-
Cited by 56 (8 self)
- Add to MetaCart
(Show Context)
A family of dimension-reduction methods, the inverse regression (IR) family, is developed by minimizing a quadratic objective function. An optimal member of this family, the inverse regression estimator (IRE), is proposed, along with inference methods and a computational algorithm. The IRE has at least three desirable properties: (1) Its estimated basis of the central dimension reduction subspace is asymptotically efficient, (2) its test statistic for dimension has an asymptotic chi-squared distribution, and (3) it provides a chi-squared test of the conditional independence hypothesis that the response is independent of a selected subset of predictors given the remaining predictors. Current methods like sliced inverse regression belong to a suboptimal class of the IR family. Comparisons of these methods are reported through simulation studies. The approach developed here also allows a relatively straightforward derivation of the asymptotic null distribution of the test statistic for dimension used in sliced average variance estimation. KEY WORDS: Inverse regression estimator; Sliced average variance estimation; Sliced inverse regression; Sufficient dimension reduction. 1.
Fisher lecture: Dimension reduction in regression
- Statist. Sci
, 2007
"... Abstract. Beginning with a discussion of R. A. Fisher’s early written remarks that relate to dimension reduction, this article revisits principal components as a reductive method in regression, develops several model-based extensions and ends with descriptions of general approaches to model-based an ..."
Abstract
-
Cited by 54 (4 self)
- Add to MetaCart
Abstract. Beginning with a discussion of R. A. Fisher’s early written remarks that relate to dimension reduction, this article revisits principal components as a reductive method in regression, develops several model-based extensions and ends with descriptions of general approaches to model-based and model-free dimension reduction in regression. It is argued that the role for principal components and related methodology may be broader than previously seen and that the common practice of conditioning on observed values of the predictors may unnecessarily limit the choice of regression methodology. Key words and phrases: Central subspace, Grassmann manifolds, inverse regression, minimum average variance estimation, principal components, principal fitted components, sliced inverse regression, sufficient dimension reduction. 1.
Dimension reduction methods for microarrays with application to censored survival data
- Bioinformatics
, 2004
"... Motivation: Recent research has shown that gene expression profiles can potentially be used for predicting various clinical phenotypes, such as tumor class, drug response and survival time. While there has been extensive studies on tumor clas-sification, there has been less emphasis on other phenoty ..."
Abstract
-
Cited by 30 (1 self)
- Add to MetaCart
Motivation: Recent research has shown that gene expression profiles can potentially be used for predicting various clinical phenotypes, such as tumor class, drug response and survival time. While there has been extensive studies on tumor clas-sification, there has been less emphasis on other phenotypic features, in particular, patient survival time or time to cancer recurrence, which are subject to right censoring. We consider in this paper an analysis of censored survival time based on microarray gene expression profiles. Results: We propose a dimension reduction strategy, which combines principal components analysis and sliced inverse regression, to identify linear combinations of genes, that both account for the variability in the gene expression levels and preserve the phenotypic information. The extracted gene com-binations are then employed as covariates in a predictive survival model formulation. We apply the proposed method to a large diffuse large-B-cell lymphoma dataset, which consists of 240 patients and 7399 genes, and build a Cox proportional hazards model based on the derived gene expression com-ponents. The proposed method is shown to provide a good predictive performance for patient survival, as demonstrated by both the significant survival difference between the pre-dicted risk groups and the receiver operator characteristics analysis. Availability: R programs are available upon request from the authors. Contact:
Principal fitted components for dimension reduction in regression
- Statistical Science
"... Abstract. We provide a remedy for two concerns that have dogged the use of principal components in regression: (i) principal components are computed from the predictors alone and do not make apparent use of the response, and (ii) principal components are not invariant or equivariant under full rank ..."
Abstract
-
Cited by 17 (6 self)
- Add to MetaCart
Abstract. We provide a remedy for two concerns that have dogged the use of principal components in regression: (i) principal components are computed from the predictors alone and do not make apparent use of the response, and (ii) principal components are not invariant or equivariant under full rank linear transformation of the predictors. The development begins with principal fitted components [Cook, R. D. (2007). Fisher lecture: Dimension reduction in regression (with discussion). Statist. Sci. 22 1–26] and uses normal models for the inverse regression of the predictors on the response to gain reductive information for the forward regression of interest. This approach includes methodology for testing hypotheses about the number of components and about conditional independencies among the predictors. Key words and phrases: Central subspace, dimension reduction, inverse regression, principal components. 1.
2009a Likelihood-based sufficient dimension reduction
- J. Amer. Statist. Ass
"... We obtain the maximum likelihood estimator of the central subspace under conditional normality of the predictors given the response. Analytically and in simulations we found that our new estimator can preform much better than sliced inverse regression, sliced average variance estimation and directio ..."
Abstract
-
Cited by 12 (4 self)
- Add to MetaCart
(Show Context)
We obtain the maximum likelihood estimator of the central subspace under conditional normality of the predictors given the response. Analytically and in simulations we found that our new estimator can preform much better than sliced inverse regression, sliced average variance estimation and directional regression, and that it seems quite robust to deviations from normality.
The Centrality of
, 1992
"... This Article is brought to you for free and open access by the Biochemistry, Department of at DigitalCommons@University of Nebraska- Lincoln. It ..."
Abstract
-
Cited by 10 (4 self)
- Add to MetaCart
This Article is brought to you for free and open access by the Biochemistry, Department of at DigitalCommons@University of Nebraska- Lincoln. It