Results 11  20
of
33
Localized sliced inverse regression
, 2008
"... We develop an extension of sliced inverse regression (SIR) that we call localized sliced inverse regression (LSIR). This method allows for supervised dimension reduction by projection onto a linear subspace that captures the nonlinear subspace relevant to predicting the response. The method is also ..."
Abstract

Cited by 8 (4 self)
 Add to MetaCart
We develop an extension of sliced inverse regression (SIR) that we call localized sliced inverse regression (LSIR). This method allows for supervised dimension reduction by projection onto a linear subspace that captures the nonlinear subspace relevant to predicting the response. The method is also extended to the semisupervised setting where one is given labeled and unlabeled data. We introduce a simple algorithm that implements this method and illustrate its utility on real and simulated data.
Graphical tools for quadratic discriminant analysis
 Technometrics
, 2007
"... Sufficient dimension reduction methods provide effective ways to visualize discriminant analysis problems. For example, Cook and Yin (2001) showed that the dimension reduction method of sliced average variance estimation (save) identifies variates that are equivalent to a quadratic discriminant ana ..."
Abstract

Cited by 4 (1 self)
 Add to MetaCart
(Show Context)
Sufficient dimension reduction methods provide effective ways to visualize discriminant analysis problems. For example, Cook and Yin (2001) showed that the dimension reduction method of sliced average variance estimation (save) identifies variates that are equivalent to a quadratic discriminant analysis (qda) solution. This article makes this connection explicit to motivate the use of save variates in exploratory graphics for discriminant analysis. Classification can then be based on the save variates using a suitable distance measure. If the chosen measure is Mahalanobis distance, then classification is identical to qda using the original variables. Just as canonical variates provide a useful way to visualize linear discriminant analysis (lda), so save variates help to visualize qda—this would appear to be particularly useful given the lack of graphical tools for qda in current software. Furthermore, while lda and qda can be sensitive to nonnormality, save is more robust. Key words: canonical variates, classification, dimension reduction, linear discriminant analysis (lda), quadratic discriminant analysis (qda), sliced average variance estima
A simple method for finding molecular signatures from gene expression data
, 2004
"... Background “Molecular signatures”or “geneexpression signatures”are used to model patients ’ clinically relevant information (e.g., prognosis, survival time) using expression data from coexpressed genes. Signatures are a key feature in cancer research because they can provide insight into biological ..."
Abstract

Cited by 2 (0 self)
 Add to MetaCart
(Show Context)
Background “Molecular signatures”or “geneexpression signatures”are used to model patients ’ clinically relevant information (e.g., prognosis, survival time) using expression data from coexpressed genes. Signatures are a key feature in cancer research because they can provide insight into biological mechanisms and have potential diagnostic use. However, available methods to search for signatures fail to address key requirements of signatures and signature components, especially the discovery of tightly coexpressed sets of genes. Results We suggest a method with good predictive performance that follows from the biologically relevant features of signatures. After identifying a seed gene with good predictive abilities, we search for a group of genes that is highly correlated with the seed gene, shows tight coexpression, and has good predictive abilities; this set of genes is reduced to a signature component using using Principal Components Analysis. The process is repeated until no further component is found. We show that the suggested method can recover signatures present in the data, and has predictive performance comparable to stateoftheart methods. The code (R with C++) is freely available under GNU GPL license. Conclusions Our method is unique because it returns signature components that fulfill what are understood as biologically relevant features of signatures. Moreover, it can help identify cases where the data are inconsistent with the assumptions underlying the existence of a few, easily interpretable, signature components of coexpressed genes.
Invited Review Article: A Selective Overview of Variable Selection in High Dimensional Feature Space
, 2009
"... ar ..."
Expression Microarray Data ” that appeared in Statistical Applications in Genetics and Molecular
"... AnneLaure Boulesteix This note is a comment on the article “Dimension Reduction for Classification with Gene ..."
Abstract
 Add to MetaCart
AnneLaure Boulesteix This note is a comment on the article “Dimension Reduction for Classification with Gene
Bigdata Feature Screening Using Bregman Divergence
"... Abstract — Modern biomedical data usually are big: they have high dimensions and/or massive volumes as a result of fast advancement in sensing and information technologies. Many components of the data, however, may not play any important role for tasks of interest. Reducing the dimensionality of the ..."
Abstract
 Add to MetaCart
(Show Context)
Abstract — Modern biomedical data usually are big: they have high dimensions and/or massive volumes as a result of fast advancement in sensing and information technologies. Many components of the data, however, may not play any important role for tasks of interest. Reducing the dimensionality of the data turns out to be critical for obtaining accurate result and a good generalization capability. This paper introduces an approach for using Bregman divergence to describe the discriminating capability of a subset of features. An important advantage of this method is that the nonlinear relationship existing in the data can be exploited with the Bregman divergence. Moreover, the optimization of the proposed discrimination measure turns out to be particularly simple, which leads to a fast algorithm.
© Institute of Mathematical Statistics, 2008 HIGHDIMENSIONAL CLASSIFICATION USING FEATURES
"... Classification using highdimensional features arises frequently in many contemporary statistical studies such as tumor classification using microarray or other highthroughput data. The impact of dimensionality on classifications is poorly understood. In a seminal paper, Bickel and Levina [Bernoull ..."
Abstract
 Add to MetaCart
Classification using highdimensional features arises frequently in many contemporary statistical studies such as tumor classification using microarray or other highthroughput data. The impact of dimensionality on classifications is poorly understood. In a seminal paper, Bickel and Levina [Bernoulli 10 (2004) 989–1010] show that the Fisher discriminant performs poorly due to diverging spectra and they propose to use the independence rule to overcome the problem. We first demonstrate that even for the independence classification rule, classification using all the features can be as poor as the random guessing due to noise accumulation in estimating population centroids in highdimensional feature space. In fact, we demonstrate further that almost all linear discriminants can perform as poorly as the random guessing. Thus, it is important to select a subset of important features for highdimensional classification, resulting in Features Annealed Independence Rules (FAIR). The conditions under which all the important features can be selected by the twosample tstatistic are established. The choice of the optimal number of features, or equivalently, the threshold value of the test statistics are proposed based on an upper bound of the classification error. Simulation studies and real data analysis support our theoretical results and demonstrate convincingly the advantage of our new classification procedure. 1. Introduction. With rapid advance of imaging technology
University Joseph Fourier of Grenoble LMCIMAG
"... The statistical problem of estimating the effective dimensionreduction (EDR) subspace in the multiindex regression model with deterministic design and additive noise is considered. A new procedure for recovering the directions of the EDR subspace is proposed. Many methods for estimating the EDR su ..."
Abstract
 Add to MetaCart
The statistical problem of estimating the effective dimensionreduction (EDR) subspace in the multiindex regression model with deterministic design and additive noise is considered. A new procedure for recovering the directions of the EDR subspace is proposed. Many methods for estimating the EDR subspace perform principal component analysis on a family of vectors, say ˆβ1; : : : ; ˆβL, nearly lying in the EDR subspace. This is in particular the case for the structureadaptive approach proposed by Hristache et al. (2001a). In the present work, we propose to estimate the projector onto the EDR subspace by the solution to the optimization problem minimize max `=1;:::;L ˆβ> ` (IA) ˆβ ` subject to A 2 Am ; where Am is the set of all symmetric matrices with eigenvalues in [0;1] and trace less than or equal to m, with m being the true structural dimension. Under mild assumptions, p nconsistency of the proposed procedure is proved (up to a logarithmic factor) in the case when the structural dimension is not larger than 4. Moreover, the stochastic error of the estimator of the projector onto the EDR subspace is shown to depend on L logarithmically. This enables us to use a large number of vectors ˆβ ` for estimating the EDR subspace. The empirical behavior of the algorithm is studied through numerical simulations.
2007 Biometrika Trust Printed in Great Britain Marginal tests with sliced average variance estimation
"... We present a new computationally feasible test for the dimension of the central subspace in a regression problem based on sliced average variance estimation. We also provide a marginal coordinate test. Under the null hypothesis, both the test of dimension and the marginal coordinate test involve tes ..."
Abstract
 Add to MetaCart
We present a new computationally feasible test for the dimension of the central subspace in a regression problem based on sliced average variance estimation. We also provide a marginal coordinate test. Under the null hypothesis, both the test of dimension and the marginal coordinate test involve test statistics that asymptotically have chisquared distributions given normally distributed predictors, and have a distribution that is a linear combination of chisquared distributions in general. Some key words: Marginal coordinate test; Sufficient dimension reduction. 1.
On Sliced Inverse Regression With
"... Sliced inverse regression is a promising method for the estimation of the central dimensionreduction subspace (CDR space) in semiparametric regression models. It is particularly useful in tackling cases with highdimensional covariates. In this article we study the asymptotic behavior of the estim ..."
Abstract
 Add to MetaCart
Sliced inverse regression is a promising method for the estimation of the central dimensionreduction subspace (CDR space) in semiparametric regression models. It is particularly useful in tackling cases with highdimensional covariates. In this article we study the asymptotic behavior of the estimate of the CDR space with highdimensional covariates, that is, when the dimension of the covariates goes to infinity as the sample size goes to infinity. Strong and weak convergence are obtained. We also suggest an estimation procedure of the Bayes information criterion type to ascertain the dimension of the CDR space and derive the consistency. A simulation study is conducted.