Results 1  10
of
156
Dimensionality reduction for supervised learning with reproducing kernel Hilbert spaces
 Journal of Machine Learning Research
, 2004
"... We propose a novel method of dimensionality reduction for supervised learning problems. Given a regression or classification problem in which we wish to predict a response variable Y from an explanatory variable X, we treat the problem of dimensionality reduction as that of finding a lowdimensional ..."
Abstract

Cited by 117 (26 self)
 Add to MetaCart
We propose a novel method of dimensionality reduction for supervised learning problems. Given a regression or classification problem in which we wish to predict a response variable Y from an explanatory variable X, we treat the problem of dimensionality reduction as that of finding a lowdimensional “effective subspace ” for X which retains the statistical relationship between X and Y. We show that this problem can be formulated in terms of conditional independence. To turn this formulation into an optimization problem we establish a general nonparametric characterization of conditional independence using covariance operators on reproducing kernel Hilbert spaces. This characterization allows us to derive a contrast function for estimation of the effective subspace. Unlike many conventional methods for dimensionality reduction in supervised learning, the proposed method requires neither assumptions on the marginal distribution of X, nor a parametric model of the conditional distribution of Y. We present experiments that compare the performance of the method with conventional methods.
An Introduction to Regression Graphics
, 1994
"... This article, which is based on an Interface tutorial, presents an overview of regression graphics, along with an annotated bibliography. The intent is to discuss basic ideas and issues without delving into methodological or theoretical details, and to provide a guide to the literature. 1 ..."
Abstract

Cited by 70 (9 self)
 Add to MetaCart
This article, which is based on an Interface tutorial, presents an overview of regression graphics, along with an annotated bibliography. The intent is to discuss basic ideas and issues without delving into methodological or theoretical details, and to provide a guide to the literature. 1
Generalized Partially Linear SingleIndex Models
 Journal of the American Statistical Association
, 1998
"... The typical generalized linear model for a regression of a response Y on predictors (X; Z) has conditional mean function based upon a linear combination of (X; Z). We generalize these models to have a nonparametric component, replacing the linear combination T 0 X + T 0 Z by 0 ( T 0 X) + T 0 Z, wher ..."
Abstract

Cited by 63 (24 self)
 Add to MetaCart
The typical generalized linear model for a regression of a response Y on predictors (X; Z) has conditional mean function based upon a linear combination of (X; Z). We generalize these models to have a nonparametric component, replacing the linear combination T 0 X + T 0 Z by 0 ( T 0 X) + T 0 Z, where 0 ( ) is an unknown function. We call these generalized partially linear singleindex models (GPLSIM). The models include the "singleindex" models, which have 0 = 0. Using local linear methods, estimates of the unknown parameters ( 0 ; 0 ) and the unknown function 0 ( ) are proposed, and their asymptotic distributions obtained. Examples illustrate the models and the proposed estimation methodology.
Prediction by supervised principal components
 Journal of the American Statistical Association
, 2006
"... In regression problems where the number of predictors greatly exceeds the number of observations, conventional regression techniques may produce unsatisfactory results. We describe a technique called supervised principal components that can be applied to this type of problem. Supervised principal co ..."
Abstract

Cited by 54 (6 self)
 Add to MetaCart
In regression problems where the number of predictors greatly exceeds the number of observations, conventional regression techniques may produce unsatisfactory results. We describe a technique called supervised principal components that can be applied to this type of problem. Supervised principal components is similar to conventional principal components analysis except that it uses a subset of the predictors selected based on their association with the outcome. Supervised principal components can be applied to regression and generalized regression problems, such as survival analysis. It compares favorably to other techniques for this type of problem, and can also account for the effects of other covariates and help identify which predictor variables are most important. We also provide asymptotic consistency results to help support our empirical findings. These methods could become important tools for DNA microarray data, where they may be used to more accurately diagnose and treat cancer. KEY WORDS: Gene expression; Microarray; Regression; Survival analysis. 1.
Prediction via Orthogonalized Model Mixing
 JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION
, 1994
"... In this paper we introduce an approach and algorithms for model mixing in large prediction problems with correlated predictors. We focus on the choice of predictors in linear models, and mix over possible subsets of candidate predictors. Our approach is based on expressing the space of models in ter ..."
Abstract

Cited by 50 (9 self)
 Add to MetaCart
In this paper we introduce an approach and algorithms for model mixing in large prediction problems with correlated predictors. We focus on the choice of predictors in linear models, and mix over possible subsets of candidate predictors. Our approach is based on expressing the space of models in terms of an orthogonalization of the design matrix. Advantages are both statistical and computational. Statistically, orthogonalization often leads to a reduction in the number of competing models by eliminating correlations. Computationally, large model spaces cannot be enumerated; recent approaches are based on sampling models with high posterior probability via Markov chains. Based on orthogonalization of the space of candidate predictors, we can approximate the posterior probabilities of models by products of predictorspecific terms. This leads to an importance sampling function for sampling directly from the joint distribution over the model space, without resorting to Markov chains. Comp...
Effective dimension reduction methods for tumor classification using gene expression data
 Bioinformatics
, 2003
"... Motivation: One particular application of microarray data, is to uncover the molecular variation among cancers. One feature of microarray studies is the fact that the number n of samples collected is relatively small compared to the number p of genes per sample which are usually in the thousands. In ..."
Abstract

Cited by 35 (2 self)
 Add to MetaCart
Motivation: One particular application of microarray data, is to uncover the molecular variation among cancers. One feature of microarray studies is the fact that the number n of samples collected is relatively small compared to the number p of genes per sample which are usually in the thousands. In statistical terms this very large number of predictors compared to a small number of samples or observations makes the classification problem difficult. An efficient way to solve this problem is by using dimension reduction statistical techniques in conjunction with nonparametric discriminant procedures. Results: We view the classification problem as a regression problem with few observations and many predictor variables. We use an adaptive dimension reduction method for generalized semiparametric regression models that allows us to solve the ‘curse of dimensionality problem ’ arising in the context of expression data. The predictive performance of the resulting classification rule is illustrated on two well know data sets in the microarray literature: the leukemia data that is known to contain classes that are easy ‘separable ’ and the colon data set. Availability: Software that implements the procedures on which this paper focus are freely available at
Kernel dimension reduction in regression
, 2006
"... Acknowledgements. The authors thank the editor and anonymous referees for their helpful comments. The authors also thank Dr. Yoichi Nishiyama for his helpful comments on the uniform convergence of empirical processes. We would like to acknowledge support from JSPS KAKENHI 15700241, ..."
Abstract

Cited by 27 (11 self)
 Add to MetaCart
Acknowledgements. The authors thank the editor and anonymous referees for their helpful comments. The authors also thank Dr. Yoichi Nishiyama for his helpful comments on the uniform convergence of empirical processes. We would like to acknowledge support from JSPS KAKENHI 15700241,
Partial least squares: A versatile tool for the analysis of highdimensional genomic data
 Briefings in Bioinformatics
, 2007
"... Partial Least Squares (PLS) is a highly efficient statistical regression technique that is well suited for the analysis of highdimensional genomic data. In this paper we review the theory and applications of PLS both under methodological and biological points of view. Focusing on microarray express ..."
Abstract

Cited by 26 (7 self)
 Add to MetaCart
Partial Least Squares (PLS) is a highly efficient statistical regression technique that is well suited for the analysis of highdimensional genomic data. In this paper we review the theory and applications of PLS both under methodological and biological points of view. Focusing on microarray expression data we provide a systematic comparison of the PLS approaches currently employed, and discuss problems as different as tumor classification, identification of relevant genes, survival analysis and modeling of gene networks. 2 1
An Adaptive Estimation of Dimension Reduction Space
, 2002
"... Searching for an effective dimension reduction space is an important problem in regression, especially for high dimensional data. In this paper, we propose an adaptive approach based on semiparametric models, which we call the minimum average (conditional) variance estimation (MAVE) method, within q ..."
Abstract

Cited by 22 (0 self)
 Add to MetaCart
Searching for an effective dimension reduction space is an important problem in regression, especially for high dimensional data. In this paper, we propose an adaptive approach based on semiparametric models, which we call the minimum average (conditional) variance estimation (MAVE) method, within quite a general setting. The MAVE method has the following advantages: (1) Most existing methods have to undersmooth the nonparametric link function estimator in order to achieve a faster rate of consistency for the estimator of the parameters (than for that of the nonparametric function). In contrast, a faster consistency rate can be achieved by the MAVE method even without undersmoothing the nonparametric link function estimator. (2) The MAVE method is applicable to a wide range of models, with fewer restrictions on the distribution of the covariates, to the extent that even time series can be included. (3) Because of the faster rate of consistency for the parameter estimators, it is possible for us to estimate the dimension of the space consistently.
V.: Structure adaptive approach for dimension reduction
 Ann. Stat
, 2001
"... We propose a new method of effective dimension reduction for a multiindex model which is based on iterative improvement of the family of average derivative estimates. The procedure is computationally straightforward and does not require any prior information about the structure of the underlying mod ..."
Abstract

Cited by 20 (3 self)
 Add to MetaCart
We propose a new method of effective dimension reduction for a multiindex model which is based on iterative improvement of the family of average derivative estimates. The procedure is computationally straightforward and does not require any prior information about the structure of the underlying model. We show that in the case when the effective dimension m of the index space does not exceed 3, this space can be estimated with the rate n −1/2 under rather mild assumptions on the model.