@MISC{_kerneldimensionality, author = {}, title = {Kernel Dimensionality Reduction for SupervisedLearning}, year = {} }

Share

OpenURL

Abstract

Abstract We propose a novel method of dimensionality reduction for supervisedlearning. Given a regression or classification problem in which we wish to predict a variable Y from an explanatory vector X, we treat the prob-lem of dimensionality reduction as that of finding a low-dimensional "effective subspace " of X which retains the statistical relationship between X and Y. We show that this problem can be formulated in terms ofconditional independence. To turn this formulation into an optimization problem, we characterize the notion of conditional independence usingcovariance operators on reproducing kernel Hilbert spaces; this allows us to derive a contrast function for estimation of the effective subspace. Un-like many conventional methods, the proposed method requires neither assumptions on the marginal distribution of X, nor a parametric modelof the conditional distribution of Y. 1 Introduction Many statistical learning problems involve some form of dimensionality reduction. Thegoal may be one of feature selection, in which we aim to find linear or nonlinear combinations of the original set of variables, or one of variable selection, in which we wish to selecta subset of variables from the original set. Motivations for such dimensionality reduction include providing a simplified explanation and visualization for a human, suppressing noiseso as to make a better prediction or decision, or reducing the computational burden. We study dimensionality reduction for supervised learning, in which the data consists of(