@MISC{Ouimet_greedyspectral, author = {Marie Ouimet and et al.}, title = {Greedy spectral embedding}, year = {} }
Share
OpenURL
Abstract
Spectral dimensionality reduction methods and spectral clustering methods require computation of the principal eigenvectors of an n × n matrix where n is the number of examples. Following up on previously proposed techniques to speed-up kernel methods by focusing on a subset of m examples, we study a greedy selection procedure for this subset, based on the feature-space distance between a candidate example and the span of the previously chosen ones. In the case of kernel PCA or spectral clustering this reduces computation to O(m² n). For the same computational complexity, we can also compute the feature space projection of the non-selected examples on the subspace spanned by the selected examples, to estimate the embedding function based on all the data, which yields considerably better estimation of the embedding function. This algorithm can be formulated in an online setting and we can bound the error on the approximation of the Gram matrix.