Results 1  10
of
124
Locality Preserving Projections
, 2002
"... Many problems in information processing involve some form of dimensionality reduction. In this paper, we introduce Locality Preserving Projections (LPP). These are linear projective maps that arise by solving a variational problem that optimally preserves the neighborhood structure of the data s ..."
Abstract

Cited by 354 (16 self)
 Add to MetaCart
Many problems in information processing involve some form of dimensionality reduction. In this paper, we introduce Locality Preserving Projections (LPP). These are linear projective maps that arise by solving a variational problem that optimally preserves the neighborhood structure of the data set. LPP should be seen as an alternative to Principal Component Analysis (PCA)  a classical linear technique that projects the data along the directions of maximal variance. When the high dimensional data lies on a low dimensional manifold embedded in the ambient space, the Locality Preserving Projections are obtained by finding the optimal linear approximations to the eigenfunctions of the Laplace Beltrami operator on the manifold. As a result, LPP shares many of the data representation properties of nonlinear techniques such as Laplacian Eigenmaps or Locally Linear Embedding. Yet LPP is linear and more crucially is defined everywhere in ambient space rather than just on the training data points. This is borne out by illustrative examples on some high dimensional data sets.
On the Nyström Method for Approximating a Gram Matrix for Improved KernelBased Learning
 JOURNAL OF MACHINE LEARNING RESEARCH
, 2005
"... A problem for many kernelbased methods is that the amount of computation required to find the solution scales as O(n³), where n is the number of training examples. We develop and analyze an algorithm to compute an easilyinterpretable lowrank approximation to an nn Gram matrix G such that compu ..."
Abstract

Cited by 156 (10 self)
 Add to MetaCart
A problem for many kernelbased methods is that the amount of computation required to find the solution scales as O(n³), where n is the number of training examples. We develop and analyze an algorithm to compute an easilyinterpretable lowrank approximation to an nn Gram matrix G such that computations of interest may be performed more rapidly. The approximation is of the form G k = CW , where C is a matrix consisting of a small number c of columns of G and W k is the best rankk approximation to W , the matrix formed by the intersection between those c columns of G and the corresponding c rows of G. An important aspect of the algorithm is the probability distribution used to randomly sample the columns; we will use a judiciouslychosen and datadependent nonuniform probability distribution. Let F denote the spectral norm and the Frobenius norm, respectively, of a matrix, and let G k be the best rankk approximation to G. We prove that by choosing O(k/# ) columns both in expectation and with high probability, for both # = 2, F , and for all k : 0 rank(W ). This approximation can be computed using O(n) additional space and time, after making two passes over the data from external storage. The relationships between this algorithm, other related matrix decompositions, and the Nyström method from integral equation theory are discussed.
Diffusion Wavelets
, 2004
"... We present a multiresolution construction for efficiently computing, compressing and applying large powers of operators that have high powers with low numerical rank. This allows the fast computation of functions of the operator, notably the associated Green’s function, in compressed form, and their ..."
Abstract

Cited by 114 (14 self)
 Add to MetaCart
(Show Context)
We present a multiresolution construction for efficiently computing, compressing and applying large powers of operators that have high powers with low numerical rank. This allows the fast computation of functions of the operator, notably the associated Green’s function, in compressed form, and their fast application. Classes of operators satisfying these conditions include diffusionlike operators, in any dimension, on manifolds, graphs, and in nonhomogeneous media. In this case our construction can be viewed as a farreaching generalization of Fast Multipole Methods, achieved through a different point of view, and of the nonstandard wavelet representation of CalderónZygmund and pseudodifferential operators, achieved through a different multiresolution analysis adapted to the operator. We show how the dyadic powers of an operator can be used to induce a multiresolution analysis, as in classical LittlewoodPaley and wavelet theory, and we show how to construct, with fast and stable algorithms, scaling function and wavelet bases associated to this multiresolution analysis, and the corresponding downsampling operators, and use them to compress the corresponding powers of the operator. This allows to extend multiscale signal processing to general spaces (such as manifolds and graphs) in a very natural way, with corresponding fast algorithms.
Dimensionality reduction by learning an invariant mapping
 In Proc. Computer Vision and Pattern Recognition Conference (CVPR’06
, 2006
"... Dimensionality reduction involves mapping a set of high dimensional input points onto a low dimensional manifold so that “similar ” points in input space are mapped to nearby points on the manifold. We present a methodcalled Dimensionality Reduction by Learning an Invariant Mapping (DrLIM) for lea ..."
Abstract

Cited by 73 (12 self)
 Add to MetaCart
(Show Context)
Dimensionality reduction involves mapping a set of high dimensional input points onto a low dimensional manifold so that “similar ” points in input space are mapped to nearby points on the manifold. We present a methodcalled Dimensionality Reduction by Learning an Invariant Mapping (DrLIM) for learning a globally coherent nonlinear function that maps the data evenly to the output manifold. The learning relies solely on neighborhood relationships and does not require any distance measure in the input space. The method can learn mappings that are invariant to certain transformations of the inputs, as is demonstrated with a number of experiments. Comparisons are made to other techniques, in particular LLE. 1
RELATIVEERROR CUR MATRIX DECOMPOSITIONS
 SIAM J. MATRIX ANAL. APPL
, 2008
"... Many data analysis applications deal with large matrices and involve approximating the matrix using a small number of “components.” Typically, these components are linear combinations of the rows and columns of the matrix, and are thus difficult to interpret in terms of the original features of the ..."
Abstract

Cited by 55 (12 self)
 Add to MetaCart
Many data analysis applications deal with large matrices and involve approximating the matrix using a small number of “components.” Typically, these components are linear combinations of the rows and columns of the matrix, and are thus difficult to interpret in terms of the original features of the input data. In this paper, we propose and study matrix approximations that are explicitly expressed in terms of a small number of columns and/or rows of the data matrix, and thereby more amenable to interpretation in terms of the original data. Our main algorithmic results are two randomized algorithms which take as input an m × n matrix A and a rank parameter k. In our first algorithm, C is chosen, and we let A ′ = CC + A, where C + is the Moore–Penrose generalized inverse of C. In our second algorithm C, U, R are chosen, and we let A ′ = CUR. (C and R are matrices that consist of actual columns and rows, respectively, of A, and U is a generalized inverse of their intersection.) For each algorithm, we show that with probability at least 1 − δ, ‖A − A ′ ‖F ≤ (1 + ɛ) ‖A − Ak‖F, where Ak is the “best ” rankk approximation provided by truncating the SVD of A, and where ‖X‖F is the Frobenius norm of the matrix X. The number of columns of C and rows of R is a lowdegree polynomial in k, 1/ɛ, and log(1/δ). Both the Numerical Linear Algebra community and the Theoretical Computer Science community have studied variants
Data fusion and multicue data matching by diffusion maps
 IEEE Transactions on Pattern Analysis and Machine Intelligence
"... Abstract—Data fusion and multicue data matching are fundamental tasks of highdimensional data analysis. In this paper, we apply the recently introduced diffusion framework to address these tasks. Our contribution is threefold: First, we present the LaplaceBeltrami approach for computing density i ..."
Abstract

Cited by 50 (5 self)
 Add to MetaCart
(Show Context)
Abstract—Data fusion and multicue data matching are fundamental tasks of highdimensional data analysis. In this paper, we apply the recently introduced diffusion framework to address these tasks. Our contribution is threefold: First, we present the LaplaceBeltrami approach for computing density invariant embeddings which are essential for integrating different sources of data. Second, we describe a refinement of the Nyström extension algorithm called “geometric harmonics. ” We also explain how to use this tool for data assimilation. Finally, we introduce a multicue data matching scheme based on nonlinear spectral graphs alignment. The effectiveness of the presented schemes is validated by applying it to the problems of lipreading and image sequence alignment. Index Terms—Pattern matching, graph theory, graph algorithms, Markov processes, machine learning, data mining, image databases. Ç 1
Discriminant locally linear embedding with highorder tensor data
 IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics
, 2008
"... Abstract—Graphembedding along with its linearization and kernelization provides a general framework that unifies most traditional dimensionality reduction algorithms. From this framework, we propose a new manifold learning technique called discriminant locally linear embedding (DLLE), in which th ..."
Abstract

Cited by 44 (12 self)
 Add to MetaCart
(Show Context)
Abstract—Graphembedding along with its linearization and kernelization provides a general framework that unifies most traditional dimensionality reduction algorithms. From this framework, we propose a new manifold learning technique called discriminant locally linear embedding (DLLE), in which the local geometric properties within each class are preserved according to the locally linear embedding (LLE) criterion, and the separability between different classes is enforced by maximizing margins between point pairs on different classes. To deal with the outofsample problem in visual recognition with vector input, the linear version of DLLE, i.e., linearization of DLLE (DLLE/L), is directly proposed through the graphembedding framework. Moreover, we propose its multilinear version, i.e., tensorization of DLLE, for the outofsample problem with highorder tensor input. Based on DLLE, a procedure for gait recognition is described. We conduct comprehensive experiments on both gait and face recognition, and observe that: 1) DLLE along its linearization and tensorization outperforms the related versions of linear discriminant analysis, and DLLE/L demonstrates greater effectiveness than the linearization of LLE; 2) algorithms based on tensor representations are generally superior to linear algorithms when dealing with intrinsically highorder data; and 3) for human gait recognition, DLLE/L generally obtains higher accuracy than stateoftheart gait recognition algorithms on the standard University of South Florida gait database. Index Terms—Dimensionality reduction, face recognition, human gait recognition, manifold learning, tensor representation.
Geometric Methods for Feature Extraction and Dimensional Reduction
 In L. Rokach and O. Maimon (Eds.), Data
, 2005
"... Abstract We give a tutorial overview of several geometric methods for feature extraction and dimensional reduction. We divide the methods into projective methods and methods that model the manifold on which the data lies. For projective methods, we review projection pursuit, principal component anal ..."
Abstract

Cited by 42 (1 self)
 Add to MetaCart
(Show Context)
Abstract We give a tutorial overview of several geometric methods for feature extraction and dimensional reduction. We divide the methods into projective methods and methods that model the manifold on which the data lies. For projective methods, we review projection pursuit, principal component analysis (PCA), kernel PCA, probabilistic PCA, and oriented PCA; and for the manifold methods, we review multidimensional scaling (MDS), landmark MDS, Isomap, locally linear embedding, Laplacian eigenmaps and spectral clustering. The Nyström method, which links several of the algorithms, is also reviewed. The goal is to provide a selfcontained review of the concepts and mathematics underlying these algorithms.
Shape priors using Manifold Learning Techniques
 in &quot;11th IEEE International Conference on Computer Vision, Rio de Janeiro
, 2007
"... We introduce a nonlinear shape prior for the deformable model framework that we learn from a set of shape samples using recent manifold learning techniques. We model a category of shapes as a finite dimensional manifold which we approximate using Diffusion maps, that we call the shape prior manifol ..."
Abstract

Cited by 35 (2 self)
 Add to MetaCart
(Show Context)
We introduce a nonlinear shape prior for the deformable model framework that we learn from a set of shape samples using recent manifold learning techniques. We model a category of shapes as a finite dimensional manifold which we approximate using Diffusion maps, that we call the shape prior manifold. Our method computes a Delaunay triangulation of the reduced space, considered as Euclidean, and uses the resulting space partition to identify the closest neighbors of any given shape based on its Nyström extension. Our contribution lies in three aspects. First, we propose a solution to the preimage problem and define the projection of a shape onto the manifold. Based on closest neighbors for the Diffusion distance, we then describe a variational framework for manifold denoising. Finally, we introduce a shape prior term for the deformable framework through a nonlinear energy term designed to attract a shape towards the manifold at given constant embedding. Results on shapes of cars and ventricule nuclei are presented and demonstrate the potentials of our method.