Results 1  10
of
13
Vector diffusion maps and the connection laplacian
 CComm. Pure Appl. Math
"... Abstract. We introduce vector diffusion maps (VDM), a new mathematical framework for organizing and analyzing massive high dimensional data sets, images and shapes. VDM is a mathematical and algorithmic generalization of diffusion maps and other nonlinear dimensionality reduction methods, such as L ..."
Abstract

Cited by 48 (13 self)
 Add to MetaCart
Abstract. We introduce vector diffusion maps (VDM), a new mathematical framework for organizing and analyzing massive high dimensional data sets, images and shapes. VDM is a mathematical and algorithmic generalization of diffusion maps and other nonlinear dimensionality reduction methods, such as LLE, ISOMAP and Laplacian eigenmaps. While existing methods are either directly or indirectly related to the heat kernel for functions over the data, VDM is based on the heat kernel for vector fields. VDM provides tools for organizing complex data sets, embedding them in a low dimensional space, and interpolating and regressing vector fields over the data. In particular, it equips the data with a metric, which we refer to as the vector diffusion distance. In the manifold learning setup, where the data set is distributed on (or near) a low dimensional manifold M d embedded in R p, we prove the relation between VDM and the connectionLaplacian operator for vector fields over the manifold. Key words. Dimensionality reduction, vector fields, heat kernel, parallel transport, local principal component analysis, alignment. 1. Introduction. Apopularwaytodescribethe
Hybrid linear modeling via local bestfit flats
 in IEEE Conference on Computer Vision and Pattern Recognition
"... In this paper we present a simple and fast geometric method for modeling data by a union of affine sets. The method begins by forming a collection of local best fit affine subspaces. The correct sizes of the local neighborhoods are determined automatically by the Jones ’ β2 numbers; we prove under c ..."
Abstract

Cited by 37 (4 self)
 Add to MetaCart
(Show Context)
In this paper we present a simple and fast geometric method for modeling data by a union of affine sets. The method begins by forming a collection of local best fit affine subspaces. The correct sizes of the local neighborhoods are determined automatically by the Jones ’ β2 numbers; we prove under certain geometric conditions that good local neighborhoods exist and are found by our method. The collection is further processed by a greedy selection procedure or a spectral method to generate the final model. We discuss applications to trackingbased motion segmentation and clustering of faces under different illuminating conditions. We give extensive experimental evidence demonstrating the state of the art accuracy and speed of the suggested algorithms on these problems and also on synthetic hybrid linear data as well as the MNIST handwritten digits data; and we demonstrate how to use our algorithms for fast determination of the number of affine subspaces.
Multiscale geometric methods for data sets I: Multiscale covariances, noise and curvature,” submitted
, 2012
"... Large data sets are often modeled as being noisy samples from probability distributionsµinR D, withDlarge. It has been noticed that oftentimes the supportM of these probability distributions seems to be wellapproximated by lowdimensional sets, perhaps even by manifolds. We shall consider sets that ..."
Abstract

Cited by 7 (1 self)
 Add to MetaCart
(Show Context)
Large data sets are often modeled as being noisy samples from probability distributionsµinR D, withDlarge. It has been noticed that oftentimes the supportM of these probability distributions seems to be wellapproximated by lowdimensional sets, perhaps even by manifolds. We shall consider sets that are locally well approximated by kdimensional planes, with k ≪ D, with kdimensional manifolds isometrically embedded in R D being a special case. Samples from µ are furthermore corrupted by Ddimensional noise. Certain tools from multiscale geometric measure theory and harmonic analysis seem wellsuited to be adapted to the study of samples from such probability distributions, in order to yield quantitative geometric information about them. In this paper we introduce and study multiscale covariance matrices, i.e. covariances corresponding to the distribution restricted to a ball of radiusr, with a fixed center and varyingr, and under rather general geometric assumptions we study how their empirical, noisy counterparts behave. We prove that in the range of scales where these covariance matrices are most informative, the empirical, noisy covariances are close to their expected, noiseless counterparts. In fact, this is true as soon as the number of samples in the balls where the covariance matrices are computed is linear in the intrinsic dimension of M. As an application, we present an algorithm for estimating the intrinsic dimension ofM. 1
TANGENT SPACE ESTIMATION FOR SMOOTH EMBEDDINGS OF RIEMANNIAN MANIFOLDS
"... Abstract. Numerous dimensionality reduction problems in data analysis involve the recovery of lowdimensional models or the learning of manifolds underlying sets of data. Many manifold learning methods require the estimation of the tangent space of the manifold at a point from locally available data ..."
Abstract

Cited by 3 (0 self)
 Add to MetaCart
(Show Context)
Abstract. Numerous dimensionality reduction problems in data analysis involve the recovery of lowdimensional models or the learning of manifolds underlying sets of data. Many manifold learning methods require the estimation of the tangent space of the manifold at a point from locally available data samples. Local sampling conditions such as (i) the size of the neighborhood (sampling width) and (ii) the number of samples in the neighborhood (sampling density) affect the performance of learning algorithms. In this work, we propose a theoretical analysis of local sampling conditions for the estimation of the tangent space at a point P lying on a mdimensional Riemannian manifold S in Rn. Assuming a smooth embedding of S in Rn, we estimate the tangent space TP S by performing a Principal Component Analysis (PCA) on points sampled from the neighborhood of P on S. Our analysis explicitly takes into account the second order properties of the manifold at P, namely the principal curvatures as well as the higher order terms. We consider a random sampling framework and leverage recent results from random matrix theory to derive conditions on the sampling width and the local sampling density for an accurate estimation of tangent subspaces. We measure the estimation accuracy by the angle between the estimated tangent space TP S and the true tangent space TP S and we give conditions for this angle to be bounded with high probability. In particular, we observe that the local sampling conditions are highly dependent on the correlation between the components in the secondorder local approximation of the manifold. We finally provide numerical simulations to validate our theoretical findings. 1.
Information and Inference: A Journal of the IMA (2013) 2, 69–114 doi:10.1093/imaiai/iat003 Tangent space estimation for smooth embeddings of Riemannian manifolds R
"... Numerous dimensionality reduction problems in data analysis involve the recovery of lowdimensional models or the learning of manifolds underlying sets of data. Many manifold learning methods require the estimation of the tangent space of the manifold at a point from locally available data samples. ..."
Abstract
 Add to MetaCart
(Show Context)
Numerous dimensionality reduction problems in data analysis involve the recovery of lowdimensional models or the learning of manifolds underlying sets of data. Many manifold learning methods require the estimation of the tangent space of the manifold at a point from locally available data samples. Local sampling conditions such as (i) the size of the neighborhood (sampling width) and (ii) the number of samples in the neighborhood (sampling density) affect the performance of learning algorithms. In this work, we propose a theoretical analysis of local sampling conditions for the estimation of the tangent space at a point P lying on an mdimensional Riemannian manifold S in Rn. Assuming a smooth embedding of S in Rn, we estimate the tangent space TPS by performing a principal component analysis (PCA) on points sampled from the neighborhood of P on S. Our analysis explicitly takes into account the secondorder properties of the manifold at P, namely the principal curvatures as well as the higherorder terms. We consider a random sampling framework and leverage recent results from random matrix theory to derive conditions on the sampling width and the local sampling density for an accurate estimation of tangent subspaces. We measure the estimation accuracy by the angle between the estimated tangent space T̂PS and the true tangent space TPS and we give conditions for this angle to be bounded with high probability. In particular, we observe that the local sampling conditions are highly dependent on the correlation between the components in the secondorder local approximation of the manifold. We finally provide numerical simulations to validate our theoretical findings.
could affect the content, and all legal disclaimers that apply to the journal pertain
, 2012
"... Please cite this article in press as: G. Wolf, A. Averbuch, Linearprojection diffusion on ..."
Abstract
 Add to MetaCart
(Show Context)
Please cite this article in press as: G. Wolf, A. Averbuch, Linearprojection diffusion on
Contents lists available at SciVerse ScienceDirect Applied and Computational Harmonic Analysis
"... www.elsevier.com/locate/acha ..."
(Show Context)
Applied and Computational Harmonic Analysis
"... This article appeared in a journal published by Elsevier. The attached copy is furnished to the author for internal noncommercial research and education use, including for instruction at the authors institution and sharing with colleagues. Other uses, including reproduction and distribution, or sel ..."
Abstract
 Add to MetaCart
(Show Context)
This article appeared in a journal published by Elsevier. The attached copy is furnished to the author for internal noncommercial research and education use, including for instruction at the authors institution and sharing with colleagues. Other uses, including reproduction and distribution, or selling or licensing copies, or posting to personal, institutional or third party websites are prohibited. In most cases authors are permitted to post their version of the article (e.g. in Word or Tex form) to their personal website or institutional repository. Authors requiring further information regarding Elsevier’s archiving and manuscript policies are encouraged to visit: