Results 1  10
of
24
MultiManifold SemiSupervised Learning
"... We study semisupervised learning when the data consists of multiple intersecting manifolds. We give a finite sample analysis to quantify the potential gain of using unlabeled data in this multimanifold setting. We then propose a semisupervised learning algorithm that separates different manifolds ..."
Abstract

Cited by 147 (9 self)
 Add to MetaCart
We study semisupervised learning when the data consists of multiple intersecting manifolds. We give a finite sample analysis to quantify the potential gain of using unlabeled data in this multimanifold setting. We then propose a semisupervised learning algorithm that separates different manifolds into decision sets, and performs supervised learning within each set. Our algorithm involves a novel application of Hellinger distance and sizeconstrained spectral clustering. Experiments demonstrate the benefit of our multimanifold semisupervised learning approach. 1
Foundations of a Multiway Spectral Clustering Framework for Hybrid Linear Modeling
, 2009
"... Abstract The problem of Hybrid Linear Modeling (HLM) is to model and segment data using a mixture of affine subspaces. Different strategies have been proposed to solve this problem, however, rigorous analysis justifying their performance is missing. This paper suggests the Theoretical Spectral Curva ..."
Abstract

Cited by 37 (10 self)
 Add to MetaCart
(Show Context)
Abstract The problem of Hybrid Linear Modeling (HLM) is to model and segment data using a mixture of affine subspaces. Different strategies have been proposed to solve this problem, however, rigorous analysis justifying their performance is missing. This paper suggests the Theoretical Spectral Curvature Clustering (TSCC) algorithm for solving the HLM problem and provides careful analysis to justify it. The TSCC algorithm is practically a combination of Govindu’s multiway spectral clustering framework (CVPR 2005) and Ng et al.’s spectral clustering algorithm (NIPS 2001). The main result of this paper states that if the given data is sampled from a mixture of distributions concentrated around affine subspaces, then with high sampling probability the TSCC algorithm segments well the different underlying clusters. The goodness of clustering depends on the withincluster errors, the betweenclusters interaction, and a tuning parameter applied by TSCC. The proof also provides new insights for the analysis of Ng et al. (NIPS 2001). Keywords Hybrid linear modeling · dflats clustering · Multiway clustering · Spectral clustering · Polar curvature · Perturbation analysis · Concentration inequalities Communicated by Albert Cohen. This work was supported by NSF grant #0612608.
Sparse Manifold Clustering and Embedding
"... We propose an algorithm called Sparse Manifold Clustering and Embedding (SMCE) for simultaneous clustering and dimensionality reduction of data lying in multiple nonlinear manifolds. Similar to most dimensionality reduction methods, SMCE finds a small neighborhood around each data point and connects ..."
Abstract

Cited by 31 (1 self)
 Add to MetaCart
(Show Context)
We propose an algorithm called Sparse Manifold Clustering and Embedding (SMCE) for simultaneous clustering and dimensionality reduction of data lying in multiple nonlinear manifolds. Similar to most dimensionality reduction methods, SMCE finds a small neighborhood around each data point and connects each point to its neighbors with appropriate weights. The key difference is that SMCE finds both the neighbors and the weights automatically. This is done by solving a sparse optimization problem, which encourages selecting nearby points that lie in the same manifold and approximately span a lowdimensional affine subspace. The optimal solution encodes information that can be used for clustering and dimensionality reduction using spectral clustering and embedding. Moreover, the size of the optimal neighborhood of a data point, which can be different for different points, provides an estimate of the dimension of the manifold to which the point belongs. Experiments demonstrate that our method can effectively handle multiple manifolds that are very close to each other, manifolds with nonuniform sampling and holes, as well as estimate the intrinsic dimensions of the manifolds.
Mathematical Methods for Diffusion MRI Processing
, 2008
"... In this article, we review recent mathematical models and computational methods for the processing of diffusion Magnetic Resonance Images, including stateoftheart reconstruction of diffusion models, cerebral white matter connectivity analysis, and segmentation techniques. We focus on Diffusion Te ..."
Abstract

Cited by 14 (2 self)
 Add to MetaCart
In this article, we review recent mathematical models and computational methods for the processing of diffusion Magnetic Resonance Images, including stateoftheart reconstruction of diffusion models, cerebral white matter connectivity analysis, and segmentation techniques. We focus on Diffusion Tensor Images (DTI) and QBall Images (QBI).
Estimation of intrinsic dimensionality of samples from noisy lowdimensional manifolds in high dimensions with multiscale svd
, 2009
"... The problem of estimating the intrinsic dimensionality of certain point clouds is of interest in many applications in statistics and analysis of highdimensional data sets. Our setting is the following: the points are sampled from a manifoldM of dimension k, embedded in RD, with k D, and corrupte ..."
Abstract

Cited by 13 (4 self)
 Add to MetaCart
(Show Context)
The problem of estimating the intrinsic dimensionality of certain point clouds is of interest in many applications in statistics and analysis of highdimensional data sets. Our setting is the following: the points are sampled from a manifoldM of dimension k, embedded in RD, with k D, and corrupted by Ddimensional noise. WhenM is a linear manifold (hyperplane), one may analyse this situation by SVD, hoping the noise would perturb the rank k covariance matrix. WhenM is a nonlinear manifold, SVD performed globally may dramatically overestimate the intrinsic dimensionality. We discuss a multiscale version SVD that is useful in estimating the intrinsic dimensionality of nonlinear manifolds. Index Terms — Multiscale analysis, intrinsic dimensionality, high dimensional data, manifolds, point clouds, sample
Kernelized spectral curvature clustering (KSCC
 In ICCV Workshop on Dynamical Vision
, 2009
"... Multimanifold modeling is increasingly used in segmentation and data representation tasks in computer vision and related fields. While the general problem, modeling data by mixtures of manifolds, is very challenging, several approaches exist for modeling data by mixtures of affine subspaces (which ..."
Abstract

Cited by 9 (4 self)
 Add to MetaCart
(Show Context)
Multimanifold modeling is increasingly used in segmentation and data representation tasks in computer vision and related fields. While the general problem, modeling data by mixtures of manifolds, is very challenging, several approaches exist for modeling data by mixtures of affine subspaces (which is often referred to as hybrid linear modeling). We translate some important instances of multimanifold modeling to hybrid linear modeling in embedded spaces, without explicitly performing the embedding but applying the kernel trick. The resulting algorithm, Kernel Spectral Curvature Clustering, uses kernels at two levels both as an implicit embedding method to linearize nonflat manifolds and as a principled method to convert a multiway affinity problem into a spectral clustering one. We demonstrate the effectiveness of the method by comparing it with other stateoftheart methods on both synthetic data and a realworld problem of segmenting multiple motions from two perspective camera views.
Approximation of Points on lowdimensional manifolds via random linear projections. arXiv:1204.3337
"... ar ..."
Data Skeletonization via Reeb Graphs
"... Recovering hidden structure from complex and noisy nonlinear data is one of the most fundamental problems in machine learning and statistical inference. While such data is often highdimensional, it is of interest to approximate it with a lowdimensional or even onedimensional space, since many im ..."
Abstract

Cited by 8 (0 self)
 Add to MetaCart
(Show Context)
Recovering hidden structure from complex and noisy nonlinear data is one of the most fundamental problems in machine learning and statistical inference. While such data is often highdimensional, it is of interest to approximate it with a lowdimensional or even onedimensional space, since many important aspects of data are often intrinsically lowdimensional. Furthermore, there are many scenarios where the underlying structure is graphlike, e.g, river/road networks or various trajectories. In this paper, we develop a framework to extract, as well as to simplify, a onedimensional ”skeleton ” from unorganized data using the Reeb graph. Our algorithm is very simple, does not require complex optimizations and can be easily applied to unorganized highdimensional data such as point clouds or proximity graphs. It can also represent arbitrary graph structures in the data. We also give theoretical results to justify our method. We provide a number of experiments to demonstrate the effectiveness and generality of our algorithm, including comparisons to existing methods, such as principal curves. We believe that the simplicity and practicality of our algorithm will help to promote skeleton graphs as a data analysis tool for a broad range of applications. 1
Multiscale geometric methods for data sets I: Multiscale covariances, noise and curvature,” submitted
, 2012
"... Large data sets are often modeled as being noisy samples from probability distributionsµinR D, withDlarge. It has been noticed that oftentimes the supportM of these probability distributions seems to be wellapproximated by lowdimensional sets, perhaps even by manifolds. We shall consider sets that ..."
Abstract

Cited by 7 (1 self)
 Add to MetaCart
(Show Context)
Large data sets are often modeled as being noisy samples from probability distributionsµinR D, withDlarge. It has been noticed that oftentimes the supportM of these probability distributions seems to be wellapproximated by lowdimensional sets, perhaps even by manifolds. We shall consider sets that are locally well approximated by kdimensional planes, with k ≪ D, with kdimensional manifolds isometrically embedded in R D being a special case. Samples from µ are furthermore corrupted by Ddimensional noise. Certain tools from multiscale geometric measure theory and harmonic analysis seem wellsuited to be adapted to the study of samples from such probability distributions, in order to yield quantitative geometric information about them. In this paper we introduce and study multiscale covariance matrices, i.e. covariances corresponding to the distribution restricted to a ball of radiusr, with a fixed center and varyingr, and under rather general geometric assumptions we study how their empirical, noisy counterparts behave. We prove that in the range of scales where these covariance matrices are most informative, the empirical, noisy covariances are close to their expected, noiseless counterparts. In fact, this is true as soon as the number of samples in the balls where the covariance matrices are computed is linear in the intrinsic dimension of M. As an application, we present an algorithm for estimating the intrinsic dimension ofM. 1
ON THE NONUNIFORM COMPLEXITY OF BRAIN CONNECTIVITY
, 2007
"... A stratification and manifold learning approach for analyzing High Angular Resolution Diffusion Imaging (HARDI) data is introduced in this paper. HARDI data provides highdimensional signals measuring the complex microstructure of biological tissues, such as the cerebral white matter. We show that th ..."
Abstract

Cited by 5 (3 self)
 Add to MetaCart
(Show Context)
A stratification and manifold learning approach for analyzing High Angular Resolution Diffusion Imaging (HARDI) data is introduced in this paper. HARDI data provides highdimensional signals measuring the complex microstructure of biological tissues, such as the cerebral white matter. We show that these highdimensional spaces may be understood as unions of manifolds of varying dimensions/complexity and densities. With such analysis, we use clustering to characterize the structural complexity of the white matter. We briefly present the underlying framework and numerical experiments illustrating this original and promising approach. Key words: Stratification and manifold learning, DTI, HARDI, complexity, white matter connectivity. 1.