Results 1  10
of
12
Robust Recovery of Subspace Structures by LowRank Representation
"... In this work we address the subspace recovery problem. Given a set of data samples (vectors) approximately drawn from a union of multiple subspaces, our goal is to segment the samples into their respective subspaces and correct the possible errors as well. To this end, we propose a novel method ter ..."
Abstract

Cited by 119 (25 self)
 Add to MetaCart
(Show Context)
In this work we address the subspace recovery problem. Given a set of data samples (vectors) approximately drawn from a union of multiple subspaces, our goal is to segment the samples into their respective subspaces and correct the possible errors as well. To this end, we propose a novel method termed LowRank Representation (LRR), which seeks the lowestrank representation among all the candidates that can represent the data samples as linear combinations of the bases in a given dictionary. It is shown that LRR well solves the subspace recovery problem: when the data is clean, we prove that LRR exactly captures the true subspace structures; for the data contaminated by outliers, we prove that under certain conditions LRR can exactly recover the row space of the original data and detect the outlier as well; for the data corrupted by arbitrary errors, LRR can also approximately recover the row space with theoretical guarantees. Since the subspace membership is provably determined by the row space, these further imply that LRR can perform robust subspace segmentation and error correction, in an efficient way.
A geometric analysis of subspace clustering with outliers
 ANNALS OF STATISTICS
, 2012
"... This paper considers the problem of clustering a collection of unlabeled data points assumed to lie near a union of lower dimensional planes. As is common in computer vision or unsupervised learning applications, we do not know in advance how many subspaces there are nor do we have any information a ..."
Abstract

Cited by 61 (3 self)
 Add to MetaCart
(Show Context)
This paper considers the problem of clustering a collection of unlabeled data points assumed to lie near a union of lower dimensional planes. As is common in computer vision or unsupervised learning applications, we do not know in advance how many subspaces there are nor do we have any information about their dimensions. We develop a novel geometric analysis of an algorithm named sparse subspace clustering (SSC) [11], which significantly broadens the range of problems where it is provably effective. For instance, we show that SSC can recover multiple subspaces, each of dimension comparable to the ambient dimension. We also prove that SSC can correctly cluster data points even when the subspaces of interest intersect. Further, we develop an extension of SSC that succeeds when the data set is corrupted with possibly overwhelmingly many outliers. Underlying our analysis are clear geometric insights, which may bear on other sparse recovery problems. A numerical study complements our theoretical analysis and demonstrates the effectiveness of these methods.
Fixedrank representation for unsupervised visual learning
"... Subspace clustering and feature extraction are two of the most commonly used unsupervised learning techniques in computer vision and pattern recognition. Stateoftheart techniques for subspace clustering make use of recent advances in sparsity and rank minimization. However, existing techniques a ..."
Abstract

Cited by 15 (2 self)
 Add to MetaCart
(Show Context)
Subspace clustering and feature extraction are two of the most commonly used unsupervised learning techniques in computer vision and pattern recognition. Stateoftheart techniques for subspace clustering make use of recent advances in sparsity and rank minimization. However, existing techniques are computationally expensive and may result in degenerate solutions that degrade clustering performance in the case of insufficient data sampling. To partially solve these problems, and inspired by existing work on matrix factorization, this paper proposes fixedrank representation (FRR) as a unified framework for unsupervised visual learning. FRR is able to reveal the structure of multiple subspaces in closedform when the data is noiseless. Furthermore, we prove that under some suitable conditions, even with insufficient observations, FRR can still reveal the true subspace memberships. To achieve robustness to outliers and noise, a sparse regularizer is introduced into the FRR framework. Beyond subspace clustering, FRR can be used for unsupervised feature extraction. As a nontrivial byproduct, a fast numerical solver is developed for FRR. Experimental results on both synthetic data and real applications validate our theoretical analysis and demonstrate the benefits of FRR for unsupervised visual learning. 1.
Provable Subspace Clustering: When LRR meets SSC
"... Sparse Subspace Clustering (SSC) and LowRank Representation (LRR) are both considered as the stateoftheart methods for subspace clustering. The two methods are fundamentally similar in that both are convex optimizations exploiting the intuition of “SelfExpressiveness”. The main difference is t ..."
Abstract

Cited by 2 (1 self)
 Add to MetaCart
(Show Context)
Sparse Subspace Clustering (SSC) and LowRank Representation (LRR) are both considered as the stateoftheart methods for subspace clustering. The two methods are fundamentally similar in that both are convex optimizations exploiting the intuition of “SelfExpressiveness”. The main difference is that SSC minimizes the vector `1 norm of the representation matrix to induce sparsity while LRR minimizes nuclear norm (aka trace norm) to promote a lowrank structure. Because the representation matrix is often simultaneously sparse and lowrank, we propose a new algorithm, termed LowRank Sparse Subspace Clustering (LRSSC), by combining SSC and LRR, and develops theoretical guarantees of when the algorithm succeeds. The results reveal interesting insights into the strength and weakness of SSC and LRR and demonstrate how LRSSC can take the advantages of both methods in preserving the “SelfExpressiveness Property ” and “Graph Connectivity ” at the same time. 1
Fast LowRank Subspace Segmentation
 JOURNAL OF IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING
"... Subspace segmentation is the problem of segmenting (or grouping) a set of n data points into a number of clusters, with each cluster being a (linear) subspace. The recently established algorithms such as Sparse Subspace Clustering (SSC), LowRank Representation (LRR) and LowRank Subspace Segmentati ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
Subspace segmentation is the problem of segmenting (or grouping) a set of n data points into a number of clusters, with each cluster being a (linear) subspace. The recently established algorithms such as Sparse Subspace Clustering (SSC), LowRank Representation (LRR) and LowRank Subspace Segmentation (LRSS) are effective in terms of segmentation accuracy, but computationally inefficient as it possesses a complexity of O(n3), which is too high to afford for the case where n is very large. In this paper we devise a fast subspace segmentation algorithm with complexity of O(n log(n)). This is achieved by firstly using partial Singular Value Decomposition (SVD) to approximate the solution of LRSS, secondly utilizing Locality Sensitive Hashing (LSH) to build a sparse affinity graph that encodes the subspace memberships, and finally adopting a fast Normalized Cut (NCut) algorithm to produce the final segmentation results. Besides of high efficiency, our algorithm also has comparable effectiveness as the original LRSS method.
Algorithms and theory for clustering . . .
, 2014
"... In this dissertation we discuss three problems characterized by hidden structure or information. The first part of this thesis focuses on extracting subspace structures from data. Subspace Clustering is the problem of finding a multisubspace representation that best fits a collection of points tak ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
In this dissertation we discuss three problems characterized by hidden structure or information. The first part of this thesis focuses on extracting subspace structures from data. Subspace Clustering is the problem of finding a multisubspace representation that best fits a collection of points taken from a highdimensional space. As with most clustering problems, popular techniques for subspace clustering are often difficult to analyze theoretically as they are often nonconvex in nature. Theoretical analysis of these algorithms becomes even more challenging in the presence of noise and missing data. We introduce a collection of subspace clustering algorithms, which are tractable and provably robust to various forms of data imperfections. We further illustrate our methods with numerical experiments on a wide variety of data segmentation problems. In the second part of the thesis, we consider the problem of recovering the seemingly hidden phase of an object from intensityonly measurements, a problem which naturally appears in Xray crystallography and related disciplines. We formulate the
A Counterexample for the Validity of Using Nuclear Norm as a Convex Surrogate of Rank
"... Abstract. Rank minimization has attracted a lot of attention due to its robustness in data recovery. To overcome the computational difficulty, rank is often replaced with nuclear norm. For several rank minimization problems, such a replacement has been theoretically proven to be valid, i.e., the sol ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
(Show Context)
Abstract. Rank minimization has attracted a lot of attention due to its robustness in data recovery. To overcome the computational difficulty, rank is often replaced with nuclear norm. For several rank minimization problems, such a replacement has been theoretically proven to be valid, i.e., the solution to nuclear norm minimization problem is also the solution to rank minimization problem. Although it is easy to believe that such a replacement may not always be valid, no concrete example has ever been found. We argue that such a validity checking cannot be done by numerical computation and show, by analyzing the noiseless latent low rank representation (LatLRR) model, that even for very simple rank minimization problems the validity may still break down. As a byproduct, we find that the solution to the nuclear norm minimization formulation of LatLRR is nonunique. Hence the results of LatLRR reported in the literature may be questionable. 1
Global Solver and Its Efficient Approximation for Variational Bayesian Lowrank Subspace Clustering
"... When a probabilistic model and its prior are given, Bayesian learning offers inference with automatic parameter tuning. However, Bayesian learning is often obstructed by computational difficulty: the rigorous Bayesian learning is intractable in many models, and its variational Bayesian (VB) approx ..."
Abstract
 Add to MetaCart
(Show Context)
When a probabilistic model and its prior are given, Bayesian learning offers inference with automatic parameter tuning. However, Bayesian learning is often obstructed by computational difficulty: the rigorous Bayesian learning is intractable in many models, and its variational Bayesian (VB) approximation is prone to suffer from local minima. In this paper, we overcome this difficulty for lowrank subspace clustering (LRSC) by providing an exact global solver and its efficient approximation. LRSC extracts a lowdimensional structure of data by embedding samples into the union of lowdimensional subspaces, and its variational Bayesian variant has shown good performance. We first prove a key property that the VBLRSC model is highly redundant. Thanks to this property, the optimization problem of VBLRSC can be separated into small subproblems, each of which has only a small number of unknown variables. Our exact global solver relies on another key property that the stationary condition of each subproblem consists of a set of polynomial equations, which is solvable with the homotopy method. For further computational efficiency, we also propose an efficient approximate variant, of which the stationary condition can be written as a polynomial equation with a single variable. Experimental results show the usefulness of our approach. 1