Results 1  10
of
126
Robust recovery of subspace structures by lowrank representation
 IEEE Trans. Pattern Anal. Mach. Intell
, 2013
"... Abstract—In this paper, we address the subspace clustering problem. Given a set of data samples (vectors) approximately drawn from a union of multiple subspaces, our goal is to cluster the samples into their respective subspaces and remove possible outliers as well. To this end, we propose a novel o ..."
Abstract

Cited by 113 (23 self)
 Add to MetaCart
(Show Context)
Abstract—In this paper, we address the subspace clustering problem. Given a set of data samples (vectors) approximately drawn from a union of multiple subspaces, our goal is to cluster the samples into their respective subspaces and remove possible outliers as well. To this end, we propose a novel objective function named LowRank Representation (LRR), which seeks the lowest rank representation among all the candidates that can represent the data samples as linear combinations of the bases in a given dictionary. It is shown that the convex program associated with LRR solves the subspace clustering problem in the following sense: When the data is clean, we prove that LRR exactly recovers the true subspace structures; when the data are contaminated by outliers, we prove that under certain conditions LRR can exactly recover the row space of the original data and detect the outlier as well; for data corrupted by arbitrary sparse errors, LRR can also approximately recover the row space with theoretical guarantees. Since the subspace membership is provably determined by the row space, these further imply that LRR can perform robust subspace clustering and error correction in an efficient and effective way. Index Terms—Lowrank representation, subspace clustering, segmentation, outlier detection Ç 1
A geometric analysis of subspace clustering with outliers
 ANNALS OF STATISTICS
, 2012
"... This paper considers the problem of clustering a collection of unlabeled data points assumed to lie near a union of lower dimensional planes. As is common in computer vision or unsupervised learning applications, we do not know in advance how many subspaces there are nor do we have any information a ..."
Abstract

Cited by 59 (3 self)
 Add to MetaCart
(Show Context)
This paper considers the problem of clustering a collection of unlabeled data points assumed to lie near a union of lower dimensional planes. As is common in computer vision or unsupervised learning applications, we do not know in advance how many subspaces there are nor do we have any information about their dimensions. We develop a novel geometric analysis of an algorithm named sparse subspace clustering (SSC) [11], which significantly broadens the range of problems where it is provably effective. For instance, we show that SSC can recover multiple subspaces, each of dimension comparable to the ambient dimension. We also prove that SSC can correctly cluster data points even when the subspaces of interest intersect. Further, we develop an extension of SSC that succeeds when the data set is corrupted with possibly overwhelmingly many outliers. Underlying our analysis are clear geometric insights, which may bear on other sparse recovery problems. A numerical study complements our theoretical analysis and demonstrates the effectiveness of these methods.
Linearized Alternating Direction Method with Adaptive Penalty for LowRank Representation
"... Many machine learning and signal processing problems can be formulated as linearly constrained convex programs, which could be efficiently solved by the alternating direction method (ADM). However, usually the subproblems in ADM are easily solvable only when the linear mappings in the constraints ar ..."
Abstract

Cited by 51 (7 self)
 Add to MetaCart
(Show Context)
Many machine learning and signal processing problems can be formulated as linearly constrained convex programs, which could be efficiently solved by the alternating direction method (ADM). However, usually the subproblems in ADM are easily solvable only when the linear mappings in the constraints are identities. To address this issue, we propose a linearized ADM (LADM) method by linearizing the quadratic penalty term and adding a proximal term when solving the subproblems. For fast convergence, we also allow the penalty to change adaptively according a novel update rule. We prove the global convergence of LADM with adaptive penalty (LADMAP). As an example, we apply LADMAP to solve lowrank representation (LRR), which is an important subspace clustering technique yet suffers from high computation cost. By combining LADMAP with a skinny SVD representation technique, we are able to reduce the complexity O(n 3) of the original ADM based method to O(rn 2), where r and n are the rank and size of the representation matrix, respectively, hence making LRR possible for large scale applications. Numerical experiments verify that for LRR our LADMAP based methods are much faster than stateoftheart algorithms. 1
A Closed Form Solution to Robust Subspace Estimation and Clustering
"... We consider the problem of fitting one or more subspaces to a collection of data points drawn from the subspaces and corrupted by noise/outliers. We pose this problem as a rank minimization problem, where the goal is to decompose the corrupted data matrix as the sum of a clean, selfexpressive, low ..."
Abstract

Cited by 45 (3 self)
 Add to MetaCart
(Show Context)
We consider the problem of fitting one or more subspaces to a collection of data points drawn from the subspaces and corrupted by noise/outliers. We pose this problem as a rank minimization problem, where the goal is to decompose the corrupted data matrix as the sum of a clean, selfexpressive, lowrank dictionary plus a matrix of noise/outliers. Our key contribution is to show that, for noisy data, this nonconvex problem can be solved very efficiently and in closed form from the SVD of the noisy data matrix. Remarkably, this is true for both one or more subspaces. An important difference with respect to existing methods is that our framework results in a polynomial thresholding of the singular values with minimal shrinkage. Indeed, a particular case of our framework in the case of a single subspace leads to classical PCA, which requires no shrinkage. In the case of multiple subspaces, our framework provides an affinity matrix that can be used to cluster the data according to the subspaces. In the case of data corrupted by outliers, a closedform solution appears elusive. We thus use an augmented Lagrangian optimization framework, which requires a combination of our proposed polynomial thresholding operator with the more traditional shrinkagethresholding operator. 1.
Multitask lowrank affinity pursuit for image segmentation
 In ICCV
"... This paper investigates how to boost regionbased image segmentation by pursuing a new solution to fuse multiple types of image features. A collaborative image segmentation framework, called multitask lowrank affinity pursuit, is presented for such a purpose. Given an image described with mul ..."
Abstract

Cited by 29 (1 self)
 Add to MetaCart
(Show Context)
This paper investigates how to boost regionbased image segmentation by pursuing a new solution to fuse multiple types of image features. A collaborative image segmentation framework, called multitask lowrank affinity pursuit, is presented for such a purpose. Given an image described with multiple types of features, we aim at inferring a unified affinity matrix that implicitly encodes the segmentation of the image. This is achieved by seeking the sparsityconsistent lowrank affinities from the joint decompositions of multiple feature matrices into pairs of sparse and lowrank matrices, the latter of which is expressed as the production of the image feature matrix and its corresponding image affinity matrix. The inference process is formulated as a constrained nuclear norm and `2,1norm minimization problem, which is convex and can be solved efficiently with the Augmented Lagrange Multiplier method. Compared to previous methods, which are usually based on a single type of features, the proposed method seamlessly integrates multiple types of features to jointly produce the affinity matrix within a single inference step, and produces more accurate and reliable segmentation results. Experiments on the MSRC dataset and Berkeley segmentation dataset well validate the superiority of using multiple features over single feature and also the superiority of our method over conventional methods for feature fusion. Moreover, our method is shown to be very competitive while comparing to other stateoftheart methods. 1.
A TUTORIAL ON SUBSPACE CLUSTERING
"... The past few years have witnessed an explosion in the availability of data from multiple sources and modalities. For example, millions of cameras have been installed in buildings, streets, airports and cities around the world. This has generated extraordinary advances on how to acquire, compress, st ..."
Abstract

Cited by 27 (0 self)
 Add to MetaCart
(Show Context)
The past few years have witnessed an explosion in the availability of data from multiple sources and modalities. For example, millions of cameras have been installed in buildings, streets, airports and cities around the world. This has generated extraordinary advances on how to acquire, compress, store, transmit and process massive amounts of complex highdimensional data. Many of these advances have relied on the observation that, even though these data sets are highdimensional, their intrinsic dimension is often much smaller than the dimension of the ambient space. In computer vision, for example, the number of pixels in an image can be rather large, yet most computer vision models use only a few parameters to describe the appearance, geometry and dynamics of a scene. This has motivated the development of a number of techniques for finding a lowdimensional representation
Matrix Completion for Multilabel Image Classification
"... Recently, image categorization has been an active research topic due to the urgent need to retrieve and browse digital images via semantic keywords. This paper formulates image categorization as a multilabel classification problem using recent advances in matrix completion. Under this setting, clas ..."
Abstract

Cited by 22 (3 self)
 Add to MetaCart
(Show Context)
Recently, image categorization has been an active research topic due to the urgent need to retrieve and browse digital images via semantic keywords. This paper formulates image categorization as a multilabel classification problem using recent advances in matrix completion. Under this setting, classification of testing data is posed as a problem of completing unknown label entries on a data matrix that concatenates training and testing features with training labels. We propose two convex algorithms for matrix completion based on a Rank Minimization criterion specifically tailored to visual data, and prove its convergence properties. A major advantage of our approach w.r.t. standard discriminative classification methods for image categorization is its robustness to outliers, background noise and partial occlusions both in the feature and label space. Experimental validation on several datasets shows how our method outperforms stateoftheart algorithms, while effectively capturing semantic concepts of classes. 1
See All by Looking at A Few: Sparse Modeling for Finding Representative Objects
"... We consider the problem of finding a few representatives for a dataset, i.e., a subset of data points that efficiently describes the entire dataset. We assume that each data point can be expressed as a linear combination of the representatives and formulate the problem of finding the representatives ..."
Abstract

Cited by 22 (3 self)
 Add to MetaCart
(Show Context)
We consider the problem of finding a few representatives for a dataset, i.e., a subset of data points that efficiently describes the entire dataset. We assume that each data point can be expressed as a linear combination of the representatives and formulate the problem of finding the representatives as a sparse multiple measurement vector problem. In our formulation, both the dictionary and the measurements are given by the data matrix, and the unknown sparse codes select the representatives via convex optimization. In general, we do not assume that the data are lowrank or distributed around cluster centers. When the data do come from a collection of lowrank models, we show that our method automatically selects a few representatives from each lowrank model. We also analyze the geometry of the representatives and discuss their relationship to the vertices of the convex hull of the data. We show that our framework can be extended to detect and reject outliers in datasets, and to efficiently deal with new observations and large datasets. The proposed framework and theoretical foundations are illustrated with examples in video summarization and image classification using representatives. 1.
Robust Visual Domain Adaptation with LowRank Reconstruction
"... Visual domain adaptation addresses the problem of adapting the sample distribution of the source domain to the target domain, where the recognition task is intended but the data distributions are different. In this paper, we present a lowrank reconstruction method to reduce the domain distribution ..."
Abstract

Cited by 21 (0 self)
 Add to MetaCart
(Show Context)
Visual domain adaptation addresses the problem of adapting the sample distribution of the source domain to the target domain, where the recognition task is intended but the data distributions are different. In this paper, we present a lowrank reconstruction method to reduce the domain distribution disparity. Specifically, we transform the visual samples in the source domain into an intermediate representation such that each transformed source sample can be linearly reconstructed by the samples of the target domain. Unlike the existing work, our method captures the intrinsic relatedness of the source samples during the adaptation process while uncovering the noises and outliers in the source domain that cannot be adapted, making it more robust than previous methods. We formulate our problem as a constrained nuclear norm and ℓ2,1 norm minimization objective and then adopt the Augmented Lagrange Multiplier (ALM) method for the optimization. Extensive experiments on various visual adaptation tasks show that the proposed method consistently and significantly beats the stateoftheart domain adaptation methods. 1.
StreettoShop: CrossScenario Clothing Retrieval via Parts Alignment and Auxiliary Set
, 2012
"... In this paper, we address a practical problem of crossscenario clothing retrieval given a daily human photo captured in general environment, e.g., on street, finding similar clothing in online shops, where the photos are captured more professionally and with clean background. There are large dis ..."
Abstract

Cited by 21 (6 self)
 Add to MetaCart
In this paper, we address a practical problem of crossscenario clothing retrieval given a daily human photo captured in general environment, e.g., on street, finding similar clothing in online shops, where the photos are captured more professionally and with clean background. There are large discrepancies between daily photo scenario and online shopping scenario. We first propose to alleviate the human pose discrepancy by locating 30 human parts detected by a well trained human detector. Then, founded on part features, we propose a twostep calculation to obtain more reliable onetomany similarities between the query daily photo and online shopping photos: 1) the withinscenario onetomany similarities between a query daily photo and the auxiliary set are derived by direct sparse reconstruction; and 2) by a crossscenario manytomany similarity transfer matrix inferred offline from an extra auxiliary set and the online shopping set, the reliable crossscenario onetomany similarities between the query daily photo and all online shopping photos are obtained. We collect a large online shopping dataset and a daily photo dataset, both of which are thoroughly labeled with 15 clothing attributes via Mechanic Turk. The extensive experimental evaluations on the collected datasets well demonstrate the effectiveness of the proposed framework for crossscenario clothing retrieval.