Results 1  10
of
127
Robust Principal Component Analysis?
, 2009
"... This paper is about a curious phenomenon. Suppose we have a data matrix, which is the superposition of a lowrank component and a sparse component. Can we recover each component individually? We prove that under some suitable assumptions, it is possible to recover both the lowrank and the sparse co ..."
Abstract

Cited by 138 (6 self)
 Add to MetaCart
This paper is about a curious phenomenon. Suppose we have a data matrix, which is the superposition of a lowrank component and a sparse component. Can we recover each component individually? We prove that under some suitable assumptions, it is possible to recover both the lowrank and the sparse components exactly by solving a very convenient convex program called Principal Component Pursuit; among all feasible decompositions, simply minimize a weighted combination of the nuclear norm and of the ℓ1 norm. This suggests the possibility of a principled approach to robust principal component analysis since our methodology and results assert that one can recover the principal components of a data matrix even though a positive fraction of its entries are arbitrarily corrupted. This extends to the situation where a fraction of the entries are missing as well. We discuss an algorithm for solving this optimization problem, and present applications in the area of video surveillance, where our methodology allows for the detection of objects in a cluttered background, and in the area of face recognition, where it offers a principled way of removing shadows and specularities in images of faces.
Matrix Completion with Noise
"... On the heels of compressed sensing, a remarkable new field has very recently emerged. This field addresses a broad range of problems of significant practical interest, namely, the recovery of a data matrix from what appears to be incomplete, and perhaps even corrupted, information. In its simplest ..."
Abstract

Cited by 74 (4 self)
 Add to MetaCart
On the heels of compressed sensing, a remarkable new field has very recently emerged. This field addresses a broad range of problems of significant practical interest, namely, the recovery of a data matrix from what appears to be incomplete, and perhaps even corrupted, information. In its simplest form, the problem is to recover a matrix from a small sample of its entries, and comes up in many areas of science and engineering including collaborative filtering, machine learning, control, remote sensing, and computer vision to name a few. This paper surveys the novel literature on matrix completion, which shows that under some suitable conditions, one can recover an unknown lowrank matrix from a nearly minimal set of entries by solving a simple convex optimization problem, namely, nuclearnorm minimization subject to data constraints. Further, this paper introduces novel results showing that matrix completion is provably accurate even when the few observed entries are corrupted with a small amount of noise. A typical result is that one can recover an unknown n × n matrix of low rank r from just about nr log 2 n noisy samples with an error which is proportional to the noise level. We present numerical results which complement our quantitative analysis and show that, in practice, nuclear norm minimization accurately fills in the many missing entries of large lowrank matrices from just a few noisy samples. Some analogies between matrix completion and compressed sensing are discussed throughout.
Matrix completion from a few entries
"... Let M be a random nα × n matrix of rank r ≪ n, and assume that a uniformly random subset E of its entries is observed. We describe an efficient algorithm that reconstructs M from E  = O(r n) observed entries with relative root mean square error RMSE ≤ C(α) ..."
Abstract

Cited by 68 (5 self)
 Add to MetaCart
Let M be a random nα × n matrix of rank r ≪ n, and assume that a uniformly random subset E of its entries is observed. We describe an efficient algorithm that reconstructs M from E  = O(r n) observed entries with relative root mean square error RMSE ≤ C(α)
A simpler approach to matrix completion
 the Journal of Machine Learning Research
"... This paper provides the best bounds to date on the number of randomly sampled entries required to reconstruct an unknown low rank matrix. These results improve on prior work by Candès and Recht [4], Candès and Tao [7], and Keshavan, Montanari, and Oh [18]. The reconstruction is accomplished by minim ..."
Abstract

Cited by 58 (3 self)
 Add to MetaCart
This paper provides the best bounds to date on the number of randomly sampled entries required to reconstruct an unknown low rank matrix. These results improve on prior work by Candès and Recht [4], Candès and Tao [7], and Keshavan, Montanari, and Oh [18]. The reconstruction is accomplished by minimizing the nuclear norm, or sum of the singular values, of the hidden matrix subject to agreement with the provided entries. If the underlying matrix satisfies a certain incoherence condition, then the number of entries required is equal to a quadratic logarithmic factor times the number of parameters in the singular value decomposition. The proof of this assertion is short, self contained, and uses very elementary analysis. The novel techniques herein are based on recent work in quantum information theory.
Robust principal component analysis: Exact recovery of corrupted lowrank matrices via convex optimization
 Advances in Neural Information Processing Systems 22
, 2009
"... The supplementary material to the NIPS version of this paper [4] contains a critical error, which was discovered several days before the conference. Unfortunately, it was too late to withdraw the paper from the proceedings. Fortunately, since that time, a correct analysis of the proposed convex prog ..."
Abstract

Cited by 44 (3 self)
 Add to MetaCart
The supplementary material to the NIPS version of this paper [4] contains a critical error, which was discovered several days before the conference. Unfortunately, it was too late to withdraw the paper from the proceedings. Fortunately, since that time, a correct analysis of the proposed convex programming relaxation has been developed by Emmanuel Candes of Stanford University. That analysis is reported in a joint paper, Robust Principal Component Analysis? by Emmanuel Candes, Xiaodong Li, Yi Ma and John Wright,
Matrix Completion from Noisy Entries
"... Given a matrix M of lowrank, we consider the problem of reconstructing it from noisy observations of a small, random subset of its entries. The problem arises in a variety of applications, from collaborative filtering (the ‘Netflix problem’) to structurefrommotion and positioning. We study a low ..."
Abstract

Cited by 43 (2 self)
 Add to MetaCart
Given a matrix M of lowrank, we consider the problem of reconstructing it from noisy observations of a small, random subset of its entries. The problem arises in a variety of applications, from collaborative filtering (the ‘Netflix problem’) to structurefrommotion and positioning. We study a low complexity algorithm introduced in [1], based on a combination of spectral techniques and manifold optimization, that we call here OPTSPACE. We prove performance guarantees that are orderoptimal in a number of circumstances. 1
FINDING STRUCTURE WITH RANDOMNESS: PROBABILISTIC ALGORITHMS FOR CONSTRUCTING APPROXIMATE MATRIX DECOMPOSITIONS
"... Abstract. Lowrank matrix approximations, such as the truncated singular value decomposition and the rankrevealing QR decomposition, play a central role in data analysis and scientific computing. This work surveys and extends recent research which demonstrates that randomization offers a powerful t ..."
Abstract

Cited by 40 (0 self)
 Add to MetaCart
Abstract. Lowrank matrix approximations, such as the truncated singular value decomposition and the rankrevealing QR decomposition, play a central role in data analysis and scientific computing. This work surveys and extends recent research which demonstrates that randomization offers a powerful tool for performing lowrank matrix approximation. These techniques exploit modern computational architectures more fully than classical methods and open the possibility of dealing with truly massive data sets. This paper presents a modular framework for constructing randomized algorithms that compute partial matrix decompositions. These methods use random sampling to identify a subspace that captures most of the action of a matrix. The input matrix is then compressed—either explicitly or implicitly—to this subspace, and the reduced matrix is manipulated deterministically to obtain the desired lowrank factorization. In many cases, this approach beats its classical competitors in terms of accuracy, speed, and robustness. These claims are supported by extensive numerical experiments and a detailed error analysis. The specific benefits of randomized techniques depend on the computational environment. Consider the model problem of finding the k dominant components of the singular value decomposition
NonParametric Bayesian Dictionary Learning for Sparse Image Representations
"... Nonparametric Bayesian techniques are considered for learning dictionaries for sparse image representations, with applications in denoising, inpainting and compressive sensing (CS). The beta process is employed as a prior for learning the dictionary, and this nonparametric method naturally infers ..."
Abstract

Cited by 37 (24 self)
 Add to MetaCart
Nonparametric Bayesian techniques are considered for learning dictionaries for sparse image representations, with applications in denoising, inpainting and compressive sensing (CS). The beta process is employed as a prior for learning the dictionary, and this nonparametric method naturally infers an appropriate dictionary size. The Dirichlet process and a probit stickbreaking process are also considered to exploit structure within an image. The proposed method can learn a sparse dictionary in situ; training images may be exploited if available, but they are not required. Further, the noise variance need not be known, and can be nonstationary. Another virtue of the proposed method is that sequential inference can be readily employed, thereby allowing scaling to large images. Several example results are presented, using both Gibbs and variational Bayesian inference, with comparisons to other stateoftheart approaches.
Restricted strong convexity and (weighted) matrix completion: Optimal bounds with noise
, 2010
"... We consider the matrix completion problem under a form of row/column weighted entrywise sampling, including the case of uniform entrywise sampling as a special case. We analyze the associated random observation operator, and prove that with high probability, it satisfies a form of restricted strong ..."
Abstract

Cited by 36 (6 self)
 Add to MetaCart
We consider the matrix completion problem under a form of row/column weighted entrywise sampling, including the case of uniform entrywise sampling as a special case. We analyze the associated random observation operator, and prove that with high probability, it satisfies a form of restricted strong convexity with respect to weighted Frobenius norm. Using this property, we obtain as corollaries a number of error bounds on matrix completion in the weighted Frobenius norm under noisy sampling and for both exact and near lowrank matrices. Our results are based on measures of the “spikiness ” and “lowrankness ” of matrices that are less restrictive than the incoherence conditions imposed in previous work. Our technique involves an Mestimator that includes controls on both the rank and spikiness of the solution, and we establish nonasymptotic error bounds in weighted Frobenius norm for recovering matrices lying with ℓq“balls ” of bounded spikiness. Using informationtheoretic methods, we show that no algorithm can achieve better estimates (up to a logarithmic factor) over these same sets, showing that our conditions on matrices and associated rates are essentially optimal.
Robust PCA via outlier pursuit
, 2010
"... Singular Value Decomposition (and Principal Component Analysis) is one of the most widely used techniques for dimensionality reduction: successful and efficiently computable, it is nevertheless plagued by a wellknown, welldocumented sensitivity to outliers. Recent work has considered the setting w ..."
Abstract

Cited by 35 (5 self)
 Add to MetaCart
Singular Value Decomposition (and Principal Component Analysis) is one of the most widely used techniques for dimensionality reduction: successful and efficiently computable, it is nevertheless plagued by a wellknown, welldocumented sensitivity to outliers. Recent work has considered the setting where each point has a few arbitrarily corrupted components. Yet, in applications of SVD or PCA such as robust collaborative filtering or bioinformatics, malicious agents, defective genes, or simply corrupted or contaminated experiments may effectively yield entire points that are completely corrupted. We present an efficient convex optimizationbased algorithm we call Outlier Pursuit, that under some mild assumptions on the uncorrupted points (satisfied, e.g., by the standard generative assumption in PCA problems) recovers the exact optimal lowdimensional subspace, and identifies the corrupted points. Such identification of corrupted points that do not conform to the lowdimensional approximation, is of paramount interest in bioinformatics and financial applications, and beyond. Our techniques involve matrix decomposition using nuclear norm minimization, however, our results, setup, and approach, necessarily differ considerably from the existing line of work in matrix completion and matrix decomposition, since we develop an approach to recover the correct column space of the uncorrupted matrix, rather than the exact matrix itself. 1