Results 1  10
of
75
Robust Principal Component Analysis?
, 2009
"... This paper is about a curious phenomenon. Suppose we have a data matrix, which is the superposition of a lowrank component and a sparse component. Can we recover each component individually? We prove that under some suitable assumptions, it is possible to recover both the lowrank and the sparse co ..."
Abstract

Cited by 138 (6 self)
 Add to MetaCart
This paper is about a curious phenomenon. Suppose we have a data matrix, which is the superposition of a lowrank component and a sparse component. Can we recover each component individually? We prove that under some suitable assumptions, it is possible to recover both the lowrank and the sparse components exactly by solving a very convenient convex program called Principal Component Pursuit; among all feasible decompositions, simply minimize a weighted combination of the nuclear norm and of the ℓ1 norm. This suggests the possibility of a principled approach to robust principal component analysis since our methodology and results assert that one can recover the principal components of a data matrix even though a positive fraction of its entries are arbitrarily corrupted. This extends to the situation where a fraction of the entries are missing as well. We discuss an algorithm for solving this optimization problem, and present applications in the area of video surveillance, where our methodology allows for the detection of objects in a cluttered background, and in the area of face recognition, where it offers a principled way of removing shadows and specularities in images of faces.
A unified framework for highdimensional analysis of Mestimators with decomposable regularizers
"... ..."
RASL: Robust Alignment by Sparse and Lowrank Decomposition for Linearly Correlated Images ∗
"... This paper studies the problem of simultaneously aligning a batch of linearly correlated images despite gross corruption (such as occlusion). Our method seeks an optimal set of image domain transformations such that the matrix of transformed images can be decomposed as the sum of a sparse matrix of ..."
Abstract

Cited by 44 (1 self)
 Add to MetaCart
This paper studies the problem of simultaneously aligning a batch of linearly correlated images despite gross corruption (such as occlusion). Our method seeks an optimal set of image domain transformations such that the matrix of transformed images can be decomposed as the sum of a sparse matrix of errors and a lowrank matrix of recovered aligned images. We reduce this extremely challenging optimization problem to a sequence of convex programs that minimize the sum of ℓ1norm and nuclear norm of the two component matrices, which can be efficiently solved by scalable convex optimization techniques with guaranteed fast convergence. We verify the efficacy of the proposed robust alignment algorithm with extensive experiments with both controlled and uncontrolled real data, demonstrating higher accuracy and efficiency than existing methods over a wide range of realistic misalignments and corruptions. 1.
The Convex Geometry of Linear Inverse Problems
, 2010
"... In applications throughout science and engineering one is often faced with the challenge of solving an illposed inverse problem, where the number of available measurements is smaller than the dimension of the model to be estimated. However in many practical situations of interest, models are constr ..."
Abstract

Cited by 38 (10 self)
 Add to MetaCart
In applications throughout science and engineering one is often faced with the challenge of solving an illposed inverse problem, where the number of available measurements is smaller than the dimension of the model to be estimated. However in many practical situations of interest, models are constrained structurally so that they only have a few degrees of freedom relative to their ambient dimension. This paper provides a general framework to convert notions of simplicity into convex penalty functions, resulting in convex optimization solutions to linear, underdetermined inverse problems. The class of simple models considered are those formed as the sum of a few atoms from some (possibly infinite) elementary atomic set; examples include wellstudied cases such as sparse vectors (e.g., signal processing, statistics) and lowrank matrices (e.g., control, statistics), as well as several others including sums of a few permutations matrices (e.g., ranked elections, multiobject tracking), lowrank tensors (e.g., computer vision, neuroscience), orthogonal matrices (e.g., machine learning), and atomic measures (e.g., system identification). The convex programming formulation is based on minimizing the norm induced by the convex hull of the atomic set; this norm is referred to as the atomic norm. The facial
Robust PCA via outlier pursuit
, 2010
"... Singular Value Decomposition (and Principal Component Analysis) is one of the most widely used techniques for dimensionality reduction: successful and efficiently computable, it is nevertheless plagued by a wellknown, welldocumented sensitivity to outliers. Recent work has considered the setting w ..."
Abstract

Cited by 35 (5 self)
 Add to MetaCart
Singular Value Decomposition (and Principal Component Analysis) is one of the most widely used techniques for dimensionality reduction: successful and efficiently computable, it is nevertheless plagued by a wellknown, welldocumented sensitivity to outliers. Recent work has considered the setting where each point has a few arbitrarily corrupted components. Yet, in applications of SVD or PCA such as robust collaborative filtering or bioinformatics, malicious agents, defective genes, or simply corrupted or contaminated experiments may effectively yield entire points that are completely corrupted. We present an efficient convex optimizationbased algorithm we call Outlier Pursuit, that under some mild assumptions on the uncorrupted points (satisfied, e.g., by the standard generative assumption in PCA problems) recovers the exact optimal lowdimensional subspace, and identifies the corrupted points. Such identification of corrupted points that do not conform to the lowdimensional approximation, is of paramount interest in bioinformatics and financial applications, and beyond. Our techniques involve matrix decomposition using nuclear norm minimization, however, our results, setup, and approach, necessarily differ considerably from the existing line of work in matrix completion and matrix decomposition, since we develop an approach to recover the correct column space of the uncorrupted matrix, rather than the exact matrix itself. 1
Fast convex optimization algorithms for exact recovery of a corrupted lowrank matrix
 In Intl. Workshop on Comp. Adv. in MultiSensor Adapt. Processing, Aruba, Dutch Antilles
, 2009
"... Abstract. This paper studies algorithms for solving the problem of recovering a lowrank matrix with a fraction of its entries arbitrarily corrupted. This problem can be viewed as a robust version of classical PCA, and arises in a number of application domains, including image processing, web data r ..."
Abstract

Cited by 33 (6 self)
 Add to MetaCart
Abstract. This paper studies algorithms for solving the problem of recovering a lowrank matrix with a fraction of its entries arbitrarily corrupted. This problem can be viewed as a robust version of classical PCA, and arises in a number of application domains, including image processing, web data ranking, and bioinformatic data analysis. It was recently shown that under surprisingly broad conditions, it can be exactly solved via a convex programming surrogate that combines nuclear norm minimization and ℓ1norm minimization. This paper develops and compares two complementary approaches for solving this convex program. The first is an accelerated proximal gradient algorithm directly applied to the primal; while the second is a gradient algorithm applied to the dual problem. Both are several orders of magnitude faster than the previous stateoftheart algorithm for this problem, which was based on iterative thresholding. Simulations demonstrate the performance improvement that can be obtained via these two algorithms, and clarify their relative merits.
Latent Variable Graphical Model Selection via Convex Optimization
, 2010
"... Suppose we have samples of a subset of a collection of random variables. No additional information is provided about the number of latent variables, nor of the relationship between the latent and observed variables. Is it possible to discover the number of hidden components, and to learn a statistic ..."
Abstract

Cited by 22 (2 self)
 Add to MetaCart
Suppose we have samples of a subset of a collection of random variables. No additional information is provided about the number of latent variables, nor of the relationship between the latent and observed variables. Is it possible to discover the number of hidden components, and to learn a statistical model over the entire collection of variables? We address this question in the setting in which the latent and observed variables are jointly Gaussian, with the conditional statistics of the observed variables conditioned on the latent variables being specified by a graphical model. As a first step we give natural conditions under which such latentvariable Gaussian graphical models are identifiable given marginal statistics of only the observed variables. Essentially these conditions require that the conditional graphical model among the observed variables is sparse, while the effect of the latent variables is “spread out ” over most of the observed variables. Next we propose a tractable convex program based on regularized maximumlikelihood for model selection in this latentvariable setting; the regularizer uses both the ℓ1 norm and the nuclear norm. Our modeling framework can be viewed as a combination of dimensionality reduction (to identify latent variables) and graphical modeling (to capture remaining statistical structure not attributable to the latent variables), and it consistently estimates both the number of hidden components and the conditional graphical model structure among the observed variables. These results are applicable in the highdimensional setting in which the number of latent/observed variables grows with the number of samples of the observed variables. The geometric properties of the algebraic varieties of sparse matrices and of lowrank matrices play an important role in our analysis.
Stable principal component pursuit
 in International Symposium on Information Theory
, 2010
"... Abstract—In this paper, we study the problem of recovering a lowrank matrix (the principal components) from a highdimensional data matrix despite both small entrywise noise and gross sparse errors. Recently, it has been shown that a convex program, named Principal Component Pursuit (PCP), can reco ..."
Abstract

Cited by 20 (1 self)
 Add to MetaCart
Abstract—In this paper, we study the problem of recovering a lowrank matrix (the principal components) from a highdimensional data matrix despite both small entrywise noise and gross sparse errors. Recently, it has been shown that a convex program, named Principal Component Pursuit (PCP), can recover the lowrank matrix when the data matrix is corrupted by gross sparse errors. We further prove that the solution to a related convex program (a relaxed PCP) gives an estimate of the lowrank matrix that is simultaneously stable to small entrywise noise and robust to gross sparse errors. More precisely, our result shows that the proposed convex program recovers the lowrank matrix even though a positive fraction of its entries are arbitrarily corrupted, with an error bound proportional to the noise level. We present simulation results to support our result and demonstrate that the new convex program accurately recovers the principal components (the lowrank matrix) under quite broad conditions. To our knowledge, this is the first result that shows the classical Principal Component Analysis (PCA), optimal for small i.i.d. noise, can be made robust to gross sparse errors; or the first that shows the newly proposed PCP can be made stable to small entrywise perturbations. I.
Noisy matrix decomposition via convexrelaxation: Optimal rates in high dimensions
 Annals of Statistics,40(2):1171
"... We analyze a class of estimators based on convex relaxation for solving highdimensional matrix decomposition problems. The observations are noisy realizations of a linear transformation X of the sum of an (approximately) low rank matrix � ⋆ with a second matrix Ɣ ⋆ endowed with a complementary for ..."
Abstract

Cited by 20 (7 self)
 Add to MetaCart
We analyze a class of estimators based on convex relaxation for solving highdimensional matrix decomposition problems. The observations are noisy realizations of a linear transformation X of the sum of an (approximately) low rank matrix � ⋆ with a second matrix Ɣ ⋆ endowed with a complementary form of lowdimensional structure; this setup includes many statistical models of interest, including factor analysis, multitask regression and robust covariance estimation. We derive a general theorem that bounds the Frobenius norm error for an estimate of the pair ( � ⋆,Ɣ ⋆ ) obtained by solving a convex optimization problem that combines the nuclear norm with a general decomposable regularizer. Our results use a “spikiness ” condition that is related to, but milder than, singular vector incoherence. We specialize our general result to two cases that have been studied in past work: low rank plus an entrywise sparse matrix, and low rank plus a columnwise sparse matrix. For both models, our theory yields nonasymptotic Frobenius error bounds for both deterministic and stochastic noise matrices, and applies to matrices � ⋆ that can be exactly or approximately low rank, and matrices Ɣ ⋆ that can be exactly or approximately sparse. Moreover, for the case of stochastic noise matrices and the identity observation operator, we establish matching lower bounds on the minimax error. The sharpness of our nonasymptotic predictions is confirmed by numerical simulations. 1. Introduction. The
TWO PROPOSALS FOR ROBUST PCA USING SEMIDEFINITE PROGRAMMING
, 1012
"... Abstract. The performance of principal component analysis (PCA) suffers badly in the presence of outliers. This paper proposes two novel approaches for robust PCA based on semidefinite programming. The first method, maximum mean absolute deviation rounding (MDR), seeks directions of large spread in ..."
Abstract

Cited by 17 (1 self)
 Add to MetaCart
Abstract. The performance of principal component analysis (PCA) suffers badly in the presence of outliers. This paper proposes two novel approaches for robust PCA based on semidefinite programming. The first method, maximum mean absolute deviation rounding (MDR), seeks directions of large spread in the data while damping the effect of outliers. The second method produces a lowleverage decomposition (LLD) of the data that attempts to form a lowrank model for the data by separating out corrupted observations. This paper also presents efficient computational methods for solving these SDPs. Numerical experiments confirm the value of these new techniques. 1.