## A Singular Value Thresholding Algorithm for Matrix Completion (2008)

### Cached

### Download Links

Citations: | 223 - 13 self |

### BibTeX

@MISC{Cai08asingular,

author = {Jian-Feng Cai and Emmanuel J. Candès and Zuowei Shen},

title = {A Singular Value Thresholding Algorithm for Matrix Completion },

year = {2008}

}

### Years of Citing Articles

### OpenURL

### Abstract

This paper introduces a novel algorithm to approximate the matrix with minimum nuclear norm among all matrices obeying a set of convex constraints. This problem may be understood as the convex relaxation of a rank minimization problem, and arises in many important applications as in the task of recovering a large matrix from a small subset of its entries (the famous Netflix problem). Off-the-shelf algorithms such as interior point methods are not directly amenable to large problems of this kind with over a million unknown entries. This paper develops a simple first-order and easy-to-implement algorithm that is extremely efficient at addressing problems in which the optimal solution has low rank. The algorithm is iterative and produces a sequence of matrices {X k, Y k} and at each step, mainly performs a soft-thresholding operation on the singular values of the matrix Y k. There are two remarkable features making this attractive for low-rank matrix completion problems. The first is that the soft-thresholding operation is applied to a sparse matrix; the second is that the rank of the iterates {X k} is empirically nondecreasing. Both these facts allow the algorithm to make use of very minimal storage space and keep the computational cost of each iteration low. On

### Citations

3621 | Convex Analysis - Rockafellar - 1970 |

1864 | Compressed sensing - Donoho - 2006 |

936 | Shape and motion from image streams under orthography: A factorization method
- Tomasi, Kanade
- 1992
(Show Context)
Citation Context ...approximately low-rank matrix from very limited information. This problem occurs in many areas of engineering and applied science such as machine learning [2–4], control [54] and computer vision, see =-=[62]-=-. As a motivating example, consider the problem of recovering a data matrix from a sampling of its entries. This routinely comes up whenever one collects partially filled out surveys, and one would li... |

887 | Near-optimal signal recovery from random projections: universal encoding strategies - Candes, Tao - 2006 |

703 | Decoding by linear programming - Candes, Tao - 2005 |

464 |
Convex Analysis and Minimization Algorithms
- Hiriart-Urruty, Lemaréchal
- 1991
(Show Context)
Citation Context ...ntries of the input are below threshold. The singular value thresholding operator is the proximity operator associated with the nuclear norm. Details about the proximity operator can be found in e.g. =-=[42]-=-. Theorem 2.1. 2 For each τ ≥ 0 and Y ∈ R n1×n2 , the singular value shrinkage operator (2.2) obeys Dτ (Y ) = arg min X 1 2 ‖X − Y ‖2 F + τ‖X‖∗. (2.3) Proof. Since the function h0(X) := τ‖X‖∗ + 1 2‖X ... |

459 |
An iterative thresholding algorithm for linear inverse problems with a sparsity constraint
- Daubechies, Defrise, et al.
(Show Context)
Citation Context ...ithms in connection with ℓ1 or total-variation minimization have quite a bit of history in signal and image processing and we would like to mention the works [14,48] for total-variation minimization, =-=[28,29,36]-=- for ℓ1 minimization, and [5,9,10,22,23,32, 33,59] for some recent applications in the area of image inpainting and image restoration. Just as iterative soft-thresholding methods are designed to find ... |

445 | The Dantzig selector: statistical estimation when p is much larger than n,” Annals of Statistics
- Candès, Tao
(Show Context)
Citation Context ...traints. Clearly, one could apply this methodology with general cone constraints of the form F(X) + d ∈ K, where K is some closed and pointed convex cone. Inspired by the work on the Dantzig selector =-=[18]-=-, which was originally developed for estimating sparse parameter vectors from noisy data, another approach is to set a constraint on the spectral norm of A ∗ (r)—recall that r is the residual vector b... |

349 | Exact matrix completion via convex optimization. 2008. Preprint available at arXiv.org (0805.4471
- Candès, Recht
(Show Context)
Citation Context ...e that the unknown has (approximately) low rank radically changes the problem, making the search for solutions feasible since the lowest-rank solution now tends to be the right one. In a recent paper =-=[16]-=-, Candès and Recht showed that matrix completion is not as ill-posed as people thought. Indeed, they proved that most low-rank matrices can be recovered exactly from most sets of sampled entries even ... |

294 |
Signal recovery by proximal forwardbackward splitting,Multiscale Modeling and Simulation
- Combettes, Wajs
- 2005
(Show Context)
Citation Context ... (2.7) with the popular iterative soft-thresholding algorithm used in many papers in imaging processing and perhaps best known under the name of Proximal Forward-Backward Splitting method (PFBS), see =-=[10,26,28,36,40,63,64]-=- for example. The constrained minimization problem (1.4) may be relaxed into minimize λ‖X‖∗ + 1 2 ‖PΩ(X) − PΩ(M)‖ 2 F (2.9) for some λ > 0. Theorem 2.1 asserts that Dλ is the proximity operator of λ‖X... |

251 | An EM algorithm for wavelet-based image restoration
- Figueiredo, Nowak
(Show Context)
Citation Context ...ithms in connection with ℓ1 or total-variation minimization have quite a bit of history in signal and image processing and we would like to mention the works [14,48] for total-variation minimization, =-=[28,29,36]-=- for ℓ1 minimization, and [5,9,10,22,23,32, 33,59] for some recent applications in the area of image inpainting and image restoration. Just as iterative soft-thresholding methods are designed to find ... |

233 | Guaranteed minimumrank solutions of linear matrix equations via nuclear norm minimization
- Recht, Fazel, et al.
- 2007
(Show Context)
Citation Context ...ent in the sense that they have exactly the same unique solution. 1.2. Algorithm outline. Because minimizing the nuclear norm both provably recovers the lowest-rank matrix subject to constraints (see =-=[57]-=- for related results) and gives generally good empirical results in a variety of situations, it is understandably of great interest to develop numerical methods for solving (1.1). In [16], this optimi... |

185 | Sparse reconstruction by separable approximation
- Wright, Nowak, et al.
- 2009
(Show Context)
Citation Context ...start with a value of τ, which is large enough so that (2.8) admits a low-rank solution, and at the same time for which the algorithm converges rapidly. One could then use a continuation method as in =-=[66]-=- to increase the value of τ sequentially according to a schedule τ0, τ1, . . ., and use the solution to the previous problem with τ = τi−1 as an initial guess for the solution to the current problem w... |

145 | Multi-task feature learning - Argyriou, Evgeniou, et al. |

139 |
Matrix Rank Minimization with Applications
- Fazel
- 2002
(Show Context)
Citation Context ...t C. 1 In (1.1), the functional ‖X‖∗ is the nuclear norm of the matrix M, which is the sum of its singular values. The optimization problem (1.1) is convex and can be recast as a semidefinite program =-=[34,35]-=-. In some sense, this is the tightest convex relaxation of the NP-hard rank minimization problem minimize rank(X) subject to Xij = Mij, (i, j) ∈ Ω, (1.2) since the nuclear ball {X : ‖X‖∗ ≤ 1} is the c... |

133 |
The split Bregman method for L1regularized problems
- Goldstein, Osher
- 2009
(Show Context)
Citation Context ...urring problem while on the theoretical side, the references [11, 12] gave a rigorous analysis of the convergence of such iterations. New developments keep on coming out at a rapid pace and recently, =-=[39]-=- introduced a new iteration, the split Bregman iteration, to extend Bregman-type iterations (such as linearized Bregman iterations) to problems involving the minimization of ℓ1-like functionals such a... |

132 |
Simultaneous Cartoon and Texture Image Inpainting Using Morphological Component Analysis
- Elad, Starck, et al.
- 2005
(Show Context)
Citation Context ...-variation minimization have quite a bit of history in signal and image processing and we would like to mention the works [14,48] for total-variation minimization, [28,29,36] for ℓ1 minimization, and =-=[5,9,10,22,23,32, 33,59]-=- for some recent applications in the area of image inpainting and image restoration. Just as iterative soft-thresholding methods are designed to find sparse solutions, our iterative singular value thr... |

101 |
Fixed point and Bregman iterative methods for matrix rank minimization
- Ma, Goldfarb, et al.
- 2009
(Show Context)
Citation Context ...e matrix completion problems (in cases where computing the SVD is prohibitive). Since the original submission of this paper, however, we note that several papers proposed some working implementations =-=[51, 61]-=-. 2.4. Interpretation as a Lagrange multiplier method. In this section, we recast the SVT algorithm as a type of Lagrange multiplier algorithm known as Uzawa’s algorithm. An important consequence is t... |

101 | An iterative regularization method for total variation-based image restoration
- Osher, Burger, et al.
(Show Context)
Citation Context ...tions with minimum ℓ1 norm.In fact, Theorem 2.1 asserts that the singular value thresholding algorithm can be formulated as a linearized Bregman iteration. Bregman iterations were first introduced in =-=[55]-=- as a convenient tool for solving computational problems in the imaging sciences, and a later paper [67] showed that they were useful for solving ℓ1-norm minimization problems in the area of compresse... |

93 | Eigenvalue optimization
- Lewis, Overton
- 1996
(Show Context)
Citation Context ...he functional h0 at the point ˆ X, i.e. 0 ∈ ˆ X − Y + τ∂‖ ˆ X‖∗, (2.5) where ∂‖ ˆ X‖∗ is the set of subgradients of the nuclear norm. Let X ∈ Rn1×n2 arbitrary matrix and UΣV ∗ be its SVD. It is known =-=[16, 46, 65]-=- that be an ∂‖X‖∗ = { UV ∗ + W : W ∈ R n1×n2 , U ∗ W = 0, W V = 0, ‖W ‖2 ≤ 1 } . (2.6) Set ˆ X := Dτ (Y ) for short. In order to show that ˆ X obeys (2.5), decompose the SVD of Y as Y = U0Σ0V ∗ 0 + U1... |

90 | Matrix completion with noise
- Candes, Plan
- 2010
(Show Context)
Citation Context ...3.12) is fast, and provides statistically accurate answers since it predicts the unseen entries with an accuracy which is about equal to the standard deviation of the noise. In fact, very recent work =-=[15]-=- performed after the original submission of this paper suggests that even with considerable side information about the unknown matrix, one would not be able to do much better. As seen in the table, al... |

84 | New multiscale transforms, minimum total variation synthesis: Applications to edge-preserving image reconstruction
- Candés, Guo
- 2002
(Show Context)
Citation Context ...vectors. Iterative soft-thresholding algorithms in connection with ℓ1 or total-variation minimization have quite a bit of history in signal and image processing and we would like to mention the works =-=[14,48]-=- for total-variation minimization, [28,29,36] for ℓ1 minimization, and [5,9,10,22,23,32, 33,59] for some recent applications in the area of image inpainting and image restoration. Just as iterative so... |

80 | An accelerated proximal gradient algorithm for nuclear norm regularized least squares problems
- Toh, Yun
(Show Context)
Citation Context ...e matrix completion problems (in cases where computing the SVD is prohibitive). Since the original submission of this paper, however, we note that several papers proposed some working implementations =-=[51, 61]-=-. 2.4. Interpretation as a Lagrange multiplier method. In this section, we recast the SVT algorithm as a type of Lagrange multiplier algorithm known as Uzawa’s algorithm. An important consequence is t... |

79 | On the Goldstein-Levitin-Polyak gradient projectionmethod
- Bertsekas
- 1976
(Show Context)
Citation Context ...ansparent. We have seen that SVT iterations are projected gradient-descent algorithms applied to the dual problems. The convergence of projected gradient-descent algorithms has been well studied, see =-=[6, 25, 38, 43, 45, 52, 68]-=- for example. 4.1. Convergence for matrix completion. We begin by recording a lemma which establishes the strong convexity of the objective fτ . Lemma 4.1. Let Z ∈ ∂fτ (X) and Z ′ ∈ ∂fτ (X ′ ). Then 〈... |

78 |
Log-det heuristic for matrix rank minimization with applications to Hankel and Euclidean distance matrices
- Fazel, Hindi, et al.
- 2003
(Show Context)
Citation Context ...t C. 1 In (1.1), the functional ‖X‖∗ is the nuclear norm of the matrix M, which is the sum of its singular values. The optimization problem (1.1) is convex and can be recast as a semidefinite program =-=[34,35]-=-. In some sense, this is the tightest convex relaxation of the NP-hard rank minimization problem minimize rank(X) subject to Xij = Mij, (i, j) ∈ Ω, (1.2) since the nuclear ball {X : ‖X‖∗ ≤ 1} is the c... |

70 |
Further Applications of a Splitting Algorithm to Decomposition
- TSENG
- 1990
(Show Context)
Citation Context ... (2.7) with the popular iterative soft-thresholding algorithm used in many papers in imaging processing and perhaps best known under the name of Proximal Forward-Backward Splitting method (PFBS), see =-=[10,26,28,36,40,63,64]-=- for example. The constrained minimization problem (1.4) may be relaxed into minimize λ‖X‖∗ + 1 2 ‖PΩ(X) − PΩ(M)‖ 2 F (2.9) for some λ > 0. Theorem 2.1 asserts that Dλ is the proximity operator of λ‖X... |

66 | S.: Uncovering shared structures in multiclass classification - Amit, Fink, et al. - 2007 |

65 | Bregman iterative algorithms for ℓ1 minimization with application to compressed sensing
- Yin, Osher, et al.
- 2008
(Show Context)
Citation Context ...an be formulated as a linearized Bregman iteration. Bregman iterations were first introduced in [55] as a convenient tool for solving computational problems in the imaging sciences, and a later paper =-=[67]-=- showed that they were useful for solving ℓ1-norm minimization problems in the area of compressed sensing. Linearized Bregman iterations were proposed in [27] to improve performance of plain Bregman i... |

62 | Linearized Bregman iterations for compressed sensing
- Cai, Osher, et al.
(Show Context)
Citation Context ...speed of convergence called kicking are described in [56]. On the practical side, the paper [13] applied Bregman iterations to solve a deblurring problem while on the theoretical side, the references =-=[11, 12]-=- gave a rigorous analysis of the convergence of such iterations. New developments keep on coming out at a rapid pace and recently, [39] introduced a new iteration, the split Bregman iteration, to exte... |

58 |
Interior-point method for nuclear norm approximation with application to system identification
- Liu, Vandenberghe
(Show Context)
Citation Context ...tion, 1 Note that an n × n matrix of rank r depends upon r(2n − r) degrees of freedom. 2none of these general purpose solvers use the fact that the solution may have low rank. We refer the reader to =-=[50]-=- for some recent progress on interior-point methods concerning some special nuclear norm-minimization problems. This paper develops the singular value thresholding algorithm for approximately solving ... |

53 |
Convex Programming in Hilbert Space
- Goldstein
- 1964
(Show Context)
Citation Context ...ansparent. We have seen that SVT iterations are projected gradient-descent algorithms applied to the dual problems. The convergence of projected gradient-descent algorithms has been well studied, see =-=[6, 25, 38, 43, 45, 52, 68]-=- for example. 4.1. Convergence for matrix completion. We begin by recording a lemma which establishes the strong convexity of the objective fτ . Lemma 4.1. Let Z ∈ ∂fτ (X) and Z ′ ∈ ∂fτ (X ′ ). Then 〈... |

51 | Astronomical image representation by the curvelet transform
- Starck, Candès, et al.
- 2003
(Show Context)
Citation Context ...-variation minimization have quite a bit of history in signal and image processing and we would like to mention the works [14,48] for total-variation minimization, [28,29,36] for ℓ1 minimization, and =-=[5,9,10,22,23,32, 33,59]-=- for some recent applications in the area of image inpainting and image restoration. Just as iterative soft-thresholding methods are designed to find sparse solutions, our iterative singular value thr... |

51 | A modified forward-backward splitting method for maximal monotone mappings
- Tseng
(Show Context)
Citation Context ... (2.7) with the popular iterative soft-thresholding algorithm used in many papers in imaging processing and perhaps best known under the name of Proximal Forward-Backward Splitting method (PFBS), see =-=[10,26,28,36,40,63,64]-=- for example. The constrained minimization problem (1.4) may be relaxed into minimize λ‖X‖∗ + 1 2 ‖PΩ(X) − PΩ(M)‖ 2 F (2.9) for some λ > 0. Theorem 2.1 asserts that Dλ is the proximity operator of λ‖X... |

50 |
On the basic theorem of complementarity
- Eaves
- 1971
(Show Context)
Citation Context ... strong duality holds which is automatically true if the constraints obey constraint qualifications such as Slater’s condition [7]. We first establish a preparatory lemma, whose proof can be found in =-=[31]-=-. Lemma 4.3. Let (X ⋆ , y ⋆ ) be a primal-dual optimal pair for (3.4). Then for each δ > 0, y ⋆ obeys y ⋆ = [y ⋆ + δF(X ⋆ )]+. (4.3) We are now in the position to state our general convergence result,... |

46 | A framelet-based image inpainting algorithm
- Cai, Chan, et al.
(Show Context)
Citation Context ...-variation minimization have quite a bit of history in signal and image processing and we would like to mention the works [14,48] for total-variation minimization, [28,29,36] for ℓ1 minimization, and =-=[5,9,10,22,23,32, 33,59]-=- for some recent applications in the area of image inpainting and image restoration. Just as iterative soft-thresholding methods are designed to find sparse solutions, our iterative singular value thr... |

46 |
On the rank minimization problem over a positive semidefinite linear matrix inequality
- Mesbahi, Papavassilopoulos
- 1997
(Show Context)
Citation Context ...ery of an unknown low-rank or approximately low-rank matrix from very limited information. This problem occurs in many areas of engineering and applied science such as machine learning [2–4], control =-=[54]-=- and computer vision, see [62]. As a motivating example, consider the problem of recovering a data matrix from a sampling of its entries. This routinely comes up whenever one collects partially filled... |

42 |
Characterization of the subdifferential of some matrix norms
- WATSON
- 1992
(Show Context)
Citation Context ...he functional h0 at the point ˆ X, i.e. 0 ∈ ˆ X − Y + τ∂‖ ˆ X‖∗, (2.5) where ∂‖ ˆ X‖∗ is the set of subgradients of the nuclear norm. Let X ∈ Rn1×n2 arbitrary matrix and UΣV ∗ be its SVD. It is known =-=[16, 46, 65]-=- that be an ∂‖X‖∗ = { UV ∗ + W : W ∈ R n1×n2 , U ∗ W = 0, W V = 0, ‖W ‖2 ≤ 1 } . (2.6) Set ˆ X := Dτ (Y ) for short. In order to show that ˆ X obeys (2.5), decompose the SVD of Y as Y = U0Σ0V ∗ 0 + U1... |

39 |
Convex optimization. Cambridge Univ Pr
- Boyd, Vandenberghe
- 2004
(Show Context)
Citation Context ... by L(X, y) = fτ (X) + 〈y, F(X)〉, y ≥ 0. To simplify, we will assume that strong duality holds which is automatically true if the constraints obey constraint qualifications such as Slater’s condition =-=[7]-=-. We first establish a preparatory lemma, whose proof can be found in [31]. Lemma 4.3. Let (X ⋆ , y ⋆ ) be a primal-dual optimal pair for (3.4). Then for each δ > 0, y ⋆ obeys y ⋆ = [y ⋆ + δF(X ⋆ )]+.... |

38 |
Tütüncü, SDPT3 – a MATLAB software package for semidefinite programming, version 1.3, Optimization Methods and Software
- Todd, Toh, et al.
- 1999
(Show Context)
Citation Context ...tandably of great interest to develop numerical methods for solving (1.1). In [16], this optimization problem was solved using one of the most advanced semidefinite programming solvers, namely, SDPT3 =-=[60]-=-. This solver and others like SeDuMi are based on interior-point methods, and are problematic when the size of the matrix is large because they need to solve huge systems of linear equations to comput... |

37 | Inpainting and zooming using sparse representations
- Fadili, Starck, et al.
- 2007
(Show Context)
Citation Context |

36 | Wavelet algorithms for high-resolution image reconstruction
- Chan, Chan, et al.
- 2003
(Show Context)
Citation Context |

35 |
A l1-unified variational framework for image restoration
- Bect, Blanc-Féraud, et al.
(Show Context)
Citation Context |

34 | Sparsity and Incoherence - Candès, Romberg - 2007 |

33 | Iteratively solving linear inverse problems under general convex constraints
- Daubechies, Teschke, et al.
- 2007
(Show Context)
Citation Context ...ithms in connection with ℓ1 or total-variation minimization have quite a bit of history in signal and image processing and we would like to mention the works [14,48] for total-variation minimization, =-=[28,29,36]-=- for ℓ1 minimization, and [5,9,10,22,23,32, 33,59] for some recent applications in the area of image inpainting and image restoration. Just as iterative soft-thresholding methods are designed to find ... |

32 | J.P.: Low-rank matrix factorization with attributes. Ecole des Mines de Paris - Abernethy, Bach, et al. |

31 |
Fixed-point continuation for l1-minimization: Methodology and convergence
- Hale, Yin, et al.
(Show Context)
Citation Context |

30 |
PROPACK - Software for large and sparse SVD calculations. Available at http://sun.stanford.edu/∼rmunk/PROPACK
- Larsen
(Show Context)
Citation Context ...methods is a relatively mature area in scientific computing and numerical linear algebra in particular. In fact, many high-quality packages are readily available. Our implementation uses PROPACK, see =-=[44]-=- for documentation and availability. One reason for this choice is convenience: PROPACK comes in a Matlab and a Fortran version, and we find it convenient to use the well-documented Matlab version. Mo... |

28 |
Co-coercivity and its role in the convergence of iterative schemes for solving variational inequalities
- ZHU, MARCOTTE
- 1996
(Show Context)
Citation Context ...ansparent. We have seen that SVT iterations are projected gradient-descent algorithms applied to the dual problems. The convergence of projected gradient-descent algorithms has been well studied, see =-=[6, 25, 38, 43, 45, 52, 68]-=- for example. 4.1. Convergence for matrix completion. We begin by recording a lemma which establishes the strong convexity of the objective fτ . Lemma 4.1. Let Z ∈ ∂fτ (X) and Z ′ ∈ ∂fτ (X ′ ). Then 〈... |

27 |
Deconvolution: A wavelet frame approach
- Chai, Shen
(Show Context)
Citation Context |

27 | Recovering the missing components in a large noisy low-rank matrix: application to SFM
- Chen, Suter
- 2004
(Show Context)
Citation Context ...or “missing data” which occur because of occlusion or tracking failures. However, when properly stacked and indexed, these images form a matrix which has very low rank (e.g. rank 3 under orthography) =-=[24,62]-=-. Other examples of low-rank matrix fitting abound; e.g. in control (system identification), machine learning (multi-class learning) and so on. Having said this, the premise that the unknown has (appr... |