Results 1  10
of
400
A fast iterative shrinkagethresholding algorithm with application to . . .
, 2009
"... We consider the class of Iterative ShrinkageThresholding Algorithms (ISTA) for solving linear inverse problems arising in signal/image processing. This class of methods is attractive due to its simplicity, however, they are also known to converge quite slowly. In this paper we present a Fast Iterat ..."
Abstract

Cited by 1058 (9 self)
 Add to MetaCart
(Show Context)
We consider the class of Iterative ShrinkageThresholding Algorithms (ISTA) for solving linear inverse problems arising in signal/image processing. This class of methods is attractive due to its simplicity, however, they are also known to converge quite slowly. In this paper we present a Fast Iterative ShrinkageThresholding Algorithm (FISTA) which preserves the computational simplicity of ISTA, but with a global rate of convergence which is proven to be significantly better, both theoretically and practically. Initial promising numerical results for waveletbased image deblurring demonstrate the capabilities of FISTA.
Distributed Optimization and Statistical Learning via the Alternating Direction Method of Multipliers
, 2010
"... ..."
Robust principal component analysis?
 Journal of the ACM,
, 2011
"... Abstract This paper is about a curious phenomenon. Suppose we have a data matrix, which is the superposition of a lowrank component and a sparse component. Can we recover each component individually? We prove that under some suitable assumptions, it is possible to recover both the lowrank and the ..."
Abstract

Cited by 569 (26 self)
 Add to MetaCart
(Show Context)
Abstract This paper is about a curious phenomenon. Suppose we have a data matrix, which is the superposition of a lowrank component and a sparse component. Can we recover each component individually? We prove that under some suitable assumptions, it is possible to recover both the lowrank and the sparse components exactly by solving a very convenient convex program called Principal Component Pursuit; among all feasible decompositions, simply minimize a weighted combination of the nuclear norm and of the 1 norm. This suggests the possibility of a principled approach to robust principal component analysis since our methodology and results assert that one can recover the principal components of a data matrix even though a positive fraction of its entries are arbitrarily corrupted. This extends to the situation where a fraction of the entries are missing as well. We discuss an algorithm for solving this optimization problem, and present applications in the area of video surveillance, where our methodology allows for the detection of objects in a cluttered background, and in the area of face recognition, where it offers a principled way of removing shadows and specularities in images of faces.
Sparse Reconstruction by Separable Approximation
, 2007
"... Finding sparse approximate solutions to large underdetermined linear systems of equations is a common problem in signal/image processing and statistics. Basis pursuit, the least absolute shrinkage and selection operator (LASSO), waveletbased deconvolution and reconstruction, and compressed sensing ..."
Abstract

Cited by 373 (38 self)
 Add to MetaCart
Finding sparse approximate solutions to large underdetermined linear systems of equations is a common problem in signal/image processing and statistics. Basis pursuit, the least absolute shrinkage and selection operator (LASSO), waveletbased deconvolution and reconstruction, and compressed sensing (CS) are a few wellknown areas in which problems of this type appear. One standard approach is to minimize an objective function that includes a quadratic (ℓ2) error term added to a sparsityinducing (usually ℓ1) regularizer. We present an algorithmic framework for the more general problem of minimizing the sum of a smooth convex function and a nonsmooth, possibly nonconvex, sparsityinducing function. We propose iterative methods in which each step is an optimization subproblem involving a separable quadratic term (diagonal Hessian) plus the original sparsityinducing term. Our approach is suitable for cases in which this subproblem can be solved much more rapidly than the original problem. In addition to solving the standard ℓ2 − ℓ1 case, our approach handles other problems, e.g., ℓp regularizers with p � = 1, or groupseparable (GS) regularizers. Experiments with CS problems show that our approach provides stateoftheart speed for the standard ℓ2 − ℓ1 problem, and is also efficient on problems with GS regularizers. Index Terms — sparse approximation, compressed sensing, optimization, reconstruction.
Online learning for matrix factorization and sparse coding
, 2010
"... Sparse coding—that is, modelling data vectors as sparse linear combinations of basis elements—is widely used in machine learning, neuroscience, signal processing, and statistics. This paper focuses on the largescale matrix factorization problem that consists of learning the basis set in order to ad ..."
Abstract

Cited by 330 (31 self)
 Add to MetaCart
Sparse coding—that is, modelling data vectors as sparse linear combinations of basis elements—is widely used in machine learning, neuroscience, signal processing, and statistics. This paper focuses on the largescale matrix factorization problem that consists of learning the basis set in order to adapt it to specific data. Variations of this problem include dictionary learning in signal processing, nonnegative matrix factorization and sparse principal component analysis. In this paper, we propose to address these tasks with a new online optimization algorithm, based on stochastic approximations, which scales up gracefully to large data sets with millions of training samples, and extends naturally to various matrix factorization formulations, making it suitable for a wide range of learning problems. A proof of convergence is presented, along with experiments with natural images and genomic data demonstrating that it leads to stateoftheart performance in terms of speed and optimization for both small and large data sets.
An interiorpoint method for largescale l1regularized logistic regression
 Journal of Machine Learning Research
, 2007
"... Logistic regression with ℓ1 regularization has been proposed as a promising method for feature selection in classification problems. In this paper we describe an efficient interiorpoint method for solving largescale ℓ1regularized logistic regression problems. Small problems with up to a thousand ..."
Abstract

Cited by 290 (9 self)
 Add to MetaCart
(Show Context)
Logistic regression with ℓ1 regularization has been proposed as a promising method for feature selection in classification problems. In this paper we describe an efficient interiorpoint method for solving largescale ℓ1regularized logistic regression problems. Small problems with up to a thousand or so features and examples can be solved in seconds on a PC; medium sized problems, with tens of thousands of features and examples, can be solved in tens of seconds (assuming some sparsity in the data). A variation on the basic method, that uses a preconditioned conjugate gradient method to compute the search step, can solve very large problems, with a million features and examples (e.g., the 20 Newsgroups data set), in a few minutes, on a PC. Using warmstart techniques, a good approximation of the entire regularization path can be computed much more efficiently than by solving a family of problems independently.
A unified framework for highdimensional analysis of Mestimators with decomposable regularizers
"... ..."
Structured variable selection with sparsityinducing norms
, 2011
"... We consider the empirical risk minimization problem for linear supervised learning, with regularization by structured sparsityinducing norms. These are defined as sums of Euclidean norms on certain subsets of variables, extending the usual ℓ1norm and the group ℓ1norm by allowing the subsets to ov ..."
Abstract

Cited by 187 (27 self)
 Add to MetaCart
(Show Context)
We consider the empirical risk minimization problem for linear supervised learning, with regularization by structured sparsityinducing norms. These are defined as sums of Euclidean norms on certain subsets of variables, extending the usual ℓ1norm and the group ℓ1norm by allowing the subsets to overlap. This leads to a specific set of allowed nonzero patterns for the solutions of such problems. We first explore the relationship between the groups defining the norm and the resulting nonzero patterns, providing both forward and backward algorithms to go back and forth from groups to patterns. This allows the design of norms adapted to specific prior knowledge expressed in terms of nonzero patterns. We also present an efficient active set algorithm, and analyze the consistency of variable selection for leastsquares linear regression in low and highdimensional settings.
NESTA: A Fast and Accurate FirstOrder Method for Sparse Recovery
, 2009
"... Accurate signal recovery or image reconstruction from indirect and possibly undersampled data is a topic of considerable interest; for example, the literature in the recent field of compressed sensing is already quite immense. Inspired by recent breakthroughs in the development of novel firstorder ..."
Abstract

Cited by 171 (2 self)
 Add to MetaCart
Accurate signal recovery or image reconstruction from indirect and possibly undersampled data is a topic of considerable interest; for example, the literature in the recent field of compressed sensing is already quite immense. Inspired by recent breakthroughs in the development of novel firstorder methods in convex optimization, most notably Nesterov’s smoothing technique, this paper introduces a fast and accurate algorithm for solving common recovery problems in signal processing. In the spirit of Nesterov’s work, one of the key ideas of this algorithm is a subtle averaging of sequences of iterates, which has been shown to improve the convergence properties of standard gradientdescent algorithms. This paper demonstrates that this approach is ideally suited for solving largescale compressed sensing reconstruction problems as 1) it is computationally efficient, 2) it is accurate and returns solutions with several correct digits, 3) it is flexible and amenable to many kinds of reconstruction problems, and 4) it is robust in the sense that its excellent performance across a wide range of problems does not depend on the fine tuning of several parameters. Comprehensive numerical experiments on realistic signals exhibiting a large dynamic range show that this algorithm compares favorably with recently proposed stateoftheart methods. We also apply the algorithm to solve other problems for which there are fewer alternatives, such as totalvariation minimization, and
Fast gradientbased algorithms for constrained total variation image denoising and deblurring problems
 IEEE TRANSACTION ON IMAGE PROCESSING
, 2009
"... This paper studies gradientbased schemes for image denoising and deblurring problems based on the discretized total variation (TV) minimization model with constraints. We derive a fast algorithm for the constrained TVbased image deburring problem. To achieve this task we combine an acceleration of ..."
Abstract

Cited by 168 (2 self)
 Add to MetaCart
(Show Context)
This paper studies gradientbased schemes for image denoising and deblurring problems based on the discretized total variation (TV) minimization model with constraints. We derive a fast algorithm for the constrained TVbased image deburring problem. To achieve this task we combine an acceleration of the well known dual approach to the denoising problem with a novel monotone version of a fast iterative shrinkage/thresholding algorithm (FISTA) we have recently introduced. The resulting gradientbased algorithm shares a remarkable simplicity together with a proven global rate of convergence which is significantly better than currently known gradient projectionsbased methods. Our results are applicable to both the anisotropic and isotropic discretized TV functionals. Initial numerical results demonstrate the viability and efficiency of the proposed algorithms on image deblurring problems with box constraints.