Results 1 - 10
of
190
Stable principal component pursuit
- In Proc. of International Symposium on Information Theory
, 2010
"... We consider the problem of recovering a target matrix that is a superposition of low-rank and sparse components, from a small set of linear measurements. This problem arises in compressed sensing of structured high-dimensional signals such as videos and hyperspectral images, as well as in the analys ..."
Abstract
-
Cited by 94 (3 self)
- Add to MetaCart
(Show Context)
We consider the problem of recovering a target matrix that is a superposition of low-rank and sparse components, from a small set of linear measurements. This problem arises in compressed sensing of structured high-dimensional signals such as videos and hyperspectral images, as well as in the analysis of transformation invariant low-rank structure recovery. We analyze the performance of the natural convex heuristic for solving this problem, under the assumption that measurements are chosen uniformly at random. We prove that this heuristic exactly recovers low-rank and sparse terms, provided the number of observations exceeds the number of intrinsic degrees of freedom of the component signals by a polylogarithmic factor. Our analysis introduces several ideas that may be of independent interest for the more general problem of compressed sensing and decomposing superpositions of multiple structured signals. 1
Revisiting frank-wolfe: Projection-free sparse convex optimization
- In ICML
, 2013
"... We provide stronger and more general primal-dual convergence results for Frank-Wolfe-type algorithms (a.k.a. conditional gradient) for constrained convex optimization, enabled by a simple framework of duality gap certificates. Our analysis also holds if the linear subproblems are only solved approxi ..."
Abstract
-
Cited by 86 (2 self)
- Add to MetaCart
We provide stronger and more general primal-dual convergence results for Frank-Wolfe-type algorithms (a.k.a. conditional gradient) for constrained convex optimization, enabled by a simple framework of duality gap certificates. Our analysis also holds if the linear subproblems are only solved approximately (as well as if the gradients are inexact), and is proven to be worst-case optimal in the sparsity of the obtained solutions. On the application side, this allows us to unify a large variety of existing sparse greedy methods, in particular for optimization over convex hulls of an atomic set, even if those sets can only be approximated, including sparse (or structured sparse) vectors or matrices, low-rank matrices, permutation matrices, or max-norm bounded matrices. We present a new general framework for convex optimization over matrix factorizations, where every Frank-Wolfe iteration will consist of a low-rank update, and discuss the broad application areas of this approach. 1.
Structured sparsity-inducing norms through submodular functions
- IN ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS
, 2010
"... Sparse methods for supervised learning aim at finding good linear predictors from as few variables as possible, i.e., with small cardinality of their supports. This combinatorial selection problem is often turnedinto a convex optimization problem byreplacing the cardinality function by its convex en ..."
Abstract
-
Cited by 60 (10 self)
- Add to MetaCart
Sparse methods for supervised learning aim at finding good linear predictors from as few variables as possible, i.e., with small cardinality of their supports. This combinatorial selection problem is often turnedinto a convex optimization problem byreplacing the cardinality function by its convex envelope (tightest convex lower bound), in this case the ℓ1-norm. In this paper, we investigate more general set-functions than the cardinality, that may incorporate prior knowledge or structural constraints which are common in many applications: namely, we show that for nonincreasing submodular set-functions, the corresponding convex envelope can be obtained from its Lovász extension, a common tool in submodular analysis. This defines a family of polyhedral norms, for which we provide generic algorithmic tools (subgradients and proximal operators) and theoretical results (conditions for support recovery or high-dimensional inference). By selecting specific submodular functions, we can give a new interpretation to known norms, such as those based on rank-statistics or grouped norms with potentially overlapping groups; we also define new norms, in particular ones that can be used as non-factorial priors for supervised learning.
Computational and Statistical Tradeoffs via Convex Relaxation
, 2012
"... In modern data analysis, one is frequently faced with statistical inference problems involving massive datasets. Processing such large datasets is usually viewed as a substantial computational challenge. However, if data are a statistician’s main resource then access to more data should be viewed as ..."
Abstract
-
Cited by 45 (1 self)
- Add to MetaCart
In modern data analysis, one is frequently faced with statistical inference problems involving massive datasets. Processing such large datasets is usually viewed as a substantial computational challenge. However, if data are a statistician’s main resource then access to more data should be viewed as an asset rather than as a burden. In this paper we describe a computational framework based on convex relaxation to reduce the computational complexity of an inference procedure when one has access to increasingly larger datasets. Convex relaxation techniques have been widely used in theoretical computer science as they give tractable approximation algorithms to many computationally intractable tasks. We demonstrate the efficacy of this methodology in statistical estimation in providing concrete time-data tradeoffs in a class of denoising problems. Thus, convex relaxation offers a principled approach to exploit the statistical gains from larger datasets to reduce the runtime of inference algorithms.
Simultaneously Structured Models with Application to Sparse and Low-rank Matrices
, 2014
"... The topic of recovery of a structured model given a small number of linear observations has been well-studied in recent years. Examples include recovering sparse or group-sparse vectors, low-rank matrices, and the sum of sparse and low-rank matrices, among others. In various applications in signal p ..."
Abstract
-
Cited by 41 (5 self)
- Add to MetaCart
The topic of recovery of a structured model given a small number of linear observations has been well-studied in recent years. Examples include recovering sparse or group-sparse vectors, low-rank matrices, and the sum of sparse and low-rank matrices, among others. In various applications in signal processing and machine learning, the model of interest is known to be structured in several ways at the same time, for example, a matrix that is simultaneously sparse and low-rank. Often norms that promote each individual structure are known, and allow for recovery using an order-wise optimal number of measurements (e.g., `1 norm for sparsity, nuclear norm for matrix rank). Hence, it is reasonable to minimize a combination of such norms. We show that, surprisingly, if we use multi-objective optimization with these norms, then we can do no better, order-wise, than an algorithm that exploits only one of the present structures. This result suggests that to fully exploit the multiple structures, we need an entirely new convex relaxation, i.e. not one that is a function of the convex relaxations used for each structure. We then specialize our results to the case of sparse and low-rank matrices. We show that a nonconvex formulation of the problem can recover the model from very few measurements, which is on the order of the degrees of freedom of the matrix, whereas the convex problem obtained from a combination of the `1 and nuclear norms requires many more measurements. This proves an order-wise gap between the performance of the convex and nonconvex recovery problems in this case. Our framework applies to arbitrary structure-inducing norms as well as to a wide range of measurement ensembles. This allows us to give performance bounds for problems such as sparse phase retrieval and low-rank tensor completion.
Accurate Prediction of Phase Transitions in Compressed Sensing via a Connection to Minimax Denoising
, 2012
"... Compressed sensing posits that, within limits, one can undersample a sparse signal and yet reconstruct it accurately. Knowing the precise limits to such undersampling is important both for theory and practice. We present a formula that characterizes the allowed undersampling of generalized sparse ob ..."
Abstract
-
Cited by 41 (5 self)
- Add to MetaCart
Compressed sensing posits that, within limits, one can undersample a sparse signal and yet reconstruct it accurately. Knowing the precise limits to such undersampling is important both for theory and practice. We present a formula that characterizes the allowed undersampling of generalized sparse objects. The formula applies to Approximate Message Passing (AMP) algorithms for compressed sensing, which are here generalized to employ denoising operators besides the traditional scalar soft thresholding denoiser. This paper gives several examples including scalar denoisers not derived from convex penalization – the firm shrinkage nonlinearity and the minimax nonlinearity – and also nonscalar denoisers – block thresholding, monotone regression, and total variation minimization. Let the variables ε = k/N and δ = n/N denote the generalized sparsity and undersampling fractions for sampling the k-generalized-sparse N-vector x0 according to y = Ax0. Here A is an n × N measurement matrix whose entries are iid standard Gaussian. The formula states that the phase transition curve δ = δ(ε) separating successful from unsuccessful reconstruction of x0
Living on the edge: A geometric theory of phase transitions in convex optimization
, 2013
"... Recent empirical research indicates that many convex optimization problems with random constraints exhibit a phase transition as the number of constraints increases. For example, this phenomenon emerges in the `1 minimization method for identifying a sparse vector from random linear samples. Indee ..."
Abstract
-
Cited by 36 (4 self)
- Add to MetaCart
(Show Context)
Recent empirical research indicates that many convex optimization problems with random constraints exhibit a phase transition as the number of constraints increases. For example, this phenomenon emerges in the `1 minimization method for identifying a sparse vector from random linear samples. Indeed, this approach succeeds with high probability when the number of samples exceeds a threshold that depends on the sparsity level; otherwise, it fails with high probability. This paper provides the first rigorous analysis that explains why phase transitions are ubiquitous in random convex optimization problems. It also describes tools for making reliable predictions about the quantitative aspects of the transition, including the location and the width of the transition region. These techniques apply to regularized linear inverse problems with random measurements, to demixing problems under a random incoherence model, and also to cone programs with random affine constraints. These applications depend on foundational research in conic geometry. This paper introduces a new summary parameter, called the statistical dimension, that canonically extends the dimension of a linear subspace to the class of convex cones. The main technical result demonstrates that the sequence of conic intrinsic volumes of a convex cone concentrates sharply near the statistical dimension. This fact leads to an approximate version of the conic kinematic formula that gives bounds on the probability that a randomly oriented cone shares a ray with a fixed cone.
Atomic norm denoising with applications to line spectral estimation
, 2012
"... The sub-Nyquist estimation of line spectra is a classical problem in signal processing, but currently popular subspace-based techniques have few guarantees in the presence of noise and rely on a priori knowledge about system model order. Motivated by recent work on atomic norms in inverse problems, ..."
Abstract
-
Cited by 29 (4 self)
- Add to MetaCart
(Show Context)
The sub-Nyquist estimation of line spectra is a classical problem in signal processing, but currently popular subspace-based techniques have few guarantees in the presence of noise and rely on a priori knowledge about system model order. Motivated by recent work on atomic norms in inverse problems, we propose a new approach to line spectral estimation that provides theoretical guarantees for the mean-squared-error performance in the presence of noise and without advance knowledge of the model order. We propose an abstract theory of denoising with atomic norms and specialize this theory to provide a convex optimization problem for estimating the frequencies and phases of a mixture of complex exponentials with guaranteed bounds on the mean-squared error. We show that the associated convex optimization problem, called Atomic norm Soft Thresholding (AST), can be solved in polynomial time via semidefinite programming. For very large scale problems we provide an alternative, efficient algorithm, called Discretized Atomic norm Soft Thresholding (DAST), based on the Fast Fourier Transform that achieves nearly the same error rate as that guaranteed by the semidefinite programming approach. We compare both AST and DAST with Cadzow’s canonical alternating projection algorithm and demonstrate that AST outperforms DAST which outperforms Cadzow in terms of mean-square reconstruction error over a wide range of signal-to-noise ratios. For very large problems DAST is considerably faster than both AST and Cadzow.
Compressed Sensing Off the Grid
, 2012
"... This work investigates the problem of estimating the frequency components of a mixture of s complex sinusoids from a random subset of n regularly spaced samples. Unlike previous work in compressed sensing, the frequencies are not assumed to lie on a grid, but can assume any values in the normalized ..."
Abstract
-
Cited by 28 (2 self)
- Add to MetaCart
This work investigates the problem of estimating the frequency components of a mixture of s complex sinusoids from a random subset of n regularly spaced samples. Unlike previous work in compressed sensing, the frequencies are not assumed to lie on a grid, but can assume any values in the normalized frequency domain [0, 1]. An atomic norm minimization approach is proposed to exactly recover the unobserved samples and identify the unknown frequencies, which is then reformulated as an exact semidefinite program. Even with this continuous dictionary, it is shown that O(s log s log n) random samples are sufficient to guarantee exact frequency localization with high probability, provided the frequencies are well separated. Numerical experiments are performed to illustrate the effectiveness of the proposed method.