Results 1 - 10
of
584
ATOMIC DECOMPOSITION BY BASIS PURSUIT
, 1995
"... The Time-Frequency and Time-Scale communities have recently developed a large number of overcomplete waveform dictionaries -- stationary wavelets, wavelet packets, cosine packets, chirplets, and warplets, to name a few. Decomposition into overcomplete systems is not unique, and several methods for d ..."
Abstract
-
Cited by 2728 (61 self)
- Add to MetaCart
(Show Context)
The Time-Frequency and Time-Scale communities have recently developed a large number of overcomplete waveform dictionaries -- stationary wavelets, wavelet packets, cosine packets, chirplets, and warplets, to name a few. Decomposition into overcomplete systems is not unique, and several methods for decomposition have been proposed, including the Method of Frames (MOF), Matching Pursuit (MP), and, for special dictionaries, the Best Orthogonal Basis (BOB). Basis Pursuit (BP) is a principle for decomposing a signal into an "optimal" superposition of dictionary elements, where optimal means having the smallest l 1 norm of coefficients among all such decompositions. We give examples exhibiting several advantages over MOF, MP and BOB, including better sparsity, and super-resolution. BP has interesting relations to ideas in areas as diverse as ill-posed problems, in abstract harmonic analysis, total variation de-noising, and multi-scale edge denoising. Basis Pursuit in highly overcomplete dictionaries leads to large-scale optimization problems. With signals of length 8192 and a wavelet packet dictionary, one gets an equivalent linear program of size 8192 by 212,992. Such problems can be attacked successfully only because of recent advances in linear programming by interior-point methods. We obtain reasonable success with a primal-dual logarithmic barrier method and conjugate-gradient solver.
Robust Uncertainty Principles: Exact Signal Reconstruction From Highly Incomplete Frequency Information
, 2006
"... This paper considers the model problem of reconstructing an object from incomplete frequency samples. Consider a discrete-time signal and a randomly chosen set of frequencies. Is it possible to reconstruct from the partial knowledge of its Fourier coefficients on the set? A typical result of this pa ..."
Abstract
-
Cited by 2632 (50 self)
- Add to MetaCart
This paper considers the model problem of reconstructing an object from incomplete frequency samples. Consider a discrete-time signal and a randomly chosen set of frequencies. Is it possible to reconstruct from the partial knowledge of its Fourier coefficients on the set? A typical result of this paper is as follows. Suppose that is a superposition of spikes @ Aa @ A @ A obeying @�� � A I for some constant H. We do not know the locations of the spikes nor their amplitudes. Then with probability at least I @ A, can be reconstructed exactly as the solution to the I minimization problem I aH @ A s.t. ” @ Aa ” @ A for all
Near Optimal Signal Recovery From Random Projections: Universal Encoding Strategies?
, 2004
"... Suppose we are given a vector f in RN. How many linear measurements do we need to make about f to be able to recover f to within precision ɛ in the Euclidean (ℓ2) metric? Or more exactly, suppose we are interested in a class F of such objects— discrete digital signals, images, etc; how many linear m ..."
Abstract
-
Cited by 1513 (20 self)
- Add to MetaCart
Suppose we are given a vector f in RN. How many linear measurements do we need to make about f to be able to recover f to within precision ɛ in the Euclidean (ℓ2) metric? Or more exactly, suppose we are interested in a class F of such objects— discrete digital signals, images, etc; how many linear measurements do we need to recover objects from this class to within accuracy ɛ? This paper shows that if the objects of interest are sparse or compressible in the sense that the reordered entries of a signal f ∈ F decay like a power-law (or if the coefficient sequence of f in a fixed basis decays like a power-law), then it is possible to reconstruct f to within very high accuracy from a small number of random measurements. typical result is as follows: we rearrange the entries of f (or its coefficients in a fixed basis) in decreasing order of magnitude |f | (1) ≥ |f | (2) ≥... ≥ |f | (N), and define the weak-ℓp ball as the class F of those elements whose entries obey the power decay law |f | (n) ≤ C · n −1/p. We take measurements 〈f, Xk〉, k = 1,..., K, where the Xk are N-dimensional Gaussian
Compressive sampling
, 2006
"... Conventional wisdom and common practice in acquisition and reconstruction of images from frequency data follow the basic principle of the Nyquist density sampling theory. This principle states that to reconstruct an image, the number of Fourier samples we need to acquire must match the desired res ..."
Abstract
-
Cited by 1441 (15 self)
- Add to MetaCart
Conventional wisdom and common practice in acquisition and reconstruction of images from frequency data follow the basic principle of the Nyquist density sampling theory. This principle states that to reconstruct an image, the number of Fourier samples we need to acquire must match the desired resolution of the image, i.e. the number of pixels in the image. This paper surveys an emerging theory which goes by the name of “compressive sampling” or “compressed sensing,” and which says that this conventional wisdom is inaccurate. Perhaps surprisingly, it is possible to reconstruct images or signals of scientific interest accurately and sometimes even exactly from a number of samples which is far smaller than the desired resolution of the image/signal, e.g. the number of pixels in the image. It is believed that compressive sampling has far reaching implications. For example, it suggests the possibility of new data acquisition protocols that translate analog information into digital form with fewer sensors than what was considered necessary. This new sampling theory may come to underlie procedures for sampling and compressing data simultaneously. In this short survey, we provide some of the key mathematical insights underlying this new theory, and explain some of the interactions between compressive sampling and other fields such as statistics, information theory, coding theory, and theoretical computer science.
Decoding by Linear Programming
, 2004
"... This paper considers the classical error correcting problem which is frequently discussed in coding theory. We wish to recover an input vector f ∈ Rn from corrupted measurements y = Af + e. Here, A is an m by n (coding) matrix and e is an arbitrary and unknown vector of errors. Is it possible to rec ..."
Abstract
-
Cited by 1399 (16 self)
- Add to MetaCart
This paper considers the classical error correcting problem which is frequently discussed in coding theory. We wish to recover an input vector f ∈ Rn from corrupted measurements y = Af + e. Here, A is an m by n (coding) matrix and e is an arbitrary and unknown vector of errors. Is it possible to recover f exactly from the data y? We prove that under suitable conditions on the coding matrix A, the input f is the unique solution to the ℓ1-minimization problem (‖x‖ℓ1:= i |xi|) min g∈R n ‖y − Ag‖ℓ1 provided that the support of the vector of errors is not too large, ‖e‖ℓ0: = |{i: ei ̸= 0} | ≤ ρ · m for some ρ> 0. In short, f can be recovered exactly by solving a simple convex optimization problem (which one can recast as a linear program). In addition, numerical experiments suggest that this recovery procedure works unreasonably well; f is recovered exactly even in situations where a significant fraction of the output is corrupted. This work is related to the problem of finding sparse solutions to vastly underdetermined systems of linear equations. There are also significant connections with the problem of recovering signals from highly incomplete measurements. In fact, the results introduced in this paper improve on our earlier work [5]. Finally, underlying the success of ℓ1 is a crucial property we call the uniform uncertainty principle that we shall describe in detail.
Stable signal recovery from incomplete and inaccurate measurements,”
- Comm. Pure Appl. Math.,
, 2006
"... Abstract Suppose we wish to recover a vector x 0 ∈ R m (e.g., a digital signal or image) from incomplete and contaminated observations y = Ax 0 + e; A is an n × m matrix with far fewer rows than columns (n m) and e is an error term. Is it possible to recover x 0 accurately based on the data y? To r ..."
Abstract
-
Cited by 1397 (38 self)
- Add to MetaCart
(Show Context)
Abstract Suppose we wish to recover a vector x 0 ∈ R m (e.g., a digital signal or image) from incomplete and contaminated observations y = Ax 0 + e; A is an n × m matrix with far fewer rows than columns (n m) and e is an error term. Is it possible to recover x 0 accurately based on the data y? To recover x 0 , we consider the solution x to the 1 -regularization problem where is the size of the error term e. We show that if A obeys a uniform uncertainty principle (with unit-normed columns) and if the vector x 0 is sufficiently sparse, then the solution is within the noise level As a first example, suppose that A is a Gaussian random matrix; then stable recovery occurs for almost all such A's provided that the number of nonzeros of x 0 is of about the same order as the number of observations. As a second instance, suppose one observes few Fourier samples of x 0 ; then stable recovery occurs for almost any set of n coefficients provided that the number of nonzeros is of the order of n/(log m) 6 . In the case where the error term vanishes, the recovery is of course exact, and this work actually provides novel insights into the exact recovery phenomenon discussed in earlier papers. The methodology also explains why one can also very nearly recover approximately sparse signals.
K-SVD: An Algorithm for Designing Overcomplete Dictionaries for Sparse Representation
, 2006
"... In recent years there has been a growing interest in the study of sparse representation of signals. Using an overcomplete dictionary that contains prototype signal-atoms, signals are described by sparse linear combinations of these atoms. Applications that use sparse representation are many and inc ..."
Abstract
-
Cited by 935 (41 self)
- Add to MetaCart
In recent years there has been a growing interest in the study of sparse representation of signals. Using an overcomplete dictionary that contains prototype signal-atoms, signals are described by sparse linear combinations of these atoms. Applications that use sparse representation are many and include compression, regularization in inverse problems, feature extraction, and more. Recent activity in this field has concentrated mainly on the study of pursuit algorithms that decompose signals with respect to a given dictionary. Designing dictionaries to better fit the above model can be done by either selecting one from a prespecified set of linear transforms or adapting the dictionary to a set of training signals. Both of these techniques have been considered, but this topic is largely still open. In this paper we propose a novel algorithm for adapting dictionaries in order to achieve sparse signal representations. Given a set of training signals, we seek the dictionary that leads to the best representation for each member in this set, under strict sparsity constraints. We present a new method—the K-SVD algorithm—generalizing the u-means clustering process. K-SVD is an iterative method that alternates between sparse coding of the examples based on the current dictionary and a process of updating the dictionary atoms to better fit the data. The update of the dictionary columns is combined with an update of the sparse representations, thereby accelerating convergence. The K-SVD algorithm is flexible and can work with any pursuit method (e.g., basis pursuit, FOCUSS, or matching pursuit). We analyze this algorithm and demonstrate its results both on synthetic tests and in applications on real image data.
Greed is Good: Algorithmic Results for Sparse Approximation
, 2004
"... This article presents new results on using a greedy algorithm, orthogonal matching pursuit (OMP), to solve the sparse approximation problem over redundant dictionaries. It provides a sufficient condition under which both OMP and Donoho’s basis pursuit (BP) paradigm can recover the optimal representa ..."
Abstract
-
Cited by 916 (9 self)
- Add to MetaCart
(Show Context)
This article presents new results on using a greedy algorithm, orthogonal matching pursuit (OMP), to solve the sparse approximation problem over redundant dictionaries. It provides a sufficient condition under which both OMP and Donoho’s basis pursuit (BP) paradigm can recover the optimal representation of an exactly sparse signal. It leverages this theory to show that both OMP and BP succeed for every sparse input signal from a wide class of dictionaries. These quasi-incoherent dictionaries offer a natural generalization of incoherent dictionaries, and the cumulative coherence function is introduced to quantify the level of incoherence. This analysis unifies all the recent results on BP and extends them to OMP. Furthermore, the paper develops a sufficient condition under which OMP can identify atoms from an optimal approximation of a nonsparse signal. From there, it argues that OMP is an approximation algorithm for the sparse problem over a quasi-incoherent dictionary. That is, for every input signal, OMP calculates a sparse approximant whose error is only a small factor worse than the minimal error that can be attained with the same number of terms.
The Dantzig selector: statistical estimation when p is much larger than n
, 2005
"... In many important statistical applications, the number of variables or parameters p is much larger than the number of observations n. Suppose then that we have observations y = Ax + z, where x ∈ R p is a parameter vector of interest, A is a data matrix with possibly far fewer rows than columns, n ≪ ..."
Abstract
-
Cited by 879 (14 self)
- Add to MetaCart
In many important statistical applications, the number of variables or parameters p is much larger than the number of observations n. Suppose then that we have observations y = Ax + z, where x ∈ R p is a parameter vector of interest, A is a data matrix with possibly far fewer rows than columns, n ≪ p, and the zi’s are i.i.d. N(0, σ 2). Is it possible to estimate x reliably based on the noisy data y? To estimate x, we introduce a new estimator—we call the Dantzig selector—which is solution to the ℓ1-regularization problem min ˜x∈R p ‖˜x‖ℓ1 subject to ‖A T r‖ℓ ∞ ≤ (1 + t −1) √ 2 log p · σ, where r is the residual vector y − A˜x and t is a positive scalar. We show that if A obeys a uniform uncertainty principle (with unit-normed columns) and if the true parameter vector x is sufficiently sparse (which here roughly guarantees that the model is identifiable), then with very large probability ‖ˆx − x ‖ 2 ℓ2 ≤ C2 ( · 2 log p · σ 2 + ∑ min(x 2 i, σ 2) Our results are nonasymptotic and we give values for the constant C. In short, our estimator achieves a loss within a logarithmic factor of the ideal mean squared error one would achieve with an oracle which would supply perfect information about which coordinates are nonzero, and which were above the noise level. In multivariate regression and from a model selection viewpoint, our result says that it is possible nearly to select the best subset of variables, by solving a very simple convex program, which in fact can easily be recast as a convenient linear program (LP).
The adaptive LASSO and its oracle properties
- Journal of the American Statistical Association
"... The lasso is a popular technique for simultaneous estimation and variable selection. Lasso variable selection has been shown to be consistent under certain conditions. In this work we derive a necessary condition for the lasso variable selection to be consistent. Consequently, there exist certain sc ..."
Abstract
-
Cited by 683 (10 self)
- Add to MetaCart
(Show Context)
The lasso is a popular technique for simultaneous estimation and variable selection. Lasso variable selection has been shown to be consistent under certain conditions. In this work we derive a necessary condition for the lasso variable selection to be consistent. Consequently, there exist certain scenarios where the lasso is inconsistent for variable selection. We then propose a new version of the lasso, called the adaptive lasso, where adaptive weights are used for penalizing different coefficients in the!1 penalty. We show that the adaptive lasso enjoys the oracle properties; namely, it performs as well as if the true underlying model were given in advance. Similar to the lasso, the adaptive lasso is shown to be near-minimax optimal. Furthermore, the adaptive lasso can be solved by the same efficient algorithm for solving the lasso. We also discuss the extension of the adaptive lasso in generalized linear models and show that the oracle properties still hold under mild regularity conditions. As a byproduct of our theory, the nonnegative garotte is shown to be consistent for variable selection.