Results 1  10
of
209
Atomic decomposition by basis pursuit
 SIAM Journal on Scientific Computing
, 1998
"... Abstract. The timefrequency and timescale communities have recently developed a large number of overcomplete waveform dictionaries — stationary wavelets, wavelet packets, cosine packets, chirplets, and warplets, to name a few. Decomposition into overcomplete systems is not unique, and several meth ..."
Abstract

Cited by 1660 (43 self)
 Add to MetaCart
Abstract. The timefrequency and timescale communities have recently developed a large number of overcomplete waveform dictionaries — stationary wavelets, wavelet packets, cosine packets, chirplets, and warplets, to name a few. Decomposition into overcomplete systems is not unique, and several methods for decomposition have been proposed, including the method of frames (MOF), Matching pursuit (MP), and, for special dictionaries, the best orthogonal basis (BOB). Basis Pursuit (BP) is a principle for decomposing a signal into an “optimal ” superposition of dictionary elements, where optimal means having the smallest l 1 norm of coefficients among all such decompositions. We give examples exhibiting several advantages over MOF, MP, and BOB, including better sparsity and superresolution. BP has interesting relations to ideas in areas as diverse as illposed problems, in abstract harmonic analysis, total variation denoising, and multiscale edge denoising. BP in highly overcomplete dictionaries leads to largescale optimization problems. With signals of length 8192 and a wavelet packet dictionary, one gets an equivalent linear program of size 8192 by 212,992. Such problems can be attacked successfully only because of recent advances in linear programming by interiorpoint methods. We obtain reasonable success with a primaldual logarithmic barrier method and conjugategradient solver.
Adapting to unknown smoothness via wavelet shrinkage
 JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION
, 1995
"... We attempt to recover a function of unknown smoothness from noisy, sampled data. We introduce a procedure, SureShrink, which suppresses noise by thresholding the empirical wavelet coefficients. The thresholding is adaptive: a threshold level is assigned to each dyadic resolution level by the princip ..."
Abstract

Cited by 675 (19 self)
 Add to MetaCart
We attempt to recover a function of unknown smoothness from noisy, sampled data. We introduce a procedure, SureShrink, which suppresses noise by thresholding the empirical wavelet coefficients. The thresholding is adaptive: a threshold level is assigned to each dyadic resolution level by the principle of minimizing the Stein Unbiased Estimate of Risk (Sure) for threshold estimates. The computational effort of the overall procedure is order N log(N) as a function of the sample size N. SureShrink is smoothnessadaptive: if the unknown function contains jumps, the reconstruction (essentially) does also; if the unknown function has a smooth piece, the reconstruction is (essentially) as smooth as the mother wavelet will allow. The procedure is in a sense optimally smoothnessadaptive: it is nearminimax simultaneously over a whole interval of the Besov scale; the size of this interval depends on the choice of mother wavelet. We know from a previous paper by the authors that traditional smoothing methods  kernels, splines, and orthogonal series estimates  even with optimal choices of the smoothing parameter, would be unable to perform
Translationinvariant denoising
, 1995
"... DeNoising with the traditional (orthogonal, maximallydecimated) wavelet transform sometimes exhibits visual artifacts; we attribute some of these – for example, Gibbs phenomena in the neighborhood of discontinuities – to the lack of translation invariance of the wavelet basis. One method to suppre ..."
Abstract

Cited by 228 (8 self)
 Add to MetaCart
DeNoising with the traditional (orthogonal, maximallydecimated) wavelet transform sometimes exhibits visual artifacts; we attribute some of these – for example, Gibbs phenomena in the neighborhood of discontinuities – to the lack of translation invariance of the wavelet basis. One method to suppress such artifacts, termed “cycle spinning ” by Coifman, is to “average out ” the translation dependence. For a range of shifts, one shifts the data (right or left as the case may be), DeNoises the shifted data, and then unshifts the denoised data. Doing this for each of a range of shifts, and averaging the several results so obtained, produces a reconstruction subject to far weaker Gibbs phenomena than thresholding based DeNoising using the traditional orthogonal wavelet transform. CycleSpinning over the range of all circulant shifts can be accomplished in order nlog 2(n) time; it is equivalent to denoising using the undecimated or stationary wavelet transform. Cyclespinning exhibits benefits outside of wavelet denoising, for example in cosine packet denoising, where it helps suppress ‘clicks’. It also has a counterpart in frequency domain denoising, where the goal of translationinvariance is replaced by modulation invariance, and the central shiftDeNoiseunshift operation is replaced by modulateDeNoisedemodulate. We illustrate these concepts with extensive computational examples; all figures presented here are reproducible using the WaveLab software package. 1
From Sparse Solutions of Systems of Equations to Sparse Modeling of Signals and Images
, 2007
"... A fullrank matrix A ∈ IR n×m with n < m generates an underdetermined system of linear equations Ax = b having infinitely many solutions. Suppose we seek the sparsest solution, i.e., the one with the fewest nonzero entries: can it ever be unique? If so, when? As optimization of sparsity is combinato ..."
Abstract

Cited by 202 (31 self)
 Add to MetaCart
A fullrank matrix A ∈ IR n×m with n < m generates an underdetermined system of linear equations Ax = b having infinitely many solutions. Suppose we seek the sparsest solution, i.e., the one with the fewest nonzero entries: can it ever be unique? If so, when? As optimization of sparsity is combinatorial in nature, are there efficient methods for finding the sparsest solution? These questions have been answered positively and constructively in recent years, exposing a wide variety of surprising phenomena; in particular, the existence of easilyverifiable conditions under which optimallysparse solutions can be found by concrete, effective computational methods. Such theoretical results inspire a bold perspective on some important practical problems in signal and image processing. Several wellknown signal and image processing problems can be cast as demanding solutions of undetermined systems of equations. Such problems have previously seemed, to many, intractable. There is considerable evidence that these problems often have sparse solutions. Hence, advances in finding sparse solutions to underdetermined systems energizes research on such signal and image processing problems – to striking effect. In this paper we review the theoretical results on sparse solutions of linear systems, empirical
An interiorpoint method for largescale l1regularized logistic regression
 Journal of Machine Learning Research
, 2007
"... Logistic regression with ℓ1 regularization has been proposed as a promising method for feature selection in classification problems. In this paper we describe an efficient interiorpoint method for solving largescale ℓ1regularized logistic regression problems. Small problems with up to a thousand ..."
Abstract

Cited by 153 (6 self)
 Add to MetaCart
Logistic regression with ℓ1 regularization has been proposed as a promising method for feature selection in classification problems. In this paper we describe an efficient interiorpoint method for solving largescale ℓ1regularized logistic regression problems. Small problems with up to a thousand or so features and examples can be solved in seconds on a PC; medium sized problems, with tens of thousands of features and examples, can be solved in tens of seconds (assuming some sparsity in the data). A variation on the basic method, that uses a preconditioned conjugate gradient method to compute the search step, can solve very large problems, with a million features and examples (e.g., the 20 Newsgroups data set), in a few minutes, on a PC. Using warmstart techniques, a good approximation of the entire regularization path can be computed much more efficiently than by solving a family of problems independently.
Basis Pursuit
, 1994
"... The TimeFrequency and TimeScale communities have recently developed an enormous number of overcomplete signal dictionaries  wavelets, wavelet packets, cosine packets, wilson bases, chirplets, warped bases, and hyperbolic cross bases being a few examples. Basis Pursuit is a technique for decompos ..."
Abstract

Cited by 119 (15 self)
 Add to MetaCart
The TimeFrequency and TimeScale communities have recently developed an enormous number of overcomplete signal dictionaries  wavelets, wavelet packets, cosine packets, wilson bases, chirplets, warped bases, and hyperbolic cross bases being a few examples. Basis Pursuit is a technique for decomposing a signal into an "optimal" superposition of dictionary elements. The optimization criterion is the l 1 norm of coefficients. The method has several advantages over Matching Pursuit and Best Ortho Basis, including superresolution and stability. 1 Introduction Over the last five years or so, there has been an explosion of awareness of alternatives to traditional signal representations. Instead of just representing objects as superpositions of sinusoids (the traditional Fourier representation) we now have available alternate dictionaries  signal representation schemes  of which the Wavelets dictionary is only the most wellknown. Wavelet dictionaries, Gabor dictionaries, Multiscale...
Multiple Shrinkage and Subset Selection in Wavelets
, 1997
"... This paper discusses Bayesian methods for multiple shrinkage estimation in wavelets. Wavelets are used in applications for data denoising, via shrinkage of the coefficients towards zero, and for data compression, by shrinkage and setting small coefficients to zero. We approach wavelet shrinkage by u ..."
Abstract

Cited by 118 (16 self)
 Add to MetaCart
This paper discusses Bayesian methods for multiple shrinkage estimation in wavelets. Wavelets are used in applications for data denoising, via shrinkage of the coefficients towards zero, and for data compression, by shrinkage and setting small coefficients to zero. We approach wavelet shrinkage by using Bayesian hierarchical models, assigning a positive prior probability to the wavelet coefficients being zero. The resulting estimator for the wavelet coefficients is a multiple shrinkage estimator that exhibits a wide variety of nonlinear shrinkage patterns. We discuss fast computational implementations, with a focus on easytocompute analytic approximations as well as importance sampling and Markov chain Monte Carlo methods. Multiple shrinkage estimators prove to have excellent mean squared error performance in reconstructing standard test functions. We demonstrate this in simulated test examples, comparing various implementations of multiple shrinkage to commonly used shrinkage rules. Finally, we illustrate our approach with an application to the socalled "glint" data.
Nonlinear Wavelet Shrinkage With Bayes Rules and Bayes Factors
 Journal of the American Statistical Association
, 1998
"... this article a wavelet shrinkage by coherent ..."
Adapting to unknown sparsity by controlling the false discovery rate
, 2000
"... We attempt to recover a highdimensional vector observed in white noise, where the vector is known to be sparse, but the degree of sparsity is unknown. We consider three different ways of defining sparsity of a vector: using the fraction of nonzero terms; imposing powerlaw decay bounds on the order ..."
Abstract

Cited by 108 (15 self)
 Add to MetaCart
We attempt to recover a highdimensional vector observed in white noise, where the vector is known to be sparse, but the degree of sparsity is unknown. We consider three different ways of defining sparsity of a vector: using the fraction of nonzero terms; imposing powerlaw decay bounds on the ordered entries; and controlling the ℓp norm for p small. We obtain a procedure which is asymptotically minimax for ℓr loss, simultaneously throughout a range of such sparsity classes. The optimal procedure is a dataadaptive thresholding scheme, driven by control of the False Discovery Rate (FDR). FDR control is a recent innovation in simultaneous testing, in which one seeks to ensure that at most a certain fraction of the rejected null hypotheses will correspond to false rejections. In our treatment, the FDR control parameter q also plays a controlling role in asymptotic minimaxity. Our results say that letting q = qn → 0 with problem size n is sufficient for asymptotic minimaxity, while keeping fixed q>1/2prevents asymptotic minimaxity. To our knowledge, this relation between ideas in simultaneous inference and asymptotic decision theory is new. Our work provides a new perspective on a class of model selection rules which has been introduced recently by several authors. These new rules impose complexity penalization of the form 2·log ( potential model size / actual model size). We exhibit a close connection with FDRcontrolling procedures having q tending to 0; this connection strongly supports a conjecture of simultaneous asymptotic minimaxity for such model selection rules.