Results 1  10
of
56
DeNoising By SoftThresholding
, 1992
"... Donoho and Johnstone (1992a) proposed a method for reconstructing an unknown function f on [0; 1] from noisy data di = f(ti)+ zi, iid i =0;:::;n 1, ti = i=n, zi N(0; 1). The reconstruction fn ^ is de ned in the wavelet domain by translating all the empirical wavelet coe cients of d towards 0 by an a ..."
Abstract

Cited by 1104 (13 self)
 Add to MetaCart
Donoho and Johnstone (1992a) proposed a method for reconstructing an unknown function f on [0; 1] from noisy data di = f(ti)+ zi, iid i =0;:::;n 1, ti = i=n, zi N(0; 1). The reconstruction fn ^ is de ned in the wavelet domain by translating all the empirical wavelet coe cients of d towards 0 by an amount p 2 log(n) = p n. We prove two results about that estimator. [Smooth]: With high probability ^ fn is at least as smooth as f, in any of a wide variety of smoothness measures. [Adapt]: The estimator comes nearly as close in mean square to f as any measurable estimator can come, uniformly over balls in each of two broad scales of smoothness classes. These two properties are unprecedented in several ways. Our proof of these results develops new facts about abstract statistical inference and its connection with an optimal recovery model.
Adapting to unknown smoothness via wavelet shrinkage
 JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION
, 1995
"... We attempt to recover a function of unknown smoothness from noisy, sampled data. We introduce a procedure, SureShrink, which suppresses noise by thresholding the empirical wavelet coefficients. The thresholding is adaptive: a threshold level is assigned to each dyadic resolution level by the princip ..."
Abstract

Cited by 864 (19 self)
 Add to MetaCart
We attempt to recover a function of unknown smoothness from noisy, sampled data. We introduce a procedure, SureShrink, which suppresses noise by thresholding the empirical wavelet coefficients. The thresholding is adaptive: a threshold level is assigned to each dyadic resolution level by the principle of minimizing the Stein Unbiased Estimate of Risk (Sure) for threshold estimates. The computational effort of the overall procedure is order N log(N) as a function of the sample size N. SureShrink is smoothnessadaptive: if the unknown function contains jumps, the reconstruction (essentially) does also; if the unknown function has a smooth piece, the reconstruction is (essentially) as smooth as the mother wavelet will allow. The procedure is in a sense optimally smoothnessadaptive: it is nearminimax simultaneously over a whole interval of the Besov scale; the size of this interval depends on the choice of mother wavelet. We know from a previous paper by the authors that traditional smoothing methods  kernels, splines, and orthogonal series estimates  even with optimal choices of the smoothing parameter, would be unable to perform
Minimax Estimation via Wavelet Shrinkage
, 1992
"... We attempt to recover an unknown function from noisy, sampled data. Using orthonormal bases of compactly supported wavelets we develop a nonlinear method which works in the wavelet domain by simple nonlinear shrinkage of the empirical wavelet coe cients. The shrinkage can be tuned to be nearly minim ..."
Abstract

Cited by 289 (32 self)
 Add to MetaCart
We attempt to recover an unknown function from noisy, sampled data. Using orthonormal bases of compactly supported wavelets we develop a nonlinear method which works in the wavelet domain by simple nonlinear shrinkage of the empirical wavelet coe cients. The shrinkage can be tuned to be nearly minimax over any member of a wide range of Triebel and Besovtype smoothness constraints, and asymptotically minimax over Besov bodies with p q. Linear estimates cannot achieve even the minimax rates over Triebel and Besov classes with p <2, so our method can signi cantly outperform every linear method (kernel, smoothing spline, sieve,:::) in a minimax sense. Variants of our method based on simple threshold nonlinearities are nearly minimax. Our method possesses the interpretation of spatial adaptivity: it reconstructs using a kernel which mayvary in shape and bandwidth from point to point, depending on the data. Least favorable distributions for certain of the Triebel and Besov scales generate objects with sparse wavelet transforms. Many real objects have similarly sparse transforms, which suggests that these minimax results are relevant for practical problems. Sequels to this paper discuss practical implementation, spatial adaptation properties and applications to inverse problems.
Wavelet shrinkage: asymptopia
 Journal of the Royal Statistical Society, Ser. B
, 1995
"... Considerable e ort has been directed recently to develop asymptotically minimax methods in problems of recovering in nitedimensional objects (curves, densities, spectral densities, images) from noisy data. A rich and complex body of work has evolved, with nearly or exactly minimax estimators bein ..."
Abstract

Cited by 273 (35 self)
 Add to MetaCart
Considerable e ort has been directed recently to develop asymptotically minimax methods in problems of recovering in nitedimensional objects (curves, densities, spectral densities, images) from noisy data. A rich and complex body of work has evolved, with nearly or exactly minimax estimators being obtained for a variety of interesting problems. Unfortunately, the results have often not been translated into practice, for a variety of reasons { sometimes, similarity to known methods, sometimes, computational intractability, and sometimes, lack of spatial adaptivity. We discuss a method for curve estimation based on n noisy data; one translates the empirical wavelet coe cients towards the origin by an amount p p 2 log(n) = n. The method is di erent from methods in common use today, is computationally practical, and is spatially adaptive; thus it avoids a number of previous objections to minimax estimators. At the same time, the method is nearly minimax for a wide variety of loss functions { e.g. pointwise error, global error measured in L p norms, pointwise and global error in estimation of derivatives { and for a wide range of smoothness classes, including standard Holder classes, Sobolev classes, and Bounded Variation. This is amuch broader nearoptimality than anything previously proposed in the minimax literature. Finally, the theory underlying the method is interesting, as it exploits a correspondence between statistical questions and questions of optimal recovery and informationbased complexity.
Nonlinear solution of linear inverse problems by waveletvaguelette decomposition
, 1992
"... We describe the WaveletVaguelette Decomposition (WVD) of a linear inverse problem. It is a substitute for the singular value decomposition (SVD) of an inverse problem, and it exists for a class of special inverse problems of homogeneous type { such asnumerical di erentiation, inversion of Abeltype ..."
Abstract

Cited by 220 (12 self)
 Add to MetaCart
(Show Context)
We describe the WaveletVaguelette Decomposition (WVD) of a linear inverse problem. It is a substitute for the singular value decomposition (SVD) of an inverse problem, and it exists for a class of special inverse problems of homogeneous type { such asnumerical di erentiation, inversion of Abeltype transforms, certain convolution transforms, and the Radon Transform. We propose to solve illposed linear inverse problems by nonlinearly \shrinking" the WVD coe cients of the noisy, indirect data. Our approach o ers signi cant advantages over traditional SVD inversion in the case of recovering spatially inhomogeneous objects. We suppose that observations are contaminated by white noise and that the object is an unknown element of a Besov space. We prove that nonlinear WVD shrinkage can be tuned to attain the minimax rate of convergence, for L 2 loss, over the entire Besov scale. The important case of Besov spaces Bp;q, p <2, which model spatial inhomogeneity, is included. In comparison, linear procedures { SVD included { cannot attain optimal rates of convergence over such classes in the case p<2. For example, our methods achieve faster rates of convergence, for objects known to lie in the Bump Algebra or in Bounded Variation, than any linear procedure.
Unconditional bases are optimal bases for data compression and for statistical estimation
 Applied and Computational Harmonic Analysis
, 1993
"... An orthogonal basis of L 2 which is also an unconditional basis of a functional space F is a kind of optimal basis for compressing, estimating, and recovering functions in F. Simple thresholding operations, applied in the unconditional basis, work essentially better for compressing, estimating, and ..."
Abstract

Cited by 165 (23 self)
 Add to MetaCart
(Show Context)
An orthogonal basis of L 2 which is also an unconditional basis of a functional space F is a kind of optimal basis for compressing, estimating, and recovering functions in F. Simple thresholding operations, applied in the unconditional basis, work essentially better for compressing, estimating, and recovering than they do in any other orthogonal basis. In fact, simple thresholding in an unconditional basis works essentially better for recovery and estimation than other methods, period. (Performance is measured in an asymptotic minimax sense.) As an application, we formalize and prove Mallat's Heuristic, which says that wavelet bases are optimal for representing functions containing singularities, when there may be an arbitrary number of singularities, arbitrarily distributed.
Asymptotic equivalence of density estimation and Gaussian white noise
 Ann. Statist
, 1996
"... Signal recovery in Gaussian white noise with variance tending to zero has served for some time as a representative model for nonparametric curve estimation, having all the essential traits in a pure form. The equivalence has mostly been stated informally, but an approximation in the sense of Le Cam’ ..."
Abstract

Cited by 103 (5 self)
 Add to MetaCart
Signal recovery in Gaussian white noise with variance tending to zero has served for some time as a representative model for nonparametric curve estimation, having all the essential traits in a pure form. The equivalence has mostly been stated informally, but an approximation in the sense of Le Cam’s deficiency distance ∆ would make it precise. The models are then asymptotically equivalent for all purposes of statistical decision with bounded loss. In nonparametrics, a first result of this kind has recently been established for Gaussian regression (Brown and Low, 1993). We consider the analogous problem for the experiment given by n i. i. d. observations having density f on the unit interval. Our basic result concerns the parameter space of densities which are in a Hölder ball with exponent α> 12 and which are uniformly bounded away from zero. We show that an i. i. d. sample of size n with density f is globally asymptotically equivalent to a white noise experiment with drift f1/2 and variance 14n −1. This represents a nonparametric analog of Le Cam’s heteroscedastic Gaussian approximation in the finite dimensional case.
Sharp Adaptation for Inverse Problems With Random Noise
, 2000
"... We consider a heteroscedastic sequence space setup with polynomially increasing variances of observations that allows to treat a number of inverse problems, in particular multivariate ones. We propose an adaptive estimator that attains simultaneously exact asymptotic minimax constants on every ellip ..."
Abstract

Cited by 66 (8 self)
 Add to MetaCart
We consider a heteroscedastic sequence space setup with polynomially increasing variances of observations that allows to treat a number of inverse problems, in particular multivariate ones. We propose an adaptive estimator that attains simultaneously exact asymptotic minimax constants on every ellipsoid of functions within a wide scale (that includes ellipoids with polynomially and exponentially decreasing axes) and, at the same time, satisfies asymptotically exact oracle inequalities within any class of linear estimates having monotone nondecreasing weights. As application, we construct sharp adaptive estimators in the problems of deconvolution and tomography.
Recovering Edges in IllPosed Inverse Problems: Optimality of Curvelet Frames
, 2000
"... We consider a model problem of recovering a function f(x1,x2) from noisy Radon data. The function f to be recovered is assumed smooth apart from a discontinuity along a C2 curve – i.e. an edge. We use the continuum white noise model, with noise level ɛ. Traditional linear methods for solving such in ..."
Abstract

Cited by 62 (14 self)
 Add to MetaCart
(Show Context)
We consider a model problem of recovering a function f(x1,x2) from noisy Radon data. The function f to be recovered is assumed smooth apart from a discontinuity along a C2 curve – i.e. an edge. We use the continuum white noise model, with noise level ɛ. Traditional linear methods for solving such inverse problems behave poorly in the presence of edges. Qualitatively, the reconstructions are blurred near the edges; quantitatively, they give in our model Mean Squared Errors (MSEs) that tend to zero with noise level ɛ only as O(ɛ1/2)asɛ → 0. A recent innovation – nonlinear shrinkage in the wavelet domain – visually improves edge sharpness and improves MSE convergence to O(ɛ2/3). However, as we show here, this rate is not optimal. In fact, essentially optimal performance is obtained by deploying the recentlyintroduced tight frames of curvelets in this setting. Curvelets are smooth, highly anisotropic elements ideally suited for detecting and synthesizing curved edges. To deploy them in the Radon setting, we construct a curveletbased biorthogonal decomposition
Minimax rates of estimation for highdimensional linear regression over balls
, 2009
"... Abstract—Consider the highdimensional linear regression model,where is an observation vector, is a design matrix with, is an unknown regression vector, and is additive Gaussian noise. This paper studies the minimax rates of convergence for estimating in eitherloss andprediction loss, assuming tha ..."
Abstract

Cited by 49 (13 self)
 Add to MetaCart
(Show Context)
Abstract—Consider the highdimensional linear regression model,where is an observation vector, is a design matrix with, is an unknown regression vector, and is additive Gaussian noise. This paper studies the minimax rates of convergence for estimating in eitherloss andprediction loss, assuming that belongs to anball for some.Itisshown that under suitable regularity conditions on the design matrix, the minimax optimal rate inloss andprediction loss scales as. The analysis in this paper reveals that conditions on the design matrix enter into the rates forerror andprediction error in complementary ways in the upper and lower bounds. Our proofs of the lower bounds are information theoretic in nature, based on Fano’s inequality and results on the metric entropy of the balls, whereas our proofs of the upper bounds are constructive, involving direct analysis of least squares overballs. For the special case, corresponding to models with an exact sparsity constraint, our results show that although computationally efficientbased methods can achieve the minimax rates up to constant factors, they require slightly stronger assumptions on the design matrix than optimal algorithms involving leastsquares over theball. Index Terms—Compressed sensing, minimax techniques, regression analysis. I.