Results 1  10
of
124
Ideal spatial adaptation by wavelet shrinkage
 Biometrika
, 1994
"... With ideal spatial adaptation, an oracle furnishes information about how best to adapt a spatially variable estimator, whether piecewise constant, piecewise polynomial, variable knot spline, or variable bandwidth kernel, to the unknown function. Estimation with the aid of an oracle o ers dramatic ad ..."
Abstract

Cited by 838 (4 self)
 Add to MetaCart
With ideal spatial adaptation, an oracle furnishes information about how best to adapt a spatially variable estimator, whether piecewise constant, piecewise polynomial, variable knot spline, or variable bandwidth kernel, to the unknown function. Estimation with the aid of an oracle o ers dramatic advantages over traditional linear estimation by nonadaptive kernels � however, it is a priori unclear whether such performance can be obtained by a procedure relying on the data alone. We describe a new principle for spatiallyadaptive estimation: selective wavelet reconstruction. Weshowthatvariableknot spline ts and piecewisepolynomial ts, when equipped with an oracle to select the knots, are not dramatically more powerful than selective wavelet reconstruction with an oracle. We develop a practical spatially adaptive method, RiskShrink, which works by shrinkage of empirical wavelet coe cients. RiskShrink mimics the performance of an oracle for selective wavelet reconstruction as well as it is possible to do so. A new inequality inmultivariate normal decision theory which wecallthe oracle inequality shows that attained performance di ers from ideal performance by at most a factor 2logn, where n is the sample size. Moreover no estimator can give a better guarantee than this. Within the class of spatially adaptive procedures, RiskShrink is essentially optimal. Relying only on the data, it comes within a factor log 2 n of the performance of piecewise polynomial and variableknot spline methods equipped with an oracle. In contrast, it is unknown how or if piecewise polynomial methods could be made to function this well when denied access to an oracle and forced to rely on data alone.
DeNoising By SoftThresholding
, 1992
"... Donoho and Johnstone (1992a) proposed a method for reconstructing an unknown function f on [0; 1] from noisy data di = f(ti)+ zi, iid i =0;:::;n 1, ti = i=n, zi N(0; 1). The reconstruction fn ^ is de ned in the wavelet domain by translating all the empirical wavelet coe cients of d towards 0 by an a ..."
Abstract

Cited by 798 (13 self)
 Add to MetaCart
Donoho and Johnstone (1992a) proposed a method for reconstructing an unknown function f on [0; 1] from noisy data di = f(ti)+ zi, iid i =0;:::;n 1, ti = i=n, zi N(0; 1). The reconstruction fn ^ is de ned in the wavelet domain by translating all the empirical wavelet coe cients of d towards 0 by an amount p 2 log(n) = p n. We prove two results about that estimator. [Smooth]: With high probability ^ fn is at least as smooth as f, in any of a wide variety of smoothness measures. [Adapt]: The estimator comes nearly as close in mean square to f as any measurable estimator can come, uniformly over balls in each of two broad scales of smoothness classes. These two properties are unprecedented in several ways. Our proof of these results develops new facts about abstract statistical inference and its connection with an optimal recovery model.
Adapting to unknown smoothness via wavelet shrinkage
 JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION
, 1995
"... We attempt to recover a function of unknown smoothness from noisy, sampled data. We introduce a procedure, SureShrink, which suppresses noise by thresholding the empirical wavelet coefficients. The thresholding is adaptive: a threshold level is assigned to each dyadic resolution level by the princip ..."
Abstract

Cited by 675 (19 self)
 Add to MetaCart
We attempt to recover a function of unknown smoothness from noisy, sampled data. We introduce a procedure, SureShrink, which suppresses noise by thresholding the empirical wavelet coefficients. The thresholding is adaptive: a threshold level is assigned to each dyadic resolution level by the principle of minimizing the Stein Unbiased Estimate of Risk (Sure) for threshold estimates. The computational effort of the overall procedure is order N log(N) as a function of the sample size N. SureShrink is smoothnessadaptive: if the unknown function contains jumps, the reconstruction (essentially) does also; if the unknown function has a smooth piece, the reconstruction is (essentially) as smooth as the mother wavelet will allow. The procedure is in a sense optimally smoothnessadaptive: it is nearminimax simultaneously over a whole interval of the Besov scale; the size of this interval depends on the choice of mother wavelet. We know from a previous paper by the authors that traditional smoothing methods  kernels, splines, and orthogonal series estimates  even with optimal choices of the smoothing parameter, would be unable to perform
Minimax Estimation via Wavelet Shrinkage
, 1992
"... We attempt to recover an unknown function from noisy, sampled data. Using orthonormal bases of compactly supported wavelets we develop a nonlinear method which works in the wavelet domain by simple nonlinear shrinkage of the empirical wavelet coe cients. The shrinkage can be tuned to be nearly minim ..."
Abstract

Cited by 246 (32 self)
 Add to MetaCart
We attempt to recover an unknown function from noisy, sampled data. Using orthonormal bases of compactly supported wavelets we develop a nonlinear method which works in the wavelet domain by simple nonlinear shrinkage of the empirical wavelet coe cients. The shrinkage can be tuned to be nearly minimax over any member of a wide range of Triebel and Besovtype smoothness constraints, and asymptotically minimax over Besov bodies with p q. Linear estimates cannot achieve even the minimax rates over Triebel and Besov classes with p <2, so our method can signi cantly outperform every linear method (kernel, smoothing spline, sieve,:::) in a minimax sense. Variants of our method based on simple threshold nonlinearities are nearly minimax. Our method possesses the interpretation of spatial adaptivity: it reconstructs using a kernel which mayvary in shape and bandwidth from point to point, depending on the data. Least favorable distributions for certain of the Triebel and Besov scales generate objects with sparse wavelet transforms. Many real objects have similarly sparse transforms, which suggests that these minimax results are relevant for practical problems. Sequels to this paper discuss practical implementation, spatial adaptation properties and applications to inverse problems.
Wavelet shrinkage: asymptopia
 Journal of the Royal Statistical Society, Ser. B
, 1995
"... Considerable e ort has been directed recently to develop asymptotically minimax methods in problems of recovering in nitedimensional objects (curves, densities, spectral densities, images) from noisy data. A rich and complex body of work has evolved, with nearly or exactly minimax estimators bein ..."
Abstract

Cited by 239 (35 self)
 Add to MetaCart
Considerable e ort has been directed recently to develop asymptotically minimax methods in problems of recovering in nitedimensional objects (curves, densities, spectral densities, images) from noisy data. A rich and complex body of work has evolved, with nearly or exactly minimax estimators being obtained for a variety of interesting problems. Unfortunately, the results have often not been translated into practice, for a variety of reasons { sometimes, similarity to known methods, sometimes, computational intractability, and sometimes, lack of spatial adaptivity. We discuss a method for curve estimation based on n noisy data; one translates the empirical wavelet coe cients towards the origin by an amount p p 2 log(n) = n. The method is di erent from methods in common use today, is computationally practical, and is spatially adaptive; thus it avoids a number of previous objections to minimax estimators. At the same time, the method is nearly minimax for a wide variety of loss functions { e.g. pointwise error, global error measured in L p norms, pointwise and global error in estimation of derivatives { and for a wide range of smoothness classes, including standard Holder classes, Sobolev classes, and Bounded Variation. This is amuch broader nearoptimality than anything previously proposed in the minimax literature. Finally, the theory underlying the method is interesting, as it exploits a correspondence between statistical questions and questions of optimal recovery and informationbased complexity.
New tight frames of curvelets and optimal representations of objects with piecewise C² singularities
 COMM. ON PURE AND APPL. MATH
, 2002
"... This paper introduces new tight frames of curvelets to address the problem of finding optimally sparse representations of objects with discontinuities along C2 edges. Conceptually, the curvelet transform is a multiscale pyramid with many directions and positions at each length scale, and needleshap ..."
Abstract

Cited by 232 (17 self)
 Add to MetaCart
This paper introduces new tight frames of curvelets to address the problem of finding optimally sparse representations of objects with discontinuities along C2 edges. Conceptually, the curvelet transform is a multiscale pyramid with many directions and positions at each length scale, and needleshaped elements at fine scales. These elements have many useful geometric multiscale features that set them apart from classical multiscale representations such as wavelets. For instance, curvelets obey a parabolic scaling relation which says that at scale 2−j, each element has an envelope which is aligned along a ‘ridge ’ of length 2−j/2 and width 2−j. We prove that curvelets provide an essentially optimal representation of typical objects f which are C2 except for discontinuities along C2 curves. Such representations are nearly as sparse as if f were not singular and turn out to be far more sparse than the wavelet decomposition of the object. For instance, the nterm partial reconstruction f C n obtained by selecting the n largest terms in the curvelet series obeys ‖f − f C n ‖ 2 L2 ≤ C · n−2 · (log n) 3, n → ∞. This rate of convergence holds uniformly over a class of functions which are C 2 except for discontinuities along C 2 curves and is essentially optimal. In comparison, the squared error of nterm wavelet approximations only converges as n −1 as n → ∞, which is considerably worst than the optimal behavior.
Nonlinear solution of linear inverse problems by waveletvaguelette decomposition
, 1992
"... We describe the WaveletVaguelette Decomposition (WVD) of a linear inverse problem. It is a substitute for the singular value decomposition (SVD) of an inverse problem, and it exists for a class of special inverse problems of homogeneous type { such asnumerical di erentiation, inversion of Abeltype ..."
Abstract

Cited by 182 (12 self)
 Add to MetaCart
We describe the WaveletVaguelette Decomposition (WVD) of a linear inverse problem. It is a substitute for the singular value decomposition (SVD) of an inverse problem, and it exists for a class of special inverse problems of homogeneous type { such asnumerical di erentiation, inversion of Abeltype transforms, certain convolution transforms, and the Radon Transform. We propose to solve illposed linear inverse problems by nonlinearly \shrinking" the WVD coe cients of the noisy, indirect data. Our approach o ers signi cant advantages over traditional SVD inversion in the case of recovering spatially inhomogeneous objects. We suppose that observations are contaminated by white noise and that the object is an unknown element of a Besov space. We prove that nonlinear WVD shrinkage can be tuned to attain the minimax rate of convergence, for L 2 loss, over the entire Besov scale. The important case of Besov spaces Bp;q, p <2, which model spatial inhomogeneity, is included. In comparison, linear procedures { SVD included { cannot attain optimal rates of convergence over such classes in the case p<2. For example, our methods achieve faster rates of convergence, for objects known to lie in the Bump Algebra or in Bounded Variation, than any linear procedure.
Wavelet Threshold Estimators for Data With Correlated Noise
, 1994
"... Wavelet threshold estimators for data with stationary correlated noise are constructed by the following prescription. First, form the discrete wavelet transform of the data points. Next, apply a leveldependent soft threshold to the individual coefficients, allowing the thresholds to depend on the l ..."
Abstract

Cited by 182 (13 self)
 Add to MetaCart
Wavelet threshold estimators for data with stationary correlated noise are constructed by the following prescription. First, form the discrete wavelet transform of the data points. Next, apply a leveldependent soft threshold to the individual coefficients, allowing the thresholds to depend on the level in the wavelet transform. Finally, transform back to obtain the estimate in the original domain. The threshold used at level j is s j p 2 log n, where s j is the standard deviation of the coefficients at that level, and n is the overall sample size. The minimax properties of the estimators are investigated by considering a general problem in multivariate normal decision theory, concerned with the estimation of the mean vector of a general multivariate normal distribution subject to squared error loss. An ideal risk is obtained by the use of an `oracle' that provides the optimum diagonal projection estimate. This `benchmark' risk can be considered in its own right as a measure of the s...
Data compression and harmonic analysis
 IEEE Trans. Inform. Theory
, 1998
"... In this paper we review some recent interactions between harmonic analysis and data compression. The story goes back of course to Shannon’s R(D) theory... ..."
Abstract

Cited by 140 (24 self)
 Add to MetaCart
In this paper we review some recent interactions between harmonic analysis and data compression. The story goes back of course to Shannon’s R(D) theory...
Unconditional bases are optimal bases for data compression and for statistical estimation
 Applied and Computational Harmonic Analysis
, 1993
"... An orthogonal basis of L 2 which is also an unconditional basis of a functional space F is a kind of optimal basis for compressing, estimating, and recovering functions in F. Simple thresholding operations, applied in the unconditional basis, work essentially better for compressing, estimating, and ..."
Abstract

Cited by 140 (23 self)
 Add to MetaCart
An orthogonal basis of L 2 which is also an unconditional basis of a functional space F is a kind of optimal basis for compressing, estimating, and recovering functions in F. Simple thresholding operations, applied in the unconditional basis, work essentially better for compressing, estimating, and recovering than they do in any other orthogonal basis. In fact, simple thresholding in an unconditional basis works essentially better for recovery and estimation than other methods, period. (Performance is measured in an asymptotic minimax sense.) As an application, we formalize and prove Mallat's Heuristic, which says that wavelet bases are optimal for representing functions containing singularities, when there may be an arbitrary number of singularities, arbitrarily distributed.