Results 1  10
of
16
DeNoising By SoftThresholding
, 1992
"... Donoho and Johnstone (1992a) proposed a method for reconstructing an unknown function f on [0; 1] from noisy data di = f(ti)+ zi, iid i =0;:::;n 1, ti = i=n, zi N(0; 1). The reconstruction fn ^ is de ned in the wavelet domain by translating all the empirical wavelet coe cients of d towards 0 by an a ..."
Abstract

Cited by 804 (13 self)
 Add to MetaCart
Donoho and Johnstone (1992a) proposed a method for reconstructing an unknown function f on [0; 1] from noisy data di = f(ti)+ zi, iid i =0;:::;n 1, ti = i=n, zi N(0; 1). The reconstruction fn ^ is de ned in the wavelet domain by translating all the empirical wavelet coe cients of d towards 0 by an amount p 2 log(n) = p n. We prove two results about that estimator. [Smooth]: With high probability ^ fn is at least as smooth as f, in any of a wide variety of smoothness measures. [Adapt]: The estimator comes nearly as close in mean square to f as any measurable estimator can come, uniformly over balls in each of two broad scales of smoothness classes. These two properties are unprecedented in several ways. Our proof of these results develops new facts about abstract statistical inference and its connection with an optimal recovery model.
Adapting to unknown smoothness via wavelet shrinkage
 JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION
, 1995
"... We attempt to recover a function of unknown smoothness from noisy, sampled data. We introduce a procedure, SureShrink, which suppresses noise by thresholding the empirical wavelet coefficients. The thresholding is adaptive: a threshold level is assigned to each dyadic resolution level by the princip ..."
Abstract

Cited by 679 (19 self)
 Add to MetaCart
We attempt to recover a function of unknown smoothness from noisy, sampled data. We introduce a procedure, SureShrink, which suppresses noise by thresholding the empirical wavelet coefficients. The thresholding is adaptive: a threshold level is assigned to each dyadic resolution level by the principle of minimizing the Stein Unbiased Estimate of Risk (Sure) for threshold estimates. The computational effort of the overall procedure is order N log(N) as a function of the sample size N. SureShrink is smoothnessadaptive: if the unknown function contains jumps, the reconstruction (essentially) does also; if the unknown function has a smooth piece, the reconstruction is (essentially) as smooth as the mother wavelet will allow. The procedure is in a sense optimally smoothnessadaptive: it is nearminimax simultaneously over a whole interval of the Besov scale; the size of this interval depends on the choice of mother wavelet. We know from a previous paper by the authors that traditional smoothing methods  kernels, splines, and orthogonal series estimates  even with optimal choices of the smoothing parameter, would be unable to perform
Minimax Estimation via Wavelet Shrinkage
, 1992
"... We attempt to recover an unknown function from noisy, sampled data. Using orthonormal bases of compactly supported wavelets we develop a nonlinear method which works in the wavelet domain by simple nonlinear shrinkage of the empirical wavelet coe cients. The shrinkage can be tuned to be nearly minim ..."
Abstract

Cited by 246 (32 self)
 Add to MetaCart
We attempt to recover an unknown function from noisy, sampled data. Using orthonormal bases of compactly supported wavelets we develop a nonlinear method which works in the wavelet domain by simple nonlinear shrinkage of the empirical wavelet coe cients. The shrinkage can be tuned to be nearly minimax over any member of a wide range of Triebel and Besovtype smoothness constraints, and asymptotically minimax over Besov bodies with p q. Linear estimates cannot achieve even the minimax rates over Triebel and Besov classes with p <2, so our method can signi cantly outperform every linear method (kernel, smoothing spline, sieve,:::) in a minimax sense. Variants of our method based on simple threshold nonlinearities are nearly minimax. Our method possesses the interpretation of spatial adaptivity: it reconstructs using a kernel which mayvary in shape and bandwidth from point to point, depending on the data. Least favorable distributions for certain of the Triebel and Besov scales generate objects with sparse wavelet transforms. Many real objects have similarly sparse transforms, which suggests that these minimax results are relevant for practical problems. Sequels to this paper discuss practical implementation, spatial adaptation properties and applications to inverse problems.
Wavelet shrinkage: asymptopia
 Journal of the Royal Statistical Society, Ser. B
, 1995
"... Considerable e ort has been directed recently to develop asymptotically minimax methods in problems of recovering in nitedimensional objects (curves, densities, spectral densities, images) from noisy data. A rich and complex body of work has evolved, with nearly or exactly minimax estimators bein ..."
Abstract

Cited by 239 (35 self)
 Add to MetaCart
Considerable e ort has been directed recently to develop asymptotically minimax methods in problems of recovering in nitedimensional objects (curves, densities, spectral densities, images) from noisy data. A rich and complex body of work has evolved, with nearly or exactly minimax estimators being obtained for a variety of interesting problems. Unfortunately, the results have often not been translated into practice, for a variety of reasons { sometimes, similarity to known methods, sometimes, computational intractability, and sometimes, lack of spatial adaptivity. We discuss a method for curve estimation based on n noisy data; one translates the empirical wavelet coe cients towards the origin by an amount p p 2 log(n) = n. The method is di erent from methods in common use today, is computationally practical, and is spatially adaptive; thus it avoids a number of previous objections to minimax estimators. At the same time, the method is nearly minimax for a wide variety of loss functions { e.g. pointwise error, global error measured in L p norms, pointwise and global error in estimation of derivatives { and for a wide range of smoothness classes, including standard Holder classes, Sobolev classes, and Bounded Variation. This is amuch broader nearoptimality than anything previously proposed in the minimax literature. Finally, the theory underlying the method is interesting, as it exploits a correspondence between statistical questions and questions of optimal recovery and informationbased complexity.
Density estimation by wavelet thresholding
 Ann. Statist
, 1996
"... Density estimation is a commonly used test case for nonparametric estimation methods. We explore the asymptotic properties of estimators based on thresholding of empirical wavelet coe cients. Minimax rates of convergence are studied over a large range of Besov function classes Bs;p;q and for a rang ..."
Abstract

Cited by 140 (8 self)
 Add to MetaCart
Density estimation is a commonly used test case for nonparametric estimation methods. We explore the asymptotic properties of estimators based on thresholding of empirical wavelet coe cients. Minimax rates of convergence are studied over a large range of Besov function classes Bs;p;q and for a range of global L 0 p error measures, 1 p 0 < 1. A single wavelet threshold estimator is asymptotically minimax within logarithmic terms simultaneously over a range of spaces and error measures. In particular, when p 0> p, some form of nonlinearity is essential, since the minimax linear estimators are suboptimal by polynomial powers of n. A second approach, using an approximation of a Gaussian white noise model in a Mallows metric, is used to attain exactly optimal rates of convergence for quadratic error (p 0 = 2).
Ideal denoising in an orthonormal basis chosen from a library of bases
 Comptes Rendus Acad. Sci., Ser. I
, 1994
"... of bases ..."
InformationTheoretic Determination of Minimax Rates of Convergence
 Ann. Stat
, 1997
"... In this paper, we present some general results determining minimax bounds on statistical risk for density estimation based on certain informationtheoretic considerations. These bounds depend only on metric entropy conditions and are used to identify the minimax rates of convergence. ..."
Abstract

Cited by 93 (18 self)
 Add to MetaCart
In this paper, we present some general results determining minimax bounds on statistical risk for density estimation based on certain informationtheoretic considerations. These bounds depend only on metric entropy conditions and are used to identify the minimax rates of convergence.
Adaptive model selection using empirical complexities
 Annals of Statistics
, 1999
"... Key words and phrases. Complexity regularization, classi cation, pattern recognition, regression estimation, curve tting, minimum description length. 1 Given n independent replicates of a jointly distributed pair (X; Y) 2R d R, we wish to select from a xed sequence of model classes F1; F2;:::a deter ..."
Abstract

Cited by 37 (8 self)
 Add to MetaCart
Key words and phrases. Complexity regularization, classi cation, pattern recognition, regression estimation, curve tting, minimum description length. 1 Given n independent replicates of a jointly distributed pair (X; Y) 2R d R, we wish to select from a xed sequence of model classes F1; F2;:::a deterministic prediction rule f: R d! R whose risk is small. We investigate the possibility of empirically assessing the complexity of each model class, that is, the actual di culty of the estimation problem within each class. The estimated complexities are in turn used to de ne an adaptive model selection procedure, which is based on complexity penalized empirical risk. The available data are divided into two parts. The rst is used to form an empirical cover of each model class, and the second is used to select a candidate rule from each cover based on empirical risk. The covering radii are determined empirically to optimize a tight upper bound on the estimation error.
Minimax bayes, asymptotic minimax and sparse wavelet priors, in
 Sciences Paris (A
, 1994
"... Pinsker(1980) gave a precise asymptotic evaluation of the minimax mean squared error of estimation of a signal in Gaussian noise when the signal is known a priori to lie in a compact ellipsoid in Hilbert space. This `Minimax Bayes ' method can be applied to a variety of global nonparametric estimat ..."
Abstract

Cited by 35 (9 self)
 Add to MetaCart
Pinsker(1980) gave a precise asymptotic evaluation of the minimax mean squared error of estimation of a signal in Gaussian noise when the signal is known a priori to lie in a compact ellipsoid in Hilbert space. This `Minimax Bayes ' method can be applied to a variety of global nonparametric estimation settings with parameter spaces far from ellipsoidal. For example it leads to a theory of exact asymptotic minimax estimation over norm balls in Besov and Triebel spaces using simple coordinatewise estimators and wavelet bases. This paper outlines some features of the method common to several applications. In particular, we derive new results on the exact asymptotic minimax risk over weak `p balls in Rn as n!1, and also for a class of `local ' estimators on the Triebel scale. By its very nature, the method reveals the structure of asymptotically least favorable distributions. Thus wemaysimulate `least favorable ' sample paths. We illustrate this for estimation of a signal in Gaussian white noise over norm balls in certain Besov spaces. In wavelet bases, when p<2, the least favorable priors are sparse, and the resulting sample paths strikingly di erent from those observed in Pinsker's ellipsoidal setting (p =2).
Fast learning rates for plugin classifiers
 Ann. Statist
, 2007
"... It has been recently shown that, under the margin (or low noise) assumption, there exist classifiers attaining fast rates of convergence of the excess Bayes risk, that is, rates faster than n −1/2. The work on this subject has suggested the following two conjectures: (i) the best achievable fast rat ..."
Abstract

Cited by 26 (3 self)
 Add to MetaCart
It has been recently shown that, under the margin (or low noise) assumption, there exist classifiers attaining fast rates of convergence of the excess Bayes risk, that is, rates faster than n −1/2. The work on this subject has suggested the following two conjectures: (i) the best achievable fast rate is of the order n −1, and (ii) the plugin classifiers generally converge more slowly than the classifiers based on empirical risk minimization. We show that both conjectures are not correct. In particular, we construct plugin classifiers that can achieve not only fast, but also superfast rates, that is, rates faster than n −1. We establish minimax lower bounds showing that the obtained rates cannot be improved. 1. Introduction. Let (X,Y