## Adapting to unknown smoothness by aggregation of thresholded wavelet estimators (2006)

Citations: | 4 - 2 self |

### BibTeX

@MISC{Chesneau06adaptingto,

author = {Christophe Chesneau and Guillaume Lecué},

title = {Adapting to unknown smoothness by aggregation of thresholded wavelet estimators},

year = {2006}

}

### OpenURL

### Abstract

We study the performances of an adaptive procedure based on a convex combination, with data-driven weights, of term-by-term thresholded wavelet estimators. For the bounded regression model, with random uniform design, and the nonparametric density model, we show that the resulting estimator is optimal in the minimax sense over all Besov balls under the L 2 risk, without any logarithm factor.

### Citations

8981 | Statistical learning theory
- Vapnik
- 1998
(Show Context)
Citation Context ...he exact oracle inequality of Section 2 is given in a general framework. Two aggregation procedures satisfy this oracle inequality. The well known ERM (for Empirical Risk Minimization) procedure (cf. =-=[51]-=-, [38] and references therein) and an exponential weighting aggregation scheme, which has been studied, among others, by [5], [8], [40], [41] and [39]. There is a recursive version of this scheme stud... |

678 | Johnstone.,“Adapting to unknown smoothness via wavelet shrinkage
- Donoho, Iain
- 1995
(Show Context)
Citation Context ...y adaptive and (near) optimal over a wide range of function classes. Standard approaches are based on the term-by-term thresholds. A well-known example is the hard thresholded estimator introduced by =-=[21]-=-. If we observe n statistical data and if the unknown function f has an expansion of the form f = ∑ j associated wavelet coefficients, then the term-by-term wavelet thresholded method consists in thre... |

550 |
Ideal spatial adaption via Wavelet Shrinkage”, Biometrika
- Donoho, Johnstone
- 1994
(Show Context)
Citation Context ...denotes a large enough constant. In the literature, several technics have been proposed to determine the ’best’ adaptive threshold. There are, for instance, the RiskShrink and SureShrink methods (see =-=[20, 21]-=-), the cross-validation methods (see [45], [53] and [31]), the methods based on hypothesis tests (see [1] and [2]), the Lepski methods (see [33]) and the Bayesian methods (see [17] and [3]). Most of t... |

246 |
Aggregating strategies
- VOVK
- 1990
(Show Context)
Citation Context ...recursive version of this scheme studied by [13], [54], [35] and [36]. In the sequential prediction problem, weighted average predictions 2with exponential weights have been widely studied (cf. e.g. =-=[52]-=- and [15]). A recent result of [42] shows that the ERM procedure is suboptimal for strictly convex losses (which is the case for density and regression estimation when the integrated squared risk is u... |

238 |
Ondelettes et opérateurs
- Meyer
- 1990
(Show Context)
Citation Context ...elet series f ∗ 2 (x) = l ∑−1 αl,kφl,k(x) + k=0 ∞∑ j=l 2j ∑−1 k=0 βj,kψj,k(x), where αj,k = ∫ 1 0 f ∗ (x)φj,k(x)dx and βj,k = ∫ 1 0 f ∗ (x)ψj,k(x)dx. Further details on wavelet theory can be found in =-=[44]-=- and [18]. Now, let us define the main function spaces of the study. Let M ∈ (0, ∞), s ∈ (0, N), p ∈ [1, ∞) and q ∈ [1, ∞). Let us set βτ−1,k = ατ,k. We say that a function f ∗ belongs to the Besov ba... |

201 | Wavelet thresholding via a Bayesian approach
- Abramovich, Sapatinas, et al.
- 1998
(Show Context)
Citation Context ...s (see [20, 21]), the cross-validation methods (see [45], [53] and [31]), the methods based on hypothesis tests (see [1] and [2]), the Lepski methods (see [33]) and the Bayesian methods (see [17] and =-=[3]-=-). Most of them are described in detailed in [45] and [4]. In the present paper, we propose to study the performances of an adaptive wavelet estimator based on a convex combination of ˆ fλ’s. In the f... |

173 |
Adaptive Bayesian wavelet shrinkage
- Chipman, Kolaczyk, et al.
- 1997
(Show Context)
Citation Context ...nk methods (see [20, 21]), the cross-validation methods (see [45], [53] and [31]), the methods based on hypothesis tests (see [1] and [2]), the Lepski methods (see [33]) and the Bayesian methods (see =-=[17]-=- and [3]). Most of them are described in detailed in [45] and [4]. In the present paper, we propose to study the performances of an adaptive wavelet estimator based on a convex combination of ˆ fλ’s. ... |

151 | Optimal aggregation of classifiers in statistical learning
- Tsybakov
- 2004
(Show Context)
Citation Context ...1] and [40]. Now, we introduce an assumption which improve the quality of estimation in our framework. This assumption has been first introduced by [43], for the problem of discriminant analysis, and =-=[50]-=-, for the classification problem. With this assumption, parametric rates of convergence can be achieved, for instance, in the classification problem (cf. [50], [48]). Margin Assumption(MA): The probab... |

140 | Density estimation by wavelet thresholding
- Donoho, Johnstone, et al.
- 1996
(Show Context)
Citation Context ...ence over Besov balls under the L2 ([0, 1]) risk for the density model can be found in [19] and [29]. For further details about the density estimation via adaptive wavelet thresholded estimators, see =-=[23]-=-, [19] and [47]. See also [30] for a practical study. 4.2 Bounded regression In the framework of the bounded regression model with uniform random design, Theorem 4 below investigates the rate of conve... |

120 |
Gaussian model selection
- Birgé, Massart
(Show Context)
Citation Context ...tor has similar minimax performances than the empirical Bayes wavelet methods (see [55] and [32]) and several term-by-term wavelet thresholded estimators defined with a random threshold (see [33] and =-=[7]-=-). Finally, it is important to mention that the multi-thresholding estimator does not need any minimization step and is relatively easy to implement. 5 Proofs Proof of Theorem 1. We recall the notatio... |

113 | Statistical behavior and consistency of classification methods based on convex risk minimization
- Zhang
- 2003
(Show Context)
Citation Context ... is given by Q((x, y), f) = φ(yf(x))for any (x, y) ∈ X × {−1, 1}. Most of the time a minimizer f ∗ of the φ−risk A over F or its sign is equal to the Bayes rule f ∗ (x) = Sign(2η(x) − 1), ∀x ∈ X (cf. =-=[56]-=-). 5In this paper we obtain an oracle inequality in the general framework described at the beginning of this Subsection. Then, we use it in the density estimation and the bounded regression framework... |

109 | Adapting to unknown sparsity by controlling the false discovery rate - Abramovich, Benjamini, et al. - 2006 |

101 | Smooth Discriminant Analysis
- Mammen, Tsybakov
- 1999
(Show Context)
Citation Context ...equality in the classification setup, we refer to [41] and [40]. Now, we introduce an assumption which improve the quality of estimation in our framework. This assumption has been first introduced by =-=[43]-=-, for the problem of discriminant analysis, and [50], for the classification problem. With this assumption, parametric rates of convergence can be achieved, for instance, in the classification problem... |

98 | adaptive wavelet estimation: a block thresholding and oracle inequality approach,” The annuals of Statistics
- W, Cai
- 1999
(Show Context)
Citation Context ...he result obtained, for instance, by the hard thresholded estimator (see [21]), by the global wavelet block thresholded estimator (see [37]), by the localized wavelet block thresholded estimator (see =-=[9, 12, 10]-=-, [28, 27], [24, 25], [16] and [11]) and, in particular, the penalized Blockwise Stein method (see [14]) are worse than the one obtained by the multi-thresholding estimator and stated in Theorems 3 an... |

86 | Empirical Bayes selection of wavelet thresholds
- Johnstone, Silverman
- 2005
(Show Context)
Citation Context ...n the optimal rate of convergence without any extra logarithm factor. In fact, the multi-thresholding estimator has similar minimax performances than the empirical Bayes wavelet methods (see [55] and =-=[32]-=-) and several term-by-term wavelet thresholded estimators defined with a random threshold (see [33] and [7]). Finally, it is important to mention that the multi-thresholding estimator does not need an... |

81 |
Statistical learning theory and stochastic optimization, ser
- Catoni
(Show Context)
Citation Context ...and references therein) and an exponential weighting aggregation scheme, which has been studied, among others, by [5], [8], [40], [41] and [39]. There is a recursive version of this scheme studied by =-=[13]-=-, [54], [35] and [36]. In the sequential prediction problem, weighted average predictions 2with exponential weights have been widely studied (cf. e.g. [52] and [15]). A recent result of [42] shows th... |

75 |
Local Rademacher complexities and oracle inequalities in risk minimization
- Koltchinskii
(Show Context)
Citation Context ...ct oracle inequality of Section 2 is given in a general framework. Two aggregation procedures satisfy this oracle inequality. The well known ERM (for Empirical Risk Minimization) procedure (cf. [51], =-=[38]-=- and references therein) and an exponential weighting aggregation scheme, which has been studied, among others, by [5], [8], [40], [41] and [39]. There is a recursive version of this scheme studied by... |

72 | Wavelet estimators in nonparametric regression: a comparative simulation study
- Antoniadis, Bigot, et al.
- 2001
(Show Context)
Citation Context ... [53] and [31]), the methods based on hypothesis tests (see [1] and [2]), the Lepski methods (see [33]) and the Bayesian methods (see [17] and [3]). Most of them are described in detailed in [45] and =-=[4]-=-. In the present paper, we propose to study the performances of an adaptive wavelet estimator based on a convex combination of ˆ fλ’s. In the framework of nonparametric density estimation and bounded ... |

53 | Incorporating information on neighboring coefficients into wavelet estimation. Web page available at www.stat.purdue.edu/~tcai/neighblock.html
- Cai, Silverman
- 1999
(Show Context)
Citation Context ...he result obtained, for instance, by the hard thresholded estimator (see [21]), by the global wavelet block thresholded estimator (see [37]), by the localized wavelet block thresholded estimator (see =-=[9, 12, 10]-=-, [28, 27], [24, 25], [16] and [11]) and, in particular, the penalized Blockwise Stein method (see [14]) are worse than the one obtained by the multi-thresholding estimator and stated in Theorems 3 an... |

52 | Wavelet shrinkage denoising using the non-negative garrote
- Gao
- 1998
(Show Context)
Citation Context ...d u (x) = x1I{|x|�u}, the soft thresholding rule Υsoft u (x) = sign(x)(|x| −u)1I{|x|�u} (see [21], [22] and [19]) and the non-negative garrote thresholding rule Υ NG u (x) = (x − u2 /x)1I{|x|�u} (see =-=[26]-=-). If we consider the minimax point of view over Besov balls under the integrated squared risk, then [19] makes the conditions on ˆαj,k, ˆ βj,k and the threshold λ such that the estimator ˆ fλ(Dn, .) ... |

52 | Fast rates for support vector machines using Gaussian kernels The Annals of Statistics
- Scovel, Steinwart
- 2007
(Show Context)
Citation Context ...oblem of discriminant analysis, and [50], for the classification problem. With this assumption, parametric rates of convergence can be achieved, for instance, in the classification problem (cf. [50], =-=[48]-=-). Margin Assumption(MA): The probability measure π satisfies the margin assumption MA(κ, c, F0), where κ ≥ 1, c > 0 and F0 is a subset of F if E[(Q(Z, f) − Q(Z, f ∗ )) 2 ] ≤ c(A(f) − A ∗ ) 1/κ , for ... |

49 | Adaptive thresholding of wavelet coefficients
- Abramovich, Benjamini
- 1996
(Show Context)
Citation Context ...’ adaptive threshold. There are, for instance, the RiskShrink and SureShrink methods (see [20, 21]), the cross-validation methods (see [45], [53] and [31]), the methods based on hypothesis tests (see =-=[1]-=- and [2]), the Lepski methods (see [33]) and the Bayesian methods (see [17] and [3]). Most of them are described in detailed in [45] and [4]. In the present paper, we propose to study the performances... |

43 |
Block threshold rules for curve estimation using kernel and wavelet methods
- HALL, KERKYACHARIAN, et al.
- 1998
(Show Context)
Citation Context ...ained, for instance, by the hard thresholded estimator (see [21]), by the global wavelet block thresholded estimator (see [37]), by the localized wavelet block thresholded estimator (see [9, 12, 10], =-=[28, 27]-=-, [24, 25], [16] and [11]) and, in particular, the penalized Blockwise Stein method (see [14]) are worse than the one obtained by the multi-thresholding estimator and stated in Theorems 3 and 4. This ... |

39 |
On the minimax optimality of block thresholded wavelet estimators
- Hall, Kerkyacharian, et al.
- 1999
(Show Context)
Citation Context ...ained, for instance, by the hard thresholded estimator (see [21]), by the global wavelet block thresholded estimator (see [37]), by the localized wavelet block thresholded estimator (see [9, 12, 10], =-=[28, 27]-=-, [24, 25], [16] and [11]) and, in particular, the penalized Blockwise Stein method (see [14]) are worse than the one obtained by the multi-thresholding estimator and stated in Theorems 3 and 4. This ... |

39 | Mixing strategies for density estimation
- Yang
- 2000
(Show Context)
Citation Context ...ferences therein) and an exponential weighting aggregation scheme, which has been studied, among others, by [5], [8], [40], [41] and [39]. There is a recursive version of this scheme studied by [13], =-=[54]-=-, [35] and [36]. In the sequential prediction problem, weighted average predictions 2with exponential weights have been widely studied (cf. e.g. [52] and [15]). A recent result of [42] shows that the... |

37 |
Information theory and mixing least-squares regressions
- Leung, Barron
- 2006
(Show Context)
Citation Context ...ality. The well known ERM (for Empirical Risk Minimization) procedure (cf. [51], [38] and references therein) and an exponential weighting aggregation scheme, which has been studied, among others, by =-=[5]-=-, [8], [40], [41] and [39]. There is a recursive version of this scheme studied by [13], [54], [35] and [36]. In the sequential prediction problem, weighted average predictions 2with exponential weig... |

36 |
On Minimax Wavelet Estimators
- Delyon, Juditsky
- 1996
(Show Context)
Citation Context ...’best’ threshold λ. In particular, we show that this estimator is optimal, in the minimax sense, over all Besov balls under the L 2 risk. The proof is based on a non-adaptive minimax result proved by =-=[19]-=- and some powerful oracle inequality satisfied by aggregation methods. There are two steps in our approach. A first step, called the training step, where non-adaptive thresholded wavelet estimators ar... |

34 | Model selection via testing: an alternative to (penalized) maximum likelihood estimators. Preprint n.862, Laboratoire de Probabilités et Modèles Aléatoires, Universités Paris 6 and Paris 7 (available at http://www.proba.jussieu.fr/mathdoc/preprints
- Birgé
- 2003
(Show Context)
Citation Context ...2 Aggregation Procedures Let’s work with the notations introduced in the beginning of the previous Subsection. The aggregation framework considered, among others, by [34], [54], [13],[46], [49], [5], =-=[6]-=- is the following: take F0 a finite subset of F, our aim is to mimic (up to an 6additive residual) the best function in F0 w.r.t. the risk A. For this, we consider two aggregation procedures. by The ... |

32 |
A.: Functional aggregation for nonparametric estimation
- Juditsky, Nemirovski
(Show Context)
Citation Context ...sumption with parameter κ = 1). 2.2 Aggregation Procedures Let’s work with the notations introduced in the beginning of the previous Subsection. The aggregation framework considered, among others, by =-=[34]-=-, [54], [13],[46], [49], [5], [6] is the following: take F0 a finite subset of F, our aim is to mimic (up to an 6additive residual) the best function in F0 w.r.t. the risk A. For this, we consider tw... |

26 | Wavelet estimators: Adapting to unknown smoothness.Math
- Juditsky
- 1997
(Show Context)
Citation Context ...nstance, the RiskShrink and SureShrink methods (see [20, 21]), the cross-validation methods (see [45], [53] and [31]), the methods based on hypothesis tests (see [1] and [2]), the Lepski methods (see =-=[33]-=-) and the Bayesian methods (see [17] and [3]). Most of them are described in detailed in [45] and [4]. In the present paper, we propose to study the performances of an adaptive wavelet estimator based... |

23 | Choice of the threshold parameter in wavelet function estimation
- Nason
- 1995
(Show Context)
Citation Context ...ature, several technics have been proposed to determine the ’best’ adaptive threshold. There are, for instance, the RiskShrink and SureShrink methods (see [20, 21]), the cross-validation methods (see =-=[45]-=-, [53] and [31]), the methods based on hypothesis tests (see [1] and [2]), the Lepski methods (see [33]) and the Bayesian methods (see [17] and [3]). Most of them are described in detailed in [45] and... |

22 |
Wavelet shrinkage: Asymptotia
- Donovo, Johnstone, et al.
- 1995
(Show Context)
Citation Context ...2 −1 u}), (11) for any x ∈ R and y ∈ R. The inequality (11) holds for the hard thresholding rule Υhard u (x) = x1I{|x|�u}, the soft thresholding rule Υsoft u (x) = sign(x)(|x| −u)1I{|x|�u} (see [21], =-=[22]-=- and [19]) and the non-negative garrote thresholding rule Υ NG u (x) = (x − u2 /x)1I{|x|�u} (see [26]). If we consider the minimax point of view over Besov balls under the integrated squared risk, the... |

19 | Penalized blockwise Stein’s method, monotone oracles and sharp adaptative estimation
- Cavalier, Tsybakov
(Show Context)
Citation Context ...resholded estimator (see [37]), by the localized wavelet block thresholded estimator (see [9, 12, 10], [28, 27], [24, 25], [16] and [11]) and, in particular, the penalized Blockwise Stein method (see =-=[14]-=-) are worse than the one obtained by the multi-thresholding estimator and stated in Theorems 3 and 4. This is because, on the difference of those works, we obtain the optimal rate of convergence witho... |

18 | General empirical Bayes wavelet methods and exactly adaptive minimax estimation
- Zhang
- 2005
(Show Context)
Citation Context ... we obtain the optimal rate of convergence without any extra logarithm factor. In fact, the multi-thresholding estimator has similar minimax performances than the empirical Bayes wavelet methods (see =-=[55]-=- and [32]) and several term-by-term wavelet thresholded estimators defined with a random threshold (see [33] and [7]). Finally, it is important to mention that the multi-thresholding estimator does no... |

16 |
Adaptive confidence interval for pointwise curve estimation
- Picard, Tribouley
- 2000
(Show Context)
Citation Context ... balls under the L2 ([0, 1]) risk for the density model can be found in [19] and [29]. For further details about the density estimation via adaptive wavelet thresholded estimators, see [23], [19] and =-=[47]-=-. See also [30] for a practical study. 4.2 Bounded regression In the framework of the bounded regression model with uniform random design, Theorem 4 below investigates the rate of convergence achieved... |

16 |
Wavelet Shrinkage and Generalized Cross Validation for Image Denoising
- Weyrich, Warhola
- 1998
(Show Context)
Citation Context ... several technics have been proposed to determine the ’best’ adaptive threshold. There are, for instance, the RiskShrink and SureShrink methods (see [20, 21]), the cross-validation methods (see [45], =-=[53]-=- and [31]), the methods based on hypothesis tests (see [1] and [2]), the Lepski methods (see [33]) and the Bayesian methods (see [17] and [3]). Most of them are described in detailed in [45] and [4]. ... |

14 | Simultaneous adaptation to the margin and to complexity in classification
- Lecué
- 2007
(Show Context)
Citation Context ... well known ERM (for Empirical Risk Minimization) procedure (cf. [51], [38] and references therein) and an exponential weighting aggregation scheme, which has been studied, among others, by [5], [8], =-=[40]-=-, [41] and [39]. There is a recursive version of this scheme studied by [13], [54], [35] and [36]. In the sequential prediction problem, weighted average predictions 2with exponential weights have be... |

13 |
Quasi-linear wavelet estimation
- EFROMOVICH
- 1999
(Show Context)
Citation Context ... instance, by the hard thresholded estimator (see [21]), by the global wavelet block thresholded estimator (see [37]), by the localized wavelet block thresholded estimator (see [9, 12, 10], [28, 27], =-=[24, 25]-=-, [16] and [11]) and, in particular, the penalized Blockwise Stein method (see [14]) are worse than the one obtained by the multi-thresholding estimator and stated in Theorems 3 and 4. This is because... |

12 |
Block thresholding for density estimation: Local and global adaptivity
- Chicken, Cai
- 2005
(Show Context)
Citation Context ...ard thresholded estimator (see [21]), by the global wavelet block thresholded estimator (see [37]), by the localized wavelet block thresholded estimator (see [9, 12, 10], [28, 27], [24, 25], [16] and =-=[11]-=-) and, in particular, the penalized Blockwise Stein method (see [14]) are worse than the one obtained by the multi-thresholding estimator and stated in Theorems 3 and 4. This is because, on the differ... |

12 |
lp adaptive density estimation
- Kerkyacharian, Picard, et al.
- 1996
(Show Context)
Citation Context ...mators developed in the literature. To the authors’s knowledge, the result obtained, for instance, by the hard thresholded estimator (see [21]), by the global wavelet block thresholded estimator (see =-=[37]-=-), by the localized wavelet block thresholded estimator (see [9, 12, 10], [28, 27], [24, 25], [16] and [11]) and, in particular, the penalized Blockwise Stein method (see [14]) are worse than the one ... |

11 | Suboptimality of penalized empirical risk minimization in classification
- Lecué
- 2007
(Show Context)
Citation Context ...udied by [13], [54], [35] and [36]. In the sequential prediction problem, weighted average predictions 2with exponential weights have been widely studied (cf. e.g. [52] and [15]). A recent result of =-=[42]-=- shows that the ERM procedure is suboptimal for strictly convex losses (which is the case for density and regression estimation when the integrated squared risk is used). Thus, in our case it is bette... |

11 |
Optimal rates of aggregation. Computational Learning Theory and Kernel Machines. B.Schölkopf and M.Warmuth, eds
- Tsybakov
- 2003
(Show Context)
Citation Context ... κ = 1). 2.2 Aggregation Procedures Let’s work with the notations introduced in the beginning of the previous Subsection. The aggregation framework considered, among others, by [34], [54], [13],[46], =-=[49]-=-, [5], [6] is the following: take F0 a finite subset of F, our aim is to mimic (up to an 6additive residual) the best function in F0 w.r.t. the risk A. For this, we consider two aggregation procedure... |

10 |
Recursive aggregation of estimators via the Mirror Descent Algorithm with averaging. Problems of Information Transmission
- Juditsky, Nazin, et al.
- 2005
(Show Context)
Citation Context ...es therein) and an exponential weighting aggregation scheme, which has been studied, among others, by [5], [8], [40], [41] and [39]. There is a recursive version of this scheme studied by [13], [54], =-=[35]-=- and [36]. In the sequential prediction problem, weighted average predictions 2with exponential weights have been widely studied (cf. e.g. [52] and [15]). A recent result of [42] shows that the ERM p... |

6 | Some new methods for wavelet density estimation
- Herrick, Nason, et al.
- 2001
(Show Context)
Citation Context ...e L2 ([0, 1]) risk for the density model can be found in [19] and [29]. For further details about the density estimation via adaptive wavelet thresholded estimators, see [23], [19] and [47]. See also =-=[30]-=- for a practical study. 4.2 Bounded regression In the framework of the bounded regression model with uniform random design, Theorem 4 below investigates the rate of convergence achieved by the multi-t... |

6 |
Noise reduction by wavelet thresholding, volume 161
- Jansen
(Show Context)
Citation Context ...technics have been proposed to determine the ’best’ adaptive threshold. There are, for instance, the RiskShrink and SureShrink methods (see [20, 21]), the cross-validation methods (see [45], [53] and =-=[31]-=-), the methods based on hypothesis tests (see [1] and [2]), the Lepski methods (see [33]) and the Bayesian methods (see [17] and [3]). Most of them are described in detailed in [45] and [4]. In the pr... |

6 | Optimal oracle inequality for aggregation of classifiers under low noise condition
- Lecué
(Show Context)
Citation Context ...known ERM (for Empirical Risk Minimization) procedure (cf. [51], [38] and references therein) and an exponential weighting aggregation scheme, which has been studied, among others, by [5], [8], [40], =-=[41]-=- and [39]. There is a recursive version of this scheme studied by [13], [54], [35] and [36]. In the sequential prediction problem, weighted average predictions 2with exponential weights have been wid... |

5 | On adaptivity of blockshrink wavelet estimator over Besov spaces
- Cai
- 1997
(Show Context)
Citation Context ...he result obtained, for instance, by the hard thresholded estimator (see [21]), by the global wavelet block thresholded estimator (see [37]), by the localized wavelet block thresholded estimator (see =-=[9, 12, 10]-=-, [28, 27], [24, 25], [16] and [11]) and, in particular, the penalized Blockwise Stein method (see [14]) are worse than the one obtained by the multi-thresholding estimator and stated in Theorems 3 an... |

4 |
Sharp linear and block shrinkage wavelet estimation
- Efromovich
- 2000
(Show Context)
Citation Context ... instance, by the hard thresholded estimator (see [21]), by the global wavelet block thresholded estimator (see [37]), by the localized wavelet block thresholded estimator (see [9, 12, 10], [28, 27], =-=[24, 25]-=-, [16] and [11]) and, in particular, the penalized Blockwise Stein method (see [14]) are worse than the one obtained by the multi-thresholding estimator and stated in Theorems 3 and 4. This is because... |

3 |
Online prediction algorithms for aggregation of arbitrary estimators of a conditional mean
- Bunea, Nobel
- 2005
(Show Context)
Citation Context .... The well known ERM (for Empirical Risk Minimization) procedure (cf. [51], [38] and references therein) and an exponential weighting aggregation scheme, which has been studied, among others, by [5], =-=[8]-=-, [40], [41] and [39]. There is a recursive version of this scheme studied by [13], [54], [35] and [36]. In the sequential prediction problem, weighted average predictions 2with exponential weights h... |

3 |
Topics in Non-parametric Statistics, volume 1738 of Ecole d’été de Probabilités de Saint-Flour
- Nemirovski
- 2000
(Show Context)
Citation Context ...ameter κ = 1). 2.2 Aggregation Procedures Let’s work with the notations introduced in the beginning of the previous Subsection. The aggregation framework considered, among others, by [34], [54], [13],=-=[46]-=-, [49], [5], [6] is the following: take F0 a finite subset of F, our aim is to mimic (up to an 6additive residual) the best function in F0 w.r.t. the risk A. For this, we consider two aggregation pro... |