Results 1  10
of
11
DeNoising By SoftThresholding
, 1992
"... Donoho and Johnstone (1992a) proposed a method for reconstructing an unknown function f on [0; 1] from noisy data di = f(ti)+ zi, iid i =0;:::;n 1, ti = i=n, zi N(0; 1). The reconstruction fn ^ is de ned in the wavelet domain by translating all the empirical wavelet coe cients of d towards 0 by an a ..."
Abstract

Cited by 798 (13 self)
 Add to MetaCart
Donoho and Johnstone (1992a) proposed a method for reconstructing an unknown function f on [0; 1] from noisy data di = f(ti)+ zi, iid i =0;:::;n 1, ti = i=n, zi N(0; 1). The reconstruction fn ^ is de ned in the wavelet domain by translating all the empirical wavelet coe cients of d towards 0 by an amount p 2 log(n) = p n. We prove two results about that estimator. [Smooth]: With high probability ^ fn is at least as smooth as f, in any of a wide variety of smoothness measures. [Adapt]: The estimator comes nearly as close in mean square to f as any measurable estimator can come, uniformly over balls in each of two broad scales of smoothness classes. These two properties are unprecedented in several ways. Our proof of these results develops new facts about abstract statistical inference and its connection with an optimal recovery model.
Wavelet shrinkage: asymptopia
 Journal of the Royal Statistical Society, Ser. B
, 1995
"... Considerable e ort has been directed recently to develop asymptotically minimax methods in problems of recovering in nitedimensional objects (curves, densities, spectral densities, images) from noisy data. A rich and complex body of work has evolved, with nearly or exactly minimax estimators bein ..."
Abstract

Cited by 239 (35 self)
 Add to MetaCart
Considerable e ort has been directed recently to develop asymptotically minimax methods in problems of recovering in nitedimensional objects (curves, densities, spectral densities, images) from noisy data. A rich and complex body of work has evolved, with nearly or exactly minimax estimators being obtained for a variety of interesting problems. Unfortunately, the results have often not been translated into practice, for a variety of reasons { sometimes, similarity to known methods, sometimes, computational intractability, and sometimes, lack of spatial adaptivity. We discuss a method for curve estimation based on n noisy data; one translates the empirical wavelet coe cients towards the origin by an amount p p 2 log(n) = n. The method is di erent from methods in common use today, is computationally practical, and is spatially adaptive; thus it avoids a number of previous objections to minimax estimators. At the same time, the method is nearly minimax for a wide variety of loss functions { e.g. pointwise error, global error measured in L p norms, pointwise and global error in estimation of derivatives { and for a wide range of smoothness classes, including standard Holder classes, Sobolev classes, and Bounded Variation. This is amuch broader nearoptimality than anything previously proposed in the minimax literature. Finally, the theory underlying the method is interesting, as it exploits a correspondence between statistical questions and questions of optimal recovery and informationbased complexity.
Local Rademacher complexities
 Annals of Statistics
, 2002
"... We propose new bounds on the error of learning algorithms in terms of a datadependent notion of complexity. The estimates we establish give optimal rates and are based on a local and empirical version of Rademacher averages, in the sense that the Rademacher averages are computed from the data, on a ..."
Abstract

Cited by 106 (18 self)
 Add to MetaCart
We propose new bounds on the error of learning algorithms in terms of a datadependent notion of complexity. The estimates we establish give optimal rates and are based on a local and empirical version of Rademacher averages, in the sense that the Rademacher averages are computed from the data, on a subset of functions with small empirical error. We present some applications to classification and prediction with convex function classes, and with kernel classes in particular.
Nonparametric time series prediction through adaptive model selection
 Machine Learning
, 2000
"... Abstract. We consider the problem of onestep ahead prediction for time series generated by an underlying stationary stochastic process obeying the condition of absolute regularity, describing the mixing nature of process. We make use of recent results from the theory of empirical processes, and ada ..."
Abstract

Cited by 28 (0 self)
 Add to MetaCart
Abstract. We consider the problem of onestep ahead prediction for time series generated by an underlying stationary stochastic process obeying the condition of absolute regularity, describing the mixing nature of process. We make use of recent results from the theory of empirical processes, and adapt the uniform convergence framework of Vapnik and Chervonenkis to the problem of time series prediction, obtaining finite sample bounds. Furthermore, by allowing both the model complexity and memory size to be adaptively determined by the data, we derive nonparametric rates of convergence through an extension of the method of structural risk minimization suggested by Vapnik. All our results are derived for general L p error measures, and apply to both exponentially and algebraically mixing processes.
Empirical minimization
 Probability Theory and Related Fields, 135(3):311 – 334
, 2003
"... We investigate the behavior of the empirical minimization algorithm using various methods. We first analyze it by comparing the empirical, random, structure and the original one on the class, either in an additive sense, via the uniform law of large numbers, or in a multiplicative sense, using isomo ..."
Abstract

Cited by 19 (7 self)
 Add to MetaCart
We investigate the behavior of the empirical minimization algorithm using various methods. We first analyze it by comparing the empirical, random, structure and the original one on the class, either in an additive sense, via the uniform law of large numbers, or in a multiplicative sense, using isomorphic coordinate projections. We then show that a direct analysis of the empirical minimization algorithm yields a significantly better bound, and that the estimates we obtain are essentially sharp. The method of proof we use is based on Talagrand’s concentration inequality for empirical processes.
Nonparametric Regression Estimation Using Penalized Least Squares
 IEEE TRANSACTIONS ON INFORMATION THEORY
, 2000
"... We present multivariate penalized least squares regression estimates. We use VapnikChervonenkis theory and bounds on the covering numbers to analyze convergence of the estimates. We show strong consistency of the truncated versions of the estimates without any conditions on the underlying distribut ..."
Abstract

Cited by 8 (6 self)
 Add to MetaCart
We present multivariate penalized least squares regression estimates. We use VapnikChervonenkis theory and bounds on the covering numbers to analyze convergence of the estimates. We show strong consistency of the truncated versions of the estimates without any conditions on the underlying distribution.
Nonparametric Regression with Additional Measurement Errors in the Dependent Variable
 Universitat Stuttgart
, 2003
"... Estimation of a regression function from data which consists of an independent and identically distributed sample of the underlying distribution with additional measurement errors in the dependent variable is considered. It is allowed that the measurement errors are not independent and have nonzero ..."
Abstract

Cited by 5 (2 self)
 Add to MetaCart
Estimation of a regression function from data which consists of an independent and identically distributed sample of the underlying distribution with additional measurement errors in the dependent variable is considered. It is allowed that the measurement errors are not independent and have nonzero mean. It is shown that the rate of convergence of least squares estimates applied to this data is similar to the rate of convergence of least squares estimates applied to an independent and identically distributed sample of the underlying distribution as long as the measurement errors are small. As an application, estimation of the conditional variance from residuals is considered.
Consistency for the Least Squares Estimator in NonParametric Regression.
, 1996
"... We shall study the general regression model Y = g 0 (X) + ", where X and " are independent. The available information about g 0 can be expressed by g 0 2 G for some class G. As an estimator of g 0 we choose the least squares estimator. We shall give necessary and sufficient conditions for consiste ..."
Abstract

Cited by 2 (0 self)
 Add to MetaCart
We shall study the general regression model Y = g 0 (X) + ", where X and " are independent. The available information about g 0 can be expressed by g 0 2 G for some class G. As an estimator of g 0 we choose the least squares estimator. We shall give necessary and sufficient conditions for consistency of this estimator in terms of (basically) geometric properties of G. Our main tool will be the theory of empirical processes. 1. Introduction. In this paper, we consider the following regression model: Y = g 0 (X) + oe"; where X is a random variable with values in IR k and probability distribution P , and " is a realvalued random variable with distribution K. We assume that " and X are independent, and that IE" = 0; IE" 2 = 1. Thus oe 2 0 is the variance of the error e = oe \Delta ". We also require that R g 2 0 dP ! 1, i.e. g 0 2 L 2 (P ). The regression function g 0 is unknown and to be estimated from independent copies (X 1 ; Y 1 ); : : : ;(X n ; Y n ) of (X; Y ). Let G...
Optimal samplebased estimates of the expectation of the empirical minimizer
, 2005
"... We study samplebased estimates of the expectation of the function produced by the empirical minimization algorithm. We investigate the extent to which one can estimate the rate of convergence of the empirical minimizer in a data dependent manner. We establish three main results. First, we provide a ..."
Abstract

Cited by 2 (2 self)
 Add to MetaCart
We study samplebased estimates of the expectation of the function produced by the empirical minimization algorithm. We investigate the extent to which one can estimate the rate of convergence of the empirical minimizer in a data dependent manner. We establish three main results. First, we provide an algorithm that upper bounds the expectation of the empirical minimizer in a completely datadependent manner. This bound is based on a structural result in [3], which relates expectations to sample averages. Second, we show that these structural
Application of Structural Risk Minimization to Multivariate Smoothing Spline Regression Estimates
, 2000
"... Estimation of regression functions from bounded, independent and identically distributed data is considered. Motivated by Vapnik's principle of structural risk minimization a datadependent choice of the smoothing parameter of multivariate smoothing spline estimates is proposed. The corresponding sm ..."
Abstract

Cited by 2 (1 self)
 Add to MetaCart
Estimation of regression functions from bounded, independent and identically distributed data is considered. Motivated by Vapnik's principle of structural risk minimization a datadependent choice of the smoothing parameter of multivariate smoothing spline estimates is proposed. The corresponding smoothing spline estimates automatically adapt to the unknown smoothness of the regression function and their L 2 errors achieve the optimal rate of convergence up to a logarithmic factor. The result is valid without any regularity conditions on the distribution of the design.