Results 1  10
of
18
DeNoising By SoftThresholding
, 1992
"... Donoho and Johnstone (1992a) proposed a method for reconstructing an unknown function f on [0; 1] from noisy data di = f(ti)+ zi, iid i =0;:::;n 1, ti = i=n, zi N(0; 1). The reconstruction fn ^ is de ned in the wavelet domain by translating all the empirical wavelet coe cients of d towards 0 by an a ..."
Abstract

Cited by 798 (13 self)
 Add to MetaCart
Donoho and Johnstone (1992a) proposed a method for reconstructing an unknown function f on [0; 1] from noisy data di = f(ti)+ zi, iid i =0;:::;n 1, ti = i=n, zi N(0; 1). The reconstruction fn ^ is de ned in the wavelet domain by translating all the empirical wavelet coe cients of d towards 0 by an amount p 2 log(n) = p n. We prove two results about that estimator. [Smooth]: With high probability ^ fn is at least as smooth as f, in any of a wide variety of smoothness measures. [Adapt]: The estimator comes nearly as close in mean square to f as any measurable estimator can come, uniformly over balls in each of two broad scales of smoothness classes. These two properties are unprecedented in several ways. Our proof of these results develops new facts about abstract statistical inference and its connection with an optimal recovery model.
Adapting to unknown smoothness via wavelet shrinkage
 JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION
, 1995
"... We attempt to recover a function of unknown smoothness from noisy, sampled data. We introduce a procedure, SureShrink, which suppresses noise by thresholding the empirical wavelet coefficients. The thresholding is adaptive: a threshold level is assigned to each dyadic resolution level by the princip ..."
Abstract

Cited by 675 (19 self)
 Add to MetaCart
We attempt to recover a function of unknown smoothness from noisy, sampled data. We introduce a procedure, SureShrink, which suppresses noise by thresholding the empirical wavelet coefficients. The thresholding is adaptive: a threshold level is assigned to each dyadic resolution level by the principle of minimizing the Stein Unbiased Estimate of Risk (Sure) for threshold estimates. The computational effort of the overall procedure is order N log(N) as a function of the sample size N. SureShrink is smoothnessadaptive: if the unknown function contains jumps, the reconstruction (essentially) does also; if the unknown function has a smooth piece, the reconstruction is (essentially) as smooth as the mother wavelet will allow. The procedure is in a sense optimally smoothnessadaptive: it is nearminimax simultaneously over a whole interval of the Besov scale; the size of this interval depends on the choice of mother wavelet. We know from a previous paper by the authors that traditional smoothing methods  kernels, splines, and orthogonal series estimates  even with optimal choices of the smoothing parameter, would be unable to perform
Wavelet shrinkage using crossvalidation
, 1996
"... Wavelets are orthonormal basis functions with special properties that show potential in many areas of mathematics and statistics. This article concentrates on the estimation of functions and images from noisy data using wavelet shrinkage. A modified form of twofold crossvalidation is introduced to ..."
Abstract

Cited by 77 (13 self)
 Add to MetaCart
Wavelets are orthonormal basis functions with special properties that show potential in many areas of mathematics and statistics. This article concentrates on the estimation of functions and images from noisy data using wavelet shrinkage. A modified form of twofold crossvalidation is introduced to choose a threshold for wavelet shrinkage estimators operating on data sets of length a power of two. The crossvalidation algorithm is then extended to data sets of any length and to multidimensional data sets. The algorithms are compared to established threshold choosers using simulation. An application to a real data set arising from anaesthesia is presented.
Optimal pointwise adaptive methods in nonparametric estimation
 ANN. STATIST
, 1997
"... The problem of optimal adaptive estimation of a function at a given point from noisy data is considered. Two procedures are proved to be asymptotically optimal for different settings. First we study the problem of bandwidth selection for nonparametric pointwise kernel estimation with a given kernel. ..."
Abstract

Cited by 38 (9 self)
 Add to MetaCart
The problem of optimal adaptive estimation of a function at a given point from noisy data is considered. Two procedures are proved to be asymptotically optimal for different settings. First we study the problem of bandwidth selection for nonparametric pointwise kernel estimation with a given kernel. We propose a bandwidth selection procedure and prove its optimality in the asymptotic sense. Moreover, this optimality is stated not only among kernel estimators with a variable bandwidth. The resulting estimator is asymptotically optimal among all feasible estimators. The important feature of this procedure is that it is fully adaptive and it “works” for a very wide class of functions obeying a mild regularity restriction. With it the attainable accuracy of estimation depends on the function itself and is expressed in terms of the “ideal adaptive bandwidth” corresponding to this function and a given kernel. The second procedure can be considered as a specialization of the first one under the qualitative assumption that the function to be estimated belongs to some Hölder class ��β � L � with unknown parameters β � L. This assumption allows us to choose a family of kernels in an optimal way and the resulting procedure appears to be asymptotically optimal in the adaptive sense in any range of adaptation with β ≤ 2.
Universal smoothing factor selection in density estimation: theory and practice (with discussion
 Test
, 1997
"... In earlier work with Gabor Lugosi, we introduced a method to select a smoothing factor for kernel density estimation such that, for all densities in all dimensions, the L1 error of the corresponding kernel estimate is not larger than 3+e times the error of the estimate with the optimal smoothing fac ..."
Abstract

Cited by 23 (10 self)
 Add to MetaCart
In earlier work with Gabor Lugosi, we introduced a method to select a smoothing factor for kernel density estimation such that, for all densities in all dimensions, the L1 error of the corresponding kernel estimate is not larger than 3+e times the error of the estimate with the optimal smoothing factor plus a constant times Ov~~n/n, where n is the sample size, and the constant only depends on the complexity of the kernel used in the estimate. The result is nonasymptotic, that is, the bound is valid for each n. The estimate uses ideas from the minimum distance estimation work of Yatracos. We present a practical implementation of this estimate, report on some comparative results, and highlight some key properties of the new method.
Consistency of cross validation for comparing regression procedures. Annals of Statistics, Accepted paper
"... Theoretical developments on cross validation (CV) have mainly focused on selecting one among a list of finitedimensional models (e.g., subset or order selection in linear regression) or selecting a smoothing parameter (e.g., bandwidth for kernel smoothing). However, little is known about consistenc ..."
Abstract

Cited by 15 (1 self)
 Add to MetaCart
Theoretical developments on cross validation (CV) have mainly focused on selecting one among a list of finitedimensional models (e.g., subset or order selection in linear regression) or selecting a smoothing parameter (e.g., bandwidth for kernel smoothing). However, little is known about consistency of cross validation when applied to compare between parametric and nonparametric methods or within nonparametric methods. We show that under some conditions, with an appropriate choice of data splitting ratio, cross validation is consistent in the sense of selecting the better procedure with probability approaching 1. Our results reveal interesting behavior of cross validation. When comparing two models (procedures) converging at the same nonparametric rate, in contrast to the parametric case, it turns out that the proportion of data used for evaluation in CV does not need to be dominating in size. Furthermore, it can even be of a smaller order than the proportion for estimation while not affecting the consistency property.
Progressive Modeling
, 2002
"... Presently, inductive learning is still performed in a frustrating batch process. The user has little interaction with the system and no control over the final accuracy and training time. If the accuracy of the produced model is too low, all the computing resources are misspent. In this paper, we pro ..."
Abstract

Cited by 8 (4 self)
 Add to MetaCart
Presently, inductive learning is still performed in a frustrating batch process. The user has little interaction with the system and no control over the final accuracy and training time. If the accuracy of the produced model is too low, all the computing resources are misspent. In this paper, we propose a progressive modeling framework. In progressive modeling, the learning algorithm estimates online both the accuracy of the final model and remaining training time. If the estimated accuracy is far below expectation, the user can terminate training prior to completion without wasting further resources. If the user chooses to complete the learning process, progressive modeling will compute a model with expected accuracy in expected time. We describe one implementation of progressive modeling using ensemble of classifiers.
Model Indexing and Smoothing Parameter Selection in Nonparametric Function Estimation
"... Smoothing parameter selection is among the most intensively studied subjects in nonparametric function estimation. A closely related issue, that of identifying a proper index for the smoothing parameter, is however largely neglected in the existing literature. Through heuristic arguments and simple ..."
Abstract

Cited by 6 (0 self)
 Add to MetaCart
Smoothing parameter selection is among the most intensively studied subjects in nonparametric function estimation. A closely related issue, that of identifying a proper index for the smoothing parameter, is however largely neglected in the existing literature. Through heuristic arguments and simple simulations, we shall illustrate that most current working indices are conceptually "incorrect", in the sense that they are not interpretable acrossreplicate in repeated experiments, and as a consequence, a few popular working concepts, such as the expected mean square error and the "degrees of freedom", appear vulnerable under close scrutiny. Due to technical constraints, the arguments are mainly developed in the penalized likelihood setting, but conceptual parallels can be drawn to other settings as well. In the light of our findings, simulations and discussion are also presented to compare the relative merits of the simple crossvalidation method versus the more sophisticated plugin met...
Bayesian fluorescence in situ hybridisation signal classification
, 2004
"... research has indicated the significance of accurate classification of fluorescence in situ hybridisation (FISH) signals for the detection of genetic abnormalities. Based on welldiscriminating features and a trainable neural network (NN) classifier, a previous system enabled highlyaccurate classifi ..."
Abstract

Cited by 5 (0 self)
 Add to MetaCart
research has indicated the significance of accurate classification of fluorescence in situ hybridisation (FISH) signals for the detection of genetic abnormalities. Based on welldiscriminating features and a trainable neural network (NN) classifier, a previous system enabled highlyaccurate classification of valid signals and artefacts of two fluorophores. However, since this system employed several features that are considered independent, the naive Bayesian classifier (NBC) is suggested here as an alternative to the NN. The NBC independence assumption permits the decomposition of the highdimensional likelihood of the model for the data into a product of onedimensional probability densities. The naive independence assumption together with the Bayesian methodology allow the NBC to predict a posteriori probabilities of class membership using estimated classconditional densities in a close and simple form. Since the probability densities are the only parameters of the NBC, the misclassification rate of the model is determined exclusively by the quality of density estimation. Densities are evaluated by three methods: single Gaussian estimation (SGE; parametric method), Gaussian mixture model assuming spherical covariance matrices (GMM; semiparametric method) and kernel density estimation (KDE; nonparametric method). For lowdimensional densities, the GMM generally outperforms the KDE that tends to overfit the training set
On Combining Data from Multiple Sources with Unknown Relative Weights
, 1993
"... The problem of using the "direct" variational methods in a statistical model that merges data from different sources with unknown relative weights is considered. To carry out this merging optimally, it is necessary to provide an estimate of the relative weights to be given to data from different sou ..."
Abstract

Cited by 5 (0 self)
 Add to MetaCart
The problem of using the "direct" variational methods in a statistical model that merges data from different sources with unknown relative weights is considered. To carry out this merging optimally, it is necessary to provide an estimate of the relative weights to be given to data from different sources. A new form of generalized cross validation (GCV) estimate for simultaneously estimating the weighting parameters and the smoothing parameters is developed here. We name this estimate GCVr, where r represents the weighting parameter. We study the properties of the GCVr estimators as well as the properties of the generalized maximum likelihood (GMLr) estimators proposed in Wahba, Johnson and Reames (1990). We prove the weak consistency and the asymptotic normality of all these estimators under a stochastic model. The convergence rates for these estimators are obtained under some conditions. Some simulation studies are carried out both to confirm the theoretical results and to compare ...