Results 1  10
of
19
DeNoising By SoftThresholding
, 1992
"... Donoho and Johnstone (1992a) proposed a method for reconstructing an unknown function f on [0; 1] from noisy data di = f(ti)+ zi, iid i =0;:::;n 1, ti = i=n, zi N(0; 1). The reconstruction fn ^ is de ned in the wavelet domain by translating all the empirical wavelet coe cients of d towards 0 by an a ..."
Abstract

Cited by 798 (13 self)
 Add to MetaCart
Donoho and Johnstone (1992a) proposed a method for reconstructing an unknown function f on [0; 1] from noisy data di = f(ti)+ zi, iid i =0;:::;n 1, ti = i=n, zi N(0; 1). The reconstruction fn ^ is de ned in the wavelet domain by translating all the empirical wavelet coe cients of d towards 0 by an amount p 2 log(n) = p n. We prove two results about that estimator. [Smooth]: With high probability ^ fn is at least as smooth as f, in any of a wide variety of smoothness measures. [Adapt]: The estimator comes nearly as close in mean square to f as any measurable estimator can come, uniformly over balls in each of two broad scales of smoothness classes. These two properties are unprecedented in several ways. Our proof of these results develops new facts about abstract statistical inference and its connection with an optimal recovery model.
Regularization Theory and Neural Networks Architectures
 Neural Computation
, 1995
"... We had previously shown that regularization principles lead to approximation schemes which are equivalent to networks with one layer of hidden units, called Regularization Networks. In particular, standard smoothness functionals lead to a subclass of regularization networks, the well known Radial Ba ..."
Abstract

Cited by 309 (31 self)
 Add to MetaCart
We had previously shown that regularization principles lead to approximation schemes which are equivalent to networks with one layer of hidden units, called Regularization Networks. In particular, standard smoothness functionals lead to a subclass of regularization networks, the well known Radial Basis Functions approximation schemes. This paper shows that regularization networks encompass a much broader range of approximation schemes, including many of the popular general additive models and some of the neural networks. In particular, we introduce new classes of smoothness functionals that lead to different classes of basis functions. Additive splines as well as some tensor product splines can be obtained from appropriate classes of smoothness functionals. Furthermore, the same generalization that extends Radial Basis Functions (RBF) to Hyper Basis Functions (HBF) also leads from additive models to ridge approximation models, containing as special cases Breiman's hinge functions, som...
Wavelet shrinkage: asymptopia
 Journal of the Royal Statistical Society, Ser. B
, 1995
"... Considerable e ort has been directed recently to develop asymptotically minimax methods in problems of recovering in nitedimensional objects (curves, densities, spectral densities, images) from noisy data. A rich and complex body of work has evolved, with nearly or exactly minimax estimators bein ..."
Abstract

Cited by 239 (35 self)
 Add to MetaCart
Considerable e ort has been directed recently to develop asymptotically minimax methods in problems of recovering in nitedimensional objects (curves, densities, spectral densities, images) from noisy data. A rich and complex body of work has evolved, with nearly or exactly minimax estimators being obtained for a variety of interesting problems. Unfortunately, the results have often not been translated into practice, for a variety of reasons { sometimes, similarity to known methods, sometimes, computational intractability, and sometimes, lack of spatial adaptivity. We discuss a method for curve estimation based on n noisy data; one translates the empirical wavelet coe cients towards the origin by an amount p p 2 log(n) = n. The method is di erent from methods in common use today, is computationally practical, and is spatially adaptive; thus it avoids a number of previous objections to minimax estimators. At the same time, the method is nearly minimax for a wide variety of loss functions { e.g. pointwise error, global error measured in L p norms, pointwise and global error in estimation of derivatives { and for a wide range of smoothness classes, including standard Holder classes, Sobolev classes, and Bounded Variation. This is amuch broader nearoptimality than anything previously proposed in the minimax literature. Finally, the theory underlying the method is interesting, as it exploits a correspondence between statistical questions and questions of optimal recovery and informationbased complexity.
Nonlinear Wavelet Methods for Recovery of Signals, Densities, and Spectra from Indirect and Noisy Data
 In Proceedings of Symposia in Applied Mathematics
, 1993
"... . We describe wavelet methods for recovery of objects from noisy and incomplete data. The common themes: (a) the new methods utilize nonlinear operations in the wavelet domain; (b) they accomplish tasks which are not possible by traditional linear/Fourier approaches to such problems. We attempt to i ..."
Abstract

Cited by 103 (5 self)
 Add to MetaCart
. We describe wavelet methods for recovery of objects from noisy and incomplete data. The common themes: (a) the new methods utilize nonlinear operations in the wavelet domain; (b) they accomplish tasks which are not possible by traditional linear/Fourier approaches to such problems. We attempt to indicate the heuristic principles, theoretical foundations, and possible application areas for these methods. Areas covered: (1) Wavelet DeNoising. (2) Wavelet Approaches to Linear Inverse Problems. (4) Wavelet Packet DeNoising. (5) Segmented MultiResolutions. (6) Nonlinear Multiresolutions. 1. Introduction. With the rapid development of computerized scientific instruments comes a wide variety of interesting problems for data analysis and signal processing. In fields ranging from Extragalactic Astronomy to Molecular Spectroscopy to Medical Imaging to Computer Vision, one must recover a signal, curve, image, spectrum, or density from incomplete, indirect, and noisy data. What can wavelets ...
LASSOPatternsearch Algorithm with Application to Ophthalmology and Genomic Data
, 2008
"... The LASSOPatternsearch algorithm is proposed to efficiently identify patterns of multiple dichotomous risk factors for outcomes of interest in demographic and genomic studies. The patterns considered are those that arise naturally from the log linear expansion of the multivariate Bernoulli density. ..."
Abstract

Cited by 29 (22 self)
 Add to MetaCart
The LASSOPatternsearch algorithm is proposed to efficiently identify patterns of multiple dichotomous risk factors for outcomes of interest in demographic and genomic studies. The patterns considered are those that arise naturally from the log linear expansion of the multivariate Bernoulli density. The method is designed for the case where there is a possibly very large number of candidate patterns but it is believed that only a relatively small number are important. A LASSO is used to greatly reduce the number of candidate patterns, using a novel computational algorithm that can handle an extremely large number of unknowns simultaneously. The patterns surviving the LASSO are further pruned in the framework of (parametric) generalized linear models. A novel tuning procedure based on the GACV for Bernoulli outcomes, modified to act
Component Selection and Smoothing in Smoothing Spline Analysis of Variance Models
 COSSO. INSTITUTE OF STATISTICS MIMEO SERIES 2556, NCSU
, 2003
"... We propose a new method for model selection and model fitting in nonparametric regression models, in the framework of smoothing spline ANOVA. The "COSSO" is a method of regularization with the penalty functional being the sum of component norms, instead of the squared norm employed in the traditi ..."
Abstract

Cited by 27 (9 self)
 Add to MetaCart
We propose a new method for model selection and model fitting in nonparametric regression models, in the framework of smoothing spline ANOVA. The "COSSO" is a method of regularization with the penalty functional being the sum of component norms, instead of the squared norm employed in the traditional smoothing spline method. The COSSO provides a unified framework for several recent proposals for model selection in linear models and smoothing spline ANOVA models. Theoretical properties, such as the existence and the rate of convergence of the COSSO estimator, are studied. In the special case of a tensor product design with periodic functions, a detailed analysis reveals that the COSSO applies a novel soft thresholding type operation to the function components and selects the correct model structure with probability tending to one. We give
Variable Selection and Model Building via Likelihood Basis Pursuit
 JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION
, 2002
"... This paper presents a nonparametric penalized likelihood approach for variable selection and model building, called likelihood basis pursuit (LBP). In the setting of a tensor product reproducing kernel Hilbert space, we decompose the log likelihood into the sum of different functional components suc ..."
Abstract

Cited by 22 (10 self)
 Add to MetaCart
This paper presents a nonparametric penalized likelihood approach for variable selection and model building, called likelihood basis pursuit (LBP). In the setting of a tensor product reproducing kernel Hilbert space, we decompose the log likelihood into the sum of different functional components such as main effects and interactions, with each component represented by appropriate basis functions. The basis functions are chosen to be compatible with variable selection and model building in the context of a smoothing spline ANOVA model. Basis pursuit is applied to obtain the optimal decomposition in terms of having the smallest l 1 norm on the coefficients. We use the functional L 1 norm to measure the importance of each component and determine the "threshold" value by a sequential Monte Carlo bootstrap test algorithm. As a generalized LASSOtype method, LBP produces shrinkage estimates for the coefficients, which greatly facilitates the variable selection process, and provides highly interpretable multivariate functional estimates at the same time. To choose the regularization parameters appearing in the LBP models, generalized approximate cross validation (GACV) is derived as a tuning criterion. To make GACV widely applicable to large data sets, its randomized version is proposed as well. A technique "slice modeling" is used to solve the optimization problem and makes the computation more efficient. LBP has great potential for a wide range of research and application areas such as medical studies, and in this paper we apply it to two large ongoing epidemiological studies: the Wisconsin Epidemiological Study of Diabetic Retinopathy (WESDR) and the Beaver Dam Eye Study (BDES).
Statistical Ideas for Selecting Network Architectures
 Invited Presentation, Neural Information Processing Systems 8
, 1995
"... Choosing the architecture of a neural network is one of the most important problems in making neural networks practically useful, but accounts of applications usually sweep these details under the carpet. How many hidden units are needed? Should weight decay be used, and if so how much? What type of ..."
Abstract

Cited by 18 (3 self)
 Add to MetaCart
Choosing the architecture of a neural network is one of the most important problems in making neural networks practically useful, but accounts of applications usually sweep these details under the carpet. How many hidden units are needed? Should weight decay be used, and if so how much? What type of output units should be chosen? And so on. We address these issues within the framework of statistical theory for model choice, which provides a number of workable approximate answers. This paper is principally concerned with architecture selection issues for feedforward neural networks (also known as multilayer perceptrons). Many of the same issues arise in selecting radial basis function networks, recurrent networks and more widely. These problems occur in a much wider context within statistics, and applied statisticians have been selecting and combining models for decades. Two recent discussions are [4, 5]. References [3, 20, 21, 22] discuss neural networks from a statistical perspecti...
Generalization And Regularization in Nonlinear Learning Systems
 The Handbook of Brain Theory and Neural Networks
, 1994
"... this article we will describe generalization and regularization from the point of view of multivariate function estimation in a statistical context. Multivariate function estimation is not, in principle, distinguishable from supervised machine learning. However, until fairly recently supervised mach ..."
Abstract

Cited by 10 (3 self)
 Add to MetaCart
this article we will describe generalization and regularization from the point of view of multivariate function estimation in a statistical context. Multivariate function estimation is not, in principle, distinguishable from supervised machine learning. However, until fairly recently supervised machine learning and multivariate function estimation had fairly distinct groups of practitioners, and small overlap in language, literature, and in the kinds of practical problems under study. In any case, we are given a training set, consisting of pairs of input (feature) vectors and associated outputs ft(i); y i g, for n training or example subjects, i = 1; :::n. From this data, it is desired to construct a map which generalizes well, that is, given a new value of t, the map will provide a reasonable prediction for the unobserved output associated with this t.
Matching Pursuit
, 1993
"... This paper presents a nonparametric penalized likelihood approach for variable selection and model building, called likelihood basis pursuit (LBP). In the setting of a tensor product reproducing kernel Hilbert space, we decompose the log likelihood into the sum of different functional components suc ..."
Abstract

Cited by 9 (0 self)
 Add to MetaCart
This paper presents a nonparametric penalized likelihood approach for variable selection and model building, called likelihood basis pursuit (LBP). In the setting of a tensor product reproducing kernel Hilbert space, we decompose the log likelihood into the sum of different functional components such as main effects and interactions, with each component represented by appropriate basis functions. Basis functions are chosen to be compatible with variable selection and model building in the context of a smoothing spline ANOVA model. Basis pursuit is applied to obtain the optimal decomposition in terms of having the smallest l1 norm on the coefficients. We use the functional L1 norm to measure the importance of each component and determine the “threshold” value by a sequential Monte Carlo bootstrap test algorithm. As a generalized LASSOtype method, LBP produces shrinkage estimates for the coefficients, which greatly facilitates variable selection process, and provides highly interpretable multivariate functional estimates at the same time. To choose the regularization parameters appearing in the LBP models, generalized approximate cross validation (GACV) is derived as a tuning criterion. To make GACV widely applicable to large data sets, its randomized version is proposed as well. A technique “slice modeling ” is