Results 1  10
of
189
Regularization Theory and Neural Networks Architectures
 Neural Computation
, 1995
"... We had previously shown that regularization principles lead to approximation schemes which are equivalent to networks with one layer of hidden units, called Regularization Networks. In particular, standard smoothness functionals lead to a subclass of regularization networks, the well known Radial Ba ..."
Abstract

Cited by 309 (31 self)
 Add to MetaCart
We had previously shown that regularization principles lead to approximation schemes which are equivalent to networks with one layer of hidden units, called Regularization Networks. In particular, standard smoothness functionals lead to a subclass of regularization networks, the well known Radial Basis Functions approximation schemes. This paper shows that regularization networks encompass a much broader range of approximation schemes, including many of the popular general additive models and some of the neural networks. In particular, we introduce new classes of smoothness functionals that lead to different classes of basis functions. Additive splines as well as some tensor product splines can be obtained from appropriate classes of smoothness functionals. Furthermore, the same generalization that extends Radial Basis Functions (RBF) to Hyper Basis Functions (HBF) also leads from additive models to ridge approximation models, containing as special cases Breiman's hinge functions, som...
A review of image denoising algorithms, with a new one
 Simul
, 2005
"... Abstract. The search for efficient image denoising methods is still a valid challenge at the crossing of functional analysis and statistics. In spite of the sophistication of the recently proposed methods, most algorithms have not yet attained a desirable level of applicability. All show an outstand ..."
Abstract

Cited by 265 (2 self)
 Add to MetaCart
Abstract. The search for efficient image denoising methods is still a valid challenge at the crossing of functional analysis and statistics. In spite of the sophistication of the recently proposed methods, most algorithms have not yet attained a desirable level of applicability. All show an outstanding performance when the image model corresponds to the algorithm assumptions but fail in general and create artifacts or remove image fine structures. The main focus of this paper is, first, to define a general mathematical and experimental methodology to compare and classify classical image denoising algorithms and, second, to propose a nonlocal means (NLmeans) algorithm addressing the preservation of structure in a digital image. The mathematical analysis is based on the analysis of the “method noise, ” defined as the difference between a digital image and its denoised version. The NLmeans algorithm is proven to be asymptotically optimal under a generic statistical image model. The denoising performance of all considered methods are compared in four ways; mathematical: asymptotic order of magnitude of the method noise under regularity assumptions; perceptualmathematical: the algorithms artifacts and their explanation as a violation of the image model; quantitative experimental: by tables of L 2 distances of the denoised version to the original image. The most powerful evaluation method seems, however, to be the visualization of the method noise on natural images. The more this method noise looks like a real white noise, the better the method.
Nonlinear BlackBox Modeling in System Identification: a Unified Overview
 Automatica
, 1995
"... A nonlinear black box structure for a dynamical system is a model structure that is prepared to describe virtually any nonlinear dynamics. There has been considerable recent interest in this area with structures based on neural networks, radial basis networks, wavelet networks, hinging hyperplanes, ..."
Abstract

Cited by 136 (15 self)
 Add to MetaCart
A nonlinear black box structure for a dynamical system is a model structure that is prepared to describe virtually any nonlinear dynamics. There has been considerable recent interest in this area with structures based on neural networks, radial basis networks, wavelet networks, hinging hyperplanes, as well as wavelet transform based methods and models based on fuzzy sets and fuzzy rules. This paper describes all these approaches in a common framework, from a user's perspective. It focuses on what are the common features in the different approaches, the choices that have to be made and what considerations are relevant for a successful system identification application of these techniques. It is pointed out that the nonlinear structures can be seen as a concatenation of a mapping from observed data to a regression vector and a nonlinear mapping from the regressor space to the output space. These mappings are discussed separately. The latter mapping is usually formed as a basis function e...
Neural Networks and Statistical Models
, 1994
"... There has been much publicity about the ability of artificial neural networks to learn and generalize. In fact, the most commonly used artificial neural networks, called multilayer perceptrons, are nothing more than nonlinear regression and discriminant models that can be implemented with standard s ..."
Abstract

Cited by 99 (1 self)
 Add to MetaCart
There has been much publicity about the ability of artificial neural networks to learn and generalize. In fact, the most commonly used artificial neural networks, called multilayer perceptrons, are nothing more than nonlinear regression and discriminant models that can be implemented with standard statistical software. This paper explains what neural networks are, translates neural network jargon into statistical jargon, and shows the relationships between neural networks and statistical models such as generalized linear models, maximum redundancy analysis, projection pursuit, and cluster analysis.
Smoothing by Local Regression: Principles and Methods
"... this paper we describe two adaptive procedures, one based on C p and the other based on crossvalidation. Still, when we have a final adaptive fit in hand, it is critical to subject it to graphical diagnostics to study its performance. The important implication of these statements is that the above c ..."
Abstract

Cited by 88 (1 self)
 Add to MetaCart
this paper we describe two adaptive procedures, one based on C p and the other based on crossvalidation. Still, when we have a final adaptive fit in hand, it is critical to subject it to graphical diagnostics to study its performance. The important implication of these statements is that the above choices must be tailored to each data set in practice; that is, the choices represent a modeling of the data. It is widely accepted that in global parametric regression there are a variety of choices that must be made  for example, the parametric family to be fitted and the form of the distribution of the response  and that we must rely on our knowledge of the mechanism generating the data, on model selection diagnostics, and on graphical diagnostic methods to make the choices. The same is true for smoothing. Cleveland (1993) presents many examples of this modeling process. For example, in one application, oxides of nitrogen from an automobile engine are fitted to the equivalence ratio, E, of the fuel and the compression ratio, C, of the engine. Coplots show that it is reasonable to use quadratics as the local parametric family but with the added assumption that given E the fitted f
Linear smoothers and additive models
 The Annals of Statistics
, 1989
"... We study linear smoothers and their use in building nonparametric regression models. In part Qfthis paper we examine certain aspects of linear smoothers for scatterplots; examples of these are the running mean and running line, kernel, and cubic spline smoothers. The eigenvalue and singular value d ..."
Abstract

Cited by 70 (2 self)
 Add to MetaCart
We study linear smoothers and their use in building nonparametric regression models. In part Qfthis paper we examine certain aspects of linear smoothers for scatterplots; examples of these are the running mean and running line, kernel, and cubic spline smoothers. The eigenvalue and singular value decompositions of the corresponding smoother matrix are used to qualitatively describe a smoother, and several other topics such as the number of degrees of freedom of a smoother are discussed. In the second part of the paper we describe how Iinearsmoothers can be used to estimate the additive model, a powerful nonparametric regression model, using the "backfitting algorithm". We study the convergence of the backfitting algorithm and prove its convergence for a class of smoothers that includes cubic e:ttJlCl€~nt jJI:::Jll<l.li:6I;:U least squares. algorithm and ' dis.cuss ev'W()r(is: Neaparametric, seanparametric, regression, GaussSeidelalgorithm,
Local polynomial kernel regression for generalized linear models and quasilikelihood functions
 Journal of the American Statistical Association,90
, 1995
"... were introduced as a means of extending the techniques of ordinary parametric regression to several commonlyused regression models arising from nonnormal likelihoods. Typically these models have a variance that depends on the mean function. However, in many cases the likelihood is unknown, but the ..."
Abstract

Cited by 57 (7 self)
 Add to MetaCart
were introduced as a means of extending the techniques of ordinary parametric regression to several commonlyused regression models arising from nonnormal likelihoods. Typically these models have a variance that depends on the mean function. However, in many cases the likelihood is unknown, but the relationship between mean and variance can be specified. This has led to the consideration of quasilikelihood methods, where the conditionalloglikelihood is replaced by a quasilikelihood function. In this article we investigate the extension of the nonparametric regression technique of local polynomial fitting with a kernel weight to these more general contexts. In the ordinary regression case local polynomial fitting has been seen to possess several appealing features in terms of intuitive and mathematical simplicity. One noteworthy feature is the better performance near the boundaries compared to the traditional kernel regression estimators. These properties are shown to carryover to the generalized linear model and quasilikelihood model. The end result is a class of kernel type estimators for smoothing in quasilikelihood models. These estimators can be viewed as a straightforward generalization of the usual parametric estimators. In addition, their simple asymptotic distributions allow for simple interpretation
Population shape regression from random design data
 IN: PROC. OF ICCV 2007
, 2007
"... Regression analysis is a powerful tool for the study of changes in a dependent variable as a function of an independent regressor variable, and in particular it is applicable to the study of anatomical growth and shape change. When the underlying process can be modeled by parameters in a Euclidean s ..."
Abstract

Cited by 51 (10 self)
 Add to MetaCart
Regression analysis is a powerful tool for the study of changes in a dependent variable as a function of an independent regressor variable, and in particular it is applicable to the study of anatomical growth and shape change. When the underlying process can be modeled by parameters in a Euclidean space, classical regression techniques [13, 34] are applicable and have been studied extensively. However, recent work suggests that attempts to describe anatomical shapes using flat Euclidean spaces undermines our ability to represent natural biological variability [9, 11]. In this paper we develop a method for regression analysis of general, manifoldvalued data. Specifically, we extend NadarayaWatson kernel regression by recasting the regression problem in terms of Fréchet expectation. Although this method is quite general, our driving problem is the study anatomical shape change as a function of age from random design image data. We demonstrate our method by analyzing shape change in the brain from a random design dataset of MR images of 89 healthy adults ranging in age from 22 to 79 years. To study the small scale changes in anatomy, we use the infinite dimensional manifold of diffeomorphic transformations, with an associated metric. We regress a representative anatomical shape, as a function of age, from this population.
The Specification of Conditional Expectations
, 1991
"... this paper was written while the author was visiting the Graduate School of Business at the University of Chicago. This paper incorporates some results previously circulated in Is the Expected Compensation for Market Volatility Constant Through Time? and On the Linearity of Conditionally Expected ..."
Abstract

Cited by 43 (6 self)
 Add to MetaCart
this paper was written while the author was visiting the Graduate School of Business at the University of Chicago. This paper incorporates some results previously circulated in Is the Expected Compensation for Market Volatility Constant Through Time? and On the Linearity of Conditionally Expected Returns. I have bene tted from the comments of Daniel Beneish, Marshall Blume, Doug Breeden, Wayne Ferson, Doug Foster, Mike Giarla, Mike Hemler, Ravi Jagannathan, Dan Nelson, Adrian Pagan, Tom Smith, Rob Stambaugh, S
Nonparametric function induction in semisupervised learning
 In Proc. Artificial Intelligence and Statistics
, 2005
"... There has been an increase of interest for semisupervised learning recently, because of the many datasets with large amounts of unlabeled examples and only a few labeled ones. This paper follows up on proposed nonparametric algorithms which provide an estimated continuous label for the given unlabe ..."
Abstract

Cited by 41 (5 self)
 Add to MetaCart
There has been an increase of interest for semisupervised learning recently, because of the many datasets with large amounts of unlabeled examples and only a few labeled ones. This paper follows up on proposed nonparametric algorithms which provide an estimated continuous label for the given unlabeled examples. First, it extends them to function induction algorithms that minimize a regularization criterion applied to an outofsample example, and happen to have the form of Parzen windows regressors. This allows to predict test labels without solving again a linear system of dimension n (the number of unlabeled and labeled training examples), which can cost O(n 3). Second, this function induction procedure gives rise to an efficient approximation of the training process, reducing the linear system to be solved to m ≪ n unknowns, using only a subset of m examples. An improvement of O(n 2 /m 2) in time can thus be obtained. Comparative experiments are presented, showing the good performance of the induction formula and approximation algorithm. 1