Results 1  10
of
28
Regularization Theory and Neural Networks Architectures
 Neural Computation
, 1995
"... We had previously shown that regularization principles lead to approximation schemes which are equivalent to networks with one layer of hidden units, called Regularization Networks. In particular, standard smoothness functionals lead to a subclass of regularization networks, the well known Radial Ba ..."
Abstract

Cited by 309 (31 self)
 Add to MetaCart
We had previously shown that regularization principles lead to approximation schemes which are equivalent to networks with one layer of hidden units, called Regularization Networks. In particular, standard smoothness functionals lead to a subclass of regularization networks, the well known Radial Basis Functions approximation schemes. This paper shows that regularization networks encompass a much broader range of approximation schemes, including many of the popular general additive models and some of the neural networks. In particular, we introduce new classes of smoothness functionals that lead to different classes of basis functions. Additive splines as well as some tensor product splines can be obtained from appropriate classes of smoothness functionals. Furthermore, the same generalization that extends Radial Basis Functions (RBF) to Hyper Basis Functions (HBF) also leads from additive models to ridge approximation models, containing as special cases Breiman's hinge functions, som...
A nonparametric approach to pricing and hedging derivative securities via learning networks
 Journal of Finance
, 1994
"... http://www.jstor.org/about/terms.html. JSTOR's Terms and Conditions of Use provides, in part, that unless you have obtained prior permission, you may not download an entire issue of a journal or multiple copies of articles, and you may use content in the JSTOR archive only for your personal, noncom ..."
Abstract

Cited by 104 (4 self)
 Add to MetaCart
http://www.jstor.org/about/terms.html. JSTOR's Terms and Conditions of Use provides, in part, that unless you have obtained prior permission, you may not download an entire issue of a journal or multiple copies of articles, and you may use content in the JSTOR archive only for your personal, noncommercial use. Please contact the publisher regarding any further use of this work. Publisher contact information may be obtained at
Priors, Stabilizers and Basis Functions: from regularization to radial, tensor and additive splines
, 1993
"... We had previously shown that regularization principles lead to approximation schemes which are equivalent to networks with one layer of hidden units, called Regularization Networks. In particular we had discussed how standard smoothness functionals lead to a subclass of regularization networks, th ..."
Abstract

Cited by 78 (14 self)
 Add to MetaCart
We had previously shown that regularization principles lead to approximation schemes which are equivalent to networks with one layer of hidden units, called Regularization Networks. In particular we had discussed how standard smoothness functionals lead to a subclass of regularization networks, the wellknown Radial Basis Functions approximation schemes. In this paper weshow that regularization networks encompass amuch broader range of approximation schemes, including many of the popular general additivemodels and some of the neural networks. In particular weintroduce new classes of smoothness functionals that lead to different classes of basis functions. Additive splines as well as some tensor product splines can be obtained from appropriate classes of smoothness functionals. Furthermore, the same extension that leads from Radial Basis Functions (RBF) to Hyper Basis Functions (HBF) also leads from additivemodels to ridge approximation models, containing as special cases Breiman's hinge functions and some forms of Projection Pursuit Regression. We propose to use the term GeneralizedRegularization Networks for this broad class of approximation schemes that follow from an extension of regularization. In the probabilistic interpretation of regularization, the different classes of basis functions correspond to different classes of prior probabilities on the approximating function spaces, and therefore to differenttypes of smoothness assumptions. In the final part of the paper, weshow the relation between activation functions of the Gaussian and sigmoidal type by considering the simple case of the kernel G(x)=x. In summary,
Regression Modeling in BackPropagation and Projection Pursuit Learning
, 1994
"... We studied and compared two types of connectionist learning methods for modelfree regression problems in this paper. One is the popular backpropagation learning (BPL) well known in the artificial neural networks literature; the other is the projection pursuit learning (PPL) emerged in recent years ..."
Abstract

Cited by 65 (1 self)
 Add to MetaCart
We studied and compared two types of connectionist learning methods for modelfree regression problems in this paper. One is the popular backpropagation learning (BPL) well known in the artificial neural networks literature; the other is the projection pursuit learning (PPL) emerged in recent years in the statistical estimation literature. Both the BPL and the PPL are based on projections of the data in directions determined from interconnection weights. However, unlike the use of fixed nonlinear activations (usually sigmoidal) for the hidden neurons in BPL, the PPL systematically approximates the unknown nonlinear activations. Moreover, the BPL estimates all the weights simultaneously at each iteration, while the PPL estimates the weights cyclically (neuronbyneuron and layerbylayer) at each iteration. Although the BPL and the PPL have comparable training speed when based on a GaussNewton optimization algorithm, the PPL proves more parsimonious in that the PPL requires a fewer hi...
Nonlinear BlackBox Models in System Identification: Mathematical Foundations
, 1995
"... In this paper we discuss several aspects of the mathematical foundations of nonlinear blackbox identification problem. As we shall see that the quality of the identification procedure is always a result of a certain tradeoff between the expressive power of the model we try to identify (the larger ..."
Abstract

Cited by 29 (5 self)
 Add to MetaCart
In this paper we discuss several aspects of the mathematical foundations of nonlinear blackbox identification problem. As we shall see that the quality of the identification procedure is always a result of a certain tradeoff between the expressive power of the model we try to identify (the larger is the number of parameters used to describe the model, more flexible would be the approximation), and the stochastic error (which is proportional to the number of parameters). A consequence of this tradeoff is a simple fact that good approximation technique can be a basis of good identification algorithm. From this point of view we consider different approximation methods, and pay special attention to spatially adaptive approximants. We introduce wavelet and "neuron" approximations and show that they are spatially adaptive. Then we apply the acquired approximation experience to estimation problems. Finally, we consider some implications of these theoretic developments for the practically...
Gibbs sampling, exponential families and orthogonal polynomials
 Statistical Sciences
, 2008
"... Abstract. We give families of examples where sharp rates of convergence to stationarity of the widely used Gibbs sampler are available. The examples involve standard exponential families and their conjugate priors. In each case, the transition operator is explicitly diagonalizable with classical ort ..."
Abstract

Cited by 19 (6 self)
 Add to MetaCart
Abstract. We give families of examples where sharp rates of convergence to stationarity of the widely used Gibbs sampler are available. The examples involve standard exponential families and their conjugate priors. In each case, the transition operator is explicitly diagonalizable with classical orthogonal polynomials as eigenfunctions. Key words and phrases: Gibbs sampler, running time analyses, exponential families, conjugate priors, location families, orthogonal polynomials, singular value decomposition. 1.
Nonlinear Partial Least Squares
, 1995
"... We propose a new nonparametric regression method for highdimensional data, nonlinear partial least squares (NLPLS). NLPLS is motivated by projectionbased regression methods, e.g., partial least squares (PLS), projection pursuit (PPR), and feedforward neural networks. The model takes the form of a ..."
Abstract

Cited by 16 (0 self)
 Add to MetaCart
We propose a new nonparametric regression method for highdimensional data, nonlinear partial least squares (NLPLS). NLPLS is motivated by projectionbased regression methods, e.g., partial least squares (PLS), projection pursuit (PPR), and feedforward neural networks. The model takes the form of a composition of two functions. The first function in the composition projects the predictor variables onto a lowerdimensional curve or surface yielding scores, and the second predicts the response variable from the scores. We implement NLPLS with feedforward neural networks. NLPLS will often produce a more parsimonious model (fewer score vectors) than projectionbased methods, and the model is well suited for detecting outliers and future covariates requiring extrapolation. The scores are also shown to have useful interpretations. We also extend the model for multiple response variables and discuss situations when multiple response variab...
RIDGELETS: ESTIMATING WITH RIDGE FUNCTIONS
, 2003
"... Feedforward neural networks, projection pursuit regression, and more generally, estimation via ridge functions have been proposed as an approach to bypass the curse of dimensionality and are now becoming widely applied to approximation or prediction in applied sciences. To address problems inherent ..."
Abstract

Cited by 16 (1 self)
 Add to MetaCart
Feedforward neural networks, projection pursuit regression, and more generally, estimation via ridge functions have been proposed as an approach to bypass the curse of dimensionality and are now becoming widely applied to approximation or prediction in applied sciences. To address problems inherent to these methods—ranging from the construction of neural networks to their efficiency and capability—Candès [Appl. Comput. Harmon. Anal. 6 (1999) 197–218] developed a new system that allows the representation of arbitrary functions as superpositions of specific ridge functions, the ridgelets. In a nonparametric regression setting, this article suggests expanding noisy data into a ridgelet series and applying a scalar nonlinearity to the coefficients (damping); this is unlike existing approaches based on stepwise additions of elements. The procedure is simple, constructive, stable and spatially adaptive—and fast algorithms have been developed to implement it. The ridgelet estimator is nearly optimal for estimating functions with certain kinds of spatial inhomogeneities. In addition, ridgelets help to identify new classes of estimands—corresponding to a new notion of smoothness— that are well suited for ridge functions estimation. While the results are stated in a decision theoretic framework, numerical experiments are also presented to illustrate the practical performance of the methodology.
A hybrid projection based and radial basis function architecture: Initial values and global optimization
, 2001
"... We introduce a mechanism for constructing and training a hybrid architecture of projection based units and radial basis functions. In particular, we introduce an optimization scheme which includes several steps and assures a convergence to a useful solution. During network architecture constructi ..."
Abstract

Cited by 13 (6 self)
 Add to MetaCart
We introduce a mechanism for constructing and training a hybrid architecture of projection based units and radial basis functions. In particular, we introduce an optimization scheme which includes several steps and assures a convergence to a useful solution. During network architecture construction and training, it is determined whether a unit should be removed or replaced. The resulting architecture has often smaller number of units compared with competing architectures. A specific overfitting resulting from shrinkage of the RBF radii is addressed by introducing a penalty on small radii. Classification and regression results are demonstrated on various benchmark data sets and compared with several variants of RBF networks [?, ?]. A striking performance improvement is achieved on the vowel data set [?]. Keywords: Projection units, RBF Units, Hybrid Network Architecture, SMLP, Clustering, Regularization. 1
Automatic model selection in a hybrid perceptron/radial network
 TO APPEAR: SPECIAL ISSUE OF INFORMATION FUSION ON MULTIPLE EXPERTS
, 2002
"... ..."