Results 1  10
of
12
Smoothing Spline ANOVA for Exponential Families, with Application to the Wisconsin Epidemiological Study of Diabetic Retinopathy
 ANN. STATIST
, 1995
"... Let y i ; i = 1; \Delta \Delta \Delta ; n be independent observations with the density of y i of the form h(y i ; f i ) = exp[y i f i \Gammab(f i )+c(y i )], where b and c are given functions and b is twice continuously differentiable and bounded away from 0. Let f i = f(t(i)), where t = (t 1 ; \De ..."
Abstract

Cited by 101 (46 self)
 Add to MetaCart
Let y i ; i = 1; \Delta \Delta \Delta ; n be independent observations with the density of y i of the form h(y i ; f i ) = exp[y i f i \Gammab(f i )+c(y i )], where b and c are given functions and b is twice continuously differentiable and bounded away from 0. Let f i = f(t(i)), where t = (t 1 ; \Delta \Delta \Delta ; t d ) 2 T (1)\Omega \Delta \Delta \Delta\Omega T (d) = T , the T (ff) are measureable spaces of rather general form, and f is an unknown function on T with some assumed `smoothness' properties. Given fy i ; t(i); i = 1; \Delta \Delta \Delta ; ng, it is desired to estimate f(t) for t in some region of interest contained in T . We develop the fitting of smoothing spline ANOVA models to this data of the form f(t) = C + P ff f ff (t ff ) + P ff!fi f fffi (t ff ; t fi ) + \Delta \Delta \Delta. The components of the decomposition satisfy side conditions which generalize the usual side conditions for parametric ANOVA. The estimate of f is obtained as the minimizer...
The BiasVariance Tradeoff and the Randomized GACV
 Advances in Neural Information Processing Systems
, 1999
"... We propose a new insample cross validation based method (randomized GACV) for choosing smoothing or bandwidth parameters that govern the biasvariance or fitcomplexity tradeoff in `soft' classification. Soft classification refers to a learning procedure which estimates the probability that an ..."
Abstract

Cited by 17 (2 self)
 Add to MetaCart
We propose a new insample cross validation based method (randomized GACV) for choosing smoothing or bandwidth parameters that govern the biasvariance or fitcomplexity tradeoff in `soft' classification. Soft classification refers to a learning procedure which estimates the probability that an example with a given attribute vector is in class 1 vs class 0. The target for optimizing the the tradeoff is the KullbackLiebler distance between the estimated probability distribution and the `true' probability distribution, representing knowledge of an infinite population. The method uses a randomized estimate of the trace of a Hessian and mimics cross validation at the cost of a single relearning with perturbed outcome data. 1 INTRODUCTION We propose and test a new insample crossvalidation based method for optimizing the biasvariance tradeoff in `soft classification' (Wahba et al 1994), called ranGACV (randomized Generalized Approximate Cross Validation). Summarizing from Wahba et al(199...
Mathematical Aspects of Neural Networks
 European Symposium of Artificial Neural Networks 2003
, 2003
"... In this tutorial paper about mathematical aspects of neural networks, we will focus on two directions: on the one hand, we will motivate standard mathematical questions and well studied theory of classical neural models used in machine learning. On the other hand, we collect some recent theoretic ..."
Abstract

Cited by 6 (4 self)
 Add to MetaCart
In this tutorial paper about mathematical aspects of neural networks, we will focus on two directions: on the one hand, we will motivate standard mathematical questions and well studied theory of classical neural models used in machine learning. On the other hand, we collect some recent theoretical results (as of beginning of 2003) in the respective areas. Thereby, we follow the dichotomy offered by the overall network structure and restrict ourselves to feedforward networks, recurrent networks, and selforganizing neural systems, respectively.
Tutorial: Perspectives on Learning with RNNs
 in: Proc. ESANN, 2002
"... We present an overview of current lines of research on learning with recurrent neural networks (RNNs). Topics covered are: understanding and unification of algorithms, theoretical foundations, new efforts to circumvent gradient vanishing, new architectures, and fusion with other learning methods ..."
Abstract

Cited by 5 (1 self)
 Add to MetaCart
We present an overview of current lines of research on learning with recurrent neural networks (RNNs). Topics covered are: understanding and unification of algorithms, theoretical foundations, new efforts to circumvent gradient vanishing, new architectures, and fusion with other learning methods and dynamical systems theory. The structuring guideline is to understand many new approaches as different efforts to regularize and thereby improve recurrent learning.
Combining linear discriminant functions with neural networks for supervised learning
 Neural Comput. Applicat
, 1997
"... A novel supervised learning method is presented by combining linear discriminant functions with neural networks. The proposed method results in a treestructured hybrid architecture. Due to constructive learning, the binary tree hierarchical architecture is automatically generated by a controlled gr ..."
Abstract

Cited by 3 (1 self)
 Add to MetaCart
(Show Context)
A novel supervised learning method is presented by combining linear discriminant functions with neural networks. The proposed method results in a treestructured hybrid architecture. Due to constructive learning, the binary tree hierarchical architecture is automatically generated by a controlled growing process for a specific supervised learning task. Unlike the classic decision tree, the linear discriminant functions are merely employed in the intermediate level of the tree for heuristically partitioning a large and complicated task into several smaller and simpler subtasks in the proposed method. These subtasks are dealt with by component neural networks at the leaves of the tree accordingly. For constructive learning, growing and creditassignment algorithms are developed to serve for the hybrid architecture. The proposed architecture provides an efficient way to apply existing neural networks (e.g. multilayered perceptron) for solving a large scale problem. We have already applied the proposed method to a universal approximation problem and several benchmark classification problems in order to evaluate its performance. Simulation results have shown that the proposed method yields better results and faster training in comparison with the multilayered perceptron. Keywords: Divideandconquer, linear discriminant function, multilayered perceptron, modular and hierarchical architecture, constructive learning, supervised learning.
RBF's, SBF's, TreeBF's, Smoothing Spline ANOVA: Representers and pseudorepresenters for a dictionary of basis functions for penalized likelihood estimates
, 1996
"... This work in progress represents an attempt to combine radial basis functions (RBF's), sigmoidal basis functions (SBF's) and basis functions that may be useful in conjunction with treestructured methods (TreeBF's) under a single `umbrella' of a reproducing kernel Hilbert space. ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
This work in progress represents an attempt to combine radial basis functions (RBF's), sigmoidal basis functions (SBF's) and basis functions that may be useful in conjunction with treestructured methods (TreeBF's) under a single `umbrella' of a reproducing kernel Hilbert space. Once this is done, several ways of generating a `list' of basis functions in which to solve a penalized likelihood problem suggest themselves. Support vector methods may be used to refine the list. Given such a list, regularized forward selection methods generalizing those suggested by Orr and by Luo and Wahba may be used to fit the model. Large to very large data sets are assumed (n ? 1000). It is envisioned that the approach could prove useful in building models where more than three or four but less than, say ten or fifteen predictor variables are involved, and that the umbrella provides some intuition concerning how the basis functions are related and what they are doing, so as to give some interpretability...
Computing & Applications Combining Linear Discriminant Functions with Neural Networks for Supervised Learning
"... A novel supervised learning method is proposed by combining linear discriminant functions with neural networks. The proposed method results in a treestructured hybrid architecture. Due to constructive learning, the binary tree hierarchical architecture is automatically generated by a controlled gro ..."
Abstract
 Add to MetaCart
(Show Context)
A novel supervised learning method is proposed by combining linear discriminant functions with neural networks. The proposed method results in a treestructured hybrid architecture. Due to constructive learning, the binary tree hierarchical architecture is automatically generated by a controlled growing process for a specific supervised learning task. Unlike the classic decision tree, the linear discriminant functions are merely employed in the intermediate level of the tree for heuristically partitioning a large and complicated task into several smaller and simpler subtasks in the proposed method. These subtasks are dealt with by component neural networks at the leaves of the tree accordingly. For constructive l arning, growing and creditassignment algorithms are developed to serve for the hybrid architecture. The proposed architecture provides an efficient way to apply existing neural networks (e.g. multilayered perceptron) for solving a large scale problem. We have already applied the proposed method to a universal approximation problem and several benchmark classification problems in order to evaluate its performance. Simulation results have shown that the proposed method yields better results and faster training in comparison with the multilayered perceptron.
Linking Microscopic and Macroscopic Models for Evolution: Markov Chain Network Training and Conservation Law Approximations
"... ..."
(Show Context)
LevenbergMarquardt Learning and Regularization
 in Progress in Neural Information Processing
, 1996
"... LevenbergMarquardt Learning was first introduced to the feedforward networks to improve the speed of the training. This method is an improved GuassNewton method which has an extra term to prevent the cases of illconditions. Interestingly, if we regard the learning as a constrained least square me ..."
Abstract
 Add to MetaCart
LevenbergMarquardt Learning was first introduced to the feedforward networks to improve the speed of the training. This method is an improved GuassNewton method which has an extra term to prevent the cases of illconditions. Interestingly, if we regard the learning as a constrained least square method, that extra term becomes a regularization term to deal with the additive noise in the training samples. In this paper, we look at the LevenbergMarquardt Learning from the viewpoint of regularization. We show that the LevenbergMarquardt learning allows other forms of regularization operators by some simple modifications. In addition, with the inclusion of test for validation error, the regularization parameter can be chosen in such a way that both the training error and validation error decrease. Thus, it prevents the occurrence of overtraining. 1 Introduction LevenbergMarquardt Learning had been introduced to feedforward networks for a number of years [HM94]. The primary objective...
STATISTICAL LIKELIHOOD REPRESENTATIONS OF PRIOR KNOWLEDGE IN MACHINE LEARNING
"... We show that maximum a posteriori (MAP) statistical methods can be used in nonparametric machine learning problems in the same way as their current applications in parametric statistical problems, and give some examples of applications. This MAPN (MAP for nonparametric machine learning) paradigm ca ..."
Abstract
 Add to MetaCart
We show that maximum a posteriori (MAP) statistical methods can be used in nonparametric machine learning problems in the same way as their current applications in parametric statistical problems, and give some examples of applications. This MAPN (MAP for nonparametric machine learning) paradigm can also reproduce much more transparently the same results as regularization methods in machine learning, spline algorithms in continuous complexity theory, and Baysian minimum risk methods.