Results 1  10
of
50
Regularization Theory and Neural Networks Architectures
 Neural Computation
, 1995
"... We had previously shown that regularization principles lead to approximation schemes which are equivalent to networks with one layer of hidden units, called Regularization Networks. In particular, standard smoothness functionals lead to a subclass of regularization networks, the well known Radial Ba ..."
Abstract

Cited by 309 (31 self)
 Add to MetaCart
We had previously shown that regularization principles lead to approximation schemes which are equivalent to networks with one layer of hidden units, called Regularization Networks. In particular, standard smoothness functionals lead to a subclass of regularization networks, the well known Radial Basis Functions approximation schemes. This paper shows that regularization networks encompass a much broader range of approximation schemes, including many of the popular general additive models and some of the neural networks. In particular, we introduce new classes of smoothness functionals that lead to different classes of basis functions. Additive splines as well as some tensor product splines can be obtained from appropriate classes of smoothness functionals. Furthermore, the same generalization that extends Radial Basis Functions (RBF) to Hyper Basis Functions (HBF) also leads from additive models to ridge approximation models, containing as special cases Breiman's hinge functions, som...
The Effective Number of Parameters: An Analysis of Generalization and Regularization in Nonlinear Learning Systems
, 1992
"... We present an analysis of how the generalization performance (expected test set error) relates to the expected training set error for nonlinear learning systems, such as multilayer perceptrons and radial basis functions. The principal result is the following relationship (computed to second order ..."
Abstract

Cited by 171 (2 self)
 Add to MetaCart
We present an analysis of how the generalization performance (expected test set error) relates to the expected training set error for nonlinear learning systems, such as multilayer perceptrons and radial basis functions. The principal result is the following relationship (computed to second order) between the expected test set and training set errors: hE test ()i 0 hE train ()i + 2oe 2 eff p eff () n : (1) Here, n is the size of the training sample , oe 2 eff is the effective noise variance in the response variable(s), is a regularization or weight decay parameter, and p eff () is the effective number of parameters in the nonlinear model. The expectations h i of training set and test set errors are taken over possible training sets and training and test sets 0 respectively. The effective number of parameters p eff () usually differs from the true number of model parameters p for nonlinear or regularized models; this theoretical conclusion is supported by M...
Prediction risk and architecture selection for neural networks
, 1994
"... Abstract. We describe two important sets of tools for neural network modeling: prediction risk estimation and network architecture selection. Prediction risk is defined as the expected performance of an estimator in predicting new observations. Estimated prediction risk can be used both for estimati ..."
Abstract

Cited by 75 (2 self)
 Add to MetaCart
Abstract. We describe two important sets of tools for neural network modeling: prediction risk estimation and network architecture selection. Prediction risk is defined as the expected performance of an estimator in predicting new observations. Estimated prediction risk can be used both for estimating the quality of model predictions and for model selection. Prediction risk estimation and model selection are especially important for problems with limited data. Techniques for estimating prediction risk include data resampling algorithms such as nonlinear crossâ€“validation (NCV) and algebraic formulae such as the predicted squared error (PSE) and generalized prediction error (GPE). We show that exhaustive search over the space of network architectures is computationally infeasible even for networks of modest size. This motivates the use of heuristic strategies that dramatically reduce the search complexity. These strategies employ directed search algorithms, such as selecting the number of nodes via sequential network construction (SNC) and pruning inputs and weights via sensitivity based pruning (SBP) and optimal brain damage (OBD) respectively.
Constructive Algorithms for Structure Learning in Feedforward Neural Networks for Regression Problems
 IEEE Transactions on Neural Networks
, 1997
"... In this survey paper, we review the constructive algorithms for structure learning in feedforward neural networks for regression problems. The basic idea is to start with a small network, then add hidden units and weights incrementally until a satisfactory solution is found. By formulating the whole ..."
Abstract

Cited by 66 (2 self)
 Add to MetaCart
In this survey paper, we review the constructive algorithms for structure learning in feedforward neural networks for regression problems. The basic idea is to start with a small network, then add hidden units and weights incrementally until a satisfactory solution is found. By formulating the whole problem as a state space search, we first describe the general issues in constructive algorithms, with special emphasis on the search strategy. A taxonomy, based on the differences in the state transition mapping, the training algorithm and the network architecture, is then presented. Keywords Constructive algorithm, structure learning, state space search, dynamic node creation, projection pursuit regression, cascadecorrelation, resourceallocating network, group method of data handling. I. Introduction A. Problems with Fixed Size Networks I N recent years, many neural network models have been proposed for pattern classification, function approximation and regression problems. Among...
Generalization Performance Of Regularized Neural Network Models
 Proceedings of the IEEE Workshop on Neural Networks for Signal Processing IV, Piscataway
, 1994
"... . Architecture optimization is a fundamental problem of neural network modeling. The optimal architecture is defined as the one which minimizes the generalization error. This paper addresses estimation of the generalization performance of regularized, complete neural network models. Regularization n ..."
Abstract

Cited by 31 (8 self)
 Add to MetaCart
. Architecture optimization is a fundamental problem of neural network modeling. The optimal architecture is defined as the one which minimizes the generalization error. This paper addresses estimation of the generalization performance of regularized, complete neural network models. Regularization normally improves the generalization performance by restricting the model complexity. A formula for the optimal weight decay regularizer is derived. A regularized model may be characterized by an effective number of weights (parameters); however, it is demonstrated that no simple definition is possible. A novel estimator of the average generalization error (called FPER) is suggested and compared to the Final Prediction Error (FPE) and Generalized Prediction Error (GPE) estimators. In addition, comparative numerical studies demonstrate the qualities of the suggested estimator. INTRODUCTION One of the fundamental problems involved in design of neural network models is architecture optimizatio...
On Design and Evaluation of TappedDelay Neural Network Architectures
 IEEE International Conference on Neural Networks
, 1993
"... We address pruning and evaluation of TappedDelay Neural Networks for the sunspot benchmark series. It is shown that the generalization ability of the networks can be improved by pruning using the Optimal Brain Damage method of Le Cun, Denker and Solla. A stop criterion for the pruning algorithm is ..."
Abstract

Cited by 25 (14 self)
 Add to MetaCart
We address pruning and evaluation of TappedDelay Neural Networks for the sunspot benchmark series. It is shown that the generalization ability of the networks can be improved by pruning using the Optimal Brain Damage method of Le Cun, Denker and Solla. A stop criterion for the pruning algorithm is formulated using a modified version of Akaike's Final Prediction Error estimate. With the proposed stop criterion the pruning scheme is shown to produce succesful architectures with a high yield. I. Introduction Needless to say, processing of time series is an important application area for neural networks, and the quest for applicationspecific architectures penetrates current network research. While the ultimate tool may be fully recurrent architectures, many problems arise during adaptation of these. Even worse, the generalization properties of recurrent networks are not well understood, hence, model optimization is difficult. However, the conventional TappedDelay Neural Net (TDNN) [11...
A neural network based hybrid system for detection, characterization and classification of shortduration oceanic signals
 IEEE Journal of Ocean Engineering
, 1992
"... AbstractAutomated identification and classification of shortduration oceanic signals obtained from passive sonar is a complex problem because of the large variability in both temporal and spectral characteristics even in signals obtained from the same source. This paper presents the design and eva ..."
Abstract

Cited by 24 (19 self)
 Add to MetaCart
AbstractAutomated identification and classification of shortduration oceanic signals obtained from passive sonar is a complex problem because of the large variability in both temporal and spectral characteristics even in signals obtained from the same source. This paper presents the design and evaluation of a comprehensive classifier system for such signals. We first highlight the importance of selecting appropriate signal descriptors or feature vectors for highquality classification of realistic shortduration oceanic signals. Waveletbased feature extractors are shown to be superior to the more commonly used autoregressive coefficients and power spectral coefficients for this purpose. A variety of static neural network classifiers are evaluated and compared favorably with traditional statistical techniques for signal classification. We concentrate on those networks that are able to time out irrelevant input features and are less susceptible to noisy inputs, and introduce two new neuralnetwork based classifiers. Methods for combining the outputs of several classifiers to yield a more accurate labeling are proposed and evaluated based on the interpretation of network outputs as approximating posterior class probabilities. These methods lead to higher classification accuracy and also provide a mechanism for recognizing deviant signals and false alarms. Performance results are given for signals in the DARPA standard data set I. KeywordsNeural networks, pattern classification, passive sonar, shortduration oceanic signals, feature extraction, evidence combination. S I.
Bayesian Model Comparison and Backprop Nets
 ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 4
, 1992
"... The Bayesian model comparison framework is reviewed, and the Bayesian Occam's razor is explained. This framework can be applied to feedforward networks, making possible (1) objective comparisons between solutions using alternative network architectures; (2) objective choice of magnitude and type ..."
Abstract

Cited by 21 (0 self)
 Add to MetaCart
The Bayesian model comparison framework is reviewed, and the Bayesian Occam's razor is explained. This framework can be applied to feedforward networks, making possible (1) objective comparisons between solutions using alternative network architectures; (2) objective choice of magnitude and type of weight decay terms; (3) quantified estimates of the error bars on network parameters and on network output. The framework also generates a measure of the effective number of parameters determined by the data. The relationship
Design of Neural Network Filters
 Electronics Institute, Technical University of Denmark
, 1993
"... Emnet for n rv rende licentiatafhandling er design af neurale netv rks ltre. Filtre baseret pa neurale netv rk kan ses som udvidelser af det klassiske line re adaptive lter rettet mod modellering af uline re sammenh nge. Hovedv gten l gges pa en neural netv rks implementering af den ikkerekursive, ..."
Abstract

Cited by 21 (12 self)
 Add to MetaCart
Emnet for n rv rende licentiatafhandling er design af neurale netv rks ltre. Filtre baseret pa neurale netv rk kan ses som udvidelser af det klassiske line re adaptive lter rettet mod modellering af uline re sammenh nge. Hovedv gten l gges pa en neural netv rks implementering af den ikkerekursive, uline re adaptive model med additiv st j. Formalet er at klarl gge en r kke faser forbundet med design af neural netv rks arkitekturer med henblik pa at udf re forskellige \blackbox " modellerings opgaver sa som: System identi kation, invers modellering og pr diktion af tidsserier. De v senligste bidrag omfatter: Formulering af en neural netv rks baseret kanonisk lter repr sentation, der danner baggrund for udvikling af et arkitektur klassi kationssystem. I hovedsagen drejer det sig om en skelnen mellem globale og lokale modeller. Dette leder til at en r kke kendte neurale netv rks arkitekturer kan klassi ceres, og yderligere abnes der mulighed for udvikling af helt nye strukturer. I denne sammenh ng ndes en gennemgang af en r kke velkendte arkitekturer. I s rdeleshed l gges der v gt pa behandlingen af multilags perceptron neural netv rket.
Plurality and resemblance in fmri data analysis
 NeuroImage
, 1999
"... We apply nine analytic methods employed currently in imaging neuroscience to simulated and actual BOLD fMRI signals and compare their performances under each signal type. Starting with baseline time series generated by a resting subject during a null hypothesis study, we compare method performance w ..."
Abstract

Cited by 20 (5 self)
 Add to MetaCart
We apply nine analytic methods employed currently in imaging neuroscience to simulated and actual BOLD fMRI signals and compare their performances under each signal type. Starting with baseline time series generated by a resting subject during a null hypothesis study, we compare method performance with embedded focal activity in these series of three different types whose magnitudes and time courses are simple, convolved with spatially varying hemodynamic responses, and highly spatially interactive. We then apply these same nine methods to BOLD fMRI time series from contralateral primary motor cortex and ipsilateral cerebellum collected during a sequential finger opposition study. Paired comparisons of results across methods include a voxelspecific concordance correlation