Results 1 
8 of
8
Prediction risk and architecture selection for neural networks
, 1994
"... Abstract. We describe two important sets of tools for neural network modeling: prediction risk estimation and network architecture selection. Prediction risk is defined as the expected performance of an estimator in predicting new observations. Estimated prediction risk can be used both for estimati ..."
Abstract

Cited by 75 (2 self)
 Add to MetaCart
Abstract. We describe two important sets of tools for neural network modeling: prediction risk estimation and network architecture selection. Prediction risk is defined as the expected performance of an estimator in predicting new observations. Estimated prediction risk can be used both for estimating the quality of model predictions and for model selection. Prediction risk estimation and model selection are especially important for problems with limited data. Techniques for estimating prediction risk include data resampling algorithms such as nonlinear crossâ€“validation (NCV) and algebraic formulae such as the predicted squared error (PSE) and generalized prediction error (GPE). We show that exhaustive search over the space of network architectures is computationally infeasible even for networks of modest size. This motivates the use of heuristic strategies that dramatically reduce the search complexity. These strategies employ directed search algorithms, such as selecting the number of nodes via sequential network construction (SNC) and pruning inputs and weights via sensitivity based pruning (SBP) and optimal brain damage (OBD) respectively.
Generalization Performance Of Regularized Neural Network Models
 Proceedings of the IEEE Workshop on Neural Networks for Signal Processing IV, Piscataway
, 1994
"... . Architecture optimization is a fundamental problem of neural network modeling. The optimal architecture is defined as the one which minimizes the generalization error. This paper addresses estimation of the generalization performance of regularized, complete neural network models. Regularization n ..."
Abstract

Cited by 31 (8 self)
 Add to MetaCart
. Architecture optimization is a fundamental problem of neural network modeling. The optimal architecture is defined as the one which minimizes the generalization error. This paper addresses estimation of the generalization performance of regularized, complete neural network models. Regularization normally improves the generalization performance by restricting the model complexity. A formula for the optimal weight decay regularizer is derived. A regularized model may be characterized by an effective number of weights (parameters); however, it is demonstrated that no simple definition is possible. A novel estimator of the average generalization error (called FPER) is suggested and compared to the Final Prediction Error (FPE) and Generalized Prediction Error (GPE) estimators. In addition, comparative numerical studies demonstrate the qualities of the suggested estimator. INTRODUCTION One of the fundamental problems involved in design of neural network models is architecture optimizatio...
Empirical Generalization Assessment of Neural Network Models
 Proceedings of the IEEE Workshop on Neural Networks for Signal Processing V, Piscataway
, 1995
"... . This paper addresses the assessment of generalization performance of neural network models by use of empirical techniques. We suggest to use the crossvalidation scheme combined with a resampling technique to obtain an estimate of the generalization performance distribution of a specific model. T ..."
Abstract

Cited by 16 (10 self)
 Add to MetaCart
. This paper addresses the assessment of generalization performance of neural network models by use of empirical techniques. We suggest to use the crossvalidation scheme combined with a resampling technique to obtain an estimate of the generalization performance distribution of a specific model. This enables the formulation of a bulk of new generalization performance measures. Numerical results demonstrate the viability of the approach compared to the standard technique of using algebraic estimates like the FPE [1]. Moreover, we consider the problem of comparing the generalization performance of different competing models. Since all models are trained on the same data, a key issue is to take this dependency into account. The optimal split of the data set of size N into a crossvalidation set of size N fl and a training set of size N(1 \Gamma fl) is discussed. Asymptotically (large data sets), fl opt ! 1 such that a relatively larger amount is left for validation. INTRODUCTION Consid...
Adaptive Regularization in Neural Network Modeling
, 1997
"... . In this paper we address the important problem of optimizing regularization parameters in neural network modeling. The suggested optimization scheme is an extended version of the recently presented algorithm [24]. The idea is to minimize an empirical estimate  like the crossvalidation estimate ..."
Abstract

Cited by 14 (2 self)
 Add to MetaCart
. In this paper we address the important problem of optimizing regularization parameters in neural network modeling. The suggested optimization scheme is an extended version of the recently presented algorithm [24]. The idea is to minimize an empirical estimate  like the crossvalidation estimate  of the generalization error with respect to regularization parameters. This is done by employing a simple iterative gradient descent scheme using virtually no additional programming overhead compared to standard training. Experiments with feedforward neural network models for time series prediction and classification tasks showed the viability and robustness of the algorithm. Moreover, we provided some simple theoretical examples in order to illustrate the potential and limitations of the proposed regularization framework. 1 Introduction Neural networks are flexible tools for time series processing and pattern recognition. By increasing the number of hidden neurons in a 2layer architec...
Design and Regularization of Neural Networks: The Optimal Use of a Validation Set
, 1996
"... In this paper we derive novel algorithms for estimation of regularization parameters and for optimization of neural net architectures based on a validation set. Regularization parameters are estimated using an iterative gradient descent scheme. Architecture optimization is performed by approximative ..."
Abstract

Cited by 9 (5 self)
 Add to MetaCart
In this paper we derive novel algorithms for estimation of regularization parameters and for optimization of neural net architectures based on a validation set. Regularization parameters are estimated using an iterative gradient descent scheme. Architecture optimization is performed by approximative combinatorial search among the relevant subsets of an initial neural network architecture by employing a validation set based Optimal Brain Damage/Surgeon (OBD/OBS) or a mean field combinatorial optimization approach. Numerical results with linear models and feedforward neural networks demonstrate the viability of the methods. INTRODUCTION Neural networks are flexible tools for function approximation and by expanding the network any relevant target function can be approximated [6]. The associated risk of overfitting on noisy data is of major concern in neural network design [2]. The objective of architecture optimization is to minimize the generalization error. The literature suggest a v...
SemiEmpirical Modeling of NonLinear Dynamic Systems through Identification of Operating Regimes and Local Models
 Neural Network Engineering in dynamic control systems
, 1995
"... . An offline algorithm for semiempirical modeling of nonlinear dynamic systems is presented. The model representation is based on the interpolation of a number of simple local models, where the validity of each local model is restricted to an operating regime, but where the local models yield a co ..."
Abstract

Cited by 7 (0 self)
 Add to MetaCart
. An offline algorithm for semiempirical modeling of nonlinear dynamic systems is presented. The model representation is based on the interpolation of a number of simple local models, where the validity of each local model is restricted to an operating regime, but where the local models yield a complete global model when interpolated. The input to the algorithm is a sequence of empirical data and a set of candidate local model structures. The algorithm searches for an optimal decomposition into operating regimes, and local model structures. The method is illustrated using simulated and real data. The transparency of the resulting model and the flexibility with respect to incorporation of prior knowledge is discussed. 1 Introduction The problem of identifying a mathematical model of an unknown system from a sequence of empirical data is a fundamental one which arises in many branches of science and engineering. The complexity of solving such a problem depends on many factors, such as...
Neural Classifier Construction Using Regularization, Pruning and Test Error Estimation
 Neural Networks
, 1998
"... In this paper we propose a method for construction of feedforward neural classifiers based on regularization and adaptive architectures. Using a penalized maximum likelihood scheme, we derive a modified form of the entropic error measure and an algebraic estimate of the test error. In conjunction w ..."
Abstract

Cited by 4 (3 self)
 Add to MetaCart
In this paper we propose a method for construction of feedforward neural classifiers based on regularization and adaptive architectures. Using a penalized maximum likelihood scheme, we derive a modified form of the entropic error measure and an algebraic estimate of the test error. In conjunction with Optimal Brain Damage pruning, a test error estimate is used to select the network architecture. The scheme is evaluated on four classification problems. Keywords: Neural classifiers, Architecture optimization, Regularization, Generalization estimation. Neural Classifier Construction 1 1 INTRODUCTION Pattern recognition is an important aspect of most scientific fields and indeed the objective of most neural network applications. Some of the classic applications of neural networks like Sejnowski and Rosenbergs "NetTalk" concern classification of patterns into a finite number of categories. In modern approaches to pattern recognition the objective is to produce class probabilities for a ...
On Comparison Of Adaptive Regularization Methods
 Proceedings of the IEEE Workshop on Neural Networks for Signal Processing
, 2000
"... This paper investigates recently suggested adaptive regularization schemes. ..."
Abstract

Cited by 2 (1 self)
 Add to MetaCart
This paper investigates recently suggested adaptive regularization schemes.