Results 1  10
of
12
Constructive Algorithms for Structure Learning in Feedforward Neural Networks for Regression Problems
 IEEE Transactions on Neural Networks
, 1997
"... In this survey paper, we review the constructive algorithms for structure learning in feedforward neural networks for regression problems. The basic idea is to start with a small network, then add hidden units and weights incrementally until a satisfactory solution is found. By formulating the whole ..."
Abstract

Cited by 66 (2 self)
 Add to MetaCart
In this survey paper, we review the constructive algorithms for structure learning in feedforward neural networks for regression problems. The basic idea is to start with a small network, then add hidden units and weights incrementally until a satisfactory solution is found. By formulating the whole problem as a state space search, we first describe the general issues in constructive algorithms, with special emphasis on the search strategy. A taxonomy, based on the differences in the state transition mapping, the training algorithm and the network architecture, is then presented. Keywords Constructive algorithm, structure learning, state space search, dynamic node creation, projection pursuit regression, cascadecorrelation, resourceallocating network, group method of data handling. I. Introduction A. Problems with Fixed Size Networks I N recent years, many neural network models have been proposed for pattern classification, function approximation and regression problems. Among...
Time Series Forecasting with Neural Networks: A Case Study
, 1995
"... This paper describes a case study which aims to do just that. ..."
Abstract

Cited by 31 (0 self)
 Add to MetaCart
This paper describes a case study which aims to do just that.
What size neural network gives optimal generalization? convergence properties of backpropagation
, 1996
"... One of the most important aspects of any machine learning paradigm is how it scales according to problem size and complexity. Using a task with known optimal training error, and a prespecified maximum number of training updates, we investigate the convergence of the backpropagation algorithm with r ..."
Abstract

Cited by 23 (2 self)
 Add to MetaCart
One of the most important aspects of any machine learning paradigm is how it scales according to problem size and complexity. Using a task with known optimal training error, and a prespecified maximum number of training updates, we investigate the convergence of the backpropagation algorithm with respect to a) the complexity of the required function approximation, b) the size of the network in relation to the size required for an optimal solution, and c) the degree of noise in the training data. In general, for a) the solution found is worse when the function to be approximated is more complex, for b) oversized networks can result in lower training and generalization error in certain cases, and for c) the use of committee or ensemble techniques can be more beneficial as the level of noise in the training data is increased. For the experiments we performed, we do not obtain the optimal solution in any case. We further support the observation that larger networks can produce better training and generalization error using a face recognition example where a network with many more parameters than training points generalizes better than smaller networks.
Lessons in Neural Network Training: Overfitting May be Harder than Expected
 In Proceedings of the Fourteenth National Conference on Artificial Intelligence, AAAI97
, 1997
"... For many reasons, neural networks have become very popular AI machine learning models. Two of the most important aspects of machine learning models are how well the model generalizes to unseen data, and how well the model scales with problem complexity. Using a controlled task with known optimal tra ..."
Abstract

Cited by 19 (1 self)
 Add to MetaCart
For many reasons, neural networks have become very popular AI machine learning models. Two of the most important aspects of machine learning models are how well the model generalizes to unseen data, and how well the model scales with problem complexity. Using a controlled task with known optimal training error, we investigate the convergence of the backpropagation (BP) algorithm. We find that the optimal solution is typically not found. Furthermore, we observe that networks larger than might be expected can result in lower training and generalization error. This result is supported by another real world example. We further investigate the training behavior by analyzing the weights in trained networks (excess degrees of freedom are seen to do little harm and to aid convergence), and contrasting the interpolation characteristics of multilayer perceptron neural networks (MLPs) and polynomial models (overfitting behavior is very different  the MLP is often biased towards smoother soluti...
Pattern Recognition via Neural Networks
, 1996
"... Pattern Recognition has had a long history within electrical enginering but has recently become much more widespread as the automated capture of signals and images has become cheaper. Very many of the application of Neural Networks are to classification, and so are within the field of pattern recogn ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
Pattern Recognition has had a long history within electrical enginering but has recently become much more widespread as the automated capture of signals and images has become cheaper. Very many of the application of Neural Networks are to classification, and so are within the field of pattern recognition. In this chapter we explore how neural networks fit into the earlier framework of pattern recognition, and show by some examples that that framework can help us to make better use of neural networks for classification.
Can Statistical Theory Help Us Use Neural Networks Better?
 Interface 97. 29th Symposium of the Interface: Computing Science and Statistics
, 1997
"... If we view neural nets as a class of statistical models with highdimensional parameters, we can consider how to apply the ideas of statistical theory, in particular ideas for model choice and the concepts of predictive Bayesian inference. It turns out that these ideas give considerable insight, and ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
If we view neural nets as a class of statistical models with highdimensional parameters, we can consider how to apply the ideas of statistical theory, in particular ideas for model choice and the concepts of predictive Bayesian inference. It turns out that these ideas give considerable insight, and enable us to find more powerful solutions with reduced computational load. This is illustrated by two case studies. 1 Types of Neural Network One type of neural network predominates, known variously as a multilayer perceptron, a backpropagation network or a feedforward network (FFNN). The other methods than seem to be used at all seriously are Radial basis function networks (RBFs) and Kohonen's Self Organizing Maps. RBFs are used in very similar ways to FFNNs, whereas SOMs are a variant of cluster analysis or multidimensional scaling. Ripley (1996) provides a wideranging overview of the area; more of the background philosophy may be found in Ripley (1993) and standard neural network te...
On Function Recovery by Neural Networks Based on Orthogonal Expansions
"... this paper is to discuss neural networks, which are based on orthogonal expansions (OEnets) of unknown functions. Due to the results of Donoho and Johnstone [7] one can look at Snets and RBFnets in an unifying manner, using orthogonal expansions as a mathematical tool. It is also clear that Pnet ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
this paper is to discuss neural networks, which are based on orthogonal expansions (OEnets) of unknown functions. Due to the results of Donoho and Johnstone [7] one can look at Snets and RBFnets in an unifying manner, using orthogonal expansions as a mathematical tool. It is also clear that Pnets can be analyzed by orthogonal expansions, while Wnets directly fall to this class. In this context, it seems desirable to consider a net architecture, which directly reflects orthogonal expansions. We should add, that the net architecture proposed here is not intended to mimic any biological neural net. It can be treated as a tool for analyzing other networks, which are closer to biological counterparts. On the other hand, we hope that the proposed net structure can be hardwired in the future, providing a useful tool for engineering applications. The second reason, for which we propose to construct OEnets is in well known difficulties in learning S and RBFnets, which manifests in training process, which usually needs hundreds of epochs. In contrary, in learning OEnets one can use classical results from the theory of leastsquares. We put emphasis on fast and reliable learning process, since in engineering applications the only reason for constructing a specialized net hardware for function approximation is when an unknown function changes from time to time and one needs a fast update of its current approximation. The paper is organized as follows. In the next section we formulate a number of questions and requirements, which should influence our decision concerning a proper choice of a functional net. Then, in Section 3 the problem of constructing a net based on orthogonal expansion is stated and the net architecture is discussed. In Section 4 we consider the learnin...
Regularisation Theroy Applied to Neurofuzzy Modelling
"... A desirable property of any empirical model is the ability to generalise well throughout the models input space. Recent work has seen the development of neurofuzzy model construction algorithms which identify neurofuzzy models from available empirical data and expert knowledge. By matching the model ..."
Abstract
 Add to MetaCart
A desirable property of any empirical model is the ability to generalise well throughout the models input space. Recent work has seen the development of neurofuzzy model construction algorithms which identify neurofuzzy models from available empirical data and expert knowledge. By matching the models structure to the underlying process represented by the data, parsimonious models are produced. Consequent parsimonious models do generalise better but due to the structural symmetry required in these models enforced by the need for model transparency, and the often sparse distribution of real data these models are still prone to poor generalisation. This report reviews and develops regularisation techniques that can be applied to identified neurofuzzy models to aid their ability to generalise. Essentially regularisation places a prior probability distribution on the weight values which consequently constrains the model output. One of the major overheads encountered when performing regulari...