Results 11  20
of
149
Finding the Embedding Dimension and Variable Dependences in Time Series
, 1994
"... : We present a general method, the ffitest, which establishes functional dependencies given a sequence of measurements. The approach is based on calculating conditional probabilities from vector component distances. Imposing the requirement of continuity of the underlying function, the obtained va ..."
Abstract

Cited by 33 (3 self)
 Add to MetaCart
: We present a general method, the ffitest, which establishes functional dependencies given a sequence of measurements. The approach is based on calculating conditional probabilities from vector component distances. Imposing the requirement of continuity of the underlying function, the obtained values of the conditional probabilities carry information on the embedding dimension and variable dependencies. The power of the method is illustrated on synthetic timeseries with different timelag dependencies and noise levels and on the sunspot data. The virtue of the method for preprocessing data in the context of feedforward neural networks is demonstrated. Also, its applicability for tracking residual errors in output units is stressed. 1 pihong@thep.lu.se 2 carsten@thep.lu.se Introduction The behaviour of a dynamical system is often modeled by analyzing a time series record of certain system variables. Using artificial neural networks (ANN) to model such systems has recently attr...
Accelerated Learning By Active Example Selection
 International Journal of Neural Systems
, 1994
"... Much previous work on training multilayer neural networks has attempted to speed up the backpropagation algorithm using more sophisticated weight modification rules, whereby all the given training examples are used in a random or predetermined sequence. In this paper we investigate an alternative a ..."
Abstract

Cited by 32 (10 self)
 Add to MetaCart
Much previous work on training multilayer neural networks has attempted to speed up the backpropagation algorithm using more sophisticated weight modification rules, whereby all the given training examples are used in a random or predetermined sequence. In this paper we investigate an alternative approach in which the learning proceeds on an increasing number of selected training examples, starting with a small training set. We derive a measure of criticality of examples and present an incremental learning algorithm that uses this measure to select a critical subset of given examples for solving the particular task. Our experimental results suggest that the method can significantly improve training speed and generalization performance in many real applications of neural networks. This method can be used in conjunction with other variations of gradient descent algorithms. 1 Introduction One of the most widely used methods for training multilayer feedforward neural networks is the erro...
Optimal Ensemble Averaging of Neural Networks
 Network
, 1997
"... Based on an observation about the different effect of ensemble averaging on the bias and variance portion of the prediction error, we discuss training methodologies for ensembles of networks. We demonstrate the effect of variance reduction and present a method of extrapolation to the limit of an inf ..."
Abstract

Cited by 30 (4 self)
 Add to MetaCart
Based on an observation about the different effect of ensemble averaging on the bias and variance portion of the prediction error, we discuss training methodologies for ensembles of networks. We demonstrate the effect of variance reduction and present a method of extrapolation to the limit of an infinite ensemble. A significant reduction of variance is obtained by averaging just over initial conditions of the neural networks, without varying architectures or training sets. The minimum of the ensemble prediction error is reached later than that of a single network. In the vicinity of the minimum, the ensemble prediction error appears to be flatter than that of the single network, thus simplifying optimal stopping decision. The results are demonstrated on the sunspots data, where the predictions are among the best obtained, and on the 1993 energy prediction competition dataset B. 1 Introduction In recent years, the use of artificial neural networks (NN) for time series prediction has g...
Connectionist theory refinement: Genetically searching the space of network topologies
 Journal of Artificial Intelligence Research
, 1997
"... An algorithm that learns from a set of examples should ideally be able to exploit the available resources of (a) abundant computing power and (b) domainspecific knowledge to improve its ability to generalize. Connectionist theoryrefinement systems, which use background knowledge to select a neural ..."
Abstract

Cited by 29 (1 self)
 Add to MetaCart
An algorithm that learns from a set of examples should ideally be able to exploit the available resources of (a) abundant computing power and (b) domainspecific knowledge to improve its ability to generalize. Connectionist theoryrefinement systems, which use background knowledge to select a neural network's topology and initial weights, have proven to be effective at exploiting domainspecific knowledge; however, most do not exploit available computing power. This weakness occurs because they lack the ability to refine the topology of the neural networks they produce, thereby limiting generalization, especially when given impoverished domain theories. We present the REGENT algorithm which uses (a) domainspecific knowledge to help create an initial population of knowledgebased neural networks and (b) genetic operators of crossover and mutation (specifically designed for knowledgebased networks) to continually search for better network topologies. Experiments on three realworld domains indicate that our new algorithm is able to significantly increase generalization compared to a standard connectionist theoryrefinement system, as well as our previous algorithm for growing knowledgebased networks.
An iterative pruning algorithm for feedforward neural networks
 IEEE Trans. Neural. Networks
, 1997
"... Abstract — The problem of determining the proper size of an artificial neural network is recognized to be crucial, especially for its practical implications in such important issues as learning and generalization. One popular approach tackling this problem is commonly known as pruning and consists o ..."
Abstract

Cited by 28 (0 self)
 Add to MetaCart
Abstract — The problem of determining the proper size of an artificial neural network is recognized to be crucial, especially for its practical implications in such important issues as learning and generalization. One popular approach tackling this problem is commonly known as pruning and consists of training a larger than necessary network and then removing unnecessary weights/nodes. In this paper, a new pruning method is developed, based on the idea of iteratively eliminating units and adjusting the remaining weights in such a way that the network performance does not worsen over the entire training set. The pruning problem is formulated in terms of solving a system of linear equations, and a very efficient conjugate gradient algorithm is used for solving it, in the leastsquares sense. The algorithm also provides a simple criterion for choosing the units to be removed, which has proved to work well in practice. The results obtained over various test problems demonstrate the effectiveness of the proposed approach. Index Terms — Feedforward neural networks, generalization, hidden neurons, iterative methods, leastsquares methods, network pruning, pattern recognition, structure simplification. I.
CostSensitive Learning with Neural Networks
 Proceedings of the 13th European Conference on Artificial Intelligence (ECAI98
, 1998
"... In the usual setting of Machine Learning, classifiers are typically evaluated by estimating their error rate (or equivalently, the classification accuracy) on the test data. However, this makes sense only if all errors have equal (uniform) costs. When the costs of errors differ between each other, t ..."
Abstract

Cited by 27 (1 self)
 Add to MetaCart
In the usual setting of Machine Learning, classifiers are typically evaluated by estimating their error rate (or equivalently, the classification accuracy) on the test data. However, this makes sense only if all errors have equal (uniform) costs. When the costs of errors differ between each other, the classifiers should be evaluated by comparing the total costs of the errors.
The Maintenance of Uncertainty
 in Control Systems
, 1997
"... It is important to remain uncertain, of observation, model and law. For the Fermi Summer School, Criticisms Requested email : lenny@maths.ox.ac.uk, Contents 1 ..."
Abstract

Cited by 27 (6 self)
 Add to MetaCart
It is important to remain uncertain, of observation, model and law. For the Fermi Summer School, Criticisms Requested email : lenny@maths.ox.ac.uk, Contents 1
Auto‐teaching: Networks that Develop their own Teaching Input
 In
, 1993
"... Backpropagation learning (Rumelhart, Hinton and Williams, 1986) is a useful research tool but it has a number of undesiderable features such as having the experimenter decide from outside what should be learned. We describe a number of simulations of neural networks that internally generate their o ..."
Abstract

Cited by 27 (7 self)
 Add to MetaCart
Backpropagation learning (Rumelhart, Hinton and Williams, 1986) is a useful research tool but it has a number of undesiderable features such as having the experimenter decide from outside what should be learned. We describe a number of simulations of neural networks that internally generate their own teaching input. The networks generate the teaching input by trasforming the network input through connection weights that are evolved using a form of genetic algorithm. What results is an innate (evolved) capacity not to behave efficiently in an environment but to learn to behave efficiently. The analysis of what these networks evolve to learn shows some interesting results.
Computing Second Derivatives in FeedForward Networks: a Review
 IEEE Transactions on Neural Networks
, 1994
"... . The calculation of second derivatives is required by recent training and analyses techniques of connectionist networks, such as the elimination of superfluous weights, and the estimation of confidence intervals both for weights and network outputs. We here review and develop exact and approximate ..."
Abstract

Cited by 27 (4 self)
 Add to MetaCart
. The calculation of second derivatives is required by recent training and analyses techniques of connectionist networks, such as the elimination of superfluous weights, and the estimation of confidence intervals both for weights and network outputs. We here review and develop exact and approximate algorithms for calculating second derivatives. For networks with jwj weights, simply writing the full matrix of second derivatives requires O(jwj 2 ) operations. For networks of radial basis units or sigmoid units, exact calculation of the necessary intermediate terms requires of the order of 2h + 2 backward/forwardpropagation passes where h is the number of hidden units in the network. We also review and compare three approximations (ignoring some components of the second derivative, numerical differentiation, and scoring). Our algorithms apply to arbitrary activation functions, networks, and error functions (for instance, with connections that skip layers, or radial basis functions, or ...
Operating regime based process modeling and identification ph.d thesis
"... the Department of Engineering Cybernetics, who has been of great inspiration and support. Thanks. Moreover, I would like to thank Prof. Petros Ioannou at the UniversityofSouthern California for hosting my six month visit at USC. My interactions with him and his students improved my mathematical prec ..."
Abstract

Cited by 26 (12 self)
 Add to MetaCart
the Department of Engineering Cybernetics, who has been of great inspiration and support. Thanks. Moreover, I would like to thank Prof. Petros Ioannou at the UniversityofSouthern California for hosting my six month visit at USC. My interactions with him and his students improved my mathematical precision and resulted in some adaptive control results that are partially reported in this thesis. Two chapters in this thesis are based on manuscripts that are coauthored with Aage V. S rensen at