Results 1 
6 of
6
A Learning Algorithm for Continually Running Fully Recurrent Neural Networks
, 1989
"... The exact form of a gradientfollowing learning algorithm for completely recurrent networks running in continually sampled time is derived and used as the basis for practical algorithms for temporal supervised learning tasks. These algorithms have: (1) the advantage that they do not require a precis ..."
Abstract

Cited by 413 (4 self)
 Add to MetaCart
The exact form of a gradientfollowing learning algorithm for completely recurrent networks running in continually sampled time is derived and used as the basis for practical algorithms for temporal supervised learning tasks. These algorithms have: (1) the advantage that they do not require a precisely defined training interval, operating while the network runs; and (2) the disadvantage that they require nonlocal communication in the network being trained and are computationally expensive. These algorithms are shown to allow networks having recurrent connections to learn complex tasks requiring the retention of information over time periods having either fixed or indefinite length. 1 Introduction A major problem in connectionist theory is to develop learning algorithms that can tap the full computational power of neural networks. Much progress has been made with feedforward networks, and attention has recently turned to developing algorithms for networks with recurrent connections, wh...
GradientBased Learning Algorithms for Recurrent Networks and Their Computational Complexity
, 1995
"... Introduction 1.1 Learning in Recurrent Networks Connectionist networks having feedback connections are interesting for a number of reasons. Biological neural networks are highly recurrently connected, and many authors have studied recurrent network models of various types of perceptual and memory pr ..."
Abstract

Cited by 115 (4 self)
 Add to MetaCart
Introduction 1.1 Learning in Recurrent Networks Connectionist networks having feedback connections are interesting for a number of reasons. Biological neural networks are highly recurrently connected, and many authors have studied recurrent network models of various types of perceptual and memory processes. The general property making such networks interesting and potentially useful is that they manifest highly nonlinear dynamical behavior. One such type of dynamical behavior that has received much attention is that of settling to a fixed stable state, but probably of greater importance both biologically and from an engineering viewpoint are timevarying behaviors. Here we consider algorithms for training recurrent networks to perform temporal supervised learning tasks, in which the specification of desired behavior is in the form of specific examples of input and desired output trajectories. One example of such a task is sequence classification, where
Neural Net Architectures for Temporal Sequence Processing
, 1994
"... I present a general taxonomy of neural net architectures for processing timevarying patterns. This taxonomy subsumes many existing architectures in the literature, and points to several promising architectures that have yet to be examined. Any architecture that processes timevarying patterns requir ..."
Abstract

Cited by 106 (0 self)
 Add to MetaCart
I present a general taxonomy of neural net architectures for processing timevarying patterns. This taxonomy subsumes many existing architectures in the literature, and points to several promising architectures that have yet to be examined. Any architecture that processes timevarying patterns requires two conceptually distinct components: a shortterm memory that holds on to relevant past events and an associator that uses the shortterm memory to classify or predict. My taxonomy is based on a characterization of shortterm memory models along the dimensions of form, content, and adaptability. Experiments on predicting future values of a financial time series (US dollarSwiss franc exchange rates) are presented using several alternative memory models. The results of these experiments serve as a baseline against which more sophisticated architectures can be compared. Neural networks have proven to be a promising alternative to traditional techniques for nonlinear temporal prediction t...
Training Recurrent Networks Using the Extended Kalman Filter
 In Proceedings International Joint Conference on Neural Networks
, 1992
"... The extended Kalman filter (EKF) can be used as an online algorithm to determine the weights in a recurrent network given target outputs as it runs. This paper notes some relationships between the EKF as applied to recurrent net learning and some simpler techniques that are more widely used. In par ..."
Abstract

Cited by 23 (0 self)
 Add to MetaCart
The extended Kalman filter (EKF) can be used as an online algorithm to determine the weights in a recurrent network given target outputs as it runs. This paper notes some relationships between the EKF as applied to recurrent net learning and some simpler techniques that are more widely used. In particular, making certain simplifications to the EKF gives rise to an algorithm essentially identical to the realtime recurrent learning (RTRL) algorithm. Since the EKF involves adjusting unit activity in the network, it also provides a principled generalization of the teacher forcing technique. Prelinary simulation experiments on simple finitestate Boolean tasks indicate that the EKF can provide substantial speedup in number of time steps required for training on such problems when compared with simpler online gradient algorithms. The computational requirements of the EKF are steep, but turn out to scale with network size at the same rate as RTRL. These observations are intended to provid...
Adaptive State Representation and Estimation Using Recurrent Connectionist Networks
 Miller, Satten, Webos, NN for Control
, 1990
"... Introduction The purpose of this chapter is to provide an introductory overview of some of the current research efforts directed toward adapting the weights in connectionist networks having feedback connections. While much of the recent emphasis in the field has been on multilayer networks having n ..."
Abstract

Cited by 9 (1 self)
 Add to MetaCart
Introduction The purpose of this chapter is to provide an introductory overview of some of the current research efforts directed toward adapting the weights in connectionist networks having feedback connections. While much of the recent emphasis in the field has been on multilayer networks having no such feedback connections, it is likely that the use of recurrently connected networks will be of particular importance for applications to the control of dynamical systems. Following the approach taken in the previous chapter by Andy Barto, this chapter will emphasize the relationship of connectionist research in this area to strategies used in more conventional engineering circles for modelling and controlling dynamical systems, while at the same time noting what there is in the connectionist approach that is novel. In particular, I will argue that while much of the connectionist approach to adapting the weights in recurrent networks having interesting dynamics rests on the same
Some Observations on the Use of the Extended Kalman Filter as a Recurrent Network Learning Algorithm
, 1992
"... The extended Kalman filter (EKF) can be used as an online algorithm to determine the weights in a recurrent network given target outputs as it runs. This involves forming an augmented network state vector consisting of all unit activities and weights. This report notes some relationships between th ..."
Abstract

Cited by 4 (1 self)
 Add to MetaCart
The extended Kalman filter (EKF) can be used as an online algorithm to determine the weights in a recurrent network given target outputs as it runs. This involves forming an augmented network state vector consisting of all unit activities and weights. This report notes some relationships between the EKF as applied to recurrent net learning and some simpler techniques that are more widely used. In particular, it is shown that making certain simplifications to the EKF gives rise to an algorithm essentially identical to the realtime recurrent learning (RTRL) algorithm. That is, the resulting algorithm both maintains the RTRL data structure and prescribes identical weight changes. In addition, because the EKF also involves adjusting unit activity in the network, it provides a principled generalization of the useful "teacher forcing" technique. Very preliminary experiments on simple finitestate Boolean tasks indicate that the EKF works well for these, generally giving substantial speedup...