Results 1  10
of
16
A Learning Algorithm for Continually Running Fully Recurrent Neural Networks
, 1989
"... The exact form of a gradientfollowing learning algorithm for completely recurrent networks running in continually sampled time is derived and used as the basis for practical algorithms for temporal supervised learning tasks. These algorithms have: (1) the advantage that they do not require a precis ..."
Abstract

Cited by 413 (4 self)
 Add to MetaCart
The exact form of a gradientfollowing learning algorithm for completely recurrent networks running in continually sampled time is derived and used as the basis for practical algorithms for temporal supervised learning tasks. These algorithms have: (1) the advantage that they do not require a precisely defined training interval, operating while the network runs; and (2) the disadvantage that they require nonlocal communication in the network being trained and are computationally expensive. These algorithms are shown to allow networks having recurrent connections to learn complex tasks requiring the retention of information over time periods having either fixed or indefinite length. 1 Introduction A major problem in connectionist theory is to develop learning algorithms that can tap the full computational power of neural networks. Much progress has been made with feedforward networks, and attention has recently turned to developing algorithms for networks with recurrent connections, wh...
GradientBased Learning Algorithms for Recurrent Networks and Their Computational Complexity
, 1995
"... Introduction 1.1 Learning in Recurrent Networks Connectionist networks having feedback connections are interesting for a number of reasons. Biological neural networks are highly recurrently connected, and many authors have studied recurrent network models of various types of perceptual and memory pr ..."
Abstract

Cited by 115 (4 self)
 Add to MetaCart
Introduction 1.1 Learning in Recurrent Networks Connectionist networks having feedback connections are interesting for a number of reasons. Biological neural networks are highly recurrently connected, and many authors have studied recurrent network models of various types of perceptual and memory processes. The general property making such networks interesting and potentially useful is that they manifest highly nonlinear dynamical behavior. One such type of dynamical behavior that has received much attention is that of settling to a fixed stable state, but probably of greater importance both biologically and from an engineering viewpoint are timevarying behaviors. Here we consider algorithms for training recurrent networks to perform temporal supervised learning tasks, in which the specification of desired behavior is in the form of specific examples of input and desired output trajectories. One example of such a task is sequence classification, where
Generalization Performance Of Backpropagation Learning On A Syllabification Task
 ENSCHEDE. TWENTE UNIVERSITY
, 1992
"... We investigated the generalization capabilities of backpropagation learning in feedforward and recurrent feedforward connectionist networks on the assignment of syllable boundaries to orthographic representations in Dutch (hyphenation). This is a difficult task because phonological and morphologic ..."
Abstract

Cited by 67 (43 self)
 Add to MetaCart
We investigated the generalization capabilities of backpropagation learning in feedforward and recurrent feedforward connectionist networks on the assignment of syllable boundaries to orthographic representations in Dutch (hyphenation). This is a difficult task because phonological and morphological constraints interact, leading to ambiguity in the input patterns. We compared the results to different symbolic pattern matching approaches, and to an exemplarbased generalization scheme, related to a knearest neighbour approach, but using a similarity metric weighed by the relative information entropy of positions in the training patterns. Our results indicate that the generalization performance of backpropagation learning for this task is not better than that of the best symbolic pattern matching approaches, and of exemplarbased generalization.
Rethinking Eliminative Connectionism
, 1998
"... Humans routinely generalize universal relationships to unfamiliar instances. If we are told ‘‘if glork then frum,’ ’ and ‘‘glork,’ ’ we can infer ‘‘frum’’; any name that serves as the subject of a sentence can appear as the object of a sentence. These universals are pervasive in language and reasoni ..."
Abstract

Cited by 65 (4 self)
 Add to MetaCart
Humans routinely generalize universal relationships to unfamiliar instances. If we are told ‘‘if glork then frum,’ ’ and ‘‘glork,’ ’ we can infer ‘‘frum’’; any name that serves as the subject of a sentence can appear as the object of a sentence. These universals are pervasive in language and reasoning. One account of how they are generalized holds that humans possess mechanisms that manipulate symbols and variables; an alternative account holds that symbolmanipulation can be eliminated from scientific theories in favor of descriptions couched in terms of networks of interconnected nodes. Can these ‘‘eliminative’ ’ connectionist models offer a genuine alternative? This article shows that eliminative connectionist models cannot account for how we extend universals to arbitrary items. The argument runs as follows. First, if these models, as currently conceived, were to extend universals to arbitrary instances, they would have to generalize outside the space of training examples. Next, it is shown that the class of eliminative connectionist models that is currently popular cannot learn to extend universals outside the training space. This limitation might be avoided through the use of an architecture that implements symbol manipulation.
Local feedback multilayered networks
 Neural Computation
, 1992
"... In this paper, we investigate the capabilities of Local Feedback MultiLayered Networks, a particular class of recurrent networks, in which feedback connections are only allowed from neurons to themselves. In this class, learning can be accomplished by an algorithm which is local in both space and t ..."
Abstract

Cited by 40 (11 self)
 Add to MetaCart
In this paper, we investigate the capabilities of Local Feedback MultiLayered Networks, a particular class of recurrent networks, in which feedback connections are only allowed from neurons to themselves. In this class, learning can be accomplished by an algorithm which is local in both space and time. We describe the limits and properties of these networks and give some insights on their use for solving practical problems.
Learning Sequential Tasks by Incrementally Adding Higher Orders
 Advances in Neural Information Processing Systems 5
, 1993
"... An incremental, higherorder, nonrecurrent network combines two properties found to be useful for learning sequential tasks: higherorder connections and incremental introduction of new units. The network adds higher orders when needed by adding new units that dynamically modify connection weights. ..."
Abstract

Cited by 28 (6 self)
 Add to MetaCart
An incremental, higherorder, nonrecurrent network combines two properties found to be useful for learning sequential tasks: higherorder connections and incremental introduction of new units. The network adds higher orders when needed by adding new units that dynamically modify connection weights. Since the new units modify the weights at the next timestep with information from the previous step, temporal tasks can be learned without the use of feedback, thereby greatly simplifying training. Furthermore, a theoretically unlimited number of units can be added to reach into the arbitrarily distant past. Experiments with the Reber grammar have demonstrated speedups of two orders of magnitude over recurrent networks. 1 INTRODUCTION Secondorder recurrent networks have proven to be very powerful [8], especially when trained using complete back propagation through time [1, 6, 14]. It has also been demonstrated by Fahlman that a recurrent network that incrementally adds nodes during traini...
Challenging the widespread assumption that connectionism and distributed representations go handinhand
 COGNITIVE PSYCHOLOGY
, 2002
"... ..."
Adaptive State Representation and Estimation Using Recurrent Connectionist Networks
 Miller, Satten, Webos, NN for Control
, 1990
"... Introduction The purpose of this chapter is to provide an introductory overview of some of the current research efforts directed toward adapting the weights in connectionist networks having feedback connections. While much of the recent emphasis in the field has been on multilayer networks having n ..."
Abstract

Cited by 9 (1 self)
 Add to MetaCart
Introduction The purpose of this chapter is to provide an introductory overview of some of the current research efforts directed toward adapting the weights in connectionist networks having feedback connections. While much of the recent emphasis in the field has been on multilayer networks having no such feedback connections, it is likely that the use of recurrently connected networks will be of particular importance for applications to the control of dynamical systems. Following the approach taken in the previous chapter by Andy Barto, this chapter will emphasize the relationship of connectionist research in this area to strategies used in more conventional engineering circles for modelling and controlling dynamical systems, while at the same time noting what there is in the connectionist approach that is novel. In particular, I will argue that while much of the connectionist approach to adapting the weights in recurrent networks having interesting dynamics rests on the same
Syntactic Category Formation with Vector Space Grammars
 In Proceedings from the Thirteenth Annual Conference of the Cognitive Science Society (pp. 908912
, 1991
"... A method for deriving phrase structure categories from structured samples of a contextfree language is presented. The learning algorithm is based on adaptation and competition, as well as error backpropagation in a continuous vector space. These connectioniststyle techniques become applicable to g ..."
Abstract

Cited by 7 (0 self)
 Add to MetaCart
A method for deriving phrase structure categories from structured samples of a contextfree language is presented. The learning algorithm is based on adaptation and competition, as well as error backpropagation in a continuous vector space. These connectioniststyle techniques become applicable to grammars as the traditional grammar formalism is generalized to use vectors instead of symbols as category labels. More generally, it is argued that the conversion of symbolic formalisms to continuous representations is a promising way of combining the connectionist learning techniques with the structures and theoretical insights embodied in classical models.
TRAINREC: A System for Training Feedforward & Simple Recurrent Networks Efficiently and Correctly
, 1993
"... TRAINREC is a system for training feedforward and recurrent neural networks that incorporates several ideas. It uses the conjugategradient method which is demonstrably more efficient than traditional backward error propagation. We assume epochbased training and derive a new error function having s ..."
Abstract

Cited by 5 (4 self)
 Add to MetaCart
TRAINREC is a system for training feedforward and recurrent neural networks that incorporates several ideas. It uses the conjugategradient method which is demonstrably more efficient than traditional backward error propagation. We assume epochbased training and derive a new error function having several desirable properties absent from the traditional sumofsquarederror function. We argue for skip (shortcut) connections where appropriate and the preference for a sigmoidal yielding values over the [1,1] interval. The input feature space is often overanalyzed, but by using singular value decomposition, input patterns can be conditioned for better learning often with a reduced number of input units. Recurrent networks, in their most general form, require special handling and cannot be simply a rewiring of the architecture without a corresponding revision of the derivative calculations. There is a careful balance required among the network architecture (specifically, hidden and feed...