Results 1  10
of
327
Selfimproving reactive agents based on reinforcement learning, planning and teaching
 Machine Learning
, 1992
"... Abstract. To date, reinforcement learning has mostly been studied solving simple learning tasks. Reinforcement learning methods that have been studied so far typically converge slowly. The purpose of this work is thus twofold: 1) to investigate the utility of reinforcement learning in solving much ..."
Abstract

Cited by 275 (2 self)
 Add to MetaCart
Abstract. To date, reinforcement learning has mostly been studied solving simple learning tasks. Reinforcement learning methods that have been studied so far typically converge slowly. The purpose of this work is thus twofold: 1) to investigate the utility of reinforcement learning in solving much more complicated learning tasks than previously studied, and 2) to investigate methods that will speed up reinforcement learning. This paper compares eight reinforcement learning frameworks: adaptive heuristic critic (AHC) learning due to Sutton, Qlearning due to Watkins, and three extensions to both basic methods for speeding up learning. The three extensions are experience replay, learning action models for planning, and teaching. The frameworks were investigated using connectionism as an approach to generalization. To evaluate the performance of different frameworks, a dynamic environment was used as a testbed. The enviromaaent is moderately complex and nondeterministic. This paper describes these frameworks and algorithms in detail and presents empirical evaluation of the frameworks.
Learning LongTerm Dependencies with Gradient Descent is Difficult
 TO APPEAR IN THE SPECIAL ISSUE ON RECURRENT NETWORKS OF THE IEEE TRANSACTIONS ON NEURAL NETWORKS
"... Recurrent neural networks can be used to map input sequences to output sequences, such as for recognition, production or prediction problems. However, practical difficulties have been reported in training recurrent neural networks to perform tasks in which the temporal contingencies present in th ..."
Abstract

Cited by 256 (24 self)
 Add to MetaCart
Recurrent neural networks can be used to map input sequences to output sequences, such as for recognition, production or prediction problems. However, practical difficulties have been reported in training recurrent neural networks to perform tasks in which the temporal contingencies present in the input/output sequences span long intervals. We showwhy gradient based learning algorithms face an increasingly difficult problem as the duration of the dependencies to be captured increases. These results expose a tradeoff between efficient learning by gradient descent and latching on information for long periods. Based on an understanding of this problem, alternatives to standard gradient descent are considered.
The Helmholtz Machine
, 1995
"... Discovering the structure inherent in a set of patterns is a fundamental aim of statistical inference or learning. One fruitful approach is to build a parameterized stochastic generative model, independent draws from which are likely to produce the patterns. For all but the simplest generative model ..."
Abstract

Cited by 194 (22 self)
 Add to MetaCart
Discovering the structure inherent in a set of patterns is a fundamental aim of statistical inference or learning. One fruitful approach is to build a parameterized stochastic generative model, independent draws from which are likely to produce the patterns. For all but the simplest generative models, each pattern can be generated in exponentially many ways. It is thus intractable to adjust the parameters to maximize the probability of the observed patterns. We describe a way of finessing this combinatorial explosion by maximizing an easily computed lower bound on the probability of the observations. Our method can be viewed as a form of hierarchical selfsupervised learning that may relate to the function of bottomup and topdown cortical processing pathways.
An Application of Recurrent Nets to Phone Probability Estimation
 IEEE Transactions on Neural Networks
, 1994
"... This paper presents an application of recurrent networks for phone probability estimation in large vocabulary speech recognition. The need for efficient exploitation of context information is discussed ..."
Abstract

Cited by 193 (8 self)
 Add to MetaCart
This paper presents an application of recurrent networks for phone probability estimation in large vocabulary speech recognition. The need for efficient exploitation of context information is discussed
On The Computational Power Of Neural Nets
 JOURNAL OF COMPUTER AND SYSTEM SCIENCES
, 1995
"... This paper deals with finite size networks which consist of interconnections of synchronously evolving processors. Each processor updates its state by applying a "sigmoidal" function to a linear combination of the previous states of all units. We prove that one may simulate all Turing Machines by su ..."
Abstract

Cited by 156 (26 self)
 Add to MetaCart
This paper deals with finite size networks which consist of interconnections of synchronously evolving processors. Each processor updates its state by applying a "sigmoidal" function to a linear combination of the previous states of all units. We prove that one may simulate all Turing Machines by such nets. In particular, one can simulate any multistack Turing Machine in real time, and there is a net made up of 886 processors which computes a universal partialrecursive function. Products (high order nets) are not required, contrary to what had been stated in the literature. Nondeterministic Turing Machines can be simulated by nondeterministic rational nets, also in real time. The simulation result has many consequences regarding the decidability, or more generally the complexity, of questions about recursive nets.
Learning Machines
, 1965
"... This book is about machines that learn to discover hidden relationships in data. A constant sfream of data bombards our senses and millions of sensory channels carry information into our brains. Brains are also learning machines that condition, ..."
Abstract

Cited by 150 (0 self)
 Add to MetaCart
This book is about machines that learn to discover hidden relationships in data. A constant sfream of data bombards our senses and millions of sensory channels carry information into our brains. Brains are also learning machines that condition,
Neurofuzzy modeling and control
 IEEE Proceedings
, 1995
"... Abstract  Fundamental and advanced developments in neurofuzzy synergisms for modeling and control are reviewed. The essential part of neurofuzzy synergisms comes from a common framework called adaptive networks, which uni es both neural networks and fuzzy models. The fuzzy models under the framew ..."
Abstract

Cited by 147 (1 self)
 Add to MetaCart
Abstract  Fundamental and advanced developments in neurofuzzy synergisms for modeling and control are reviewed. The essential part of neurofuzzy synergisms comes from a common framework called adaptive networks, which uni es both neural networks and fuzzy models. The fuzzy models under the framework of adaptive networks is called ANFIS (AdaptiveNetworkbased Fuzzy Inference System), which possess certain advantages over neural networks. We introduce the design methods for ANFIS in both modeling and control applications. Current problems and future directions for neurofuzzy approaches are also addressed. KeywordsFuzzy logic, neural networks, fuzzy modeling, neurofuzzy modeling, neurofuzzy control, ANFIS. I.
Gradient calculation for dynamic recurrent neural networks: a survey
 IEEE Transactions on Neural Networks
, 1995
"... Abstract  We survey learning algorithms for recurrent neural networks with hidden units, and put the various techniques into a common framework. We discuss xedpoint learning algorithms, namely recurrent backpropagation and deterministic Boltzmann Machines, and non xedpoint algorithms, namely backp ..."
Abstract

Cited by 135 (3 self)
 Add to MetaCart
Abstract  We survey learning algorithms for recurrent neural networks with hidden units, and put the various techniques into a common framework. We discuss xedpoint learning algorithms, namely recurrent backpropagation and deterministic Boltzmann Machines, and non xedpoint algorithms, namely backpropagation through time, Elman's history cuto, and Jordan's output feedback architecture. Forward propagation, an online technique that uses adjoint equations, and variations thereof, are also discussed. In many cases, the uni ed presentation leads to generalizations of various sorts. We discuss advantages and disadvantages of temporally continuous neural networks in contrast to clocked ones, continue with some \tricks of the trade" for training, using, and simulating continuous time and recurrent neural networks. We present somesimulations, and at the end, address issues of computational complexity and learning speed.
An Efficient GradientBased Algorithm for OnLine Training of Recurrent Network Trajectories
 Neural Computation
, 1990
"... A novel variant of a familiar recurrent network learning algorithm is described. This algorithm is capable of shaping the behavior of an arbitrary recurrent network as it runs, and it is specifically designed to execute efficiently on serial machines. 1 Introduction Artificial neural networks having ..."
Abstract

Cited by 117 (3 self)
 Add to MetaCart
A novel variant of a familiar recurrent network learning algorithm is described. This algorithm is capable of shaping the behavior of an arbitrary recurrent network as it runs, and it is specifically designed to execute efficiently on serial machines. 1 Introduction Artificial neural networks having feedback connections can implement a wide variety of dynamical systems. The problem of training such a network is the problem of finding a particular dynamical system from among a parameterized family of such systems which best fits the desired specification. This paper proposes a specific learning algorithm for temporal supervised learning tasks, in which the specification of desired behavior is in the form of specific examples of input and desired output trajectories. One example of such a task is sequence classification, where the input is the sequence to be classified and the desired output is the correct classification, which is to be produced at the end of the sequence. Another examp...
GradientBased Learning Algorithms for Recurrent Networks and Their Computational Complexity
, 1995
"... Introduction 1.1 Learning in Recurrent Networks Connectionist networks having feedback connections are interesting for a number of reasons. Biological neural networks are highly recurrently connected, and many authors have studied recurrent network models of various types of perceptual and memory pr ..."
Abstract

Cited by 115 (4 self)
 Add to MetaCart
Introduction 1.1 Learning in Recurrent Networks Connectionist networks having feedback connections are interesting for a number of reasons. Biological neural networks are highly recurrently connected, and many authors have studied recurrent network models of various types of perceptual and memory processes. The general property making such networks interesting and potentially useful is that they manifest highly nonlinear dynamical behavior. One such type of dynamical behavior that has received much attention is that of settling to a fixed stable state, but probably of greater importance both biologically and from an engineering viewpoint are timevarying behaviors. Here we consider algorithms for training recurrent networks to perform temporal supervised learning tasks, in which the specification of desired behavior is in the form of specific examples of input and desired output trajectories. One example of such a task is sequence classification, where