Results 1 
3 of
3
Designing (Approximate) Optimal Controllers via DHP Adaptive Critics & Neural Networks
 in The Handbook of Applied Computational Intelligence, Karayiannis, Padgett
, 2000
"... this paper, however, more important is the general view of what Backpropagation is, namely, it is (ingeniously) an implementation of the chainrule of taking derivatives. The order in which associated operations are performed is important, and this prompted its inventor [Werbos, 1974] to use the ter ..."
Abstract

Cited by 2 (2 self)
 Add to MetaCart
this paper, however, more important is the general view of what Backpropagation is, namely, it is (ingeniously) an implementation of the chainrule of taking derivatives. The order in which associated operations are performed is important, and this prompted its inventor [Werbos, 1974] to use the term `ordered derivatives' for this context. We will return to this more general aspect of what Backpropagation is a little later, but first, to help understand why taking derivatives is so important in the neuralnetwork learning context, we will go through some background conceptual developments. We consider again the NN as an inputoutput devise, and create some criterion function (CF) to assess the quality of the NN's output (s) in the given problem context. For the context of learning a specified I/O mapping, the CF
Primitive Adaptive Critics
, 1997
"... We propose a simple framework for criticbased training of recurrent neural networks and feedback controllers. We term the critics that are used primitive adaptive critics, since we represent them with the simplest possible architecture (bias weight only). We derive this framework from two main prem ..."
Abstract

Cited by 1 (1 self)
 Add to MetaCart
We propose a simple framework for criticbased training of recurrent neural networks and feedback controllers. We term the critics that are used primitive adaptive critics, since we represent them with the simplest possible architecture (bias weight only). We derive this framework from two main premises. The first of these is a natural similarity between a form of approximate dynamic programming, called Dual Heuristic Programming (DHP), and backpropagation through time (BPTT), as we will discuss. The second premise is our emphasis on a development of a truly online criticbased training procedure competitive in performance and computational cost to truncated BPTT. Three examples illustrate the main features of the framework proposed. DHP and BPTT A family of designs of approximate dynamic programming in continuous domains has been proposed by Werbos [1]. It includes the following steps. First, the wellknown Bellman equation of dynamic programming is written as J t e t J t j j N N ...
unknown title
"... Abstract We propose a simple framework for criticbased training of recurrent neural networks and feedback controllers. We term the critics that are used primitive adaptive critics, since we represent them with the simplest possible architecture (bias weight only). We derive this framework from two ..."
Abstract
 Add to MetaCart
Abstract We propose a simple framework for criticbased training of recurrent neural networks and feedback controllers. We term the critics that are used primitive adaptive critics, since we represent them with the simplest possible architecture (bias weight only). We derive this framework from two main premises. The first of these is a natural similarity between a form of approximate dynamic programming, called Dual Heuristic Programming (DHP), and backpropagation through time (BPTT), as we will discuss. The second premise is our emphasis on a development of a truly online criticbased training procedure competitive in performance and computational cost to truncated BPTT. Three examples illustrate the main features of the framework proposed. DHP and BPTT A family of designs of approximate dynamic programming in continuous domains has been proposed by Werbos [1]. It includes the following steps. First, the wellknown Bellman equation of dynamic programming is written as