Matters Temporat (2002)
http://www.cs.washington.edu/homes/rao/dayan_nv.pd
http://www.gatsby.ucl.ac.uk/~dayan/papers/d02.pdf
CACHED:
Abstract:
rence learning. What we will see is that although prediction is relatively straightforward at a systems level, it poses some interesting and tricky conceptual, architectural and mechanistic problems at the level of single neurons. Many of these problems were first discussed in a seminal paper on single cell prediction by Sutton and Barto [9], which is one of the main precursors to their later work on temporal difference learning [3,4]. Temporal difference learning was originally developed as such in the context of modeling classical conditioning [4], and this provides a convenient backdrop for our discussion. Consider a set of separate trials, in each of which a set of m stimuli is provided, the absence or presence of the i th of which at time t is marked by x t (i ) {0,1}. Further suppose that a sequence of rewards is also provided, with r t delivered at time t. Temporal difference learning solves the particular prediction problem, that Sutton and Barto suggested arises in classic
Citations
| 1 | The book of Hebb. Neuron 24 – Sejnowski - 1999 |
| 1 | A framework for mesencephalic dopamine systems based on predictive Hebbian learning – Neuroscience - 1996 |

