• Documents
  • Authors
  • Tables
  • Log in
  • Sign up
  • MetaCart
  • DMCA
  • Donate

CiteSeerX logo

Advanced Search Include Citations
Advanced Search Include Citations | Disambiguate

DMCA

Long-Term Reward Prediction in TD Models of the Dopamine System (2002)

Cached

  • Download as a PDF

Download Links

  • [www.ri.cmu.edu]
  • [www.gatsby.ucl.ac.uk]
  • [www-cgi.cs.cmu.edu]
  • [www-2.cs.cmu.edu]
  • [www-2.cs.cmu.edu]
  • [www.cs.cmu.edu]
  • [www.cs.cmu.edu]
  • [www.cs.cmu.edu]
  • [www.cs.cmu.edu]
  • [www.cs.cmu.edu]
  • [www.cns.nyu.edu]
  • [www.cs.cmu.edu]
  • [www.cns.nyu.edu]

  • Other Repositories/Bibliography

  • DBLP
  • Save to List
  • Add to Collection
  • Correct Errors
  • Monitor Changes
by Nathaniel D. Daw , David S. Touretzky
Citations:26 - 2 self
  • Summary
  • Citations
  • Active Bibliography
  • Co-citation
  • Clustered Documents
  • Version History

BibTeX

@MISC{Daw02long-termreward,
    author = {Nathaniel D. Daw and David S. Touretzky},
    title = {Long-Term Reward Prediction in TD Models of the Dopamine System},
    year = {2002}
}

Share

Facebook Twitter Reddit Bibsonomy

OpenURL

 

Abstract

This article addresses the relationship between long-term reward predictions and slow-timescale neural activity in temporal difference (TD) models of the dopamine system. Such models attempt to explain how the activity of dopamine (DA) neurons relates to errors in the prediction of future rewards. Previous models have been mostly restricted to short-term predictions of rewards expected during a single, somewhat artificially defined trial. Also, the models focused exclusively on the phasic pause-and-burst activity of primate DA neurons; the neurons' slower, tonic background activity was assumed to be constant. This has led to difficulty in explaining the results of neurochemical experiments that measure indications of DA release on a slow timescale, results that seem at first glance inconsistent with a reward prediction model. In this article, we investigate a TD model of DA activity modified so as to enable it to make longer-term predictions about rewards expected far in the future. We show that these predictions manifest themselves as slow changes in the baseline error signal, which we associate with tonic DA activity. Using this model, we make new predictions about the behavior of the DA system in a number of experimental situations. Some of these predictions suggest new computational explanations for previously puzzling data, such as indications from microdialysis studies of elevated DA activity triggered by aversive events

Keyphrases

long-term reward prediction    td model    dopamine system    da release    defined trial    slow change    microdialysis study    tonic da activity    elevated da activity    temporal difference    new computational explanation    first glance inconsistent    phasic pause-and-burst activity    da activity    short-term prediction    primate da neuron    previous model    future reward    da system    new prediction    neurochemical experiment    experimental situation    slow-timescale neural activity    baseline error signal    tonic background activity    longer-term prediction    reward prediction model    slow timescale    aversive event   

Powered by: Apache Solr
  • About CiteSeerX
  • Submit and Index Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2019 The Pennsylvania State University