• Documents
  • Authors
  • Tables
  • Log in
  • Sign up
  • MetaCart
  • DMCA
  • Donate

CiteSeerX logo

Advanced Search Include Citations
Advanced Search Include Citations

A framework for mesencephalic dopamine systems based on predictive hebbian learning. (1996)

by P R Montague, P Dayan, T J Sejnowski
Venue:J. Neurosci.
Add To MetaCart

Tools

Sorted by:
Results 1 - 10 of 385
Next 10 →

Reinforcement Learning I: Introduction

by Richard S. Sutton, Andrew G. Barto , 1998
"... In which we try to give a basic intuitive sense of what reinforcement learning is and how it differs and relates to other fields, e.g., supervised learning and neural networks, genetic algorithms and artificial life, control theory. Intuitively, RL is trial and error (variation and selection, search ..."
Abstract - Cited by 5614 (118 self) - Add to MetaCart
In which we try to give a basic intuitive sense of what reinforcement learning is and how it differs and relates to other fields, e.g., supervised learning and neural networks, genetic algorithms and artificial life, control theory. Intuitively, RL is trial and error (variation and selection, search) plus learning (association, memory). We argue that RL is the only field that seriously addresses the special features of the problem of learning from interaction to achieve long-term goals.

The neural basis of human error processing: Reinforcement learning, dopamine, and the error-related negativity

by Clay B. Holroyd, Michael G. H. Coles - PSYCHOLOGICAL REVIEW 109:679–709 , 2002
"... The authors present a unified account of 2 neural systems concerned with the development and expression of adaptive behaviors: a mesencephalic dopamine system for reinforcement learning and a “generic ” error-processing system associated with the anterior cingulate cortex. The existence of the error ..."
Abstract - Cited by 430 (20 self) - Add to MetaCart
The authors present a unified account of 2 neural systems concerned with the development and expression of adaptive behaviors: a mesencephalic dopamine system for reinforcement learning and a “generic ” error-processing system associated with the anterior cingulate cortex. The existence of the error-processing system has been inferred from the error-related negativity (ERN), a component of the event-related brain potential elicited when human participants commit errors in reaction-time tasks. The authors propose that the ERN is generated when a negative reinforcement learning signal is conveyed to the anterior cingulate cortex via the mesencephalic dopamine system and that this signal is used by the anterior cingulate cortex to modify performance on the task at hand. They provide support for this proposal using both computational modeling and psychophysiological experimentation. Human beings learn from the consequences of their actions. Thorndike (1911/1970) originally described this phenomenon with his law of effect, which made explicit the commonsense notion that actions that are followed by feelings of satisfaction are more likely to be generated again in the future, whereas actions that are followed by negative outcomes are less likely to reoccur. This
(Show Context)

Citation Context

...receive input from the mesencephalic dopamine system (Schultz et al., 1995). Several groups of investigators (Barto, 1995; Friston, Tononi, Reeke, Sporns, & Edelman, 1994; Houk, Adams, & Barto, 1995; =-=Montague, Dayan, & Sejnowski, 1996-=-; Schultz, Dayan, & Montague, 1997; Suri & Schultz, 1998; cf. J. Brown, Bullock, & Grossberg, 1999) have noted similarities between the phasic activity of the mesencephalic dopamine system and a parti...

Motivated Reinforcement Learning

by Peter Dayan , 2001
"... The standard reinforcement learning view of the involvement of neuromodulatory systems in instrumental conditioning includes a rather straightforward conception of motivation as prediction of sum future reward. Competition between actions is based on the motivating characteristics of their consequen ..."
Abstract - Cited by 332 (15 self) - Add to MetaCart
The standard reinforcement learning view of the involvement of neuromodulatory systems in instrumental conditioning includes a rather straightforward conception of motivation as prediction of sum future reward. Competition between actions is based on the motivating characteristics of their consequent states in this sense. Substantial, careful, experiments reviewed in Dickinson & Balleine, into the neurobiology and psychology of motivation shows that this view is incomplete. In many cases, animals are faced with the choice not between many different actions at a given state, but rather whether a single response is worth executing at all. Evidence suggests that the motivational process underlying this choice has different psychological and neural properties from that underlying action choice. We describe and model these motivational systems, and consider the way they interact.

Making working memory work: A computational model of learning in the prefrontal cortex and basal ganglia

by Randall C. O’Reilly, Michael J. Frank , 2003
"... The prefrontal cortex has long been thought to subserve both working memory (the holding of information online for processing) and “executive ” functions (deciding how to manipulate working memory and perform processing). Although many computational models of working memory have been developed, the ..."
Abstract - Cited by 174 (19 self) - Add to MetaCart
The prefrontal cortex has long been thought to subserve both working memory (the holding of information online for processing) and “executive ” functions (deciding how to manipulate working memory and perform processing). Although many computational models of working memory have been developed, the mechanistic basis of executive function remains elusive, often amounting to a homunculus. This paper presents an attempt to deconstruct this homunculus through powerful learning mechanisms that allow a computational model of the prefrontal cortex to control both itself and other brain areas in a strategic, task-appropriate manner. These learning mechanisms are based on subcortical structures in the midbrain, basal ganglia and amygdala, which together form an actor/critic architecture. The critic system learns which prefrontal representations are task-relevant and trains the actor, which in turn provides a dynamic gating mechanism for controlling working memory updating. Computationally, the learning mechanism is designed to simultaneously solve the temporal and structural credit assignment problems. The model’s performance compares favorably with standard backpropagation-based temporal learning mechanisms on the challenging 1-2-AX working memory task, and other benchmark working memory tasks.

Intrinsically motivated learning of hierarchical collections of skills

by Andrew G. Barto , 2004
"... Humans and other animals often engage in activities for their own sakes rather than as steps toward solving practical problems. Psychologists call these intrinsically motivated behaviors. What we learn during intrinsically motivated behavior is essential for our development as competent autonomous e ..."
Abstract - Cited by 173 (16 self) - Add to MetaCart
Humans and other animals often engage in activities for their own sakes rather than as steps toward solving practical problems. Psychologists call these intrinsically motivated behaviors. What we learn during intrinsically motivated behavior is essential for our development as competent autonomous entities able to efficiently solve a wide range of practical problems as they arise. In this paper we present initial results from a computational study of intrinsically motivated learning aimed at allowing artificial agents to construct and extend hierarchies of reusable skills that are needed for competent autonomy. At the core of the model are recent theoretical and algorithmic advances in computational reinforcement learning, specifically, new concepts related to skills and new learning algorithms for learning with skill hierarchies. 1
(Show Context)

Citation Context

...e—The neuromodulator dopamine has long been associated with reward learning and rewarded behavior, partly because of clear evidence of its key role in drugs of addiction [6]. The original observation =-=[12, 8, 18, 26]-=- that the activity of dopamine cells in the monkey midbrain in reward-learning tasks closely follows the form of a key training signal in RL (the temporal difference prediction error), is an important...

A framework for studying the neurobiology of value-based decision making.

by Antonio Rangel , Colin Camerer , P Read , Montague - Nat. Rev. Neurosci. , 2008
"... Value-based decision making is pervasive in nature. It occurs whenever an animal makes a choice from several alternatives on the basis of a subjective value that it places on them. Examples include basic animal behaviours, such as bee foraging, and complicated human decisions, such as trading in th ..."
Abstract - Cited by 164 (14 self) - Add to MetaCart
Value-based decision making is pervasive in nature. It occurs whenever an animal makes a choice from several alternatives on the basis of a subjective value that it places on them. Examples include basic animal behaviours, such as bee foraging, and complicated human decisions, such as trading in the stock market. Neuroeconomics is a relatively new discipline that studies the computations that the brain carries out in order to make value-based decisions, as well as the neural implementation of those computations. It seeks to build a biologically sound theory of how humans make decisions that can be applied in both the natural and the social sciences. The field brings together models, tools and techniques from several disciplines. Economics provides a rich class of choice paradigms, formal models of the subjective variables that the brain needs to compute to make decisions, and some experimental protocols for how to measure these variables. Psychology provides a wealth of behavioural data that shows how animals learn and choose under different conditions, as well as theories about the nature of those processes. Neuroscience provides the knowledge of the brain and the tools to study the neural events that attend decision making. Finally, computer science provides computational models of machine learning and decision making. Ultimately, it is the computations that are central to uniting these disparate levels of description, as computational models identify the kinds of signals and signal dynamics that are required by different value-dependent learning and decision problems. However, a full understanding of choice will require a description at all these levels. In this Review we propose a framework for thinking about decision making. It has three components: first, it divides decision-making computations into five types; second, it shows that there are multiple types of valuation systems; and third, it incorporates modulating variables that affect the different valuation processes. This framework will allow us to bring together recent findings in the field, highlight some of the most important outstanding problems, define a common lexicon that bridges the different disciplines that inform neuroeconomics, and point the way to future applications. The development of a common lexicon is important because a lot of confusion has been introduced into the literature on the neurobiology of decision making by the use of the unqualified terms 'reward' and 'value'; as shown in the Review, these terms could apply to very different computations. Computations involved in decision making The first part of the framework divides the computations that are required for value-based decision making into five basic processes The first process in decision making involves the computation of a representation of the decision problem. This entails identifying internal states (for example, hunger level), external states (for example, threat level) Abstract | Neuroeconomics is the study of the neurobiological and computational basis of value-based decision making. Its goal is to provide a biologically based account of human behaviour that can be applied in both the natural and the social sciences. This Review proposes a framework to investigate different aspects of the neurobiology of decision making. The framework allows us to bring together recent findings in the field, highlight some of the most important outstanding problems, define a common lexicon that bridges the different disciplines that inform neuroeconomics, and point the way to future applications.

Interactions Between Frontal Cortex and Basal Ganglia in Working Memory: A Computational Model

by Michael J. Frank, Bryan Loughry, Randall C. O'Reilly , 2000
"... The frontal cortex and basal ganglia interact via a relatively well-understood and elaborate system of interconnections. In the context of motor function, these interconnections can be understood as disinhibiting or "releasing the brakes" on frontal motor action plans --- the basal ganglia ..."
Abstract - Cited by 152 (18 self) - Add to MetaCart
The frontal cortex and basal ganglia interact via a relatively well-understood and elaborate system of interconnections. In the context of motor function, these interconnections can be understood as disinhibiting or "releasing the brakes" on frontal motor action plans --- the basal ganglia detect appropriate contexts for performing motor actions, and enable the frontal cortex to execute such actions at the appropriate time. We build on this idea in the domain of working memory through the use of computational neural network models of this circuit. In our model, the frontal cortex exhibits robust active maintenance, while the basal ganglia contribute a selective, dynamic gating function that enables frontal memory representations to be rapidly updated in a task-relevant manner. We apply the model to a novel version of the continuous performance task (CPT) that requires subroutine-like selective working memory updating, and compare and contrast our model with other existing models and th...

Decision making, the P3, and the locus coeruleus-norepinephrine system. Psychol Bull.

by S Nieuwenhuis, G Aston-Jones, Cohen JD , 2005
"... ..."
Abstract - Cited by 101 (3 self) - Add to MetaCart
Abstract not found
(Show Context)

Citation Context

... neuromodulatory mechanisms in cognitive function on the one hand and cognition and electrophysiological brain activity on the other hand (e.g., Braver & Cohen, 2000; Holroyd &sColes, 2002; Li, 2003; =-=Montague, Dayan, & Sejnowski, 1996-=-; Nieuwenhuis et al., 2002; Robbins, 1997; Usher, Cohen, ServanSchreiber, Rajkowski, & Aston-Jones, 1999; Yeung, Botvinick, & Cohen, 2004). This, together with new insights into the role of locus coer...

How the basal ganglia use parallel excitatory and inhibitory learning pathways to selectively respond to unexpected rewarding cues

by Joshua Brown, Daniel Bullock, Stephen Grossberg - Journal of Neuroscience , 1999
"... After classically conditioned learning, dopaminergic cells in the substantia nigra pars compacta (SNc) respond immediately to unexpected conditioned stimuli (CS) but omit formerly seen responses to expected unconditioned stimuli, notably rewards. These cells play an important role in reinforcement l ..."
Abstract - Cited by 98 (13 self) - Add to MetaCart
After classically conditioned learning, dopaminergic cells in the substantia nigra pars compacta (SNc) respond immediately to unexpected conditioned stimuli (CS) but omit formerly seen responses to expected unconditioned stimuli, notably rewards. These cells play an important role in reinforcement learning. A neural model explains the key neurophysiological properties of these cells before, during, and after conditioning, as well as related anatomical and neurophysiological data about the pedunculopontine tegmental nucleus (PPTN), lateral hypothalamus, ventral striatum, and striosomes. The model proposes how two parallel learning pathways from limbic cortex to the SNc, one devoted to excitatory conditioning (through the ventral striatum, ventral pallidum, and PPTN) and the other to adaptively timed inhibitory conditioning (through the striosomes), control SNc responses. The excitatory pathway generates CS-induced excitatory SNc dopamine bursts. The inhibitory pathway prevents dopamine bursts in response to Humans and animals can learn to predict both the amounts and times of expected rewards. The dopaminergic cells of the substantia nigra pars compacta (SNc) have unique firing patterns related to the predicted and actual times of reward (Ljungberg et
(Show Context)

Citation Context

...ent or “liking” of a reward once consumed (Berridge and Robinson, 1998). The liking may be mediated by areas other than the basal ganglia (McDonald and White, 1993). Recent models (Houk et al., 1995; =-=Montague et al., 1996-=-; Contreras-Vidal & Schultz, 1997; Schultz et al., 1997; Berns and Sejnowski, 1998; Suri and Schultz, 1998) of the nigral dopamine cells have noted similarities between dopamine cell properties and we...

Metalearning and neuromodulation

by Kenji Doya , 2002
"... This paper presents a computational theory on the roles of the ascending neuromodulatory systems from the viewpoint that they mediate the global signals that regulate the distributed learning mechanisms in the brain. Based on the review of experimental data and theoretical models, it is proposed tha ..."
Abstract - Cited by 96 (4 self) - Add to MetaCart
This paper presents a computational theory on the roles of the ascending neuromodulatory systems from the viewpoint that they mediate the global signals that regulate the distributed learning mechanisms in the brain. Based on the review of experimental data and theoretical models, it is proposed that dopamine signals the error in reward prediction, serotonin controls the time scale of reward prediction, noradrenaline controls the randomness in action selection, and acetylcholine controls the speed of memory update. The possible interactions between those neuromodulators and the environment are predicted on the basis of computational theory of metalearning.
Powered by: Apache Solr
  • About CiteSeerX
  • Submit and Index Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2019 The Pennsylvania State University