Results 1 - 10
of
15
Shaping Robot Behavior Using Principles from Instrumental Conditioning
, 1997
"... Shaping by successive approximations is an important animal training technique in which behavior is gradually adjusted in response to strategically timed reinforcements. We describe a computational model of this shaping process and its implementation on a mobile robot. Innate behaviors in our model ..."
Abstract
-
Cited by 36 (1 self)
- Add to MetaCart
Shaping by successive approximations is an important animal training technique in which behavior is gradually adjusted in response to strategically timed reinforcements. We describe a computational model of this shaping process and its implementation on a mobile robot. Innate behaviors in our model are sequences of actions and enabling conditions, and shaping is a behavior editing process realized by multiple editing mechanisms. The model replicates some fundamental phenomena associated with instrumental learning in animals, and allows an RWI B21 robot to learn several distinct tasks derived from the same innate behavior. 1. Introduction Service dogs trained to assist a disabled person will respond to over 60 verbal commands to, for example, turn on lights, open a refrigerator door, or retrieve a dropped object [9]. Chicks can be taught to play a toy piano (peck out a key sequence until a reinforcement is received at the end of the tune) [6], and rats have been conditioned to perform c...
A bottom up approach towards the acquisition and expression of sequential representations applied to a behaving real-world device: Distributed Adaptive Control III.
, 1999
"... Biological systems display a high degree of flexibility in problem solving. In this paper a model is presented, Distributed Adaptive Control III (DACIII), which is aimed at understanding these forms of behavior. DACIII is part of a larger modeling series directed at understanding how biological syst ..."
Abstract
-
Cited by 20 (5 self)
- Add to MetaCart
Biological systems display a high degree of flexibility in problem solving. In this paper a model is presented, Distributed Adaptive Control III (DACIII), which is aimed at understanding these forms of behavior. DACIII is part of a larger modeling series directed at understanding how biological systems acquire, retain, and express knowledge of the world. This modeling series has its roots, on one hand, in the methodological consideration that brain and behavior need to be modeled from a multi-level perspective. On the other, the importance of the acquisition of representations of events in the world, as opposed to an a priori specification, is emphasized. DACIII is presented against the background of the paradigms of classical and operant conditioning. On the basis of an analysis of these experimental approaches towards the study of animal behavior a theoretical framework is defined aimed at identifying the minimal requirements of a control structure which could display these behaviors...
Operant conditioning in skinnerbots
- Adaptive Behavior
, 1997
"... Instrumental (or operant) conditioning, a form of animal learning, is similar to reinforcement learning (Watkins, 1989) in that it allows an agent to adapt its actions to gain maximally from the environment while only being rewarded for correct performance. But animals learn much more complicated be ..."
Abstract
-
Cited by 13 (1 self)
- Add to MetaCart
Instrumental (or operant) conditioning, a form of animal learning, is similar to reinforcement learning (Watkins, 1989) in that it allows an agent to adapt its actions to gain maximally from the environment while only being rewarded for correct performance. But animals learn much more complicated behaviors through instrumental conditioning than robots presently acquire through reinforcement learning. We describe a new computational model of the conditioning process that attempts to capture some of the aspects that are missing from simple reinforcement learning: conditioned reinforcers, shifting reinforcement contingencies, explicit action sequencing, and state space re nement. We apply our model to a task commonly used to study working memory in rats and monkeys: the DMTS (Delayed Match to Sample) task. Animals learn this task in stages. In simulation, our model also acquires the task in stages, in a similar manner. We have used the model to train an RWI B21 robot.
Anticipations control behavior: animal behavior in an anticipatory learning classifier system
- ADAPTIVE BEHAVIOR
, 2002
"... ..."
The Shared Circuits Model: How Control, Mirroring and Simulation Can Enable Imitation, Deliberation, and Mindreading
"... To be published in Behavioral and Brain Sciences (in press) ..."
Abstract
-
Cited by 3 (0 self)
- Add to MetaCart
To be published in Behavioral and Brain Sciences (in press)
Skinnerbots
, 1996
"... Instrumental (or operant) conditioning, a form of animal learning, is similar to reinforcement learning in that it allows an agent to adapt its actions to gain maximally from the environment while only being rewarded for correct performance. But animals learn much more complicated behaviors through ..."
Abstract
-
Cited by 2 (1 self)
- Add to MetaCart
Instrumental (or operant) conditioning, a form of animal learning, is similar to reinforcement learning in that it allows an agent to adapt its actions to gain maximally from the environment while only being rewarded for correct performance. But animals learn much more complicated behaviors through instrumental conditioning than robots presently acquire through reinforcement learning. We describe a new computational model of the conditioning process; our discussion focuses on a training technique called chaining. Four aspects of our model distinguishit from simple reinforcement learning: conditional reinforcers, shifting reinforcement contingencies, explicit action sequencing, and state space refinement. We apply our model to a task commonly used to study working memory in rats and monkeys: the DMTS (Delayed Match to Sample) task. Animals learn this task in stages. Our model also acquires the task in stages, in a similar manner. We have also used our learning program to control a B21 r...
The Control of Instrumental Action Following Outcome Devaluation in Young Children Aged Between 1 and 4 Years
"... To determine the role of action–outcome learning in the control of young children’s instrumental behavior, the authors trained 18- to 48-month-olds to manipulate visual icons on a touch-sensitive display to obtain different types of video clips as outcomes. Subsequently, one of the outcomes was deva ..."
Abstract
-
Cited by 2 (1 self)
- Add to MetaCart
To determine the role of action–outcome learning in the control of young children’s instrumental behavior, the authors trained 18- to 48-month-olds to manipulate visual icons on a touch-sensitive display to obtain different types of video clips as outcomes. Subsequently, one of the outcomes was devalued by repeated exposure, and children’s propensity to perform the trained actions was tested in extinction. On test, children with a mean age greater than 2.5 years performed the action trained with the devalued outcome less than those trained with the still-valued outcome, thereby demonstrating that their actions were mediated by action–outcome learning. By contrast, the instrumental responses of younger children (mean age �2 years) were resistant to outcome devaluation and may have been elicited directly by the icons associated with each response, rather than mediated by a specific action–outcome expectation.
From Animals to Animats 4: Proceedings of the Fourth International Conference on Simulation of Adaptive Behavior (SAB96), pp. 285-
"... Instrumental (or operant) conditioning, a form of animal learning, is similar to reinforcement learning in that it allows an agent to adapt its actions to gain maximally from the environment while only being rewarded for correct performance. But animals learn much more complicated behaviors through ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
Instrumental (or operant) conditioning, a form of animal learning, is similar to reinforcement learning in that it allows an agent to adapt its actions to gain maximally from the environment while only being rewarded for correct performance. But animals learn much more complicated behaviors through instrumental conditioning than robots presently acquire through reinforcement learning. We describe a new computational model of the conditioning process; our discussion focuses on a training technique called chaining. Four aspects of our model distinguishitfrom simple reinforcement learning: conditional reinforcers, shifting reinforcement contingencies, explicit action sequencing, and state space refinement. We apply our model to a task commonly used to study working memory in rats and monkeys: the DMTS (Delayed Match to Sample) task. Animals learn this task in stages. Our model also acquires the task in stages, in a similar manner. We have also used our learning program to control a B21 robot. 1
The 28th Bartlett Memorial Lecture
"... The concordance between performance and judgements of the causal effectiveness of an instrumental action suggests that such actions are mediated by causal knowledge. Although causal learning exhibits many associative phenomena—blocking, inhibitory or preventative learning, and super-learning—judgeme ..."
Abstract
- Add to MetaCart
The concordance between performance and judgements of the causal effectiveness of an instrumental action suggests that such actions are mediated by causal knowledge. Although causal learning exhibits many associative phenomena—blocking, inhibitory or preventative learning, and super-learning—judgements of the causal status of a cue can be changed retrospectively as a result of learning episodes that do not directly involve the cue. In order to explain retrospective revaluation, a modi®ed associative theory is described in which the learning processes for retrieved cue representations are the opposite to those for presented cues, and this theory is evaluated by studies of the role of within-compound associations in retrospective revaluation and blocking. However, this modi®ed theory only applies when the within-compound association represents a contiguous rather than a causal cue relationship. Causal learning and representation is a fundamental form of cognition, if not the fundamental form. Without the capacity to learn about and represent the causal relationships between our actions and their consequences, the mind would be radically disconnected from the world. However detailed and rich our knowledge, however sophisticated and complex our inferences and planning, cognition would be impotent if our thoughts could not be
BMC Neuroscience BioMed Central
, 2005
"... Research article Hippocampal lesions facilitate instrumental learning with delayed reinforcement but induce impulsive choice in rats ..."
Abstract
- Add to MetaCart
Research article Hippocampal lesions facilitate instrumental learning with delayed reinforcement but induce impulsive choice in rats

