Results 1 -
5 of
5
Shaping Robot Behavior Using Principles from Instrumental Conditioning
, 1997
"... Shaping by successive approximations is an important animal training technique in which behavior is gradually adjusted in response to strategically timed reinforcements. We describe a computational model of this shaping process and its implementation on a mobile robot. Innate behaviors in our model ..."
Abstract
-
Cited by 36 (1 self)
- Add to MetaCart
Shaping by successive approximations is an important animal training technique in which behavior is gradually adjusted in response to strategically timed reinforcements. We describe a computational model of this shaping process and its implementation on a mobile robot. Innate behaviors in our model are sequences of actions and enabling conditions, and shaping is a behavior editing process realized by multiple editing mechanisms. The model replicates some fundamental phenomena associated with instrumental learning in animals, and allows an RWI B21 robot to learn several distinct tasks derived from the same innate behavior. 1. Introduction Service dogs trained to assist a disabled person will respond to over 60 verbal commands to, for example, turn on lights, open a refrigerator door, or retrieve a dropped object [9]. Chicks can be taught to play a toy piano (peck out a key sequence until a reinforcement is received at the end of the tune) [6], and rats have been conditioned to perform c...
Operant conditioning in skinnerbots
- Adaptive Behavior
, 1997
"... Instrumental (or operant) conditioning, a form of animal learning, is similar to reinforcement learning (Watkins, 1989) in that it allows an agent to adapt its actions to gain maximally from the environment while only being rewarded for correct performance. But animals learn much more complicated be ..."
Abstract
-
Cited by 13 (1 self)
- Add to MetaCart
Instrumental (or operant) conditioning, a form of animal learning, is similar to reinforcement learning (Watkins, 1989) in that it allows an agent to adapt its actions to gain maximally from the environment while only being rewarded for correct performance. But animals learn much more complicated behaviors through instrumental conditioning than robots presently acquire through reinforcement learning. We describe a new computational model of the conditioning process that attempts to capture some of the aspects that are missing from simple reinforcement learning: conditioned reinforcers, shifting reinforcement contingencies, explicit action sequencing, and state space re nement. We apply our model to a task commonly used to study working memory in rats and monkeys: the DMTS (Delayed Match to Sample) task. Animals learn this task in stages. In simulation, our model also acquires the task in stages, in a similar manner. We have used the model to train an RWI B21 robot.
Hierarchical learning of efficient skill application for autonomous robots
- SIRS '95
, 1995
"... This paper presents a novel hierarchical approach to learning the efficient application of robot skills in order to solve complex tasks. By using the idea of skills and elementary operations as a mean to "discretize" the continuous perception and action space of a robot, methods such as Watkins' Q ..."
Abstract
-
Cited by 4 (3 self)
- Add to MetaCart
This paper presents a novel hierarchical approach to learning the efficient application of robot skills in order to solve complex tasks. By using the idea of skills and elementary operations as a mean to "discretize" the continuous perception and action space of a robot, methods such as Watkins' Q-Learning [41] can be employed without the need to artificially restrict the tasks to be solved. Moreover, the method allows for detection of missing skills, thereby providing a way to realize long-term learning in robots that can be supported through autonomous experimentation as well as by means of user interaction and traditional knowledge-based robot programming techniques.
Adaptive Behavior Navigation of a Mobile Robot
, 2001
"... This paper describes a neural network model for the reactive behavioral navigation of a mobile robot. From the information received through the sensors the robot can elicit one of several behaviors #e.g. stop, avoid, stroll, wall following#, through a competitive neural network. The robot is able ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
This paper describes a neural network model for the reactive behavioral navigation of a mobile robot. From the information received through the sensors the robot can elicit one of several behaviors #e.g. stop, avoid, stroll, wall following#, through a competitive neural network. The robot is able to develop a control strategy depending on sensor information and learning operation. Reinforcement learning improves the navigation of the robot by adapting the eligibility of the behaviors and determining the linear and angular robot velocities. Keywords mobile robots, obstacle avoidance, learning control, neural networks, robot navigation, adaptive behavior I.
In: MLnet Workshop on Industrial Applications of Machine Learning, Dourdan, France
"... This paper describes methodologies applied and results achieved in the framework of the ESPRIT Basic Research Action B-Learn II (project no. 7274). B-Learn II is one of the first projects working towards an application of Machine Learning techniques in fields of industrial relevance, which are mu ..."
Abstract
- Add to MetaCart
This paper describes methodologies applied and results achieved in the framework of the ESPRIT Basic Research Action B-Learn II (project no. 7274). B-Learn II is one of the first projects working towards an application of Machine Learning techniques in fields of industrial relevance, which are much more complex than the domains usually treated in ML research. In particular, B-Learn II aims at easing the programming of robots and enhancing their ability to cooperate with humans. The paper gives a short introduction to learning in robotics and to the three applications under consideration in B-Learn II. Afterwards, some examples of learning methodologies employed and results achieved in B-Learn II are presented, and the original references for the work going on in B-Learn II are given. A more thorough description of B-Learn II can be found in [22].

