• Documents
  • Authors
  • Tables
  • Log in
  • Sign up
  • MetaCart
  • DMCA
  • Donate

CiteSeerX logo

Advanced Search Include Citations
Advanced Search Include Citations

Reward functions for accelerated learning. (1994)

by M J Mataric
Venue:In ICML.
Add To MetaCart

Tools

Sorted by:
Results 1 - 10 of 195
Next 10 →

Reinforcement learning: a survey

by Leslie Pack Kaelbling, Michael L. Littman, Andrew W. Moore - Journal of Artificial Intelligence Research , 1996
"... This paper surveys the field of reinforcement learning from a computer-science perspective. It is written to be accessible to researchers familiar with machine learning. Both the historical basis of the field and a broad selection of current work are summarized. Reinforcement learning is the problem ..."
Abstract - Cited by 1714 (25 self) - Add to MetaCart
This paper surveys the field of reinforcement learning from a computer-science perspective. It is written to be accessible to researchers familiar with machine learning. Both the historical basis of the field and a broad selection of current work are summarized. Reinforcement learning is the problem faced by an agent that learns behavior through trial-and-error interactions with a dynamic environment. The work described here has a resemblance to work in psychology, but differs considerably in the details and in the use of the word "reinforcement." The paper discusses central issues of reinforcement learning, including trading off exploration and exploitation, establishing the foundations of the field via Markov decision theory, learning from delayed reinforcement, constructing empirical models to accelerate learning, making use of generalization and hierarchy, and coping with hidden state. It concludes with a survey of some implemented systems and an assessment of the practical utility of current methods for reinforcement learning.
(Show Context)

Citation Context

...f this work, mentioned in Section 6.3, was a pre-programmed breakdown of the monolithic task description into a set of lower level tasks to be learned. 266Reinforcement Learning: A Survey 3. Mataric =-=[73]-=- describes a robotics experiment with, from the viewpoint of theoretical reinforcement learning, an unthinkably high dimensional state space, containing many dozens of degrees of freedom. Four mobile ...

Cooperative mobile robotics: Antecedents and directions

by Y. Uny Cao, Alex S. Fukunaga, Andrew B. Kahng, Frank Meng , 1995
"... There has been increased research interest in systems composed of multiple autonomous mobile robots exhibiting collective behavior. Groups of mobile robots are constructed, with an aim to studying such issues as group architecture, resource conflict, origin of cooperation, learning, and geometric pr ..."
Abstract - Cited by 385 (3 self) - Add to MetaCart
There has been increased research interest in systems composed of multiple autonomous mobile robots exhibiting collective behavior. Groups of mobile robots are constructed, with an aim to studying such issues as group architecture, resource conflict, origin of cooperation, learning, and geometric problems. As yet, few applications of collective robotics have been reported, and supporting theory is still in its formative stages. In this paper, we give a critical survey of existing works and discuss open problems in this field, emphasizing the various theoretical issues that arise in the study of cooperative robotics. We describe the intellectual heritages that have guided early research, as well as possible additions to the set of existing motivations.

Behavior-Based Control: Examples from Navigation, Learning, and Group Behavior

by Maja J. Mataric - Journal of Experimental and Theoretical Artificial Intelligence , 1997
"... This paper describes the main properties of behavior-based approaches to control. Different approaches to designing and using behaviors as basic units for control, representation, and learning are illustrated on three empirical examples of robots performing navigation and path-finding, group behavio ..."
Abstract - Cited by 224 (38 self) - Add to MetaCart
This paper describes the main properties of behavior-based approaches to control. Different approaches to designing and using behaviors as basic units for control, representation, and learning are illustrated on three empirical examples of robots performing navigation and path-finding, group behaviors, and learning behavior selection. 1 Introduction An architecture provides a set of principles for organizing control systems. In addition to supplying structure, it imposes constraints on the way control problems can be solved. In this paper we explore the constraints of behavior-based approaches to control, and demonstrate them on three architectures that were used to implement robots that successfully performed navigation and pathfinding, group behaviors, and learning of behavior selection. In each case, we focus on the different ways behaviors are defined, modularized, and combined. This paper is organized as follows. Section 2 gives an overview of basic approaches to autonomous agent...
(Show Context)

Citation Context

...ss of choosing the set of basis behaviors for a given domain: from the bottom-up by the dynamics of the agent and the environment, and from the top-down by the agent's goals as specified by the task (=-=Matari'c 1994-=-a). The combination of the two types of constraints helps the designer prune the agent's behavior space and find an efficient basis set. The basis behavior methodology was demonstrated on the Nerd Her...

Cooperative Multi-Agent Learning: The State of the Art

by Liviu Panait, Sean Luke - Autonomous Agents and Multi-Agent Systems , 2005
"... Cooperative multi-agent systems are ones in which several agents attempt, through their interaction, to jointly solve tasks or to maximize utility. Due to the interactions among the agents, multi-agent problem complexity can rise rapidly with the number of agents or their behavioral sophistication. ..."
Abstract - Cited by 182 (8 self) - Add to MetaCart
Cooperative multi-agent systems are ones in which several agents attempt, through their interaction, to jointly solve tasks or to maximize utility. Due to the interactions among the agents, multi-agent problem complexity can rise rapidly with the number of agents or their behavioral sophistication. The challenge this presents to the task of programming solutions to multi-agent systems problems has spawned increasing interest in machine learning techniques to automate the search and optimization process. We provide a broad survey of the cooperative multi-agent learning literature. Previous surveys of this area have largely focused on issues common to specific subareas (for example, reinforcement learning or robotics). In this survey we attempt to draw from multi-agent learning work in a spectrum of areas, including reinforcement learning, evolutionary computation, game theory, complex systems, agent modeling, and robotics. We find that this broad view leads to a division of the work into two categories, each with its own special issues: applying a single learner to discover joint solutions to multi-agent problems (team learning), or using multiple simultaneous learners, often one per agent (concurrent learning). Additionally, we discuss direct and indirect communication in connection with learning, plus open issues in task decomposition, scalability, and adaptive dynamics. We conclude with a presentation of multi-agent learning problem domains, and a list of multi-agent learning resources. 1
(Show Context)

Citation Context

... obvious way to tackle this is to use domain knowledge to simplify the state space, often by providing a smaller set of more “powerful” actions customized for the problem domain. For example, Mataric =-=[162, 160]-=- applies Q learning to select from hand-coded reactive behaviors such as avoid, head-home, search or disperse for robot foraging tasks. An alternative has been to reduce complexity by heuristically de...

Reinforcement Learning In Continuous Time and Space

by Kenji Doya - Neural Computation , 2000
"... This paper presents a reinforcement learning framework for continuoustime dynamical systems without a priori discretization of time, state, and action. Based on the Hamilton-Jacobi-Bellman (HJB) equation for infinitehorizon, discounted reward problems, we derive algorithms for estimating value f ..."
Abstract - Cited by 176 (7 self) - Add to MetaCart
This paper presents a reinforcement learning framework for continuoustime dynamical systems without a priori discretization of time, state, and action. Based on the Hamilton-Jacobi-Bellman (HJB) equation for infinitehorizon, discounted reward problems, we derive algorithms for estimating value functions and for improving policies with the use of function approximators. The process of value function estimation is formulated as the minimization of a continuous-time form of the temporal difference (TD) error. Update methods based on backward Euler approximation and exponential eligibility traces are derived and their correspondences with the conventional residual gradient, TD(0), and TD() algorithms are shown. For policy improvement, two methods, namely, a continuous actor-critic method and a value-gradient based greedy policy, are formulated. As a special case of the latter, a nonlinear feedback control law using the value gradient and the model of the input gain is derived....
(Show Context)

Citation Context

...ful applications to large-scale problems, such as board games (Tesauro, 1994), dispatch problems (Crites and Barto, 1996; Zhang and Dietterich, 1996; Singh and Bertsekas, 1997), and robot navigation (=-=Mataric, 1994-=-) have been reported (see, e.g., Kaelbling et al. (1996) and Sutton and Barto (1998) for a review). The progress of RL research so far, however, has been mostly constrained to the discrete formulation...

Challenges in Evolving Controllers for Physical Robots

by Maja Mataric, Dave Cliff , 1996
"... This paper discusses the feasibility of applying evolutionary methods to automatically generating controllers for physical mobile robots. We overview the state of the art in the field, describe some of the main approaches, discuss the key challenges, unanswered problems, and some promising direction ..."
Abstract - Cited by 156 (6 self) - Add to MetaCart
This paper discusses the feasibility of applying evolutionary methods to automatically generating controllers for physical mobile robots. We overview the state of the art in the field, describe some of the main approaches, discuss the key challenges, unanswered problems, and some promising directions. 1 Introduction This paper is concerned with the distant goal of automated synthesis of robot controllers. Specifically, we focus on the problems of evolving controllers for physically embodied and embedded systems that deal with all of the noise and uncertainly present in the world. We will also address some systems that evolve both the morphology and the controller of a robot. Within the scope of this paper we define morphology as the physical, embodied characteristics of the robot, such as its mechanics and sensor organization. Given that definition, the only examples of evolving both morphology and control exist in simulation. Evolutionary methods for automated hardware design are an ...

Issues and Approaches in Design of Collective Autonomous Agents

by Maja J Mataric - Robotics and Autonomous Systems , 1994
"... The problem of synthesizing and analyzing collective autonomous agents has only recently begun to be practically studied by the robotics community. This paper overviews the most prominent directions of research, defines key terms, and summarizes the main issues. Finally, it briefly describes our app ..."
Abstract - Cited by 152 (14 self) - Add to MetaCart
The problem of synthesizing and analyzing collective autonomous agents has only recently begun to be practically studied by the robotics community. This paper overviews the most prominent directions of research, defines key terms, and summarizes the main issues. Finally, it briefly describes our approach to controlling group behavior and its relation to the field as a whole.
(Show Context)

Citation Context

...ary and sufficient subsets of state required for triggering the behavior set. Conditions are many fewer than states, so their use diminishes the agent's learning space and speeds up any RL algorithm (=-=Matari'c 1994-=-c). In addition to the use of behaviors and conditions, we introduced two ways of shaping the reinforcement function to aid the agent--learner in the nondeterministic, noisy, and dynamic environment. ...

Designing and Understanding Adaptive Group Behavior

by Maja J. Mataric - Adaptive Behavior , 1995
"... This paper proposes the concept of basis behaviors as ubiquitous general building blocks for synthesizing artificial group behavior in multi--agent systems, and for analyzing group behavior in nature. We demonstrate the concept through examples implemented both in simulation and on a group of physic ..."
Abstract - Cited by 148 (32 self) - Add to MetaCart
This paper proposes the concept of basis behaviors as ubiquitous general building blocks for synthesizing artificial group behavior in multi--agent systems, and for analyzing group behavior in nature. We demonstrate the concept through examples implemented both in simulation and on a group of physical mobile robots. The basis behavior set we propose, consisting of avoidance, safe--wandering, following, aggregation, dispersion, and homing, is constructed from behaviors commonly observed in a variety of species in nature. The proposed behaviors are manifested spatially, but have an effect on more abstract modes of interaction, including the exchange of information and cooperation. We demonstrate how basis behaviors can be combined into higher--level group behaviors commonly observed across species. The combination mechanisms we propose are useful for synthesizing a variety of new group behaviors, as well as for analyzing naturally occurring ones. Key words: group behavior, robotics, eth...
(Show Context)

Citation Context

...avior set should be sufficient for accomplishing the goals in a given domain so no other basis behaviors are necessary. Finally, basis behaviors should be simple, local, stable, robust, and scalable (=-=Matari'c 1994-=-a). To evaluate our selected behaviors, we applied the above criteria to implementations on physical robots interacting in the real world, with all of the present error, noise, and uncertainty. In ord...

Purposive behavior acquisition on a real robot by vision-based reinforcement learning

by Minoru Asada, Shoichi Noda, Sukoya Tawaratsumida, Koh Hosoda - MACHINE LEARNING , 1996
"... This paper presents a method of vision-based reinforcement learning by which a robot learns to shoot a ball into a goal. We discuss several issues in applying the reinforcement learning method to a real robot with vision sensor by which the robot can obtain information about the changes in an envi ..."
Abstract - Cited by 130 (30 self) - Add to MetaCart
This paper presents a method of vision-based reinforcement learning by which a robot learns to shoot a ball into a goal. We discuss several issues in applying the reinforcement learning method to a real robot with vision sensor by which the robot can obtain information about the changes in an environment. First, we construct a state space in terms of size, position, and orientation of a ball and a goal in an image, and an action space is designed in terms of the action commands to be sent to the left and right motors of a mobile robot. This causes a “state-action deviation ” problem in constructing the state and action spaces that reflect the outputs from physical sensors and actuators, respectively. To deal with this issue, an action set is constructed in a way that one action consists of a series of the same action primitive which is successively executed until the current state changes. Next, to speed up the learning time, a mechanism of Learning from Easy Missions (or LEM) is implemented. LEM reduces the learning time from exponential to almost linear order in the size of the state space. The results of computer simulations and real robot experiments are given.
(Show Context)

Citation Context

...ction is forward, backward, left, or right, and the states are encoded by the locations of the agent). However, this is not always the case in the real world, where everything changes asynchronously (=-=Mataric, 1994-=-). Thus, we need to have the following principles for the construction of state and action spaces. • Natural segmentation of the state and action spaces: The state (action) space should reflect the co...

Planning, learning and coordination in multiagent decision processes

by Craig Boutilier - In Proceedings of the Sixth Conference on Theoretical Aspects of Rationality and Knowledge (TARK96 , 1996
"... There has been a growing interest in AI in the design of multiagent systems, especially in multiagent cooperative planning. In this paper, we investigate the extent to which methods from single-agent planning and learning can be applied in multiagent settings. We survey a number of different techniq ..."
Abstract - Cited by 121 (1 self) - Add to MetaCart
There has been a growing interest in AI in the design of multiagent systems, especially in multiagent cooperative planning. In this paper, we investigate the extent to which methods from single-agent planning and learning can be applied in multiagent settings. We survey a number of different techniques from decision-theoretic planning and reinforcement learning and describe a number of interesting issues that arise with regard to coordinating the policies of individual agents. To this end, we describe multiagent Markov decision processes as a general model in which to frame this discussion. These are special n-person cooperative games in which agents share the same utility function. We discuss coordination mechanisms based on imposed conventions (or social laws) as well as learning methods for coordination. Our focus is on the decomposition of sequential decision processes so that coordination can be learned (or imposed) locally, at the level of individual states. We also discuss the use of structured problem representations and their role in the generalization of learned conventions and in approximation. 1
(Show Context)

Citation Context

...r agents. In other words, other agents are simply treated as part of the environment. Recent work in applying Q-learning to multiagent systems seems to adopt just this approach. For instance, Mataric =-=[34]-=- describes experiments with mobile robots in which Q-learning is applied to a cooperative task with good results. In a similar vein, Yanco and Stein [58] also reported experimental results with hierar...

Powered by: Apache Solr
  • About CiteSeerX
  • Submit and Index Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2019 The Pennsylvania State University