Results 1 - 10
of
195
Reinforcement learning: a survey
- Journal of Artificial Intelligence Research
, 1996
"... This paper surveys the field of reinforcement learning from a computer-science perspective. It is written to be accessible to researchers familiar with machine learning. Both the historical basis of the field and a broad selection of current work are summarized. Reinforcement learning is the problem ..."
Abstract
-
Cited by 1714 (25 self)
- Add to MetaCart
(Show Context)
This paper surveys the field of reinforcement learning from a computer-science perspective. It is written to be accessible to researchers familiar with machine learning. Both the historical basis of the field and a broad selection of current work are summarized. Reinforcement learning is the problem faced by an agent that learns behavior through trial-and-error interactions with a dynamic environment. The work described here has a resemblance to work in psychology, but differs considerably in the details and in the use of the word "reinforcement." The paper discusses central issues of reinforcement learning, including trading off exploration and exploitation, establishing the foundations of the field via Markov decision theory, learning from delayed reinforcement, constructing empirical models to accelerate learning, making use of generalization and hierarchy, and coping with hidden state. It concludes with a survey of some implemented systems and an assessment of the practical utility of current methods for reinforcement learning.
Cooperative mobile robotics: Antecedents and directions
, 1995
"... There has been increased research interest in systems composed of multiple autonomous mobile robots exhibiting collective behavior. Groups of mobile robots are constructed, with an aim to studying such issues as group architecture, resource conflict, origin of cooperation, learning, and geometric pr ..."
Abstract
-
Cited by 385 (3 self)
- Add to MetaCart
There has been increased research interest in systems composed of multiple autonomous mobile robots exhibiting collective behavior. Groups of mobile robots are constructed, with an aim to studying such issues as group architecture, resource conflict, origin of cooperation, learning, and geometric problems. As yet, few applications of collective robotics have been reported, and supporting theory is still in its formative stages. In this paper, we give a critical survey of existing works and discuss open problems in this field, emphasizing the various theoretical issues that arise in the study of cooperative robotics. We describe the intellectual heritages that have guided early research, as well as possible additions to the set of existing motivations.
Behavior-Based Control: Examples from Navigation, Learning, and Group Behavior
- Journal of Experimental and Theoretical Artificial Intelligence
, 1997
"... This paper describes the main properties of behavior-based approaches to control. Different approaches to designing and using behaviors as basic units for control, representation, and learning are illustrated on three empirical examples of robots performing navigation and path-finding, group behavio ..."
Abstract
-
Cited by 224 (38 self)
- Add to MetaCart
(Show Context)
This paper describes the main properties of behavior-based approaches to control. Different approaches to designing and using behaviors as basic units for control, representation, and learning are illustrated on three empirical examples of robots performing navigation and path-finding, group behaviors, and learning behavior selection. 1 Introduction An architecture provides a set of principles for organizing control systems. In addition to supplying structure, it imposes constraints on the way control problems can be solved. In this paper we explore the constraints of behavior-based approaches to control, and demonstrate them on three architectures that were used to implement robots that successfully performed navigation and pathfinding, group behaviors, and learning of behavior selection. In each case, we focus on the different ways behaviors are defined, modularized, and combined. This paper is organized as follows. Section 2 gives an overview of basic approaches to autonomous agent...
Cooperative Multi-Agent Learning: The State of the Art
- Autonomous Agents and Multi-Agent Systems
, 2005
"... Cooperative multi-agent systems are ones in which several agents attempt, through their interaction, to jointly solve tasks or to maximize utility. Due to the interactions among the agents, multi-agent problem complexity can rise rapidly with the number of agents or their behavioral sophistication. ..."
Abstract
-
Cited by 182 (8 self)
- Add to MetaCart
(Show Context)
Cooperative multi-agent systems are ones in which several agents attempt, through their interaction, to jointly solve tasks or to maximize utility. Due to the interactions among the agents, multi-agent problem complexity can rise rapidly with the number of agents or their behavioral sophistication. The challenge this presents to the task of programming solutions to multi-agent systems problems has spawned increasing interest in machine learning techniques to automate the search and optimization process. We provide a broad survey of the cooperative multi-agent learning literature. Previous surveys of this area have largely focused on issues common to specific subareas (for example, reinforcement learning or robotics). In this survey we attempt to draw from multi-agent learning work in a spectrum of areas, including reinforcement learning, evolutionary computation, game theory, complex systems, agent modeling, and robotics. We find that this broad view leads to a division of the work into two categories, each with its own special issues: applying a single learner to discover joint solutions to multi-agent problems (team learning), or using multiple simultaneous learners, often one per agent (concurrent learning). Additionally, we discuss direct and indirect communication in connection with learning, plus open issues in task decomposition, scalability, and adaptive dynamics. We conclude with a presentation of multi-agent learning problem domains, and a list of multi-agent learning resources. 1
Reinforcement Learning In Continuous Time and Space
- Neural Computation
, 2000
"... This paper presents a reinforcement learning framework for continuoustime dynamical systems without a priori discretization of time, state, and action. Based on the Hamilton-Jacobi-Bellman (HJB) equation for infinitehorizon, discounted reward problems, we derive algorithms for estimating value f ..."
Abstract
-
Cited by 176 (7 self)
- Add to MetaCart
(Show Context)
This paper presents a reinforcement learning framework for continuoustime dynamical systems without a priori discretization of time, state, and action. Based on the Hamilton-Jacobi-Bellman (HJB) equation for infinitehorizon, discounted reward problems, we derive algorithms for estimating value functions and for improving policies with the use of function approximators. The process of value function estimation is formulated as the minimization of a continuous-time form of the temporal difference (TD) error. Update methods based on backward Euler approximation and exponential eligibility traces are derived and their correspondences with the conventional residual gradient, TD(0), and TD() algorithms are shown. For policy improvement, two methods, namely, a continuous actor-critic method and a value-gradient based greedy policy, are formulated. As a special case of the latter, a nonlinear feedback control law using the value gradient and the model of the input gain is derived....
Challenges in Evolving Controllers for Physical Robots
, 1996
"... This paper discusses the feasibility of applying evolutionary methods to automatically generating controllers for physical mobile robots. We overview the state of the art in the field, describe some of the main approaches, discuss the key challenges, unanswered problems, and some promising direction ..."
Abstract
-
Cited by 156 (6 self)
- Add to MetaCart
This paper discusses the feasibility of applying evolutionary methods to automatically generating controllers for physical mobile robots. We overview the state of the art in the field, describe some of the main approaches, discuss the key challenges, unanswered problems, and some promising directions. 1 Introduction This paper is concerned with the distant goal of automated synthesis of robot controllers. Specifically, we focus on the problems of evolving controllers for physically embodied and embedded systems that deal with all of the noise and uncertainly present in the world. We will also address some systems that evolve both the morphology and the controller of a robot. Within the scope of this paper we define morphology as the physical, embodied characteristics of the robot, such as its mechanics and sensor organization. Given that definition, the only examples of evolving both morphology and control exist in simulation. Evolutionary methods for automated hardware design are an ...
Issues and Approaches in Design of Collective Autonomous Agents
- Robotics and Autonomous Systems
, 1994
"... The problem of synthesizing and analyzing collective autonomous agents has only recently begun to be practically studied by the robotics community. This paper overviews the most prominent directions of research, defines key terms, and summarizes the main issues. Finally, it briefly describes our app ..."
Abstract
-
Cited by 152 (14 self)
- Add to MetaCart
(Show Context)
The problem of synthesizing and analyzing collective autonomous agents has only recently begun to be practically studied by the robotics community. This paper overviews the most prominent directions of research, defines key terms, and summarizes the main issues. Finally, it briefly describes our approach to controlling group behavior and its relation to the field as a whole.
Designing and Understanding Adaptive Group Behavior
- Adaptive Behavior
, 1995
"... This paper proposes the concept of basis behaviors as ubiquitous general building blocks for synthesizing artificial group behavior in multi--agent systems, and for analyzing group behavior in nature. We demonstrate the concept through examples implemented both in simulation and on a group of physic ..."
Abstract
-
Cited by 148 (32 self)
- Add to MetaCart
(Show Context)
This paper proposes the concept of basis behaviors as ubiquitous general building blocks for synthesizing artificial group behavior in multi--agent systems, and for analyzing group behavior in nature. We demonstrate the concept through examples implemented both in simulation and on a group of physical mobile robots. The basis behavior set we propose, consisting of avoidance, safe--wandering, following, aggregation, dispersion, and homing, is constructed from behaviors commonly observed in a variety of species in nature. The proposed behaviors are manifested spatially, but have an effect on more abstract modes of interaction, including the exchange of information and cooperation. We demonstrate how basis behaviors can be combined into higher--level group behaviors commonly observed across species. The combination mechanisms we propose are useful for synthesizing a variety of new group behaviors, as well as for analyzing naturally occurring ones. Key words: group behavior, robotics, eth...
Purposive behavior acquisition on a real robot by vision-based reinforcement learning
- MACHINE LEARNING
, 1996
"... This paper presents a method of vision-based reinforcement learning by which a robot learns to shoot a ball into a goal. We discuss several issues in applying the reinforcement learning method to a real robot with vision sensor by which the robot can obtain information about the changes in an envi ..."
Abstract
-
Cited by 130 (30 self)
- Add to MetaCart
(Show Context)
This paper presents a method of vision-based reinforcement learning by which a robot learns to shoot a ball into a goal. We discuss several issues in applying the reinforcement learning method to a real robot with vision sensor by which the robot can obtain information about the changes in an environment. First, we construct a state space in terms of size, position, and orientation of a ball and a goal in an image, and an action space is designed in terms of the action commands to be sent to the left and right motors of a mobile robot. This causes a “state-action deviation ” problem in constructing the state and action spaces that reflect the outputs from physical sensors and actuators, respectively. To deal with this issue, an action set is constructed in a way that one action consists of a series of the same action primitive which is successively executed until the current state changes. Next, to speed up the learning time, a mechanism of Learning from Easy Missions (or LEM) is implemented. LEM reduces the learning time from exponential to almost linear order in the size of the state space. The results of computer simulations and real robot experiments are given.
Planning, learning and coordination in multiagent decision processes
- In Proceedings of the Sixth Conference on Theoretical Aspects of Rationality and Knowledge (TARK96
, 1996
"... There has been a growing interest in AI in the design of multiagent systems, especially in multiagent cooperative planning. In this paper, we investigate the extent to which methods from single-agent planning and learning can be applied in multiagent settings. We survey a number of different techniq ..."
Abstract
-
Cited by 121 (1 self)
- Add to MetaCart
(Show Context)
There has been a growing interest in AI in the design of multiagent systems, especially in multiagent cooperative planning. In this paper, we investigate the extent to which methods from single-agent planning and learning can be applied in multiagent settings. We survey a number of different techniques from decision-theoretic planning and reinforcement learning and describe a number of interesting issues that arise with regard to coordinating the policies of individual agents. To this end, we describe multiagent Markov decision processes as a general model in which to frame this discussion. These are special n-person cooperative games in which agents share the same utility function. We discuss coordination mechanisms based on imposed conventions (or social laws) as well as learning methods for coordination. Our focus is on the decomposition of sequential decision processes so that coordination can be learned (or imposed) locally, at the level of individual states. We also discuss the use of structured problem representations and their role in the generalization of learned conventions and in approximation. 1