Results 1 - 10
of
137
Cooperative Multi-Agent Learning: The State of the Art
- Autonomous Agents and Multi-Agent Systems
, 2005
"... Cooperative multi-agent systems are ones in which several agents attempt, through their interaction, to jointly solve tasks or to maximize utility. Due to the interactions among the agents, multi-agent problem complexity can rise rapidly with the number of agents or their behavioral sophistication. ..."
Abstract
-
Cited by 182 (8 self)
- Add to MetaCart
(Show Context)
Cooperative multi-agent systems are ones in which several agents attempt, through their interaction, to jointly solve tasks or to maximize utility. Due to the interactions among the agents, multi-agent problem complexity can rise rapidly with the number of agents or their behavioral sophistication. The challenge this presents to the task of programming solutions to multi-agent systems problems has spawned increasing interest in machine learning techniques to automate the search and optimization process. We provide a broad survey of the cooperative multi-agent learning literature. Previous surveys of this area have largely focused on issues common to specific subareas (for example, reinforcement learning or robotics). In this survey we attempt to draw from multi-agent learning work in a spectrum of areas, including reinforcement learning, evolutionary computation, game theory, complex systems, agent modeling, and robotics. We find that this broad view leads to a division of the work into two categories, each with its own special issues: applying a single learner to discover joint solutions to multi-agent problems (team learning), or using multiple simultaneous learners, often one per agent (concurrent learning). Additionally, we discuss direct and indirect communication in connection with learning, plus open issues in task decomposition, scalability, and adaptive dynamics. We conclude with a presentation of multi-agent learning problem domains, and a list of multi-agent learning resources. 1
Nash Q-Learning for General-Sum Stochastic Games
- JOURNAL OF MACHINE LEARNING RESEARCH
, 2003
"... We extend Q-learning to a noncooperative multiagent context, using the framework of generalsum stochastic games. A learning agent maintains Q-functions over joint actions, and performs updates based on assuming Nash equilibrium behavior over the current Q-values. This learning protocol provably conv ..."
Abstract
-
Cited by 138 (0 self)
- Add to MetaCart
We extend Q-learning to a noncooperative multiagent context, using the framework of generalsum stochastic games. A learning agent maintains Q-functions over joint actions, and performs updates based on assuming Nash equilibrium behavior over the current Q-values. This learning protocol provably converges given certain restrictions on the stage games (defined by Q-values) that arise during learning. Experiments with a pair of two-player grid games suggest that such restrictions on the game structure are not necessarily required. Stage games encountered during learning in both grid environments violate the conditions. However, learning consistently converges in the first grid game, which has a unique equilibrium Q-function, but sometimes fails to converge in the second, which has three different equilibrium Q-functions. In a comparison of offline learning performance in both games, we find agents are more likely to reach a joint optimal path with Nash Q-learning than with a single-agent Q-learning method. When at least one agent adopts Nash Q-learning, the performance of both agents is better than using single-agent Q-learning. We have also implemented an online version of Nash Q-learning that balances exploration with exploitation, yielding improved performance.
A Polynomial-time Nash Equilibrium Algorithm for Repeated Games
- Proceedings of the ACM Conference on Electronic Commerce (ACM-EC
, 2004
"... With the increasing reliance on game theory as a foundation for auctions and electronic commerce, ecient algorithms for computing equilibria in multiplayer general-sum games are of great theoretical and practical interest. The computational complexity of nding a Nash equilibrium for a one-shot bima ..."
Abstract
-
Cited by 74 (5 self)
- Add to MetaCart
With the increasing reliance on game theory as a foundation for auctions and electronic commerce, ecient algorithms for computing equilibria in multiplayer general-sum games are of great theoretical and practical interest. The computational complexity of nding a Nash equilibrium for a one-shot bimatrix game is a well known open problem. This paper treats a related but distinct problem, that of nding a Nash equilibrium for an average-payo repeated bimatrix game, and presents a polynomial-time algorithm. Our approach draws on the well known \folk theorem" from game theory and shows how nite-state equilibrium strategies can be found eciently and expressed succinctly.
Multi-agent reinforcement learning: a critical survey
, 2003
"... We survey the recent work in AI on multi-agent reinforcement learning (that is, learning in stochastic games). We then argue that, while exciting, this work is flawed. The fundamental flaw is unclarity about the problem or problems being addressed. After tracing a representative sample of the recent ..."
Abstract
-
Cited by 68 (1 self)
- Add to MetaCart
(Show Context)
We survey the recent work in AI on multi-agent reinforcement learning (that is, learning in stochastic games). We then argue that, while exciting, this work is flawed. The fundamental flaw is unclarity about the problem or problems being addressed. After tracing a representative sample of the recent literature, we identify four well-defined problems in multi-agent reinforcement learning, single out the problem that in our view is most suitable for AI, and make some remarks about how we believe progress is tobemadeonthisproblem. 1
Implicit Negotiation in Repeated Games
- In Proceedings of The Eighth International Workshop on Agent Theories, Architectures, and Languages (ATAL-2001
, 2001
"... In business-related interactions such as the on-going highstakes FCC spectrum auctions, explicit communication among participants is regarded as collusion, and is therefore illegal. In this paper, we consider the possibility of autonomous agents engaging in implicit negotiation via their tacit inter ..."
Abstract
-
Cited by 44 (11 self)
- Add to MetaCart
In business-related interactions such as the on-going highstakes FCC spectrum auctions, explicit communication among participants is regarded as collusion, and is therefore illegal. In this paper, we consider the possibility of autonomous agents engaging in implicit negotiation via their tacit interactions. In repeated general-sum games, our testbed for studying this type of interaction, an agent using a "best-response" strategy maximizes its own payoff assuming its behavior has no effect on its opponent. This notion of best response requires some degree of learning to determine the fixed opponent behavior. Against an unchanging opponent, the best-response agent performs optimally, and can be thought of as a "follower," since it adapts to its opponent. However, pairing two best-response agents in a repeated game can result in suboptimal behavior. We demonstrate this suboptimality in several different games using variants of Q-learning as an example of a best-response strategy. We then examine two "leader" strategies that induce better performance from opponent followers via stubbornness and threats. These tactics are forms of implicit negotiation in that they aim to achieve a mutually beneficial outcome without using explicit communication outside of the game.
Extending Q-learning to general adaptive multi-agent systems
- In Advances in Neural Information Processing Systems 16
, 2004
"... Recent multi-agent extensions of Q-Learning require knowledge of other agents ’ payoffs and Q-functions, and assume game-theoretic play at all times by all other agents. This paper proposes a fundamentally different approach, dubbed “Hyper-Q ” Learning, in which values of mixed strategies rather tha ..."
Abstract
-
Cited by 39 (0 self)
- Add to MetaCart
Recent multi-agent extensions of Q-Learning require knowledge of other agents ’ payoffs and Q-functions, and assume game-theoretic play at all times by all other agents. This paper proposes a fundamentally different approach, dubbed “Hyper-Q ” Learning, in which values of mixed strategies rather than base actions are learned, and in which other agents ’ strategies are estimated from observed actions via Bayesian inference. Hyper-Q may be effective against many different types of adaptive agents, even if they are persistently dynamic. Against certain broad categories of adaptation, it is argued that Hyper-Q may converge to exact optimal time-varying policies. In tests using Rock-Paper-Scissors, Hyper-Q learns to significantly exploit an Infinitesimal Gradient Ascent (IGA) player, as well as a Policy Hill Climber (PHC) player. Preliminary analysis of Hyper-Q against itself is also presented. 1
Playing is believing: The role of beliefs in multi-agent learning
- In Advances in Neural Information Processing Systems 14
, 2001
"... We propose a new classification for multi-agent learning algorithms, with each league of players characterized by both their possible strategies and possible beliefs. Using this classification, we review the optimality of existing algorithms, including the case of interleague play. We propose an ..."
Abstract
-
Cited by 32 (1 self)
- Add to MetaCart
(Show Context)
We propose a new classification for multi-agent learning algorithms, with each league of players characterized by both their possible strategies and possible beliefs. Using this classification, we review the optimality of existing algorithms, including the case of interleague play. We propose an incremental improvement to the existing algorithms that seems to achieve average payoffs that are at least the Nash equilibrium payoffs in the longrun against fair opponents.
Multiagent Reinforcement Learning for Multi-Robot Systems: A Survey
, 2004
"... Multiagent reinforcement learning for multirobot systems is a challenging issue in both robotics and artificial intelligence. With the ever increasing interests in theoretical researches and practical applications, currently there have been a lot of efforts towards providing some solutions to this c ..."
Abstract
-
Cited by 28 (0 self)
- Add to MetaCart
(Show Context)
Multiagent reinforcement learning for multirobot systems is a challenging issue in both robotics and artificial intelligence. With the ever increasing interests in theoretical researches and practical applications, currently there have been a lot of efforts towards providing some solutions to this challenge. However, there are still many difficulties in scaling up the multiagent reinforcement learning to multi-robot systems. The main objective of this paper is to provide a survey, though not completely on the multiagent reinforcement learning in multi-robot systems. After reviewing important advances in this field, some challenging problems and promising research directions are analyzed. A concluding remark is made from the perspectives of the authors.