Results 1  10
of
137
Cooperative MultiAgent Learning: The State of the Art
 Autonomous Agents and MultiAgent Systems
, 2005
"... Cooperative multiagent systems are ones in which several agents attempt, through their interaction, to jointly solve tasks or to maximize utility. Due to the interactions among the agents, multiagent problem complexity can rise rapidly with the number of agents or their behavioral sophistication. ..."
Abstract

Cited by 182 (8 self)
 Add to MetaCart
(Show Context)
Cooperative multiagent systems are ones in which several agents attempt, through their interaction, to jointly solve tasks or to maximize utility. Due to the interactions among the agents, multiagent problem complexity can rise rapidly with the number of agents or their behavioral sophistication. The challenge this presents to the task of programming solutions to multiagent systems problems has spawned increasing interest in machine learning techniques to automate the search and optimization process. We provide a broad survey of the cooperative multiagent learning literature. Previous surveys of this area have largely focused on issues common to specific subareas (for example, reinforcement learning or robotics). In this survey we attempt to draw from multiagent learning work in a spectrum of areas, including reinforcement learning, evolutionary computation, game theory, complex systems, agent modeling, and robotics. We find that this broad view leads to a division of the work into two categories, each with its own special issues: applying a single learner to discover joint solutions to multiagent problems (team learning), or using multiple simultaneous learners, often one per agent (concurrent learning). Additionally, we discuss direct and indirect communication in connection with learning, plus open issues in task decomposition, scalability, and adaptive dynamics. We conclude with a presentation of multiagent learning problem domains, and a list of multiagent learning resources. 1
Nash QLearning for GeneralSum Stochastic Games
 JOURNAL OF MACHINE LEARNING RESEARCH
, 2003
"... We extend Qlearning to a noncooperative multiagent context, using the framework of generalsum stochastic games. A learning agent maintains Qfunctions over joint actions, and performs updates based on assuming Nash equilibrium behavior over the current Qvalues. This learning protocol provably conv ..."
Abstract

Cited by 138 (0 self)
 Add to MetaCart
We extend Qlearning to a noncooperative multiagent context, using the framework of generalsum stochastic games. A learning agent maintains Qfunctions over joint actions, and performs updates based on assuming Nash equilibrium behavior over the current Qvalues. This learning protocol provably converges given certain restrictions on the stage games (defined by Qvalues) that arise during learning. Experiments with a pair of twoplayer grid games suggest that such restrictions on the game structure are not necessarily required. Stage games encountered during learning in both grid environments violate the conditions. However, learning consistently converges in the first grid game, which has a unique equilibrium Qfunction, but sometimes fails to converge in the second, which has three different equilibrium Qfunctions. In a comparison of offline learning performance in both games, we find agents are more likely to reach a joint optimal path with Nash Qlearning than with a singleagent Qlearning method. When at least one agent adopts Nash Qlearning, the performance of both agents is better than using singleagent Qlearning. We have also implemented an online version of Nash Qlearning that balances exploration with exploitation, yielding improved performance.
A Polynomialtime Nash Equilibrium Algorithm for Repeated Games
 Proceedings of the ACM Conference on Electronic Commerce (ACMEC
, 2004
"... With the increasing reliance on game theory as a foundation for auctions and electronic commerce, ecient algorithms for computing equilibria in multiplayer generalsum games are of great theoretical and practical interest. The computational complexity of nding a Nash equilibrium for a oneshot bima ..."
Abstract

Cited by 74 (5 self)
 Add to MetaCart
With the increasing reliance on game theory as a foundation for auctions and electronic commerce, ecient algorithms for computing equilibria in multiplayer generalsum games are of great theoretical and practical interest. The computational complexity of nding a Nash equilibrium for a oneshot bimatrix game is a well known open problem. This paper treats a related but distinct problem, that of nding a Nash equilibrium for an averagepayo repeated bimatrix game, and presents a polynomialtime algorithm. Our approach draws on the well known \folk theorem" from game theory and shows how nitestate equilibrium strategies can be found eciently and expressed succinctly.
Multiagent reinforcement learning: a critical survey
, 2003
"... We survey the recent work in AI on multiagent reinforcement learning (that is, learning in stochastic games). We then argue that, while exciting, this work is flawed. The fundamental flaw is unclarity about the problem or problems being addressed. After tracing a representative sample of the recent ..."
Abstract

Cited by 68 (1 self)
 Add to MetaCart
(Show Context)
We survey the recent work in AI on multiagent reinforcement learning (that is, learning in stochastic games). We then argue that, while exciting, this work is flawed. The fundamental flaw is unclarity about the problem or problems being addressed. After tracing a representative sample of the recent literature, we identify four welldefined problems in multiagent reinforcement learning, single out the problem that in our view is most suitable for AI, and make some remarks about how we believe progress is tobemadeonthisproblem. 1
Implicit Negotiation in Repeated Games
 In Proceedings of The Eighth International Workshop on Agent Theories, Architectures, and Languages (ATAL2001
, 2001
"... In businessrelated interactions such as the ongoing highstakes FCC spectrum auctions, explicit communication among participants is regarded as collusion, and is therefore illegal. In this paper, we consider the possibility of autonomous agents engaging in implicit negotiation via their tacit inter ..."
Abstract

Cited by 44 (11 self)
 Add to MetaCart
In businessrelated interactions such as the ongoing highstakes FCC spectrum auctions, explicit communication among participants is regarded as collusion, and is therefore illegal. In this paper, we consider the possibility of autonomous agents engaging in implicit negotiation via their tacit interactions. In repeated generalsum games, our testbed for studying this type of interaction, an agent using a "bestresponse" strategy maximizes its own payoff assuming its behavior has no effect on its opponent. This notion of best response requires some degree of learning to determine the fixed opponent behavior. Against an unchanging opponent, the bestresponse agent performs optimally, and can be thought of as a "follower," since it adapts to its opponent. However, pairing two bestresponse agents in a repeated game can result in suboptimal behavior. We demonstrate this suboptimality in several different games using variants of Qlearning as an example of a bestresponse strategy. We then examine two "leader" strategies that induce better performance from opponent followers via stubbornness and threats. These tactics are forms of implicit negotiation in that they aim to achieve a mutually beneficial outcome without using explicit communication outside of the game.
Extending Qlearning to general adaptive multiagent systems
 In Advances in Neural Information Processing Systems 16
, 2004
"... Recent multiagent extensions of QLearning require knowledge of other agents ’ payoffs and Qfunctions, and assume gametheoretic play at all times by all other agents. This paper proposes a fundamentally different approach, dubbed “HyperQ ” Learning, in which values of mixed strategies rather tha ..."
Abstract

Cited by 39 (0 self)
 Add to MetaCart
Recent multiagent extensions of QLearning require knowledge of other agents ’ payoffs and Qfunctions, and assume gametheoretic play at all times by all other agents. This paper proposes a fundamentally different approach, dubbed “HyperQ ” Learning, in which values of mixed strategies rather than base actions are learned, and in which other agents ’ strategies are estimated from observed actions via Bayesian inference. HyperQ may be effective against many different types of adaptive agents, even if they are persistently dynamic. Against certain broad categories of adaptation, it is argued that HyperQ may converge to exact optimal timevarying policies. In tests using RockPaperScissors, HyperQ learns to significantly exploit an Infinitesimal Gradient Ascent (IGA) player, as well as a Policy Hill Climber (PHC) player. Preliminary analysis of HyperQ against itself is also presented. 1
Playing is believing: The role of beliefs in multiagent learning
 In Advances in Neural Information Processing Systems 14
, 2001
"... We propose a new classification for multiagent learning algorithms, with each league of players characterized by both their possible strategies and possible beliefs. Using this classification, we review the optimality of existing algorithms, including the case of interleague play. We propose an ..."
Abstract

Cited by 32 (1 self)
 Add to MetaCart
(Show Context)
We propose a new classification for multiagent learning algorithms, with each league of players characterized by both their possible strategies and possible beliefs. Using this classification, we review the optimality of existing algorithms, including the case of interleague play. We propose an incremental improvement to the existing algorithms that seems to achieve average payoffs that are at least the Nash equilibrium payoffs in the longrun against fair opponents.
Multiagent Reinforcement Learning for MultiRobot Systems: A Survey
, 2004
"... Multiagent reinforcement learning for multirobot systems is a challenging issue in both robotics and artificial intelligence. With the ever increasing interests in theoretical researches and practical applications, currently there have been a lot of efforts towards providing some solutions to this c ..."
Abstract

Cited by 28 (0 self)
 Add to MetaCart
(Show Context)
Multiagent reinforcement learning for multirobot systems is a challenging issue in both robotics and artificial intelligence. With the ever increasing interests in theoretical researches and practical applications, currently there have been a lot of efforts towards providing some solutions to this challenge. However, there are still many difficulties in scaling up the multiagent reinforcement learning to multirobot systems. The main objective of this paper is to provide a survey, though not completely on the multiagent reinforcement learning in multirobot systems. After reviewing important advances in this field, some challenging problems and promising research directions are analyzed. A concluding remark is made from the perspectives of the authors.