Results 1  10
of
62
Multiagent Learning Using a Variable Learning Rate
 Artificial Intelligence
, 2002
"... Learning to act in a multiagent environment is a difficult problem since the normal definition of an optimal policy no longer applies. The optimal policy at any moment depends on the policies of the other agents and so creates a situation of learning a moving target. Previous learning algorithms hav ..."
Abstract

Cited by 180 (8 self)
 Add to MetaCart
Learning to act in a multiagent environment is a difficult problem since the normal definition of an optimal policy no longer applies. The optimal policy at any moment depends on the policies of the other agents and so creates a situation of learning a moving target. Previous learning algorithms have one of two shortcomings depending on their approach. They either converge to a policy that may not be optimal against the specific opponents' policies, or they may not converge at all. In this article we examine this learning problem in the framework of stochastic games. We look at a number of previous learning algorithms showing how they fail at one of the above criteria. We then contribute a new reinforcement learning technique using a variable learning rate to overcome these shortcomings. Specifically, we introduce the WoLF principle, "Win or Learn Fast", for varying the learning rate. We examine this technique theoretically, proving convergence in selfplay on a restricted class of iterated matrix games. We also present empirical results on a variety of more general stochastic games, in situations of selfplay and otherwise, demonstrating the wide applicability of this method.
Cooperative MultiAgent Learning: The State of the Art
 Autonomous Agents and MultiAgent Systems
, 2005
"... Cooperative multiagent systems are ones in which several agents attempt, through their interaction, to jointly solve tasks or to maximize utility. Due to the interactions among the agents, multiagent problem complexity can rise rapidly with the number of agents or their behavioral sophistication. ..."
Abstract

Cited by 113 (6 self)
 Add to MetaCart
Cooperative multiagent systems are ones in which several agents attempt, through their interaction, to jointly solve tasks or to maximize utility. Due to the interactions among the agents, multiagent problem complexity can rise rapidly with the number of agents or their behavioral sophistication. The challenge this presents to the task of programming solutions to multiagent systems problems has spawned increasing interest in machine learning techniques to automate the search and optimization process. We provide a broad survey of the cooperative multiagent learning literature. Previous surveys of this area have largely focused on issues common to specific subareas (for example, reinforcement learning or robotics). In this survey we attempt to draw from multiagent learning work in a spectrum of areas, including reinforcement learning, evolutionary computation, game theory, complex systems, agent modeling, and robotics. We find that this broad view leads to a division of the work into two categories, each with its own special issues: applying a single learner to discover joint solutions to multiagent problems (team learning), or using multiple simultaneous learners, often one per agent (concurrent learning). Additionally, we discuss direct and indirect communication in connection with learning, plus open issues in task decomposition, scalability, and adaptive dynamics. We conclude with a presentation of multiagent learning problem domains, and a list of multiagent learning resources. 1
Run the GAMUT: A comprehensive approach to evaluating gametheoretic algorithms
 In AAMAS04
, 2004
"... We present GAMUT 1, a suite of game generators designed for testing gametheoretic algorithms. We explain why such a generator is necessary, offer a way of visualizing relationships between the sets of games supported by GAMUT, and give an overview of GAMUTâ€™s architecture. We highlight the importanc ..."
Abstract

Cited by 65 (8 self)
 Add to MetaCart
We present GAMUT 1, a suite of game generators designed for testing gametheoretic algorithms. We explain why such a generator is necessary, offer a way of visualizing relationships between the sets of games supported by GAMUT, and give an overview of GAMUTâ€™s architecture. We highlight the importance of using comprehensive test data by benchmarking existing algorithms. We show surprisingly large variation in algorithm performance across different sets of games for two widelystudied problems: computing Nash equilibria and multiagent learning in repeated games. 2 1.
Multiagent reinforcement learning: a critical survey
, 2003
"... We survey the recent work in AI on multiagent reinforcement learning (that is, learning in stochastic games). We then argue that, while exciting, this work is flawed. The fundamental flaw is unclarity about the problem or problems being addressed. After tracing a representative sample of the recent ..."
Abstract

Cited by 52 (0 self)
 Add to MetaCart
We survey the recent work in AI on multiagent reinforcement learning (that is, learning in stochastic games). We then argue that, while exciting, this work is flawed. The fundamental flaw is unclarity about the problem or problems being addressed. After tracing a representative sample of the recent literature, we identify four welldefined problems in multiagent reinforcement learning, single out the problem that in our view is most suitable for AI, and make some remarks about how we believe progress is tobemadeonthisproblem. 1
Accelerating Reinforcement Learning through Implicit Imitation
 JOURNAL OF ARTIFICIAL INTELLIGENCE RESEARCH
, 2003
"... Imitation can be viewed as a means of enhancing learning in multiagent environments. It augments ..."
Abstract

Cited by 51 (0 self)
 Add to MetaCart
Imitation can be viewed as a means of enhancing learning in multiagent environments. It augments
Coordination in Multiagent Reinforcement Learning: A Bayesian Approach
 In Proceedings of the Second International Joint Conference on Autonomous Agents and Multiagent Systems
, 2003
"... Much emphasis in multiagent reinforcement learning (MARL) research is placed on ensuring that MARL algorithms (eventually) converge to desirable equilibria. As in standard reinforcement learning, convergence generally requires sufficient exploration of strategy space. However, exploration often com ..."
Abstract

Cited by 48 (6 self)
 Add to MetaCart
Much emphasis in multiagent reinforcement learning (MARL) research is placed on ensuring that MARL algorithms (eventually) converge to desirable equilibria. As in standard reinforcement learning, convergence generally requires sufficient exploration of strategy space. However, exploration often comes at a price in the form of penalties or foregone opportunities. In multiagent settings, the problem is exacerbated by the need for agents to "coordinate" their policies on equilibria. We propose a Bayesian model for optimal exploration in MARL problems that allows these exploration costs to be weighed against their expected benefits using the notion of value of information. Unlike standard RL models, this model requires reasoning about how one's actions will influence the behavior of other agents. We develop tractable approximations to optimal Bayesian exploration, and report on experiments illustrating the benefits of this approach in identical interest games.
CorrelatedQ learning
 In NIPS Workshop on Multiagent Learning
, 2002
"... Bowling named two desiderata for multiagent learning algorithms: rationality and convergence. This paper introduces co~elatedQ learning, a natural generalization of NashQ and FFQ that satisfies these criteria. NashoQ satisfies rationality, but in general it does not converge. FFQ satisfies conve ..."
Abstract

Cited by 44 (1 self)
 Add to MetaCart
Bowling named two desiderata for multiagent learning algorithms: rationality and convergence. This paper introduces co~elatedQ learning, a natural generalization of NashQ and FFQ that satisfies these criteria. NashoQ satisfies rationality, but in general it does not converge. FFQ satisfies convergence, but in general it is not rational. CorrelatedQ satisfies rationality by construction. This papers demonstrates the empirical convergence of correlatedQ on a standard testbed of generalsum Markov games.
Scalable Learning in Stochastic Games
 In: AAAI Workshop on Game Theoretic and Decision Theoretic Agents
, 2002
"... Stochastic games are a general model of interaction between multiple agents. ..."
Abstract

Cited by 17 (1 self)
 Add to MetaCart
Stochastic games are a general model of interaction between multiple agents.
BestResponse Multiagent Learning in NonStationary Environments
, 2004
"... This paper investigates a relatively new direction in Multiagent Reinforcement Learning. Most multiagent learning techniques focus on Nash equilibria as elements of both the learning algorithm and its evaluation criteria. In contrast, we propose a multiagent learning algorithm that is optimal in the ..."
Abstract

Cited by 16 (1 self)
 Add to MetaCart
This paper investigates a relatively new direction in Multiagent Reinforcement Learning. Most multiagent learning techniques focus on Nash equilibria as elements of both the learning algorithm and its evaluation criteria. In contrast, we propose a multiagent learning algorithm that is optimal in the sense of finding a bestresponse policy, rather than in reaching an equilibrium. We present the first learning algorithm that is provably optimal against restricted classes of nonstationary opponents. The algorithm infers an accurate model of the opponent's nonstationary strategy, and simultaneously creates a bestresponse policy against that strategy. Our learning algorithm works within the very general framework of #player, generalsum stochastic games, and learns both the game structure and its associated optimal policy.