Results 1  10
of
86
Dynamic Programming for Partially Observable Stochastic Games
 IN PROCEEDINGS OF THE NINETEENTH NATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE
, 2004
"... We develop an exact dynamic programming algorithm for partially observable stochastic games (POSGs). The algorithm is a synthesis of dynamic programming for partially observable Markov decision processes (POMDPs) and iterated elimination of dominated strategies in normal form games. ..."
Abstract

Cited by 119 (23 self)
 Add to MetaCart
We develop an exact dynamic programming algorithm for partially observable stochastic games (POSGs). The algorithm is a synthesis of dynamic programming for partially observable Markov decision processes (POMDPs) and iterated elimination of dominated strategies in normal form games.
Cooperative MultiAgent Learning: The State of the Art
 Autonomous Agents and MultiAgent Systems
, 2005
"... Cooperative multiagent systems are ones in which several agents attempt, through their interaction, to jointly solve tasks or to maximize utility. Due to the interactions among the agents, multiagent problem complexity can rise rapidly with the number of agents or their behavioral sophistication. ..."
Abstract

Cited by 115 (6 self)
 Add to MetaCart
Cooperative multiagent systems are ones in which several agents attempt, through their interaction, to jointly solve tasks or to maximize utility. Due to the interactions among the agents, multiagent problem complexity can rise rapidly with the number of agents or their behavioral sophistication. The challenge this presents to the task of programming solutions to multiagent systems problems has spawned increasing interest in machine learning techniques to automate the search and optimization process. We provide a broad survey of the cooperative multiagent learning literature. Previous surveys of this area have largely focused on issues common to specific subareas (for example, reinforcement learning or robotics). In this survey we attempt to draw from multiagent learning work in a spectrum of areas, including reinforcement learning, evolutionary computation, game theory, complex systems, agent modeling, and robotics. We find that this broad view leads to a division of the work into two categories, each with its own special issues: applying a single learner to discover joint solutions to multiagent problems (team learning), or using multiple simultaneous learners, often one per agent (concurrent learning). Additionally, we discuss direct and indirect communication in connection with learning, plus open issues in task decomposition, scalability, and adaptive dynamics. We conclude with a presentation of multiagent learning problem domains, and a list of multiagent learning resources. 1
Correlated Qlearning
 In Proceedings of the Twentieth International Conference on Machine Learning
, 2003
"... There have been several attempts to design multiagent Qlearning algorithms capable of learning equilibrium policies in generalsum Markov games, just as Qlearning learns optimal policies in Markov decision processes. We introduce correlated Qlearning, one such algorithm based on the correlated eq ..."
Abstract

Cited by 56 (2 self)
 Add to MetaCart
There have been several attempts to design multiagent Qlearning algorithms capable of learning equilibrium policies in generalsum Markov games, just as Qlearning learns optimal policies in Markov decision processes. We introduce correlated Qlearning, one such algorithm based on the correlated equilibrium solution concept. Motivated by a fixed point proof of the existence of stationary correlated equilibrium policies in Markov games, we present a generic multiagent Qlearning algorithm of which many popular algorithms are immediate special cases. We also prove that certain variants of correlated (and Nash) Qlearning are guaranteed to converge to stationary correlated (and Nash) equilibrium policies in two special classes of Markov games, namely zerosum and commoninterest. Finally, we show empirically that correlated Qlearning outperforms Nash Qlearning, further justifying the former beyond noting that it is less computationally expensive than the latter.
BestResponse Multiagent Learning in NonStationary Environments
, 2004
"... This paper investigates a relatively new direction in Multiagent Reinforcement Learning. Most multiagent learning techniques focus on Nash equilibria as elements of both the learning algorithm and its evaluation criteria. In contrast, we propose a multiagent learning algorithm that is optimal in the ..."
Abstract

Cited by 16 (1 self)
 Add to MetaCart
This paper investigates a relatively new direction in Multiagent Reinforcement Learning. Most multiagent learning techniques focus on Nash equilibria as elements of both the learning algorithm and its evaluation criteria. In contrast, we propose a multiagent learning algorithm that is optimal in the sense of finding a bestresponse policy, rather than in reaching an equilibrium. We present the first learning algorithm that is provably optimal against restricted classes of nonstationary opponents. The algorithm infers an accurate model of the opponent's nonstationary strategy, and simultaneously creates a bestresponse policy against that strategy. Our learning algorithm works within the very general framework of #player, generalsum stochastic games, and learns both the game structure and its associated optimal policy.
Multiagent learning for engineers
 Artificial Intelligence
, 2007
"... As suggested by the title of Shoham, Powers, and Grenager’s position paper [34], the ultimate lens through which the multiagent learning framework should be assessed is “what is the question?”. In this paper, we address this question by presenting challenges motivated by engineering applications an ..."
Abstract

Cited by 15 (5 self)
 Add to MetaCart
As suggested by the title of Shoham, Powers, and Grenager’s position paper [34], the ultimate lens through which the multiagent learning framework should be assessed is “what is the question?”. In this paper, we address this question by presenting challenges motivated by engineering applications and discussing the potential appeal of multiagent learning to meet these challenges. Moreover, we highlight various differences in the underlying assumptions and issues of concern that generally distinguish engineering applications from models that are typically considered in the economic game theory literature. 1
An intrusion detection game with limited observations
 Proceedings of the 12th Int. Symp. on Dynamic Games and Applications, Sophia Antipolis
, 2006
"... We present a 2player zerosum stochastic (Markov) security game which models the interaction between malicious attackers to a system and the IDS who allocates system resources for detection and response. We capture the operation of a sensor network observing and reporting the attack information to ..."
Abstract

Cited by 14 (6 self)
 Add to MetaCart
We present a 2player zerosum stochastic (Markov) security game which models the interaction between malicious attackers to a system and the IDS who allocates system resources for detection and response. We capture the operation of a sensor network observing and reporting the attack information to the IDS as a finite Markov chain. Thus, we extend the game theoretic framework in [1] to a stochastic and dynamic one. We analyze the outcomes and evolution of an example game numerically for various game parameters. Furthermore, we study limited information cases where players optimize their strategies offline or online depending on the type of information available, using methods based on Markov decision process and Qlearning. I.
Multiagent Reinforcement Learning for MultiRobot Systems: A Survey
, 2004
"... Multiagent reinforcement learning for multirobot systems is a challenging issue in both robotics and artificial intelligence. With the ever increasing interests in theoretical researches and practical applications, currently there have been a lot of efforts towards providing some solutions to this c ..."
Abstract

Cited by 11 (0 self)
 Add to MetaCart
Multiagent reinforcement learning for multirobot systems is a challenging issue in both robotics and artificial intelligence. With the ever increasing interests in theoretical researches and practical applications, currently there have been a lot of efforts towards providing some solutions to this challenge. However, there are still many difficulties in scaling up the multiagent reinforcement learning to multirobot systems. The main objective of this paper is to provide a survey, though not completely on the multiagent reinforcement learning in multirobot systems. After reviewing important advances in this field, some challenging problems and promising research directions are analyzed. A concluding remark is made from the perspectives of the authors.
Learning from Multiple Sources
 In AAMAS2004 — Proceedings of the Third International Joint Conference on Autonomous Agents and Multi Agent Systems
, 2004
"... This work aims at defining and testing a set of techniques that enables agents to use information from several sources during learning. In Multiagent Systems (MAS) it is frequent that several agents need to learn similar concepts in parallel. In this type of environment there are more possibilities ..."
Abstract

Cited by 10 (1 self)
 Add to MetaCart
This work aims at defining and testing a set of techniques that enables agents to use information from several sources during learning. In Multiagent Systems (MAS) it is frequent that several agents need to learn similar concepts in parallel. In this type of environment there are more possibilities for learning than in classical Machine Learning. Exchange of information between teams of agents that are attempting to solve similar problems may increase accuracy and learning speed at the expense of communication. One of the most interesting possibilities to explore is when the agents may have different structures and learning algorithms, thus providing different perspectives on the problems they are facing. In this paper the authors report the results of experiments made in a traffic control simulation with and without exchanging information between learning agents.