Results 11 - 20
of
331
AWESOME: A general multiagent learning algorithm that converges in self-play and learns a best response against stationary opponents
, 2003
"... A satisfactory multiagent learning algorithm should, at a minimum, learn to play optimally against stationary opponents and converge to a Nash equilibrium in self-play. The algorithm that has come closest, WoLF-IGA, has been proven to have these two properties in 2-player 2-action repeated games— as ..."
Abstract
-
Cited by 97 (5 self)
- Add to MetaCart
(Show Context)
A satisfactory multiagent learning algorithm should, at a minimum, learn to play optimally against stationary opponents and converge to a Nash equilibrium in self-play. The algorithm that has come closest, WoLF-IGA, has been proven to have these two properties in 2-player 2-action repeated games— assuming that the opponent’s (mixed) strategy is observable. In this paper we present AWESOME, the first algorithm that is guaranteed to have these two properties in all repeated (finite) games. It requires only that the other players ’ actual actions (not their strategies) can be observed at each step. It also learns to play optimally against opponents that eventually become stationary. The basic idea behind AWESOME (Adapt When Everybody is Stationary, Otherwise Move to Equilibrium) is to try to adapt to the others’ strategies when they appear stationary, but otherwise to retreat to a precomputed equilibrium strategy. The techniques used to prove the properties of AWESOME are fundamentally different from those used for previous algorithms, and may help in analyzing other multiagent learning algorithms also.
Rational and Convergent Learning in Stochastic Games
, 2001
"... This paper investigates the problem of policy learning in multiagent environments using the stochastic game framework, which we briefly overview. We introduce two properties as desirable for a learning agent when in the presence of other learning agents, namely rationality and convergence. We e ..."
Abstract
-
Cited by 91 (5 self)
- Add to MetaCart
This paper investigates the problem of policy learning in multiagent environments using the stochastic game framework, which we briefly overview. We introduce two properties as desirable for a learning agent when in the presence of other learning agents, namely rationality and convergence. We examine existing reinforcement learning algorithms according to these two properties and notice that they fail to simultaneously meet both criteria. We then contribute a new learning algorithm, WoLF policy hillclimbing, that is based on a simple principle: "learn quickly while losing, slowly while winning." The algorithm is proven to be rational and we present empirical results for a number of stochastic games showing the algorithm converges.
Reinforcement Learning to Play an Optimal Nash Equilibrium in Team Markov Games
- in Advances in Neural Information Processing Systems
, 2002
"... Multiagent learning is a key problem in game theory and AI. It involves two interrelated learning problems: identifying the game and learning to play. These two problems prevail even in team games where the agents' interests do not conflict. Even team games can have multiple Nash equilibria, on ..."
Abstract
-
Cited by 88 (3 self)
- Add to MetaCart
(Show Context)
Multiagent learning is a key problem in game theory and AI. It involves two interrelated learning problems: identifying the game and learning to play. These two problems prevail even in team games where the agents' interests do not conflict. Even team games can have multiple Nash equilibria, only some of which are optimal. We present optimal adaptive learning (OAL), the first algorithm that converges to an optimal Nash equilibrium for any team Markov game. We provide a convergence proof, and show that the algorithm's parameters are easy to set so that the convergence conditions are met. Our experiments show that existing algorithms do not converge in many of these problems while OAL does. We also demonstrate the importance of the fundamental ideas behind OAL: incomplete history sampling and biased action selection.
Convergence and no-regret in multiagent learning
- In Advances in Neural Information Processing Systems 17
, 2005
"... Learning in a multiagent system is a challenging problem due to two key factors. First, if other agents are simultaneously learning then the environment is no longer stationary, thus undermining convergence guarantees. Second, learning is often susceptible to deception, where the other agents may be ..."
Abstract
-
Cited by 85 (0 self)
- Add to MetaCart
(Show Context)
Learning in a multiagent system is a challenging problem due to two key factors. First, if other agents are simultaneously learning then the environment is no longer stationary, thus undermining convergence guarantees. Second, learning is often susceptible to deception, where the other agents may be able to exploit a learner’s particular dynamics. In the worst case, this could result in poorer performance than if the agent was not learning at all. These challenges are identifiable in the two most common evaluation criteria for multiagent learning algorithms: convergence and regret. Algorithms focusing on convergence or regret in isolation are numerous. In this paper, we seek to address both criteria in a single algorithm by introducing GIGA-WoLF, a learning algorithm for normalform games. We prove the algorithm guarantees at most zero average regret, while demonstrating the algorithm converges in many situations of self-play. We prove convergence in a limited setting and give empirical results in a wider variety of situations. These results also suggest a third new learning criterion combining convergence and regret, which we call negative non-convergence regret (NNR). 1
Accelerating Reinforcement Learning through Implicit Imitation
- JOURNAL OF ARTIFICIAL INTELLIGENCE RESEARCH
, 2003
"... Imitation can be viewed as a means of enhancing learning in multiagent environments. It augments ..."
Abstract
-
Cited by 79 (0 self)
- Add to MetaCart
(Show Context)
Imitation can be viewed as a means of enhancing learning in multiagent environments. It augments
Analyzing complex strategic interactions in multi-agent systems.
- Proceedings of 2002 Workshop on Game-Theoretic and Decision-Theoretic Agents (GTDT-02),
, 2002
"... Abstract We develop a model for analyzing complex games with repeated interactions, for which a full game-theoretic analysis is intractable. Our approach treats exogenously specified, heuristic strategies, rather than the atomic actions, as primitive, and computes a heuristic-payoff table specifyin ..."
Abstract
-
Cited by 76 (3 self)
- Add to MetaCart
(Show Context)
Abstract We develop a model for analyzing complex games with repeated interactions, for which a full game-theoretic analysis is intractable. Our approach treats exogenously specified, heuristic strategies, rather than the atomic actions, as primitive, and computes a heuristic-payoff table specifying the expected payoffs of the joint heuristic strategy space. We analyze two games based on (i) automated dynamic pricing and (ii) continuous double auction. For each game we compute Nash equilibria of previously published heuristic strategies. To determine the most plausible equilibria, we study the replicator dynamics of a large population playing the strategies. In order to account for errors in estimation of payoffs or improvements in strategies, we also analyze the dynamics and equilibria based on perturbed payoffs.
Multi-agent reinforcement learning: a critical survey
, 2003
"... We survey the recent work in AI on multi-agent reinforcement learning (that is, learning in stochastic games). We then argue that, while exciting, this work is flawed. The fundamental flaw is unclarity about the problem or problems being addressed. After tracing a representative sample of the recent ..."
Abstract
-
Cited by 68 (1 self)
- Add to MetaCart
We survey the recent work in AI on multi-agent reinforcement learning (that is, learning in stochastic games). We then argue that, while exciting, this work is flawed. The fundamental flaw is unclarity about the problem or problems being addressed. After tracing a representative sample of the recent literature, we identify four well-defined problems in multi-agent reinforcement learning, single out the problem that in our view is most suitable for AI, and make some remarks about how we believe progress is tobemadeonthisproblem. 1
Coordination in Multiagent Reinforcement Learning: A Bayesian Approach
- In Proceedings of the Second International Joint Conference on Autonomous Agents and Multiagent Systems
, 2003
"... Much emphasis in multiagent reinforcement learning (MARL) research is placed on ensuring that MARL algorithms (eventually) converge to desirable equilibria. As in standard reinforcement learning, convergence generally requires sufficient exploration of strategy space. However, exploration often com ..."
Abstract
-
Cited by 66 (6 self)
- Add to MetaCart
(Show Context)
Much emphasis in multiagent reinforcement learning (MARL) research is placed on ensuring that MARL algorithms (eventually) converge to desirable equilibria. As in standard reinforcement learning, convergence generally requires sufficient exploration of strategy space. However, exploration often comes at a price in the form of penalties or foregone opportunities. In multiagent settings, the problem is exacerbated by the need for agents to "coordinate" their policies on equilibria. We propose a Bayesian model for optimal exploration in MARL problems that allows these exploration costs to be weighed against their expected benefits using the notion of value of information. Unlike standard RL models, this model requires reasoning about how one's actions will influence the behavior of other agents. We develop tractable approximations to optimal Bayesian exploration, and report on experiments illustrating the benefits of this approach in identical interest games.
Correlated-q learning
- In ICML ’03: Proceedings of the Twentieth International Conference on Machine Learning
, 2003
"... Abstract This paper introduces Correlated-Q (CE-Q) learning, a multiagent Q-learning algorithm based on the correlated equilibrium (CE) solution concept. CE-Q generalizes both Nash-Q and Friend-and-Foe-Q: in general-sum games, the set of correlated equilibria contains the set of Nash equilibria; in ..."
Abstract
-
Cited by 65 (3 self)
- Add to MetaCart
(Show Context)
Abstract This paper introduces Correlated-Q (CE-Q) learning, a multiagent Q-learning algorithm based on the correlated equilibrium (CE) solution concept. CE-Q generalizes both Nash-Q and Friend-and-Foe-Q: in general-sum games, the set of correlated equilibria contains the set of Nash equilibria; in constantsum games, the set of correlated equilibria contains the set of minimax equilibria. This paper describes experiments with four variants of CE-Q, demonstrating empirical convergence to equilibrium policies on a testbed of general-sum Markov games.
Economics and Electronic Commerce: Survey and Directions for Research
- INTERNATIONAL JOURNAL OF ELECTRONIC COMMERCE
, 2001
"... This article reviews the growing body of research on electronic commerce from the perspective of economic analysis. It begins by constructing a new framework for understanding electronic commerce research, then identifies the range of applicable theory and current research in the context of the new ..."
Abstract
-
Cited by 60 (11 self)
- Add to MetaCart
This article reviews the growing body of research on electronic commerce from the perspective of economic analysis. It begins by constructing a new framework for understanding electronic commerce research, then identifies the range of applicable theory and current research in the context of the new conceptual model. It goes on to assess the state-of-the-art of knowledge about electronic commerce phenomena in terms of the levels of analysis here proposed. And finally, it charts the directions along which useful work in this area might be developed. This survey and framework are intended to induce researchers in the field of information systems, the authors’ reference discipline, and other areas in schools of business and management to recognize that research on electronic commerce is business-school research, broadly defined. As such, developments in this research area in the next several years will occur across multiple business-school disciplines, and there will be a growing impetus for greater interdisciplinary communication and interaction.