Results 1 - 10
of
123
The dynamics of reinforcement learning in cooperative multiagent systems
- In Proceedings of National Conference on Artificial Intelligence (AAAI-98
, 1998
"... Reinforcement learning can provide a robust and natural means for agents to learn how to coordinate their action choices in multiagent systems. We examine some of the factors that can influence the dynamics of the learning process in such a setting. We first distinguish reinforcement learners that a ..."
Abstract
-
Cited by 249 (1 self)
- Add to MetaCart
Reinforcement learning can provide a robust and natural means for agents to learn how to coordinate their action choices in multiagent systems. We examine some of the factors that can influence the dynamics of the learning process in such a setting. We first distinguish reinforcement learners that are unaware of (or ignore) the presence of other agents from those that explicitly attempt to learn the value of joint actions and the strategies of their counterparts. We study (a simple form of) Q-learning in cooperative multiagent systems under these two perspectives, focusing on the influence of that game structure and exploration strategies on convergence to (optimal and suboptimal) Nash equilibria. We then propose alternative optimistic exploration strategies that increase the likelihood of convergence to an optimal equilibrium. 1
Learning in Extensive-Form Games: Experimental Data and Simple Dynamic Models in the Intermediate Term
- GAMES AND ECONOMIC BEHAVIOR 8, 164--212 (1995)
, 1995
"... We use simple learning models to track the behavior observed in experiments concerning three extensive form games with similar perfect equilibria. In only two of the games does observed behavior approach the perfect equilibrium as players gain experience. We examine a family of learning models which ..."
Abstract
-
Cited by 163 (9 self)
- Add to MetaCart
We use simple learning models to track the behavior observed in experiments concerning three extensive form games with similar perfect equilibria. In only two of the games does observed behavior approach the perfect equilibrium as players gain experience. We examine a family of learning models which possess some of the robust properties of learning noted in the psychology literature. The intermediate term predictions of these models track well the observed behavior in all three games, even though the models considered differ in their very long term predictions. We argue that for predicting observed behavior the intermediate term predictions of dynamic learning models may be even more important than their asymptotic properties.
Sequential optimality and coordination in multiagent systems
- In International Joint Conference on Artificial Intelligence
, 1999
"... Coordination of agent activities is a key problem in multiagent systems. Set in a larger decision theoretic context, the existence of coordination problems leads to difficulty in evaluating the utility of a situation. This in turn makes defining optimal policies for sequential decision processes pro ..."
Abstract
-
Cited by 120 (3 self)
- Add to MetaCart
Coordination of agent activities is a key problem in multiagent systems. Set in a larger decision theoretic context, the existence of coordination problems leads to difficulty in evaluating the utility of a situation. This in turn makes defining optimal policies for sequential decision processes problematic. We propose a method for solving sequential multiagent decision problems by allowing agents to reason explicitly about specific coordination mechanisms. We define an extension of value iteration in which the system’s state space is augmented with the state of the coordination mechanism adopted, allowing agents to reason about the short and long term prospects for coordination, the long term consequences of (mis)coordination, and make decisions to engage or avoid coordination problems based on expected value. We also illustrate the benefits of mechanism generalization. 1
Economic analysis of social interactions
- Journal of Economic Perspectives
, 2000
"... Economists have long been ambivalent about whether the discipline should focus on the analysis of markets or should be concerned with social interactions more generally. Recently the discipline has sought to broaden its scope while maintaining the rigor of modern economic analysis. Major theoretical ..."
Abstract
-
Cited by 101 (0 self)
- Add to MetaCart
Economists have long been ambivalent about whether the discipline should focus on the analysis of markets or should be concerned with social interactions more generally. Recently the discipline has sought to broaden its scope while maintaining the rigor of modern economic analysis. Major theoretical developments in game theory, the economics of the family, and endogenous growth theory have taken place. Economists have also performed new empirical research on social interactions, but the empirical literature does not show progress comparable to that achieved in economic theory. This paper examines why and discusses how economists might make sustained contributions to the empirical analysis of social interactions.
Nash Convergence of Gradient Dynamics in General-Sum Games
- In Proceedings of the Sixteenth Conference on Uncertainty in Artificial Intelligence
, 2000
"... Multi-agent games are becoming an increasingly prevalent formalism for the study of electronic commerce and auctions. The speed at which transactions can take place and the growing complexity of electronic marketplaces makes the study of computationally simple agents an appealing direction. In ..."
Abstract
-
Cited by 77 (0 self)
- Add to MetaCart
Multi-agent games are becoming an increasingly prevalent formalism for the study of electronic commerce and auctions. The speed at which transactions can take place and the growing complexity of electronic marketplaces makes the study of computationally simple agents an appealing direction. In this work, we analyze the behavior of agents that incrementally adapt their strategy through gradient ascent on expected payoff, in the simple setting of two-player, two-action, iterated general-sum games, and present a surprising result. We show that either the agents will converge to a Nash equilibrium, or if the strategies themselves do not converge, then their average payoffs will nevertheless converge to the payoffs of a Nash equilibrium. 1 Introduction It is widely expected that in the near future, software agents will act on behalf of humans in many electronic marketplaces based on auctions, barter, and other forms of trading. This makes multi-agent game theory (Owen, 199...
Planning, learning and coordination in multiagent decision processes
- In Proceedings of the Sixth Conference on Theoretical Aspects of Rationality and Knowledge (TARK96
, 1996
"... There has been a growing interest in AI in the design of multiagent systems, especially in multiagent cooperative planning. In this paper, we investigate the extent to which methods from single-agent planning and learning can be applied in multiagent settings. We survey a number of different techniq ..."
Abstract
-
Cited by 72 (1 self)
- Add to MetaCart
There has been a growing interest in AI in the design of multiagent systems, especially in multiagent cooperative planning. In this paper, we investigate the extent to which methods from single-agent planning and learning can be applied in multiagent settings. We survey a number of different techniques from decision-theoretic planning and reinforcement learning and describe a number of interesting issues that arise with regard to coordinating the policies of individual agents. To this end, we describe multiagent Markov decision processes as a general model in which to frame this discussion. These are special n-person cooperative games in which agents share the same utility function. We discuss coordination mechanisms based on imposed conventions (or social laws) as well as learning methods for coordination. Our focus is on the decomposition of sequential decision processes so that coordination can be learned (or imposed) locally, at the level of individual states. We also discuss the use of structured problem representations and their role in the generalization of learned conventions and in approximation. 1
Bayesian Learning in Negotiation
, 1996
"... Recent growing interest in autonomous interacting software agents and their potential application in areas such as electronic commerce [Sandolm & Lesser 1995] has given increased importance to automated negotiation. MuchDAI and game theoretic research [Rosenschein & Zlotkin 1994; Osborne & Rubinstei ..."
Abstract
-
Cited by 71 (6 self)
- Add to MetaCart
Recent growing interest in autonomous interacting software agents and their potential application in areas such as electronic commerce [Sandolm & Lesser 1995] has given increased importance to automated negotiation. MuchDAI and game theoretic research [Rosenschein & Zlotkin 1994; Osborne & Rubinstein 1994] deals with coordination and negotiation issues by giving pre-computed solutions to specific problems. There has been much research reported on developing theoretical models in which learning plays an eminent role, especially in the area of adaptive dynamics of games (e.g., [Jordan 1992; Kalai & Lehrer 1993]). However, to build autonomous agents that improve their negotiation competence based on learning from their interactions with other agents is still an emerging area. We are interested in developing autonomous agents capable of reasoning based on experience and improving their negotiation behavior incrementally. Learning in negotiation is closely coupled with...
AWESOME: A General Multiagent Learning Algorithm that Converges in Self-Play and Learns a Best Response against Stationary Opponents
- IN PROCEEDINGS OF THE 20TH INTERNATIONAL CONFERENCE ON MACHINE LEARNING
, 2006
"... Two minimal requirements for a satisfactory multiagent learning algorithm are that it 1. learns to play optimally against stationary opponents and 2. converges to a Nash equilibrium in self-play. The previous algorithm that has come closest, WoLF-IGA, has been proven to have these two properties ..."
Abstract
-
Cited by 57 (5 self)
- Add to MetaCart
Two minimal requirements for a satisfactory multiagent learning algorithm are that it 1. learns to play optimally against stationary opponents and 2. converges to a Nash equilibrium in self-play. The previous algorithm that has come closest, WoLF-IGA, has been proven to have these two properties in 2-player 2-action (repeated) games -- assuming that the opponent's mixed strategy is observable. Another algorithm, ReDVaLeR (which was introduced after the algorithm described in this paper), achieves the two properties in games with arbitrary numbers of actions and players, but still requires that the opponents' mixed strategies are observable. In this paper we present AWESOME, the first algorithm that is guaranteed to have the two properties in games with arbitrary numbers of actions and players. It is still the only algorithm that does so while only relying on observing the other players' actual actions (not their mixed strategies). It also learns to play optimally against opponents that eventually become stationary. The basic idea behind AWESOME (Adapt When Everybody is Stationary, Otherwise Move to Equilibrium) is to try to adapt to the others' strategies when they appear stationary, but otherwise to retreat to a precomputed equilibrium strategy. We provide experimental results that suggest that AWESOME converges fast in practice. The techniques used to prove the properties of AWESOME are fundamentally different from those used for previous algorithms, and may help in analyzing future multiagent learning algorithms as well.
Logarithmic Market Scoring Rules for Modular Combinatorial Information Aggregation
- Journal of Prediction Markets
, 2002
"... In practice, scoring rules elicit good probability estimates from individuals, while betting markets elicit good consensus estimates from groups. Market scoring rules combine these features, eliciting estimates from individuals or groups, with groups costing no more than individuals. ..."
Abstract
-
Cited by 44 (4 self)
- Add to MetaCart
In practice, scoring rules elicit good probability estimates from individuals, while betting markets elicit good consensus estimates from groups. Market scoring rules combine these features, eliciting estimates from individuals or groups, with groups costing no more than individuals.

