Results 1 - 10
of
46
Learning in Extensive-Form Games: Experimental Data and Simple Dynamic Models in the Intermediate Term
- GAMES AND ECONOMIC BEHAVIOR 8, 164--212 (1995)
, 1995
"... We use simple learning models to track the behavior observed in experiments concerning three extensive form games with similar perfect equilibria. In only two of the games does observed behavior approach the perfect equilibrium as players gain experience. We examine a family of learning models which ..."
Abstract
-
Cited by 163 (9 self)
- Add to MetaCart
We use simple learning models to track the behavior observed in experiments concerning three extensive form games with similar perfect equilibria. In only two of the games does observed behavior approach the perfect equilibrium as players gain experience. We examine a family of learning models which possess some of the robust properties of learning noted in the psychology literature. The intermediate term predictions of these models track well the observed behavior in all three games, even though the models considered differ in their very long term predictions. We argue that for predicting observed behavior the intermediate term predictions of dynamic learning models may be even more important than their asymptotic properties.
Calibrated Learning and Correlated Equilibrium
- Games and Economic Behavior
, 1996
"... Suppose two players meet each other in a repeated game where: 1. each uses a learning rule with the property that it is a calibrated forecast of the others plays, and 2. each plays a best response to this forecast distribution. ..."
Abstract
-
Cited by 63 (3 self)
- Add to MetaCart
Suppose two players meet each other in a repeated game where: 1. each uses a learning rule with the property that it is a calibrated forecast of the others plays, and 2. each plays a best response to this forecast distribution.
AWESOME: A General Multiagent Learning Algorithm that Converges in Self-Play and Learns a Best Response against Stationary Opponents
- IN PROCEEDINGS OF THE 20TH INTERNATIONAL CONFERENCE ON MACHINE LEARNING
, 2006
"... Two minimal requirements for a satisfactory multiagent learning algorithm are that it 1. learns to play optimally against stationary opponents and 2. converges to a Nash equilibrium in self-play. The previous algorithm that has come closest, WoLF-IGA, has been proven to have these two properties ..."
Abstract
-
Cited by 57 (5 self)
- Add to MetaCart
Two minimal requirements for a satisfactory multiagent learning algorithm are that it 1. learns to play optimally against stationary opponents and 2. converges to a Nash equilibrium in self-play. The previous algorithm that has come closest, WoLF-IGA, has been proven to have these two properties in 2-player 2-action (repeated) games -- assuming that the opponent's mixed strategy is observable. Another algorithm, ReDVaLeR (which was introduced after the algorithm described in this paper), achieves the two properties in games with arbitrary numbers of actions and players, but still requires that the opponents' mixed strategies are observable. In this paper we present AWESOME, the first algorithm that is guaranteed to have the two properties in games with arbitrary numbers of actions and players. It is still the only algorithm that does so while only relying on observing the other players' actual actions (not their mixed strategies). It also learns to play optimally against opponents that eventually become stationary. The basic idea behind AWESOME (Adapt When Everybody is Stationary, Otherwise Move to Equilibrium) is to try to adapt to the others' strategies when they appear stationary, but otherwise to retreat to a precomputed equilibrium strategy. We provide experimental results that suggest that AWESOME converges fast in practice. The techniques used to prove the properties of AWESOME are fundamentally different from those used for previous algorithms, and may help in analyzing future multiagent learning algorithms as well.
Online Ascending Auctions for Gradually Expiring Items
- In SODA
, 2004
"... In this paper we consider online auction mechanisms for the allocation of M items that are identical to each other except for the fact that the items have dierent expiration times, and each item must be allocated before it expires. A computational application is the allocation of time slots in a ..."
Abstract
-
Cited by 46 (6 self)
- Add to MetaCart
In this paper we consider online auction mechanisms for the allocation of M items that are identical to each other except for the fact that the items have dierent expiration times, and each item must be allocated before it expires. A computational application is the allocation of time slots in a scheduling problem, and an economic application is the allocation of transportation tickets.
Conditional Universal Consistency
, 1997
"... Each period, a player must choose an action without knowing the outcome that will be chosen by "Nature," according to an unknown and possibly history-dependent stochastic rule. We discuss have a class of procedures that assign observations to categories, and prescribe a simple randomized variation o ..."
Abstract
-
Cited by 25 (0 self)
- Add to MetaCart
Each period, a player must choose an action without knowing the outcome that will be chosen by "Nature," according to an unknown and possibly history-dependent stochastic rule. We discuss have a class of procedures that assign observations to categories, and prescribe a simple randomized variation of fictitious play within each category. These procedures are "conditionally consistent," in the sense of yielding almost as high a time-average payoff as could be obtained if the player chose knowing the conditional distributions of actions given categories. Moreover given any alternative procedure, there is a conditionally consistent procedure whose performance is no more than epsilon worse regardless of the discount factor. Cycles can persist if all players classify histories in the same way; however in an example, where players classify histories differently, the system converges to a Nash equilibrium. We also argue that in the long run the time-average of play should resemble a correlated equilibrium.
Evolutionary games on graphs
, 2007
"... Game theory is one of the key paradigms behind many scientific disciplines from biology to behavioral sciences to economics. In its evolutionary form and especially when the interacting agents are linked in a specific social network the underlying solution concepts and methods are very similar to ..."
Abstract
-
Cited by 24 (0 self)
- Add to MetaCart
Game theory is one of the key paradigms behind many scientific disciplines from biology to behavioral sciences to economics. In its evolutionary form and especially when the interacting agents are linked in a specific social network the underlying solution concepts and methods are very similar to those applied in non-equilibrium statistical physics. This review gives a tutorial-type overview of the field for physicists. The first four sections introduce the necessary background in classical and evolutionary game theory from the basic definitions to the most important results. The fifth section surveys the topological complications implied by non-mean-field-type social network structures in general. The next three sections discuss in detail the dynamic behavior of three prominent classes of models: the Prisoner’s Dilemma, the Rock–Scissors–Paper game, and Competing Associations. The major theme of the review is in what sense and how the graph structure of interactions can modify and enrich the picture of long term behavioral patterns emerging in evolutionary games.
Computing best-response strategies in infinite games of incomplete information
- In Uncertainty in artificial intelligence
, 2004
"... We describe an algorithm for computing bestresponse strategies in a class of two-player infinite games of incomplete information, defined by payoffs piecewise linear in agents ’ types and actions, conditional on linear comparisons of agents ’ actions. We show that this class includes many well-known ..."
Abstract
-
Cited by 19 (4 self)
- Add to MetaCart
We describe an algorithm for computing bestresponse strategies in a class of two-player infinite games of incomplete information, defined by payoffs piecewise linear in agents ’ types and actions, conditional on linear comparisons of agents ’ actions. We show that this class includes many well-known games including a variety of auctions and a novel allocation game. In some cases, the best-response algorithm can be iterated to compute Bayes-Nash equilibria. We demonstrate the efficacy of our approach on existing and new games. 1
Computing the minimal covering set
- In Proceedings of the 11th Conference on Theoretical Aspects of Rationality and Knowledge
, 2007
"... We present the first polynomial-time algorithm for computing the minimal covering set of a (weak) tournament. The algorithm draws upon a linear programming formulation of a subset of the minimal covering set known as the essential set. On the other hand, we show that no efficient algorithm exists fo ..."
Abstract
-
Cited by 14 (11 self)
- Add to MetaCart
We present the first polynomial-time algorithm for computing the minimal covering set of a (weak) tournament. The algorithm draws upon a linear programming formulation of a subset of the minimal covering set known as the essential set. On the other hand, we show that no efficient algorithm exists for two variants of the minimal covering set, the minimal upward covering set and the minimal downward covering set, unless P equals NP. Finally, we observe a strong relationship between von Neumann-Morgenstern stable sets and upward covering on the one hand, and the Banks set and downward covering on the other.
On the convergence of fictitious play
- Mathematics of Operations Research
, 1998
"... We study the continuous time Brown-Robinson ctitious play process for non-zero sum games. We show that, in general, ctitious play cannot converge cyclically to a mixed strategy equilibrium in which both players use more than two pure strategies. 1 ..."
Abstract
-
Cited by 11 (0 self)
- Add to MetaCart
We study the continuous time Brown-Robinson ctitious play process for non-zero sum games. We show that, in general, ctitious play cannot converge cyclically to a mixed strategy equilibrium in which both players use more than two pure strategies. 1
Non-cooperative dynamics of multi-agent teams
- In Proceedings of the First International Joint Conference on Autonomous Agents and Multi-Agent Systems
, 2002
"... ABSTRACT 1 Results on the formation of multi-agent teams are reviewed and extended. Conditions are specified under which it is individually rational for agents to spontaneously form coalitions in order to engage in collective action. In a cooperative setting the formation of such groups is to be exp ..."
Abstract
-
Cited by 9 (2 self)
- Add to MetaCart
ABSTRACT 1 Results on the formation of multi-agent teams are reviewed and extended. Conditions are specified under which it is individually rational for agents to spontaneously form coalitions in order to engage in collective action. In a cooperative setting the formation of such groups is to be expected. Here we show that in non-cooperative environments—presumably a more realistic context for a variety of both human and software agents—self-organized coalitions are capable of extracting welfare improvements. The Nash equilibria of these coalitional formation games are demonstrated to always exist and be unique. Certain free rider problems in such group formation dynamics lead to the possibility of dynamically unstable Nash equilibria, depending on the nature of intra-group compensation and coalition size. Yet coherent groups can still form, if only temporarily, as demonstrated by computational experiments. Such groups of agents can be either long-lived or transient. The macroscopic structure of these emergent 'bands ' of agents is stationary in sufficiently large populations, despite constant adaptation at the agent level. It is argued that assumptions concerning attainment of agent-level (Nash) equilibrium, so ubiquitous in conventional economics and game theory, are difficult to justify behaviorally and highly restrictive theoretically, and are thus unlikely to serve either as fertile design objectives or robust operating principles for realistic multi-agent systems.

