## A Polynomial-time Nash Equilibrium Algorithm for Repeated Games (2004)

### Cached

### Download Links

- [www.cs.utexas.edu]
- [www.cs.utexas.edu]
- [www.cs.utexas.edu]
- [www.cs.utexas.edu]
- DBLP

### Other Repositories/Bibliography

Venue: | Proceedings of the ACM Conference on Electronic Commerce (ACM-EC |

Citations: | 66 - 4 self |

### BibTeX

@INPROCEEDINGS{Littman04apolynomial-time,

author = {Michael L. Littman and Peter Stone},

title = {A Polynomial-time Nash Equilibrium Algorithm for Repeated Games},

booktitle = {Proceedings of the ACM Conference on Electronic Commerce (ACM-EC},

year = {2004},

pages = {48--54}

}

### Years of Citing Articles

### OpenURL

### Abstract

With the increasing reliance on game theory as a foundation for auctions and electronic commerce, ecient algorithms for computing equilibria in multiplayer general-sum games are of great theoretical and practical interest. The computational complexity of nding a Nash equilibrium for a one-shot bimatrix game is a well known open problem. This paper treats a related but distinct problem, that of nding a Nash equilibrium for an average-payo repeated bimatrix game, and presents a polynomial-time algorithm. Our approach draws on the well known \folk theorem" from game theory and shows how nite-state equilibrium strategies can be found eciently and expressed succinctly.

### Citations

2298 | The Evolution of Cooperation
- Axelrod
- 1984
(Show Context)
Citation Context ...tion 2 until q chooses Action 1. At this point, p returns to the left node again. The strategy expressed in thesgure can be thought of as \two tits for a tat" in the context of the Prisoner's Dil=-=emma [1]-=-; the player defects at least twice in response to defection, but otherwise cooperates. Whilesnite-state machines provide a simple and broad language for expressing strategies, some basic strategies b... |

1867 | A course in game theory
- Osborne, Rubinstein
- 1994
(Show Context)
Citation Context ...ing a Nash equilibrium in a one-shot game, the complexity of which remains an important and long-standing open problem [12]. The idea behind our algorithm echoes that of the well known \folk theorem&q=-=uot; [1-=-1], which shows how the notion of threats can stabilize a wide range of payo proles in repeated games. While the folk theorem provides a constructive method for identifying Nash equilibria in repeated... |

923 |
Non-cooperative games
- Nash
- 1951
(Show Context)
Citation Context ... always chooses the column. 2 strategies is a Nash equilibrium if each strategy is optimized with respect to the other|neither player can improve its average payo by changing strategies unilaterally [=-=9]-=-. As a running example in this paper, we use the well known Iterated Prisoner's Dilemma to illustrate and motivate our algorithm. In this repeated bimatrix game, on each round, each player can either ... |

819 |
The bargaining problem
- Nash
- 1950
(Show Context)
Citation Context ...n Player 1 chooses i 1 and Player 2 chooses i 2 , the payos for the two players can be visualized as a point x = (A 1 i 1 i 2 ; A 2 i 2 i 1 ) = (x 1 ; x 2 ) in a twodimensional space. Following Nash [10], we consider the set of all pairs of actions for the two players, X = f(A 1 i 1 i 2 ; A 2 i 2 i 1 )j1 i 1 n 1 ; 1 i 2 n 2 g. All the points x 2 X can be achieved as average payos for the two... |

296 |
the internet
- Algorithms
- 2001
(Show Context)
Citation Context ... averagepayo criterion. This result stands in contrast to the problem of computing a Nash equilibrium in a one-shot game, the complexity of which remains an important and long-standing open problem [=-=12]. T-=-he idea behind our algorithm echoes that of the well known \folk theorem" [11], which shows how the notion of threats can stabilize a wide range of payo proles in repeated games. While the folk t... |

226 | Graphical Models for Game Theory
- Kearns, Littman, et al.
- 2001
(Show Context)
Citation Context ..., there are a number of other contexts in which we are applying the ideas in this paper, such as computing equilibria in repeated Markov games [6] and n-player games expressed in a graphical notation =-=[5]-=-, and using reinforcement learning to issue and recognize threats [7]. The relative simplicity of the threat-based approach makes it a promising direction for future work in computational game theory ... |

132 | Complexity Results about Nash Equilibria
- Conitzer, Sandholm
- 2003
(Show Context)
Citation Context ...argaining axioms, although other criteria such as maximizing the average combined payo of the two players present no additional diculty; in one-shot games, identifying such an equilibrium is NP-hard [=-=3-=-]. The fact that punishment is ofsnite duration after any history makes the equilibrium subgame perfect [11]. For symmetric games, the resulting payos are socially optimal and symmetric. On the negati... |

124 |
Friend-or-Foe Q-learning in generalsum games
- Littman
- 2001
(Show Context)
Citation Context ...vantage case in a computationally ecient way. Nonetheless, there are a number of other contexts in which we are applying the ideas in this paper, such as computing equilibria in repeated Markov games =-=[6]-=- and n-player games expressed in a graphical notation [5], and using reinforcement learning to issue and recognize threats [7]. The relative simplicity of the threat-based approach makes it a promisin... |

48 | Efficient learning equilibrium
- Brafman, Tennenholtz
(Show Context)
Citation Context ... theoretical treatment presented here. Our use of threats in this context echoes their use in folk theorems [11]. It is also similar to the ecient learning equilibrium work of Brafman and Tennenholtz =-=[2]-=-, which seeks punishments that achieve their in uence after a polynomial number of rounds. Folk theorems can be considered algorithms, since they constructively prove the existence Nash equilibria. Ho... |

42 | Implicit negotiation in repeated games
- Littman, Stone
- 2001
(Show Context)
Citation Context ...deas in this paper, such as computing equilibria in repeated Markov games [6] and n-player games expressed in a graphical notation [5], and using reinforcement learning to issue and recognize threats =-=[7]-=-. The relative simplicity of the threat-based approach makes it a promising direction for future work in computational game theory and electronic commerce more generally. Acknowledgments We thank Alle... |

20 | FAucS: An FCC spectrum auction simulator for autonomous bidding agents
- Csirik, Littman, et al.
- 2001
(Show Context)
Citation Context ...at least 2 b states. 5 Conclusions Our interest in repeated games and in the use of threats to spur mutually benecial behavior came about during a study of automated bidders in simultaneous auctions [=-=4]-=-. When human bidders participate in auctions, they have been known to use threats to in uence the behavior of other bidders [16]. We were able to show experimentally that threats can be a valuable too... |

9 | Self-Enforcing Strategic Demand Reduction
- Reitsma, Stone, et al.
- 2002
(Show Context)
Citation Context ... auctions, they have been known to use threats to in uence the behavior of other bidders [16]. We were able to show experimentally that threats can be a valuable tool for automated agents in auctions =-=[13]-=-, and undertook the more theoretical treatment presented here. Our use of threats in this context echoes their use in folk theorems [11]. It is also similar to the ecient learning equilibrium work of ... |