## Regret testing: A simple payoff-based procedure for learning Nash equilibrium (2004)

Venue: | Games Econ. Behav |

Citations: | 7 - 1 self |

### BibTeX

@TECHREPORT{Foster04regrettesting:,

author = {Dean P. Foster and H. Peyton Young},

title = {Regret testing: A simple payoff-based procedure for learning Nash equilibrium},

institution = {Games Econ. Behav},

year = {2004}

}

### OpenURL

### Abstract

constructive comments on an earlier draft. 1 2 A learning rule is uncoupled if a player does not condition his strategy on the opponent’s payoffs. It is radically uncoupled if the player does not condition his strategy on the opponent’s actions or payoffs. We demonstrate a simple class of radically uncoupled learning rules, patterned after aspiration learning models, whose period-byperiod behavior comes arbitrarily close to Nash equilibrium behavior in any finite two-person game. 1 Payoff-based learning rules In this paper we propose a class of simple, adaptive learning rules that depend only on players ’ realized payoffs, such that when two players employ a rule from this class their period-by-period strategic behavior approximates Nash equilibrium behavior. Like reinforcement and aspiration models, this type of rule depends only on summary statistics that are derived from the players’ received payoffs; 1 indeed the players do not even need to know they are involved in a game for them to learn equilibrium eventually. To position our contribution with respect to the recent literature, we need to consider three separate issues: i) the amount of information needed to implement a learning rule; ii) the type of equilibrium to which the learning process tends (Nash, correlated, etc.); iii) the sense in which the process can be said to “approximate ” the type of equilibrium behavior in question. (For a further discussion of these issues see Young, 2004) Consider, for example, the recently discovered regret matching rules of Hart and Mas-Colell (2000, 2001). The essential idea is that players randomize among actions in proportion to their regrets from not having played those actions in the past. Like the regret-testing rules we introduce here,