ed exo ine agents who recurrently play a against each other agent in the n a classical result, Kandori et al. f the bilateral game is a 2×2 is such that strategies leading to ts coordinate on risk-dominant he pre lt gave ilds on The KMR model can be readily interpreted as a model of mimicked.1 In a framework with memory, the rule will specify to imitate the action which has led to highest payoffs in remembered experience.2 A standard element in learningmodels is the presence of exogenous inertia (see e.g. Samuelson, 1994 or Kandori and Rob, 1995), defined as an exogenously given probability 0≤ρ<1 that each single agent is notimitation (see KMR, p.31; Rhode and Stegeman, 1996; Sandholm, 1998) where agents mimic the actions which led to highest payoffs in the last period. In this note we consider exactly such a framework and endow agents with bounded memory, hence allowing them to make use of the information gained in the most recent periods of play. Agents remember all actions and payoffs observed in the last K≥0 periods of play in addition to the current one. such that aNc, dNb, aNd, and a+b<c+d. Hence, (P,P) and (R,R) are strict Nash equilibria, (P,P) is Pareto efficient and (R,R) is risk dominant. This is the most interesting case.The imitation rule used in KMR can be d best”, where simply the action leading to