Results 1 - 10
of
14
Neglect Tolerant Teaming: Issues and Dilemmas
- In proceedings of the 2003 AAAI Spring Symposium on Human Interaction with Autonomous Systems in Complex Environments
, 2003
"... In this paper, a brief overview of neglect-tolerant humanrobot interaction is presented. Recent results of a neglecttolerance study are then summarized. The problem is then posed of how neglect tolerance affects how a human interacts with multiple robots, and a scenario is constructed that illustrat ..."
Abstract
-
Cited by 9 (0 self)
- Add to MetaCart
In this paper, a brief overview of neglect-tolerant humanrobot interaction is presented. Recent results of a neglecttolerance study are then summarized. The problem is then posed of how neglect tolerance affects how a human interacts with multiple robots, and a scenario is constructed that illustrates how multiple robot management can produce a problem with the form of a prisoner’s dilemma. An abstraction of this prisoner’s dilemma problem is then presented, and two robot learning algorithms are outlined that may address key points in this abstracted dilemma.
The efficiency of adapting aspiration levels
- Proceedings of the Royal Society of London Series B-Biological Sciences 266
, 1999
"... review. Views or opinions expressed herein do not necessarily represent those of the Institute, its National Member Organizations, or other organizations supporting the work. IIASA STUDIES IN ADAPTIVE DYNAMICS NO. 33 The Adaptive Dynamics Network at IIASA fosters the development of new mathematical ..."
Abstract
-
Cited by 9 (0 self)
- Add to MetaCart
review. Views or opinions expressed herein do not necessarily represent those of the Institute, its National Member Organizations, or other organizations supporting the work. IIASA STUDIES IN ADAPTIVE DYNAMICS NO. 33 The Adaptive Dynamics Network at IIASA fosters the development of new mathematical and conceptual techniques for understanding the evolution of complex adaptive systems. Focusing on these long-term implications of adaptive processes in systems of limited growth, the Adaptive Dynamics Network brings together scientists and institutions from around the world with IIASA acting as the central node. Scientific progress within the network
Regret testing: A simple payoff-based procedure for learning Nash equilibrium
- Games Econ. Behav
, 2004
"... constructive comments on an earlier draft. 1 2 A learning rule is uncoupled if a player does not condition his strategy on the opponent’s payoffs. It is radically uncoupled if the player does not condition his strategy on the opponent’s actions or payoffs. We demonstrate a simple class of radically ..."
Abstract
-
Cited by 6 (1 self)
- Add to MetaCart
constructive comments on an earlier draft. 1 2 A learning rule is uncoupled if a player does not condition his strategy on the opponent’s payoffs. It is radically uncoupled if the player does not condition his strategy on the opponent’s actions or payoffs. We demonstrate a simple class of radically uncoupled learning rules, patterned after aspiration learning models, whose period-byperiod behavior comes arbitrarily close to Nash equilibrium behavior in any finite two-person game. 1 Payoff-based learning rules In this paper we propose a class of simple, adaptive learning rules that depend only on players ’ realized payoffs, such that when two players employ a rule from this class their period-by-period strategic behavior approximates Nash equilibrium behavior. Like reinforcement and aspiration models, this type of rule depends only on summary statistics that are derived from the players’ received payoffs; 1 indeed the players do not even need to know they are involved in a game for them to learn equilibrium eventually. To position our contribution with respect to the recent literature, we need to consider three separate issues: i) the amount of information needed to implement a learning rule; ii) the type of equilibrium to which the learning process tends (Nash, correlated, etc.); iii) the sense in which the process can be said to “approximate ” the type of equilibrium behavior in question. (For a further discussion of these issues see Young, 2004) Consider, for example, the recently discovered regret matching rules of Hart and Mas-Colell (2000, 2001). The essential idea is that players randomize among actions in proportion to their regrets from not having played those actions in the past. Like the regret-testing rules we introduce here,
Aspiration-Based Reinforcement Learning In Repeated Interaction Games: An Overview
, 2001
"... In models of aspiration-based... This paper provides an informal overview of a range of such theories applied to repeated interaction games. We describe different models of aspiration formation: where (1) aspirations are fixed but required to be consistent with longrun average payoffs; (2) aspiratio ..."
Abstract
-
Cited by 6 (1 self)
- Add to MetaCart
In models of aspiration-based... This paper provides an informal overview of a range of such theories applied to repeated interaction games. We describe different models of aspiration formation: where (1) aspirations are fixed but required to be consistent with longrun average payoffs; (2) aspirations evolve based on past personal experience or of previous generations of players; and (3) aspirations are based on the experience of peers. Convergence to non-Nash outcomes may result in either of these formulations. Indeed, cooperative behaviour can emerge and survive in the long run, even though it may be a strictly dominated strategy in the stage game, and despite the myopic adaptation of stage game strategies. Differences between reinforcement learning and evolutionary game theory are also discussed.
Learning by Trial and Error
, 2008
"... A person learns by trial and error if he occasionally tries out new strategies, rejecting choices that are erroneous in the sense that they do not lead to higher payoffs. In a game, however, strategies can become erroneous due to a change of behavior by someone else. We introduce a learning rule in ..."
Abstract
-
Cited by 3 (1 self)
- Add to MetaCart
A person learns by trial and error if he occasionally tries out new strategies, rejecting choices that are erroneous in the sense that they do not lead to higher payoffs. In a game, however, strategies can become erroneous due to a change of behavior by someone else. We introduce a learning rule in which behavior is conditional on whether a player experiences an error of the first or second type. This rule, called interactive trial and error learning, implements Nash equilibrium behavior in any game with generic payoffs and at least one pure Nash equilibrium. JEL Classification: C72, D83
Learning Efficient Nash Equilibria in Distributed Systems
, 2010
"... Abstract. An individual’s learning rule is completely uncoupled if it does not depend on the actions or payoffs of anyone else. We propose a variant of log linear learning that is completely uncoupled and that selects an efficient pure Nash equilibrium in all generic n-person games that possess at l ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
Abstract. An individual’s learning rule is completely uncoupled if it does not depend on the actions or payoffs of anyone else. We propose a variant of log linear learning that is completely uncoupled and that selects an efficient pure Nash equilibrium in all generic n-person games that possess at least one pure Nash equilibrium. In games that do not have such an equilibrium, there is a simple formula that expresses the long-run probability of the various disequilibrium states in terms of two factors: i) the sum of payoffs over all agents, and ii) the maximum payoff gain that results from a unilateral deviation by some agent. This welfare/stability trade-off criterion provides a novel framework for analyzing the selection of disequilibrium as well as equilibrium states in n-person games. JEL: C72, C73 1 1. Learning equilibrium in complex interactive systems Game theory has traditionally focussed on situations that involve a small number of players. In these environments it makes sense to assume that players know the structure of the game and can predict the strategic behavior of their opponents. But there are many situations involving huge numbers of players where these assumptions are not particularly persuasive.
Learning to Play a Satisfaction Equilibrium
"... In real life problems, agents are generally faced with situations where they only have partial or no knowledge about their environment and the other agents evolving in it. In this case all an agent can do is reasoning about its own payoffs and it cannot rely on the classical equilibria through delib ..."
Abstract
- Add to MetaCart
In real life problems, agents are generally faced with situations where they only have partial or no knowledge about their environment and the other agents evolving in it. In this case all an agent can do is reasoning about its own payoffs and it cannot rely on the classical equilibria through deliberation. To palliate to this difficulty, we introduce the satisfaction principle from which an equilibrium can arise as the result of the agents individual learning experiences. We define such an equilibrium and then we present different algorithms that can be used to reach it. Finally, we present experimental results and theoretical proofs that show that using learning strategies based on this specific equilibrium, agents will generally coordinate themselves on a Paretooptimal joint strategy, that is not always a Nash equilibrium, even though each agent is individually rational, in the sense that they try to maximize their own satisfaction. 1
Learning To Cooperate in a Social Dilemma: A Satisficing Approach to Bargaining
"... Learning in many multi-agent settings is inherently repeated play. This calls into question the naive application of single play Nash equilibria in multi-agent learning and suggests, instead, the application of give-andtake principles of bargaining. We modify and analyze a satisficing algorithm base ..."
Abstract
- Add to MetaCart
Learning in many multi-agent settings is inherently repeated play. This calls into question the naive application of single play Nash equilibria in multi-agent learning and suggests, instead, the application of give-andtake principles of bargaining. We modify and analyze a satisficing algorithm based on (Karandikar et al., 1998) that is compatible with the bargaining perspective. This algorithm is a form of relaxation search that converges to a satisficing equilibrium without knowledge of game payoffs or other agents’ actions. We then develop an M action, N player social dilemma that encodes the key elements of the Prisoner’s Dilemma. This game is instructive because it characterizes social dilemmas with more than two agents and more than two choices. We show how several different multi-agent learning algorithms behave in this social dilemma, and demonstrate that the satisficing algorithm converges, with high probability, to a Pareto efficient solution in self play and to the single play Nash equilibrium against selfish agents. Finally, we present theoretical results that characterize the behavior of the algorithm. 1.
Satisficing Multi-Agent Learning: A Simple But Powerful Algorithm
, 2008
"... Learning in the presence of adaptive, possibly antagonistic, agents presents special challenges to algorithm designers, especially in environments with limited information. We consider situations in which an agent knows its own set of actions and observes its own payoffs, but does not know or observ ..."
Abstract
- Add to MetaCart
Learning in the presence of adaptive, possibly antagonistic, agents presents special challenges to algorithm designers, especially in environments with limited information. We consider situations in which an agent knows its own set of actions and observes its own payoffs, but does not know or observe the actions and payoffs of the other agents. Despite this limited information, a robust learning algorithm must have two properties: security, which requires the algorithm to avoid exploitation by antagonistic agents, and efficiency, which requires the algorithm to find nearly pareto efficient solutions when associating with agents who are inclined to cooperate. However, no learning algorithm in the literature has both of these properties when playing repeated general-sum games in these limited-information environments. In this paper, we present and analyze a variation of Karandikar et al.’s learning algorithm [19]. The algorithm is conceptually very simple, but has surprising power given this simplicity. It is provably secure in all matrix games, regardless of the play of its associates, and it is efficient in self play in a very large set of matrix games. Additionally, the algorithm performs well when associating with representative, state-of-the-art learning algorithms with similar representational capabilities in general-sum games. These properties make the algorithm highly robust, more so than representative best-response and regret-minimizing algorithms with similar reasoning capabilities.

