Results 1  10
of
28
Autonomous vehicletarget assignment: a game theoretical formulation
 ASME JOURNAL OF DYNAMIC SYSTEMS, MEASUREMENT AND CONTROL
, 2007
"... We consider an autonomous vehicletarget assignment problem where a group of vehicles are expected to optimally assign themselves to a set of targets. We introduce a gametheoretical formulation of the problem in which the vehicles are viewed as selfinterested decision makers. Thus, we seek the opt ..."
Abstract

Cited by 89 (22 self)
 Add to MetaCart
We consider an autonomous vehicletarget assignment problem where a group of vehicles are expected to optimally assign themselves to a set of targets. We introduce a gametheoretical formulation of the problem in which the vehicles are viewed as selfinterested decision makers. Thus, we seek the optimization of a global utility function through autonomous vehicles that are capable of making individually rational decisions to optimize their own utility functions. The first important aspect of the problem is to choose the utility functions of the vehicles in such a way that the objectives of the vehicles are localized to each vehicle yet aligned with a global utility function. The second important aspect of the problem is to equip the vehicles with an appropriate negotiation mechanism by which each vehicle pursues the optimization of its own utility function. We present several design procedures and accompanying caveats for vehicle utility design. We present two new negotiation mechanisms, namely, “generalized regret monitoring with fading memory and inertia ” and “selective spatial adaptive play, ” and provide accompanying proofs of their convergence. Finally, we present simulations that illustrate how vehicle negotiations can consistently lead to nearoptimal assignments provided that the utilities of the vehicles are designed appropriately.
Revisiting LogLinear Learning: Asynchrony, Completeness and PayoffBased Implementation
, 2008
"... Loglinear learning is a learning algorithm with equilibrium selection properties. Loglinear learning provides guarantees on the percentage of time that the joint action profile will be at a potential maximizer in potential games. The traditional analysis of loglinear learning has centered around ..."
Abstract

Cited by 42 (11 self)
 Add to MetaCart
Loglinear learning is a learning algorithm with equilibrium selection properties. Loglinear learning provides guarantees on the percentage of time that the joint action profile will be at a potential maximizer in potential games. The traditional analysis of loglinear learning has centered around explicitly computing the stationary distribution. This analysis relied on a highly structured setting: i) players ’ utility functions constitute a potential game, ii) players update their strategies one at a time, which we refer to as asynchrony, iii) at any stage, a player can select any action in the action set, which we refer to as completeness, and iv) each player is endowed with the ability to assess the utility he would have received for any alternative action provided that the actions of all other players remain fixed. Since the appeal of loglinear learning is not solely the explicit form of the stationary distribution, we seek to address to what degree one can relax the structural assumptions while maintaining that only potential function maximizers are the stochastically stable action profiles. In this paper, we introduce slight variants of loglinear learning to include both synchronous updates and incomplete action sets. In both settings, we prove that only potential function maximizers are stochastically stable. Furthermore, we introduce a payoffbased version of loglinear learning, in which players are only aware of the utility they received and the action that they played. Note that loglinear learning in its original form is not a payoffbased learning algorithm. In payoffbased loglinear learning, we also prove that only potential maximizers are stochastically stable. The key enabler for these results is to change the focus of the analysis away from deriving the explicit form of the stationary distribution of the learning process towards characterizing the stochastically stable states. The resulting analysis uses the theory of resistance trees for regular perturbed Markov decision processes, thereby allowing a relaxation of the aforementioned structural assumptions.
Connections Between Cooperative Control and Potential Games Illustrated on the Consensus Problem
, 2007
"... This paper presents a view of cooperative control using the language of learning in games. We review the game theoretic concepts of potential games and weakly acyclic games and demonstrate how the specific cooperative control problem of consensus can be formulated in these settings. Motivated by th ..."
Abstract

Cited by 36 (12 self)
 Add to MetaCart
This paper presents a view of cooperative control using the language of learning in games. We review the game theoretic concepts of potential games and weakly acyclic games and demonstrate how the specific cooperative control problem of consensus can be formulated in these settings. Motivated by this connection, we build upon game theoretic concepts to better accommodate a broader class of cooperative control problems. In particular, we introduce sometimes weakly acyclic games for timevarying objective functions and action sets, and provide distributed algorithms for convergence to an equilibrium. Finally, we illustrate how to implement these algorithms for the consensus problem in a variety of settings, most notably, in an environment with nonconvex obstructions.
Payoffbased dynamics for multiplayer weakly acyclic games
 SIAM J. CONTROL OPT
, 2009
"... We consider repeated multiplayer games in which players repeatedly and simultaneously choose strategies from a finite set of available strategies according to some strategy adjustment process. We focus on the specific class of weakly acyclic games, which is particularly relevant for multiagent coo ..."
Abstract

Cited by 33 (12 self)
 Add to MetaCart
(Show Context)
We consider repeated multiplayer games in which players repeatedly and simultaneously choose strategies from a finite set of available strategies according to some strategy adjustment process. We focus on the specific class of weakly acyclic games, which is particularly relevant for multiagent cooperative control problems. A strategy adjustment process determines how players select their strategies at any stage as a function of the information gathered over previous stages. Of particular interest are “payoffbased ” processes in which, at any stage, players know only their own actions and (noise corrupted) payoffs from previous stages. In particular, players do not know the actions taken by other players and do not know the structural form of payoff functions. We introduce three different payoffbased processes for increasingly general scenarios and prove that, after a sufficiently large number of stages, player actions constitute a Nash equilibrium at any stage with arbitrarily high probability. We also show how to modify player utility functions through tolls and incentives in socalled congestion games, a special class of weakly acyclic games, to guarantee that a centralized objective can be realized as a Nash equilibrium. We illustrate the methods with a simulation of distributed routing over a network.
Cooperative control and potential game
 IEEE Trans. Syst., Man, Cybern. B
, 2009
"... Abstract—We present a view of cooperative control using the language of learning in games. We review the gametheoretic concepts of potential and weakly acyclic games, and demonstrate how several cooperative control problems, such as consensus and dynamic sensor coverage, can be formulated in these ..."
Abstract

Cited by 32 (6 self)
 Add to MetaCart
(Show Context)
Abstract—We present a view of cooperative control using the language of learning in games. We review the gametheoretic concepts of potential and weakly acyclic games, and demonstrate how several cooperative control problems, such as consensus and dynamic sensor coverage, can be formulated in these settings. Motivated by this connection, we build upon gametheoretic concepts to better accommodate a broader class of cooperative control problems. In particular, we extend existing learning algorithms to accommodate restricted action sets caused by the limitations of agent capabilities and groupbased decision making. Furthermore, we also introduce a new class of games called sometimes weakly acyclic games for timevarying objective functions and action sets, and provide distributed algorithms for convergence to an equilibrium. Index Terms—Cooperative control, game theory, learning in games, multiagent systems. I.
Payoff Based Dynamics for MultiPlayer Weakly Acyclic Games
 SIAM JOURNAL ON CONTROL AND OPTIMIZATION, SPECIAL ISSUE ON CONTROL AND OPTIMIZATION IN COOPERATIVE NETWORKS
, 2007
"... We consider repeated multiplayer games in which players repeatedly and simultaneously choose strategies from a finite set of available strategies according to some strategy adjustment process. We focus on the specific class of weakly acyclic games, which is particularly relevant for multiagent coo ..."
Abstract

Cited by 28 (15 self)
 Add to MetaCart
(Show Context)
We consider repeated multiplayer games in which players repeatedly and simultaneously choose strategies from a finite set of available strategies according to some strategy adjustment process. We focus on the specific class of weakly acyclic games, which is particularly relevant for multiagent cooperative control problems. A strategy adjustment process determines how players select their strategies at any stage as a function of the information gathered over previous stages. Of particular interest are “payoff based ” processes, in which at any stage, players only know their own actions and (noise corrupted) payoffs from previous stages. In particular, players do not know the actions taken by other players and do not know the structural form of payoff functions. We introduce three different payoff based processes for increasingly general scenarios and prove that after a sufficiently large number of stages, player actions constitute a Nash equilibrium at any stage with arbitrarily high probability. We also show how to modify player utility functions through tolls and incentives in socalled congestion games, a special class of weakly acyclic games, to guarantee that a centralized objective can be realized as a Nash equilibrium. We illustrate the methods with a simulation of distributed routing over a network.
Distributed Welfare Games
"... We consider a variation of the resource allocation problem. In the traditional problem, there is a global planner who would like to assign a set of players to a set of resources so as to maximize welfare. We consider the situation where the global planner does not have the authority to assign player ..."
Abstract

Cited by 20 (7 self)
 Add to MetaCart
We consider a variation of the resource allocation problem. In the traditional problem, there is a global planner who would like to assign a set of players to a set of resources so as to maximize welfare. We consider the situation where the global planner does not have the authority to assign players to resources; rather, players are selfinterested. The question that emerges is how can the global planner entice the players to settle on a desirable allocation with respect to the global welfare? To study this question, we focus on a class of games that we refer to as distributed welfare games. Within this context, we investigate how the global planner should distribute the welfare to the players. We measure the efficacy of a distribution rule in two ways: (i) Does a pure Nash equilibrium exist? (ii) How does the welfare associated with a pure Nash equilibrium compare to the global welfare associated with the optimal allocation? In this paper we explore the applicability of cost sharing methodologies for distributing welfare in such resource allocation problems. We demonstrate that obtaining desirable distribution rules, such as distribution rules that are budget balanced and guarantee the existence of a pure Nash equilibrium, often comes at a significant informational and computational cost. In light of this, we derive a systematic procedure for designing desirable distribution rules with a minimal informational and computational cost for a special class of distributed welfare games. Furthermore, we derive a bound on the price of anarchy for distributed welfare games in a variety of settings. Lastly, we highlight the implications of these results using the problem of sensor coverage.
An Architectural View of Game Theoretic Control
"... Gametheoretic control is a promising new approach for distributed resource allocation. In this paper, we describe how gametheoretic control can be viewed as having an intrinsic layered architecture, which provides a modularization that simplifies the control design. We illustrate this architectura ..."
Abstract

Cited by 16 (9 self)
 Add to MetaCart
(Show Context)
Gametheoretic control is a promising new approach for distributed resource allocation. In this paper, we describe how gametheoretic control can be viewed as having an intrinsic layered architecture, which provides a modularization that simplifies the control design. We illustrate this architectural view by presenting details about one particular instantiation using potential games as an interface. This example serves to highlight the strengths and limitations of the proposed architecture while also illustrating the relationship between gametheoretic control and other existing approaches to distributed resource allocation. 1.
Decoupling Coupled Constraints Through Utility Design
"... The central goal in multiagent systems is to engineer a decision making architecture where agents make independent decisions in response to local information while ensuring that the emergent global behavior is desirable with respect to a given system level objective. In many systems this control de ..."
Abstract

Cited by 10 (2 self)
 Add to MetaCart
The central goal in multiagent systems is to engineer a decision making architecture where agents make independent decisions in response to local information while ensuring that the emergent global behavior is desirable with respect to a given system level objective. In many systems this control design is further complicated by coupled constraints on the agents’ behavior. This paper seeks to address the design of such algorithms using the field of game theory. In particular, we derive a systematic methodology for designing local agent utility functions such that (i) all resulting pure Nash equilibria of the designed game optimize the given system level objective and satisfy the given coupled constraint (ii) the resulting game possesses an inherent structure that can be exploited in distributed learning, e.g., potential games. Such developments would greatly simplify the control design by eliminating the need to explicitly consider the constraint. One key to this realization is introducing an estimate of the coupled constraint and incorporating exterior penalty functions and barrier functions into the design of the agents’ utility functions.