Results 1  10
of
39
Intrinsic Robustness of the Price of Anarchy
"... The price of anarchy (POA) is a worstcase measure of the inefficiency of selfish behavior, defined as the ratio of the objective function value of a worst Nash equilibrium of a game and that of an optimal outcome. This measure implicitly assumes that players successfully reach some Nash equilibrium ..."
Abstract

Cited by 55 (11 self)
 Add to MetaCart
The price of anarchy (POA) is a worstcase measure of the inefficiency of selfish behavior, defined as the ratio of the objective function value of a worst Nash equilibrium of a game and that of an optimal outcome. This measure implicitly assumes that players successfully reach some Nash equilibrium. This drawback motivates the search for inefficiency bounds that apply more generally to weaker notions of equilibria, such as mixed Nash and correlated equilibria; or to sequences of outcomes generated by natural experimentation strategies, such as successive best responses or simultaneous regretminimization. We prove a general and fundamental connection between the price of anarchy and its seemingly stronger relatives in classes of games with a sum objective. First, we identify a “canonical sufficient condition ” for an upper bound of the POA for pure Nash equilibria, which we call a smoothness argument. Second, we show that every bound derived via a smoothness argument extends automatically, with no quantitative degradation in the bound, to mixed Nash equilibria, correlated equilibria, and the average objective function value of regretminimizing players (or “price of total anarchy”). Smoothness arguments also have automatic implications for the inefficiency of approximate and BayesianNash equilibria and, under mild additional assumptions, for bicriteria bounds and for polynomiallength bestresponse sequences. We also identify classes of games — most notably, congestion games with cost functions restricted to an arbitrary fixed set — that are tight, in the sense that smoothness arguments are guaranteed to produce an optimal worstcase upper bound on the POA, even for the smallest set of interest (pure Nash equilibria). Byproducts of our proof of this result include the first tight bounds on the POA in congestion games with nonpolynomial cost functions, and the first
Fast convergence to Wardrop equilibria by adaptive sampling methods
 in Proc. 38th Ann. ACM. Symp. on Theory of Comput. (STOC
, 2006
"... We study rerouting policies in a dynamic roundbased variant of a well known game theoretic traffic model due to Wardrop. Previous analyses (mostly in the context of selfish routing) based on Wardrop’s model focus mostly on the static analysis of equilibria. In this paper, we ask the question whethe ..."
Abstract

Cited by 41 (6 self)
 Add to MetaCart
We study rerouting policies in a dynamic roundbased variant of a well known game theoretic traffic model due to Wardrop. Previous analyses (mostly in the context of selfish routing) based on Wardrop’s model focus mostly on the static analysis of equilibria. In this paper, we ask the question whether the population of agents responsible for routing the traffic can jointly compute or better learn a Wardrop equilibrium efficiently. The rerouting policies that we study are of the following kind. In each round, each agent samples an alternative routing path and compares the latency on this path with its current latency. If the agent observes that it can improve its latency then it switches with some probability depending on the possible improvement to the better path. We can show various positive results based on a rerouting policy using an adaptive sampling rule that implicitly amplifies paths that carry a large amount of traffic in the Wardrop equilibrium. For general asymmetric games, we show that a simple replication protocol in which agents adopt strategies of more successful agents reaches a certain kind of bicriteria equilibrium within a time bound that is independent of the size and the structure of the network but only depends on a parameter of the latency functions, that we call the relative slope. For symmetric games, this result has an intuitive interpretation: Replication approximately satisfies almost everyone very quickly. In order to achieve convergence to a Wardrop equilibrium besides replication one also needs an exploration component discovering possibly unused strategies. We present a
Joint Strategy Fictitious Play with Inertia for Potential Games
, 2005
"... We consider finite multiplayer repeated games involving a large number of players with large strategy spaces and enmeshed utility structures. In these “largescale” games, players are inherently faced with limitations in both their observational and computational capabilities. Accordingly, players ..."
Abstract

Cited by 38 (21 self)
 Add to MetaCart
We consider finite multiplayer repeated games involving a large number of players with large strategy spaces and enmeshed utility structures. In these “largescale” games, players are inherently faced with limitations in both their observational and computational capabilities. Accordingly, players in largescale games need to make their decisions using algorithms that accommodate limitations in information gathering and processing. A motivating example is a congestion game in a complex transportation system, in which a large number of vehicles make daily routing decisions to optimize their own objectives in response to their observations. In this setting, observing and responding to the individual actions of all vehicles on a daily basis would be a formidable task for any individual driver. This disqualifies some of the well known decision making models such as “Fictitious Play” (FP) as
Distributed selfish load balancing
, 2006
"... Suppose that a set of m tasks are to be shared as equally as possible amongst a set of n resources. A gametheoretic mechanism to find a suitable allocation is to associate each task with a “selfish agent”, and require each agent to select a resource, with the cost of a resource being the number of ..."
Abstract

Cited by 29 (1 self)
 Add to MetaCart
Suppose that a set of m tasks are to be shared as equally as possible amongst a set of n resources. A gametheoretic mechanism to find a suitable allocation is to associate each task with a “selfish agent”, and require each agent to select a resource, with the cost of a resource being the number of agents to select it. Agents would then be expected to migrate from overloaded to underloaded resources, until the allocation becomes balanced. Recent work has studied the question of how this can take place within a distributed setting in which agents migrate selfishly without any centralized control. In this paper we discuss a natural protocol for the agents which combines the following desirable features: It can be implemented in a strongly distributed setting, uses no central control, and has good convergence properties. For m ≫ n, the system becomes approximately balanced (an ǫNash equilibrium) in expected time O(log log m). We show using a martingale technique that the process converges to a perfectly balanced allocation in expected time O(log log m + n 4). We also give a lower bound of Ω(max{loglog m, n}) for the convergence time.
Revisiting LogLinear Learning: Asynchrony, Completeness and PayoffBased Implementation
, 2008
"... Loglinear learning is a learning algorithm with equilibrium selection properties. Loglinear learning provides guarantees on the percentage of time that the joint action profile will be at a potential maximizer in potential games. The traditional analysis of loglinear learning has centered around ..."
Abstract

Cited by 25 (10 self)
 Add to MetaCart
Loglinear learning is a learning algorithm with equilibrium selection properties. Loglinear learning provides guarantees on the percentage of time that the joint action profile will be at a potential maximizer in potential games. The traditional analysis of loglinear learning has centered around explicitly computing the stationary distribution. This analysis relied on a highly structured setting: i) players ’ utility functions constitute a potential game, ii) players update their strategies one at a time, which we refer to as asynchrony, iii) at any stage, a player can select any action in the action set, which we refer to as completeness, and iv) each player is endowed with the ability to assess the utility he would have received for any alternative action provided that the actions of all other players remain fixed. Since the appeal of loglinear learning is not solely the explicit form of the stationary distribution, we seek to address to what degree one can relax the structural assumptions while maintaining that only potential function maximizers are the stochastically stable action profiles. In this paper, we introduce slight variants of loglinear learning to include both synchronous updates and incomplete action sets. In both settings, we prove that only potential function maximizers are stochastically stable. Furthermore, we introduce a payoffbased version of loglinear learning, in which players are only aware of the utility they received and the action that they played. Note that loglinear learning in its original form is not a payoffbased learning algorithm. In payoffbased loglinear learning, we also prove that only potential maximizers are stochastically stable. The key enabler for these results is to change the focus of the analysis away from deriving the explicit form of the stationary distribution of the learning process towards characterizing the stochastically stable states. The resulting analysis uses the theory of resistance trees for regular perturbed Markov decision processes, thereby allowing a relaxation of the aforementioned structural assumptions.
Regret based dynamics: Convergence in weakly acyclic games
 In Proceedings of the 2007 International Conference on Autonomous Agents and Multiagent Systems (AAMAS
, 2007
"... Regret based algorithms have been proposed to control a wide variety of multiagent systems. The appeal of regretbased algorithms is that (1) these algorithms are easily implementable in large scale multiagent systems and (2) there are existing results proving that the behavior will asymptotically ..."
Abstract

Cited by 19 (9 self)
 Add to MetaCart
Regret based algorithms have been proposed to control a wide variety of multiagent systems. The appeal of regretbased algorithms is that (1) these algorithms are easily implementable in large scale multiagent systems and (2) there are existing results proving that the behavior will asymptotically converge to a set of points of “noregret ” in any game. We illustrate, through a simple example, that noregret points need not reflect desirable operating conditions for a multiagent system. Multiagent systems often exhibit an additional structure (i.e. being “weakly acyclic”) that has not been exploited in the context of regret based algorithms. In this paper, we introduce a modification of regret based algorithms by (1) exponentially discounting the memory and (2) bringing in a notion of inertia in players ’ decision process. We show how these modifications can lead to an entire class of regret based algorithm that provide almost sure convergence to a pure Nash equilibrium in any weakly acyclic game.
Multiplicative Updates Outperform Generic NoRegret . . .
, 2009
"... We study the outcome of natural learning algorithms in atomic congestion games. Atomic congestion games have a wide variety of equilibria often with vastly differing social costs. We show that in almost all such games, the wellknown multiplicativeweights learning algorithm results in convergence to ..."
Abstract

Cited by 15 (4 self)
 Add to MetaCart
We study the outcome of natural learning algorithms in atomic congestion games. Atomic congestion games have a wide variety of equilibria often with vastly differing social costs. We show that in almost all such games, the wellknown multiplicativeweights learning algorithm results in convergence to pure equilibria. Our results show that natural learning behavior can avoid bad outcomes predicted by the price of anarchy in atomic congestion games such as the loadbalancing game introduced by Koutsoupias and Papadimitriou, which has superconstant price of anarchy and has correlated equilibria that are exponentially worse than any mixed Nash equilibrium. Our results identify a set of mixed Nash equilibria that we call weakly stable equilibria. Our notion of weakly stable is defined gametheoretically, but we show that this property holds whenever a stability criterion from the theory of dynamical systems is satisfied. This allows us to show that in every congestion game, the distribution of play converges to the set of weakly stable equilibria. Pure Nash equilibria are weakly stable, and we show using techniques from algebraic geometry that the converse is true with probability 1 when congestion costs are selected at random independently on each edge (from any monotonically parametrized distribution). We further extend our results to show that players can use algorithms with different (sufficiently small) learning rates, i.e. they can trade off convergence speed and long term average regret differently.
NoRegret Learning and a Mechanism for Distributed Multiagent Planning
, 2008
"... We develop a novel mechanism for coordinated, distributed multiagent planning. We consider problems stated as a collection of singleagent planning problems coupled by common soft constraints on resource consumption. (Resources may be real or fictitious, the latter introduced as a tool for factoring ..."
Abstract

Cited by 10 (6 self)
 Add to MetaCart
We develop a novel mechanism for coordinated, distributed multiagent planning. We consider problems stated as a collection of singleagent planning problems coupled by common soft constraints on resource consumption. (Resources may be real or fictitious, the latter introduced as a tool for factoring the problem). A key idea is to recast the distributed planning problem as learning in a repeated game between the original agents and a newly introduced group of adversarial agents who influence prices for the resources. The adversarial agents benefit from arbitrage: that is, their incentive is to uncover violations of the resource usage constraints and, by selfishly adjusting prices, encourage the original agents to avoid plans that cause such violations. If all agents employ noregret learning algorithms in the course of this repeated interaction, we are able to show that our mechanism can achieve design goals such as social optimality (efficiency), budget balance, and Nashequilibrium convergence to within an error which approaches zero as the agents gain experience. In particular, the agents’ average plans converge to a socially optimal solution for the original planning task. We present experiments in a simulated network routing domain demonstrating our method’s ability to reliably generate sound plans.
Weighted congestion games: Price of anarchy, universal worstcase examples, and tightness
 In Proceedings of the 18th Annual European Symposium on Algorithms (ESA
, 2010
"... Abstract. We characterize the price of anarchy in weighted congestion games, as a function of the allowable resource cost functions. Our results provide as thorough an understanding of this quantity as is already known for nonatomic and unweighted congestion games, and take the form of universal (co ..."
Abstract

Cited by 8 (4 self)
 Add to MetaCart
Abstract. We characterize the price of anarchy in weighted congestion games, as a function of the allowable resource cost functions. Our results provide as thorough an understanding of this quantity as is already known for nonatomic and unweighted congestion games, and take the form of universal (cost functionindependent) worstcase examples. One noteworthy byproduct of our proofs is the fact that weighted congestion games are “tight”, which implies that the worstcase price of anarchy with respect to pure Nash, mixed Nash, correlated, and coarse correlated equilibria are always equal (under mild conditions on the allowable cost functions). Another is the fact that, like nonatomic but unlike atomic (unweighted) congestion games, weighted congestion games with trivial structure already realize the worstcase POA, at least for polynomial cost functions. We also prove a new result about unweighted congestion games: the worstcase price of anarchy in symmetric games is, as the number of players goes to infinity, as large as in their more general asymmetric counterparts. 1