Results 1  10
of
13
Intrinsic Robustness of the Price of Anarchy
"... The price of anarchy (POA) is a worstcase measure of the inefficiency of selfish behavior, defined as the ratio of the objective function value of a worst Nash equilibrium of a game and that of an optimal outcome. This measure implicitly assumes that players successfully reach some Nash equilibrium ..."
Abstract

Cited by 56 (11 self)
 Add to MetaCart
The price of anarchy (POA) is a worstcase measure of the inefficiency of selfish behavior, defined as the ratio of the objective function value of a worst Nash equilibrium of a game and that of an optimal outcome. This measure implicitly assumes that players successfully reach some Nash equilibrium. This drawback motivates the search for inefficiency bounds that apply more generally to weaker notions of equilibria, such as mixed Nash and correlated equilibria; or to sequences of outcomes generated by natural experimentation strategies, such as successive best responses or simultaneous regretminimization. We prove a general and fundamental connection between the price of anarchy and its seemingly stronger relatives in classes of games with a sum objective. First, we identify a “canonical sufficient condition ” for an upper bound of the POA for pure Nash equilibria, which we call a smoothness argument. Second, we show that every bound derived via a smoothness argument extends automatically, with no quantitative degradation in the bound, to mixed Nash equilibria, correlated equilibria, and the average objective function value of regretminimizing players (or “price of total anarchy”). Smoothness arguments also have automatic implications for the inefficiency of approximate and BayesianNash equilibria and, under mild additional assumptions, for bicriteria bounds and for polynomiallength bestresponse sequences. We also identify classes of games — most notably, congestion games with cost functions restricted to an arbitrary fixed set — that are tight, in the sense that smoothness arguments are guaranteed to produce an optimal worstcase upper bound on the POA, even for the smallest set of interest (pure Nash equilibria). Byproducts of our proof of this result include the first tight bounds on the POA in congestion games with nonpolynomial cost functions, and the first
Routing without regret: On convergence to nash equilibria of regretminimizing algorithms in routing games
 In PODC
, 2006
"... Abstract There has been substantial work developing simple, efficient noregret algorithms for a wideclass of repeated decisionmaking problems including online routing. These are adaptive strategies an individual can use that give strong guarantees on performance even in adversariallychanging envi ..."
Abstract

Cited by 47 (6 self)
 Add to MetaCart
Abstract There has been substantial work developing simple, efficient noregret algorithms for a wideclass of repeated decisionmaking problems including online routing. These are adaptive strategies an individual can use that give strong guarantees on performance even in adversariallychanging environments. There has also been substantial work on analyzing properties of Nash equilibria in routing games. In this paper, we consider the question: if each player in a routing game uses a noregret strategy, will behavior converge to a Nash equilibrium? In general games the answer to this question is known to be no in a strong sense, but routing games havesubstantially more structure. In this paper we show that in the Wardrop setting of multicommodity flow and infinitesimalagents, behavior will approach Nash equilibrium (formally, on most days, the cost of the flow will be close to the cost of the cheapest paths possible given that flow) at a rate that dependspolynomially on the players ' regret bounds and the maximum slope of any latency function. We also show that priceofanarchy results may be applied to these approximate equilibria, and alsoconsider the finitesize (noninfinitesimal) loadbalancing model of Azar [2].
1 Distributed Algorithms for Learning and Cognitive Medium Access with Logarithmic Regret
"... Abstract—The problem of distributed learning and channel access is considered in a cognitive network with multiple secondary users. The availability statistics of the channels are initially unknown to the secondary users and are estimated using sensing decisions. There is no explicit information exc ..."
Abstract

Cited by 10 (0 self)
 Add to MetaCart
Abstract—The problem of distributed learning and channel access is considered in a cognitive network with multiple secondary users. The availability statistics of the channels are initially unknown to the secondary users and are estimated using sensing decisions. There is no explicit information exchange or prior agreement among the secondary users. We propose policies for distributed learning and access which achieve orderoptimal cognitive system throughput (number of successful secondary transmissions) under self play, i.e., when implemented at all the secondary users. Equivalently, our policies minimize the regret in distributed learning and access. We first consider the scenario when the number of secondary users is known to the policy, and prove that the total regret is logarithmic in the number of transmission slots. Our distributed learning and access policy achieves orderoptimal regret by comparing to an asymptotic lower bound for regret under any uniformlygood learning and access policy. We then consider the case when the number of secondary users is fixed but unknown, and is estimated through feedback. We propose a policy in this scenario whose asymptotic sum regret which grows slightly faster than logarithmic in the number of transmission slots. Index Terms—Cognitive medium access control, multiarmed bandits, distributed algorithms, logarithmic regret. I.
Circumventing the Price of Anarchy: Leading Dynamics to Good Behavior
"... Abstract: Many natural games can have a dramatic difference between the quality of their best and worst Nash equilibria, even in pure strategies. Yet, nearly all work to date on dynamics shows only convergence to some equilibrium, especially within a polynomial number of steps. In this work we study ..."
Abstract

Cited by 10 (4 self)
 Add to MetaCart
Abstract: Many natural games can have a dramatic difference between the quality of their best and worst Nash equilibria, even in pure strategies. Yet, nearly all work to date on dynamics shows only convergence to some equilibrium, especially within a polynomial number of steps. In this work we study how agents with some knowledge of the game might be able to quickly (within a polynomial number of steps) find their way to states of quality close to the best equilibrium. We consider two natural learning models in which players choose between greedy behavior and following a proposed good but untrusted strategy and analyze two important classes of games in this context, fair costsharing and consensus games. Both games have extremely high Price of Anarchy and yet we show that behavior in these models can efficiently reach lowcost states. Keywords: Dynamics in Games, Price of Anarchy, Price of Stability, Costsharing games, Consensus games, Learning from untrusted experts
On the Inefficiency Ratio of Stable Equilibria in Congestion Games
"... Price of anarchy and price of stability are the primary notions for measuring the efficiency (i.e. the social welfare) of the outcome of a game. Both of these notions focus on extreme cases: one is defined as the inefficiency ratio of the worstcase equilibrium and the other as the best one. Therefo ..."
Abstract

Cited by 5 (0 self)
 Add to MetaCart
Price of anarchy and price of stability are the primary notions for measuring the efficiency (i.e. the social welfare) of the outcome of a game. Both of these notions focus on extreme cases: one is defined as the inefficiency ratio of the worstcase equilibrium and the other as the best one. Therefore, studying these notions often results in discovering equilibria that are not necessarily the most likely outcomes of the dynamics of selfish and noncoordinating agents. The current paper studies the inefficiency of the equilibria that are most stable in the presence of noise. In particular, we study two variations of noncooperative games: atomic congestion games and selfish load balancing. The noisy bestresponse dynamics in these games keeps the joint action profile around a particular set of equilibria that minimize the potential function. The inefficiency ratio in the neighborhood of these “stable ” equilibria is much better than the price of anarchy. Furthermore, the dynamics reaches these equilibria in polynomial time. Our observations show that in the game environments where a small noise is present, the system as a whole works better than what a pessimist may predict. They also suggest that in congestion games, introducing a small noise in the payoff of the agents may improve the social welfare.
Load Balancing Without Regret in the Bulletin Board Model
"... We analyze the performance of protocols for load balancing in distributed systems based on noregret algorithms from online learning theory. These protocols treat load balancing as a repeated game and apply algorithms whose average performance over time is guaranteed to match or exceed the average p ..."
Abstract

Cited by 4 (2 self)
 Add to MetaCart
We analyze the performance of protocols for load balancing in distributed systems based on noregret algorithms from online learning theory. These protocols treat load balancing as a repeated game and apply algorithms whose average performance over time is guaranteed to match or exceed the average performance of the best strategy in hindsight. Our approach captures two major aspects of distributed systems. First, in our setting of atomic load balancing, every single process can have a significant impact on the performance and behavior of the system. Furthermore, although in distributed systems participants can query the current state of the system they cannot reliably predict the effect of their actions on it. We address this issue by considering load balancing games in the bulletin board model, where players can find out the delay on all machines, but do not have information on what their experienced delay would have been if they had selected another machine. We show that under these more realistic assumptions, if all players use the wellknown multiplicative weights algorithm, then the quality of the resulting solution is exponentially better than the worst correlated equilibrium, and almost as good as that of the worst Nash. These tighter bounds are derived from analyzing the dynamics of a multiagent learning system.
Near Optimality in Covering and Packing Games by Exposing Global
 Information, CoRR
"... Covering and packing problems can be modeled as games to encapsulate interesting social and engineering settings. These games have a high Price of Anarchy in their natural formulation. However, existing research applicable to specific instances of these games has only been able to prove fast converg ..."
Abstract

Cited by 2 (2 self)
 Add to MetaCart
Covering and packing problems can be modeled as games to encapsulate interesting social and engineering settings. These games have a high Price of Anarchy in their natural formulation. However, existing research applicable to specific instances of these games has only been able to prove fast convergence to arbitrary equilibria. This paper studies general classes of covering and packing games with learning dynamics models that incorporate a central authority who broadcasts weak, socially beneficial signals to agents that otherwise only use local information in their decisionmaking. Rather than illustrating convergence to an arbitrary equilibrium that may have very high social cost, we show that these systems quickly achieve nearoptimal performance. In particular, we show that in the public service advertising model of [1], reaching a small constant fraction of the agents is enough to bring the system to a state within a logn factor of optimal in a broad class of set cover and set packing games or a constant factor of optimal in the special cases of vertex cover and maximum independent set, circumventing social inefficiency of bad local equilibria that could arise without a central authority. We extend these results to the learnthendecide model of [2], in which agents use any of a broad class of learning algorithms to decide in a given round whether to behave according to locally optimal behavior or the behavior prescribed by the broadcast signal. The new techniques we use for analyzing these games could be of broader interest for analyzing more general classic optimization problems in a distributed fashion. 1
Discrete price updates yield fast convergence in ongoing markets with finite warehouses
, 2010
"... This paper shows that in suitable markets, even with outofequilibrium trade allowed, a simple price update rule leads to rapid convergence toward the equilibrium. In particular, this paper considers a Fisher market repeated over an unbounded number of time steps, with the addition of finite sized ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
This paper shows that in suitable markets, even with outofequilibrium trade allowed, a simple price update rule leads to rapid convergence toward the equilibrium. In particular, this paper considers a Fisher market repeated over an unbounded number of time steps, with the addition of finite sized warehouses to enable nonequilibrium trade. The main result is that suitable tatonnement style price updates lead to convergence in a significant subset of markets satisfying the Weak Gross Substitutes property. Throughout this process the warehouse are always able to store or meet demand imbalances (the needed capacity depends on the initial imbalances). Our price update rule is robust in a variety of regards: • The updates for each good depend only on information about that good (its current price, its excess demand since its last update) and occur asynchronously from updates to other prices. • The process is resilient to error in the excess demand data. • Likewise, the process is resilient to discreteness, i.e. a limit to divisibility, both of goods and money.
Research Statement
"... The central theme of my research is to explore the impact of combining learning algorithms and game theory. Game theory attempts to mathematically capture behavior in strategic situations, in which an individual’s success depends on the choices of others. In practice, the interacting entities may be ..."
Abstract
 Add to MetaCart
The central theme of my research is to explore the impact of combining learning algorithms and game theory. Game theory attempts to mathematically capture behavior in strategic situations, in which an individual’s success depends on the choices of others. In practice, the interacting entities may be numerous and entangled via complex networks of interdependencies. Over the last decade, the prevalence of these issues has risen dramatically following a number of paradigmshifting events such as the cataclysmic rise of the Internet as a social networking tool, the painful realization of the extent of interconnectivity of the global economy as well as the necessity of international cooperation for addressing global sustainability concerns. As a result, there has been recorded a swift increase of the interest for a more detailed, realistic and quantitative understanding of such networked interactions. Algorithmic game theory (AGT) employs analytic tools from computer science, such as worst case analysis and complexity theory, to characterize behavioral solutions to strategic situations prescribed by (classical) game theory. Research in this area tends to focus on one of the following challenges: • Price of Anarchy: Characterize the inefficiency of equilibria vs the global optimum • Algorithmic Mechanism Design: Design games with desirable properties that are efficiently implementable. • Computational Complexity of Equilibria Many strategic interactions are in their nature recurrent (e.g. financial markets) with the agents participating in them repeatedly. These agents learn over time to adapt to their environment as is defined by the game and the dynamic behavior of the other agents. Understanding how agents can learn in the presence of other agents that are simultaneously learning constitutes a research problem that is as expansive as it is challenging. Such questions have fueled research endeavors both in economics as well as within computer science (e.g. multiagent learning). My research interests concentrate on the intersection of algorithmic game theory and learning. Incorporating the rather natural assumption that agents learn to adapt to their environment can lead to exciting new insights to long standing questions. For each subarea of AGT, I will present current results and directions for future research. My plan is to examine how far these implications reach.
Opportunistic Spectrum Access with Multiple Users: Learning under Competition
"... Abstract—The problem of cooperative allocation among multiple secondary users to maximize cognitive system throughput is considered. The channel availability statistics are initially unknown to the secondary users and are learnt via sensing samples. Two distributed learning and allocation schemes wh ..."
Abstract
 Add to MetaCart
Abstract—The problem of cooperative allocation among multiple secondary users to maximize cognitive system throughput is considered. The channel availability statistics are initially unknown to the secondary users and are learnt via sensing samples. Two distributed learning and allocation schemes which maximize the cognitive system throughput or equivalently minimize the total regret in distributed learning and allocation are proposed. The first scheme assumes minimal prior information in terms of preallocated ranks for secondary users while the second scheme is fully distributed and assumes no such prior information. The two schemes have sum regret which is provably logarithmic in the number of sensing time slots. A lower bound is derived for any learning scheme which is asymptotically logarithmic in the number of slots. Hence, our schemes achieve asymptotic order optimality in terms of regret in distributed learning and allocation. Index Terms—Cognitive medium access, learning, multiarmed bandits, logarithmic regret, distributed algorithms. I.