## Routing without regret: On convergence to nash equilibria of regret-minimizing algorithms in routing games (2006)

### Cached

### Download Links

- [www.seas.upenn.edu]
- [www.seas.upenn.edu]
- [tocmirror.cs.tau.ac.il]
- [www.cs.cornell.edu]
- [www.cs.cmu.edu]
- [www.lb.cs.cmu.edu]
- [www-cgi.cs.cmu.edu.]
- [www-cgi.cs.cmu.edu]
- [www-cgi.cs.cmu.edu]
- [www-cgi.cs.cmu.edu]
- [www-cgi.cs.cmu.edu]
- [www-2.cs.cmu.edu]
- [www-2.cs.cmu.edu]
- [www.cs.cmu.edu]
- [www.cs.cmu.edu]
- [www.cs.cmu.edu]
- [www.cs.cmu.edu]
- [www.cs.cmu.edu]
- [www.cs.cmu.edu]
- [www.cs.cmu.edu]
- [www.cs.cmu.edu]
- [reports-archive.adm.cs.cmu.edu]
- [reports-archive.adm.cs.cmu.edu]
- [reports-archive.adm.cs.cmu.edu]
- [reports-archive.adm.cs.cmu.edu]
- [reports-archive.adm.cs.cmu.edu]
- [reports-archive.adm.cs.cmu.edu]
- [reports-archive.adm.cs.cmu.edu]
- [reports-archive.adm.cs.cmu.edu]
- [reports-archive.adm.cs.cmu.edu]
- [reports-archive.adm.cs.cmu.edu]
- [reports-archive.adm.cs.cmu.edu]
- [reports-archive.adm.cs.cmu.edu]
- [reports-archive.adm.cs.cmu.edu]
- [reports-archive.adm.cs.cmu.edu]
- [reports-archive.adm.cs.cmu.edu]
- [reports-archive.adm.cs.cmu.edu]
- [reports-archive.adm.cs.cmu.edu]
- [reports-archive.adm.cs.cmu.edu]
- [reports-archive.adm.cs.cmu.edu]
- [reports-archive.adm.cs.cmu.edu]
- [reports-archive.adm.cs.cmu.edu]
- [reports-archive.adm.cs.cmu.edu]
- [reports-archive.adm.cs.cmu.edu]
- [reports-archive.adm.cs.cmu.edu]
- DBLP

### Other Repositories/Bibliography

Venue: | In PODC |

Citations: | 48 - 6 self |

### BibTeX

@INPROCEEDINGS{Blum06routingwithout,

author = {Avrim Blum and Eyal Even-dar and Katrina Ligett},

title = {Routing without regret: On convergence to nash equilibria of regret-minimizing algorithms in routing games},

booktitle = {In PODC},

year = {2006},

pages = {45--52},

publisher = {ACM}

}

### Years of Citing Articles

### OpenURL

### Abstract

Abstract There has been substantial work developing simple, efficient no-regret algorithms for a wideclass of repeated decision-making problems including online routing. These are adaptive strategies an individual can use that give strong guarantees on performance even in adversarially-changing environments. There has also been substantial work on analyzing properties of Nash equilibria in routing games. In this paper, we consider the question: if each player in a rout-ing game uses a no-regret strategy, will behavior converge to a Nash equilibrium? In general games the answer to this question is known to be no in a strong sense, but routing games havesubstantially more structure. In this paper we show that in the Wardrop setting of multicommodity flow and infinitesimalagents, behavior will approach Nash equilibrium (formally, on most days, the cost of the flow will be close to the cost of the cheapest paths possible given that flow) at a rate that dependspolynomially on the players ' regret bounds and the maximum slope of any latency function. We also show that price-of-anarchy results may be applied to these approximate equilibria, and alsoconsider the finite-size (non-infinitesimal) load-balancing model of Azar [2].

### Citations

2460 | A Decision-Theoretic Generalization of OnLine Learning and an Application to Boosting
- Freund, Shapire
- 1997
(Show Context)
Citation Context ... best fixed strategy in hindsight—the “per time step regret”—is linear in the size of the game N. This was reduced to logarithmic in N in more recent exponential-weighting algorithms for this problem =-=[28, 8, 20]-=- (also known as the problem of “combining expert advice”). Most recently, a number of algorithms have been developed for achieving such guarantees in a computationally efficient manner in many setting... |

697 | The weighted majority algorithm
- Littlestone, Warmuth
- 1994
(Show Context)
Citation Context ... to the best fixed strategy in hindsight— the “per time step regret”—is linear in the size of the game N. This was reduced to O(log N) in more recent exponential-weighting algorithms for this problem =-=[26, 7, 18]-=- (also called the problem of “combining expert advice”). Most recently, a number of algorithms have been developed for achieving such guarantees efficiently in many settings where the number of choice... |

660 | Worst-case equilibria
- Koutsoupias, Papadimitriou
- 1999
(Show Context)
Citation Context ... revealed to the algorithm at the end of each time step t. Along a very different line of inquiry, there has also been much recent work on the price of anarchy in games. Koutsoupias and Papadimitriou =-=[25]-=- defined the price of anarchy, which is the ratio of the cost of an optimal global objective function to the cost of the worst Nash equilibrium. Many subsequent results have studied the price of anarc... |

541 | How bad is selfish routing
- Roughgarden, Tardos
- 2002
(Show Context)
Citation Context ...scheduling to facility location to network creation games, and especially to problems of routing in the Wardrop model, where the cost of an edge is a function of the amount of traffic using that edge =-=[8, 9, 25, 32, 13]-=-. Such work implicitly assumes that selfish individual behavior results in Nash equilibria. In this work we consider the question: if all players in a routing game use no-regret algorithms to choose t... |

517 | Some theoretical aspects of road traffic research - Wardrop - 1952 |

321 | How to use expert advice
- Cesa-Bianchi, Freund, et al.
- 1997
(Show Context)
Citation Context ... to the best fixed strategy in hindsight— the “per time step regret”—is linear in the size of the game N. This was reduced to O(log N) in more recent exponential-weighting algorithms for this problem =-=[26, 7, 18]-=- (also called the problem of “combining expert advice”). Most recently, a number of algorithms have been developed for achieving such guarantees efficiently in many settings where the number of choice... |

240 | A Simple Adaptive Procedure leading to Correlated Equilibrium
- Hart, Mas-Colell
- 2000
(Show Context)
Citation Context ... where standard algorithms will have this property with arbitrarily high probability [38]. 1.2 Regret and Correlated equilibria It is known that certain algorithms such as that of Hart and Mas-Colell =-=[22]-=-, as well as any algorithms satisfying the stronger property of “no internal regret” [17], have the property that the empirical distribution of play approaches a correlated equilibrium. On the positiv... |

194 | On a network creation game
- Fabrikant, Luthra, et al.
- 2003
(Show Context)
Citation Context ...scheduling to facility location to network creation games, and especially to problems of routing in the Wardrop model, where the cost of an edge is a function of the amount of traffic using that edge =-=[8, 9, 25, 32, 13]-=-. Such work implicitly assumes that selfish individual behavior results in Nash equilibria. In this work we consider the question: if all players in a routing game use no-regret algorithms to choose t... |

194 | Online convex programming and generalized infinitesimal gradient ascent
- Zinkevich
(Show Context)
Citation Context ...n developed for achieving such guarantees in a computationally efficient manner in many settings where the number of possible actions N is exponential in the natural description-length of the problem =-=[25, 38, 39]-=-. One specific setting where efficient regret-minimizing algorithms can be applied is online routing. Given a graph G = (V,E) and two distinguished nodes vstart and vend, the game for an individual pl... |

167 | Tight bounds for worst-case equilibria
- Czumaj, Vöcking
- 2002
(Show Context)
Citation Context ...o problems of routing in the Wardrop model such as that described above, where the cost of an edge is a function of the amount of traffic using that edge, and the individual players are infinitesimal =-=[10, 11, 27, 34, 15]-=-. Such work implicitly assumes that selfish individual behavior results in Nash equilibria. THEORY OF COMPUTING, Volume 6 (2010), pp. 179–199 180ROUTING WITHOUT REGRET Our Contribution We consider th... |

149 |
Congestion games with player-specific payoff functions
- Milchtaich
- 1996
(Show Context)
Citation Context ...ng potential functions, with the limitation that only one player is allowed to move in each time step; the convergence times derived depended on the appropriate potential functions of the exact model =-=[28, 10]-=-. The work of Goldberg [20] studied a randomized model in which each user can select a random delay over continuous time. This implies that only one user tries to reroute at each specific time; theref... |

147 |
Approximation to bayes risk in repeated plays
- Hannan
- 1957
(Show Context)
Citation Context ...setting, their average loss per time step approaches that of the best fixed strategy in hindsight (or better) over time. Moreover, the convergence rates are quite good: in Hannan’s original algorithm =-=[21]-=-, the number of time steps needed to achieve a gap of ǫ with respect to the best fixed strategy in hindsight— the “per time step regret”—is linear in the size of the game N. This was reduced to O(log ... |

138 | Adaptive game playing using multiplicative weights
- Freund, Schapire
- 1999
(Show Context)
Citation Context ...ning algorithms settle at all, they will have to settle at a Nash equilibrium. In fact, for zero-sum games, noregret algorithms when played against each other will approach a minimax optimal solution =-=[19]-=-. However, it is known that even in small 2-player general-sum games, no-regret algorithms need not approach a Nash equilibrium and can instead cycle, achieving performance substantially worse than an... |

137 | Efficient algorithms for online decision problems
- Kalai, Vempala
- 2003
(Show Context)
Citation Context ...ly, a number of algorithms have been developed for achieving such guarantees efficiently in many settings where the number of choices N is exponential in the natural description-length of the problem =-=[23, 36, 37]-=-. One specific setting where these efficient algorithms apply is online routing. Given a graph G = (V,E) and two distinguished nodes vstart and vend, the game for an individual player is defined as fo... |

106 |
Approximation to Bayes Risk in Repeated Plays” in Contributions to the Theory of Games
- Hannan
- 1957
(Show Context)
Citation Context ...setting, their average loss per time step approaches that of the best fixed strategy in hindsight (or better) over time. Moreover, the convergence rates are quite good: in Hannan’s original algorithm =-=[19]-=-, the number of time steps needed to achieve a gap of with respect to the best fixed strategy in hindsight—the “per time step regret”—is linear in the size of the game N . This was reduced to O(logN... |

99 |
Equilibrium points of nonatomic games
- Schmeidler
- 1973
(Show Context)
Citation Context ...s for players of that type. In addition, given our assumption that all latency functions are continuous and non-decreasing, one can prove the existence of Nash equilibria: Proposition 2.3 (Schmeidler =-=[35]-=-, generalization of Beckman et al. [4]) Every nonatomic congestion game admits a flow at equilibrium. We define the social cost of a flow to be the average cost incurred by the players: Definition 2.4... |

89 | Calibrated learning and correlated equilibrium - Foster, Vohra - 1997 |

84 | Potential games with continuous player sets
- Sandholm
- 2001
(Show Context)
Citation Context ...ated equilibrium in atomic congestion games is the unique Nash equilibrium, there is no known efficient implementation for internal regret minimization for routing problems. 1.3 Related work Sandholm =-=[34]-=- considers convergence in potential games (which include routing games), and shows that a very broad class of evolutionary dynamics is guaranteed to converge to Nash equilibrium. 2 Fischer and Vöcking... |

83 | Convergence time to Nash equilibria
- Even-Dar, Kesselman, et al.
- 2003
(Show Context)
Citation Context ...ng potential functions, with the limitation that only one player is allowed to move in each time step; the convergence times derived depended on the appropriate potential functions of the exact model =-=[28, 10]-=-. The work of Goldberg [20] studied a randomized model in which each user can select a random delay over continuous time. This implies that only one user tries to reroute at each specific time; theref... |

80 |
Adaptive routing with end-to-end feedback: Distributed learning and geometric approaches
- Awerbuch, Kleinberg
- 2004
(Show Context)
Citation Context ... achieve running time and convergence rates (to the cost of the best fixed path in hindsight) which are polynomial in the size of the graph and the maximum edge cost. Moreover, a number of extensions =-=[2, 27]-=- have shown how these algorithms can be applied even to the “bandit” setting where only the cost of edges actually traversed (or even just the total cost of Pt) is revealed to the algorithm at the end... |

79 | Selfish traffic allocation for server farms
- Czumaj, Krysta, et al.
- 2002
(Show Context)
Citation Context ...scheduling to facility location to network creation games, and especially to problems of routing in the Wardrop model, where the cost of an edge is a function of the amount of traffic using that edge =-=[8, 9, 25, 32, 13]-=-. Such work implicitly assumes that selfish individual behavior results in Nash equilibria. In this work we consider the question: if all players in a routing game use no-regret algorithms to choose t... |

64 | Path kernels and multiplicative updates
- Takimoto, Warmuth
(Show Context)
Citation Context ...ly, a number of algorithms have been developed for achieving such guarantees efficiently in many settings where the number of choices N is exponential in the natural description-length of the problem =-=[23, 36, 37]-=-. One specific setting where these efficient algorithms apply is online routing. Given a graph G = (V,E) and two distinguished nodes vstart and vend, the game for an individual player is defined as fo... |

64 |
and Eva Tardos, “How bad is selfish routing
- Roughgarden
(Show Context)
Citation Context ...o problems of routing in the Wardrop model such as that described above, where the cost of an edge is a function of the amount of traffic using that edge, and the individual players are infinitesimal =-=[10, 11, 27, 34, 15]-=-. Such work implicitly assumes that selfish individual behavior results in Nash equilibria. THEORY OF COMPUTING, Volume 6 (2010), pp. 179–199 180ROUTING WITHOUT REGRET Our Contribution We consider th... |

62 | Online geometric optimization in the bandit setting against an adaptive adversary
- McMahan, Blum
(Show Context)
Citation Context ... achieve running time and convergence rates (to the cost of the best fixed path in hindsight) which are polynomial in the size of the graph and the maximum edge cost. Moreover, a number of extensions =-=[2, 27]-=- have shown how these algorithms can be applied even to the “bandit” setting where only the cost of edges actually traversed (or even just the total cost of Pt) is revealed to the algorithm at the end... |

62 | Intrinsic robustness of the price of anarchy
- Roughgarden
- 2009
(Show Context)
Citation Context ... the player strategies is that they are all no-regret. Subsequent work Since the initial publication of these results, a number of publications have built on our work. Blum et al. [6] and Roughgarden =-=[31]-=- explore the outcomes of regret-minimizing behavior in a variety of classes of games; they are able to show Price of Anarchy style bounds on the social cost, but do not prove convergence results. Klei... |

60 | Bounding the inefficiency of equilibria in nonatomic congestion games
- Roughgarden, Tardos
- 2002
(Show Context)
Citation Context ... of Anarchy results for the congestion game to the regret-minimizing players in the original game. In our second result in this section, we give an argument paralleling that of Roughgarden and Tardos =-=[33]-=- that directly relates the costs of regret-minimizing users to the cost of the social optimum. Theorem 6.1 If f is an ǫ-Nash equilibrium flow for a nonatomic congestion game Γ, then C(f on Γ) ≤ ρ ( C(... |

43 | Regret minimization and the price of total anarchy
- Blum, Hajiaghayi, et al.
(Show Context)
Citation Context ...nly assumption about the player strategies is that they are all no-regret. Subsequent work Since the initial publication of these results, a number of publications have built on our work. Blum et al. =-=[6]-=- and Roughgarden [31] explore the outcomes of regret-minimizing behavior in a variety of classes of games; they are able to show Price of Anarchy style bounds on the social cost, but do not prove conv... |

43 | Fast convergence to wardrop equilibria by adaptive sampling methods
- Fischer, Räcke, et al.
(Show Context)
Citation Context ...tely stable configuration. In more recent work, they study the convergence of a class of routing policies under a specific model of stale information [16]. Most recently, Fischer, Raecke, and Vöcking =-=[14]-=- give a distributed procedure with especially good convergence properties. The key difference between that work and ours is that those results consider specific adaptive strategies designed to quickly... |

34 | S.: Efficient algorithms for on-line optimization
- Kalai, Vempala
(Show Context)
Citation Context ...ly, a number of algorithms have been developed for achieving such guarantees efficiently in many settings where the number of choices N is exponential in the natural description-length of the problem =-=[21, 30, 31]-=-. One specific setting where these efficient algorithms apply is online routing. Given a graph G = (V,E) and two distinguished nodes vstart and vend, the game for an individual player is defined as fo... |

33 | Fast convergence to nearly optimal solutions in potential games
- Awerbuch, Azar, et al.
- 2008
(Show Context)
Citation Context ...Nash equilibria. Even-Dar et al. [12] demonstrate convergence of general regret-minimizing algorithms to Nash equilibria in a general class of games they call “sociallyconcave” games. Awerbuch et al. =-=[1]-=- show that a certain type of best response dynamics converges quickly to approximate Nash equilibria in congestion games. General no regret dynamics are much more complex than the dynamics they study,... |

31 | R.: Distributed Selfish Load Balancing
- Berenbrink, Friedetzky, et al.
- 2007
(Show Context)
Citation Context ...ered a model where many users are allowed to move concurrently, and derived a logarithmic convergence rate for users following a centrally-moderated greedy algorithm. Most recently, Berenbrink et al. =-=[5]-=- showed weaker convergence results for a specific distributed protocol. To summarize, previous work studied the convergence time to pure Nash equilibria in situations with a centralized mechanism or s... |

27 | Adaptive Routing with Stale Information
- Fischer, Vöcking
- 2005
(Show Context)
Citation Context ...about convergence of this dynamics to an approximately stable configuration. In more recent work, they study the convergence of a class of routing policies under a specific model of stale information =-=[16]-=-. Most recently, Fischer, Raecke, and Vöcking [14] give a distributed procedure with especially good convergence properties. The key difference between that work and ours is that those results conside... |

26 | On the evolution of selfish routing
- Fischer, Vöcking
(Show Context)
Citation Context ...considers convergence in potential games (which include routing games), and shows that a very broad class of evolutionary dynamics is guaranteed to converge to Nash equilibrium. 2 Fischer and Vöcking =-=[15]-=- consider a specific adaptive dynamics (a particular functional form in which flow might naturally change over time) in the context of selfish routing and prove results about convergence of this dynam... |

26 |
Correlated Equilibrium and Potential Games
- Neyman
- 1997
(Show Context)
Citation Context ... latency substantially greater than the best path given the flow (and we give a specific example of how this can happen when edge-latencies have unbounded slope in §2.4). In addition, although Neyman =-=[30]-=- does show that the only correlated equilibrium in atomic congestion games is the unique Nash equilibrium, there is no known efficient implementation for internal regret minimization for routing probl... |

25 | Fast convergence of selfish rerouting
- Even-Dar, Mansour
- 2005
(Show Context)
Citation Context ...can select a random delay over continuous time. This implies that only one user tries to reroute at each specific time; therefore the setting was similar to that mentioned above. Even-Dar and Mansour =-=[11]-=- considered a model where many users are allowed to move concurrently, and derived a logarithmic convergence rate for users following a centrally-moderated greedy algorithm. Most recently, Berenbrink ... |

22 |
Bounds for the convergence rate of randomized local search in multiplayer games, uniform resource sharing game
- Goldberg
- 2004
(Show Context)
Citation Context ...e limitation that only one player is allowed to move in each time step; the convergence times derived depended on the appropriate potential functions of the exact model [28, 10]. The work of Goldberg =-=[20]-=- studied a randomized model in which each user can select a random delay over continuous time. This implies that only one user tries to reroute at each specific time; therefore the setting was similar... |

20 | On the severity of Braess’s paradox: Designing networks for selfish users is hard - Roughgarden |

18 |
On-line Load Balancing Online Algorithms - The State of the Art
- Azar
- 1998
(Show Context)
Citation Context ...e of any latency function. We also show that price-of-anarchy results may be applied to these approximate equilibria, and also consider the finite-size (non-infinitesimal) loadbalancing model of Azar =-=[3]-=-. Our nonatomic results also apply to a more general class of games known as congestion games. ∗A preliminary version of these results appeared in the Proceedings of the 25th Annual ACM Symposium on P... |

15 | Multiplicative updates outperform generic no-regret learning in congestion games - Kleinberg, Piliouras, et al. - 2009 |

15 | Generic uniqueness of equilibrium in large crowding games
- Milchtaich
- 2000
(Show Context)
Citation Context ...e players: Definition 2.4 Define the cost C(f) of a flow f to be C(f) = ∑ e∈E ℓe(fe)fe. In addition, for any nonatomic congestion game, there is a unique equilibrium cost: Proposition 2.5 (Milchtaich =-=[29]-=-, generalization of Beckman et al. [4]) Distinct equilibria for a nonatomic congestion game have equal social cost. 2.3 No-Regret Algorithms Definition 2.6 Consider a series of flows f 1 ,f 2 ,...,f T... |

9 |
Tight bounds on worse case equilibria
- Czumaj, Vocking
- 2002
(Show Context)
Citation Context |

9 |
On the convergence of regret minimization dynamics in concave games
- Even-dar, Mansour, et al.
(Show Context)
Citation Context ...loying a particular class of regret-minimization algorithms and show that in many cases, the additional assumptions on the player algorithms allow convergence to pure Nash equilibria. Even-Dar et al. =-=[12]-=- demonstrate convergence of general regret-minimizing algorithms to Nash equilibria in a general class of games they call “sociallyconcave” games. Awerbuch et al. [1] show that a certain type of best ... |

7 |
Calibrated learning and correlated equilibrium, Games and Economic Behavior
- Foster, Vohra
- 1997
(Show Context)
Citation Context .... 1.2 Regret and Correlated equilibria It is known that certain algorithms such as that of Hart and Mas-Colell [22], as well as any algorithms satisfying the stronger property of “no internal regret” =-=[17]-=-, have the property that the empirical distribution of play approaches a correlated equilibrium. On the positive side, such results are extremely general, apply to nearly any game including routing, a... |

7 |
and Éva Tardos. Bounding the inefficiency of equilibria in nonatomic congestion games
- Roughgarden
(Show Context)
Citation Context ...curred by regret-minimizing players in a congestion game to the cost of the social optimum. We approach this problem in two ways: First, we give an argument paralleling that of Roughgarden and Tardos =-=[35]-=- that directly relates the costs of regret-minimizing users to the cost of the social optimum. In our second result in this section, we show that any average-ε-Nash equilibrium in a congestion game is... |

6 | Theoretical guarantees for algorithms in multi-agent settings
- ZINKEVICH
- 2004
(Show Context)
Citation Context ...hieving performance substantially worse than any Nash equilibrium for all players. Indeed simple examples are known where standard algorithms will have this property with arbitrarily high probability =-=[38]-=-. 1.2 Regret and Correlated equilibria It is known that certain algorithms such as that of Hart and Mas-Colell [22], as well as any algorithms satisfying the stronger property of “no internal regret” ... |

5 | Tight Bounds For Worst Case Equilibria - Czumaj, Vocking - 2002 |

4 |
AND RAKESH V. VOHRA: Calibrated learning and correlated equilibrium
- FOSTER
- 1997
(Show Context)
Citation Context .... 1.2 Regret and correlated equilibria It is known that certain algorithms such as that of Hart and Mas-Colell [24], as well as any algorithms satisfying the stronger property of “no internal regret” =-=[19]-=-, have the property that the empirical distribution of play approaches the set of correlated equilibria. On the positive side, such results are extremely general, apply to nearly any game including ro... |

3 | On the performance of approximate equilibria in congestion games - Christodoulou, Koutsoupias, et al. - 2009 |

3 |
Vöcking B. Selfish traffic allocation for server farms
- Czumaj, Krysta
(Show Context)
Citation Context ...o problems of routing in the Wardrop model such as that described above, where the cost of an edge is a function of the amount of traffic using that edge, and the individual players are infinitesimal =-=[10, 11, 27, 34, 15]-=-. Such work implicitly assumes that selfish individual behavior results in Nash equilibria. THEORY OF COMPUTING, Volume 6 (2010), pp. 179–199 180ROUTING WITHOUT REGRET Our Contribution We consider th... |

3 |
Path kernels and multiplicative
- TAKIMOTO, WARMUTH
(Show Context)
Citation Context ...n developed for achieving such guarantees in a computationally efficient manner in many settings where the number of possible actions N is exponential in the natural description-length of the problem =-=[25, 38, 39]-=-. One specific setting where efficient regret-minimizing algorithms can be applied is online routing. Given a graph G = (V,E) and two distinguished nodes vstart and vend, the game for an individual pl... |