## Playing games with approximation algorithms (2007)

### Cached

### Download Links

- [www.cs.cmu.edu]
- [research.microsoft.com]
- [www.cs.cornell.edu]
- [users.cms.caltech.edu]
- [www.cs.cornell.edu]
- DBLP

### Other Repositories/Bibliography

Venue: | In Proceedings of the 39 th annual ACM Symposium on Theory of Computing |

Citations: | 21 - 2 self |

### BibTeX

@INPROCEEDINGS{Kakade07playinggames,

author = {Sham M. Kakade and Adam Tauman Kalai and Katrina Ligett},

title = {Playing games with approximation algorithms},

booktitle = {In Proceedings of the 39 th annual ACM Symposium on Theory of Computing},

year = {2007},

pages = {546--555},

publisher = {ACM Press}

}

### OpenURL

### Abstract

Abstract. In an online linear optimization problem, on each period t, an online algorithm chooses st ∈ S from a fixed (possibly infinite) set S of feasible decisions. Nature (who may be adversarial) chooses a weight vector wt ∈ R n, and the algorithm incurs cost c(st, wt), where c is a fixed cost function that is linear in the weight vector. In the full-information setting, the vector wt is then revealed to the algorithm, and in the bandit setting, only the cost experienced, c(st, wt), is revealed. The goal of the online algorithm is to perform nearly as well as the best fixed s ∈ S in hindsight. Many repeated decision-making problems with weights fit naturally into this framework, such as online shortest-path, online TSP, online clustering, and online weighted set cover. Previously, it was shown how to convert any efficient exact offline optimization algorithm for such a problem into an efficient online algorithm in both the full-information and the bandit settings, with average cost nearly as good as that of the best fixed s ∈ S in hindsight. However, in the case where the offline algorithm is an approximation algorithm with ratio α> 1, the previous approach only worked for special types of approximation algorithms. We show how to convert any offline approximation algorithm for a linear optimization problem into a corresponding online approximation algorithm, with a polynomial blowup in runtime. If the offline algorithm has an α-approximation guarantee, then the expected cost of the online algorithm on any sequence is not much larger than α times that of the best s ∈ S, where the best is chosen with the benefit of hindsight. Our main innovation is combining Zinkevich’s algorithm for convex optimization with a geometric transformation that can be applied to any approximation algorithm. Standard techniques generalize the above result to the bandit setting, except that a “Barycentric Spanner ” for the problem is also (provably) necessary as input. Our algorithm can also be viewed as a method for playing large repeated games, where one can only compute approximate best-responses, rather than best-responses. 1. Introduction. In the 1950’s

### Citations

973 | Improved Approximation Algorithms for Maximum Cut and Satisfiability Problems Using Semidefinite Programming
- Goemans, Williamson
- 1995
(Show Context)
Citation Context ...input, the solution they find differs from the optimal solution by a factor of at most α in every coordinate. They observe that a number of algorithms, such as the GoemansWilliamson max-cut algorithm =-=[11]-=-, have this property. Balcan and Blum [3] observe that the previous approach applies to another type of approximation algorithm: one that uses an optimal decision for another linear optimization probl... |

299 | Some aspects of the sequential design of experiments
- ROBBINS
(Show Context)
Citation Context ...version, the player is then informed of wt, while in the bandit version she is only informed of the value c(st,wt). (The name bandit refers to the similarity to the classic multi-armed bandit problem =-=[15]-=-). The player’s goal is to achieve low average cost. In particular, we compare her cost with that of the best fixed decision: she would like her average cost to approach that of the best single point ... |

197 | Online convex programming and generalized infinitesimal gradient ascent
- Zinkevich
- 2003
(Show Context)
Citation Context ...e average performance of the best static decision in hindsight. Our new approach is inspired by Zinkevich’s algorithm for the problem of minimizing convex functions over a convex feasible set S ⊆ R n =-=[16]-=-. However, the application is not direct and requires a geometric transformation that can be applied to any approximation algorithm. Example 1 (Online metric TSP). Every day, a delivery company serves... |

139 | Efficient algorithms for online decision problems
- Kalai, Vempala
(Show Context)
Citation Context ..., even if this action could be chosen after observing the opponent’s play. Kalai and Vempala showed that Hannan’s approach can be used to efficiently solve online linear optimization problems as well =-=[13]-=-. Hannan’s algorithm relied on the ability to find best responses to an opponent’s play history. Informally speaking, Kalai and Vempala replaced this best-reply computation with an efficient black-box... |

111 |
Approximation to Bayes risk in repeated play,” Contributions to the Theory
- Hannan
- 1957
(Show Context)
Citation Context ...e can only compute approximate best-responses, rather than best-responses. 1. Introduction. In the 1950’s, Hannan gave an algorithm for playing repeated two-player games against an arbitrary opponent =-=[12]-=-. His was one of the earliest algorithms with the no-regret property: against any opponent, his algorithm achieved expected performance asymptotically near that of the best single action, where the be... |

80 |
Adaptive routing with end-to-end feedback: Distributed learning and geometric approaches
- Awerbuch, Kleinberg
- 2004
(Show Context)
Citation Context ...t a priori bounds on W and R by a simple change of basis so that RW = O(n). It is possible to do this from the set W alone. In particular, one can compute a 2-barycentric spanner (BS) e1,...,en for W =-=[2]-=- and perform a change of basis so that Φ(e1),...,Φ(en) is the standard basis (as we describe in greater detail in §4). By the definition of a 2-BS, this implies that W ⊆ [−2,2] n and hence W = 2 √ n i... |

76 |
Online convex optimization in the bandit setting: gradient descent without a gradient
- Flaxman, Kalai, et al.
- 2005
(Show Context)
Citation Context ...4(α + 2) 2 T. 4. Bandit algorithm. We now describe how to extend Algorithm 3.1 to the partial-information model, where the only feedback we receive is the cost we incur at each period. Flaxman et al. =-=[10]-=- also use a gradient descent style algorithm for online optimization in the bandit setting, but the details of their approach differ significantly from ours. The algorithm we describe here requires ac... |

65 | Approximation algorithms and online mechanisms for item pricing
- Balcan, Blum
(Show Context)
Citation Context ...problem nearly as efficiently as one can solve the offline problem. (They used the offline optimizer as a black box.) However, in many cases of interest, such as online combinatorial auction problems =-=[3]-=-, even the offline problem is NP-hard. Hannan’s “follow-the-perturbed-leader” approach can also be applied to some special types of approximation algorithms, but fails to work directly in general. Fin... |

62 | Online geometric optimization in the bandit setting against an adaptive adversary
- McMahan, Blum
(Show Context)
Citation Context ...tly achieve this property using the previous approach. 1.2.2. Bandit results. Previous work in the bandit setting constructs an “exploration basis” to allow the algorithm to discover better decisions =-=[2, 14, 7]-=-. In particular, Awerbuch and Kleinberg [2] introduce a so-called Barycentric Spanner (BS) as their exploration basis and show how to construct one from an optimization oracle A : R n → S. However, in... |

50 | Competing in the dark: An efficient algorithm for bandit linear optimization
- Abernethy, Hazan, et al.
- 2008
(Show Context)
Citation Context ...nline weighted set cover, the vendors are fixed sets P1,...,Pn ⊆ [m]. Each period, we choose a legal cover st ⊆ [n], that is, ⋃ i∈st Pi = [m]. There is an unknown sequence of cost vectors w1,w2,... ∈ =-=[0,1]-=- n , indicating the quarterly vendor costs. Each quarter, our total cost c(st,wt) is the sum of the costs of the vendors we chose for that quarter. In the full-information setting, at the end of the q... |

45 | Robbing the bandit: less regret in online geometric optimization against an adaptive adversary
- Dani, Hayes
- 2006
(Show Context)
Citation Context ...] that using Hannan’s approach [12], one can guarantee O(T −1/2 ) regret for any linear optimization problem, in the full-information version, as the number of periods T increases. It was later shown =-=[2, 14, 7]-=- how to convert exact algorithms to achieve O(T −1/3 ) regret in the more difficult bandit setting. This prior work was actually a reduction showing that one can solve the online problem nearly as eff... |

32 |
The price of bandit information for online optimization
- Dani, Hayes, et al.
- 2007
(Show Context)
Citation Context ...ar optimization problem (without additional input) to a bandit algorithm guaranteeing low α-regret. We note that the above regret is sub-optimal in terms of the T dependence. Furthermore, recent work =-=[8, 4, 1]-=- presents algorithms for online linear optimization that achieve the optimal √ T regret even in the bandit setting (these results either do not explicitly consider the computational issues or assume a... |

32 | A simple polynomial-time rescaling algorithm for solving linear programs
- Dunagan, Vempala
- 2004
(Show Context)
Citation Context ... δ λ and ‖x‖ ≤ 1 2 √ δ λ . Then ApproxProj(z,s,x) terminates after at most ‖x−z‖2 δλ iterations. Proof. The analysis is reminiscent of that of the perceptron algorithm (see, e.g., Dunagan and Vempala =-=[9]-=-). Let H = 1 √ δ 2 λ . To bound the number of recursivePLAYING GAMES WITH APPROXIMATION ALGORITHMS 13 calls to Approx-Proj, it suffices to show that the non-negative quantity ‖x − z‖ 2 decreases by a... |

12 | Randomized Metarounding,” Random Structures and Algorithms - Carr, Vempala - 2002 |

8 | Design is as easy as optimization
- Chakrabarty, Mehta, et al.
- 2006
(Show Context)
Citation Context ...ounding [5]. It would be interesting to extend their approach, based on the ellipsoid algorithm, to our problem and potentially achieve a more efficient algorithm. Related but simpler issues arise in =-=[6]-=-.PLAYING GAMES WITH APPROXIMATION ALGORITHMS 11 Fig. 3.2. An approximation algorithm run on vector w ∈ W always returns a point s ∈ S such that the set αK is contained in the halfspace tangent to Φ(s... |

5 |
Randomized metarounding,” Random Struct
- Carr, Vempala
- 2002
(Show Context)
Citation Context ... can accept input w. By the definition of α-approximation, we have w · 6 Note that representing a given feasible point as a convex combination of feasible points is similar to randomized metarounding =-=[5]-=-. It would be interesting to extend their approach, based on the ellipsoid algorithm, to our problem and potentially achieve a more efficient algorithm. Related but simpler issues arise in [6].PLAYIN... |

2 |
High probability regret bounds for bandit online optimization
- Bartlett, Dani, et al.
- 2008
(Show Context)
Citation Context ...ar optimization problem (without additional input) to a bandit algorithm guaranteeing low α-regret. We note that the above regret is sub-optimal in terms of the T dependence. Furthermore, recent work =-=[8, 4, 1]-=- presents algorithms for online linear optimization that achieve the optimal √ T regret even in the bandit setting (these results either do not explicitly consider the computational issues or assume a... |