#### DMCA

## An Exact Double-Oracle Algorithm for Zero-Sum Extensive-Form Games with Imperfect Information

Citations: | 2 - 0 self |

### Citations

678 |
Reexamination of the Perfectness Concept for Equilibrium Points in Extensive Games
- Selten
- 1975
(Show Context)
Citation Context ...rooted in some node h) of the original game. Unfortunately, sub-games are not particularly useful in imperfectinformation EFGs; hence, here the refinements include strategic-from perfect equilibrium (=-=Selten, 1975-=-), sequential equilibrium (Kreps & Wilson, 1982), or quasi-perfect equilibrium (van Damme, 1984; Miltersen & Sørensen, 2010). The first refinement avoids using weakly dominated strategies in equilibri... |

376 |
Decomposition principle for linear programs
- Dantzig, Wolfe
- 1960
(Show Context)
Citation Context ...heory literature as oracle algorithms (McMahan, Gordon, & Blum, 2003). Oracle algorithms are related to the methods of constraint/column generation used for solving large-scale optimization problems (=-=Dantzig & Wolfe, 1960-=-; Barnhart, Johnson, Nemhauser, Savelsbergh, & Vance, 1998) and exploit two characteristics commonly found in games. First, in many cases finding a solution to a game only requires using a small fract... |

265 |
Spieltheoretische behandlung eines oligopolmodells mit nachfrageträgheit. Zeitschrift für die gesamte Staatswissenschaft
- Selten
- 1965
(Show Context)
Citation Context ...se drawbacks, a number of refinements of NE have been introduced imposing further restrictions with the intention of describing more sensible strategies. Examples include subgame-perfect equilibrium (=-=Selten, 1965-=-) used in perfect-information EFGs. The subgame-perfect equilibrium forces the strategy profile to be a Nash equilibrium in each sub-game (i.e., in each sub-tree rooted in some node h) of the original... |

203 | Stability and Perfection of Nash Equilibria - Damme - 1987 |

116 | Efficient computation of equilibria for extensive two-person games - Koller, Megiddo, et al. - 1996 |

93 | The complexity of two-person zero-sum games in extensive form
- Koller, Megiddo
- 1992
(Show Context)
Citation Context ...racle algorithms to EFGs, primarily using pure and mixed strategies in EFGs. The first work that exploited this iterative principle is the predecessor of the sequence-form linear-program formulation (=-=Koller & Megiddo, 1992-=-). In this algorithm, the authors use a representation similar to the sequence form only for a single player, while the strategies for the opponent are iteratively added as constraints into the linear... |

79 | Deployed armor protection: the application of a game theoretic model for security at the los angeles international airport
- Pita, Jain, et al.
- 2008
(Show Context)
Citation Context ...ances. For example, several decision support systems have recently been deployed in homeland security domains to recommend policies based on game-theoretic models for placing checkpoints at airports (=-=Pita, Jain, Western, Portway, Tambe, Ordonez, Kraus, & Parachuri, 2008-=-), scheduling Federal Air Marshals (Tsai, Rathi, Kiekintveld, Ordóñez, & Tambe, 2009), and patrolling ports (Shieh, An, Yang, Tambe, Baldwin, Direnzo, Meyer, Baldwin, Maule, & Meyer, 2012). The capa... |

77 |
Security and Game Theory: Algorithms
- Tambe
- 2011
(Show Context)
Citation Context ...orts (Shieh, An, Yang, Tambe, Baldwin, Direnzo, Meyer, Baldwin, Maule, & Meyer, 2012). The capabilities of these systems are based on a large amount of research in fast algorithms for security games (=-=Tambe, 2011-=-). Another notable example is the algorithmic progress that has led to game-theoretic Poker agents that are competitive with highly skilled human opponents (e.g., see Zinkevich, Bowling, & Burch, 2007... |

75 | M.: IRIS: a Tool for Strategic Security Allocation in Transportation Networks - Tsai, Rathi, et al. - 2009 |

63 | Efficient computation of behavior strategies - Stengel - 1996 |

57 | G.: PROTECT: A Deployed Game Theoretic System to Protect the Ports of the United States
- Shieh, An, et al.
- 2012
(Show Context)
Citation Context ...oints at airports (Pita, Jain, Western, Portway, Tambe, Ordonez, Kraus, & Parachuri, 2008), scheduling Federal Air Marshals (Tsai, Rathi, Kiekintveld, Ordóñez, & Tambe, 2009), and patrolling ports (=-=Shieh, An, Yang, Tambe, Baldwin, Direnzo, Meyer, Baldwin, Maule, & Meyer, 2012-=-). The capabilities of these systems are based on a large amount of research in fast algorithms for security games (Tambe, 2011). Another notable example is the algorithmic progress that has led to ga... |

54 | Planning in the presence of cost functions controlled by an adversary
- McMahan, Gordon, et al.
- 2003
(Show Context)
Citation Context ... same compact representation, but we improve the solution methods by adopting the algorithmic framework based on decompositions known in the computational game theory literature as oracle algorithms (=-=McMahan, Gordon, & Blum, 2003-=-). Oracle algorithms are related to the methods of constraint/column generation used for solving large-scale optimization problems (Dantzig & Wolfe, 1960; Barnhart, Johnson, Nemhauser, Savelsbergh, & ... |

48 | A Relation between Perfect Equilibria in Extensive Form - Damme - 1984 |

41 | M.: A Double Oracle Algorithm for Zero-Sum SEcurity games on Graphs
- Jain, Korzhyk, et al.
- 2011
(Show Context)
Citation Context ...main-specific double-oracle methods has been demonstrated on a variety of different domains inspired by pursuit-evasion games (Halvorson, Conitzer, & Parr, 2009) and security games played on a graph (=-=Jain, Korzhyk, Vanek, Conitzer, Tambe, & Pechoucek, 2011-=-; Letchford & Vorobeychik, 2013; Jain, Conitzer, & Tambe, 2013). Only a few works have tried to apply the iterative framework of oracle algorithms to EFGs, primarily using pure and mixed strategies in... |

38 | Multi-Step Multi-Sensor Hider-Seeker Games
- Halvorson, Conitzer, et al.
(Show Context)
Citation Context ... to solve much larger instances of this game. Similar success with the domain-specific double-oracle methods has been demonstrated on a variety of different domains inspired by pursuit-evasion games (=-=Halvorson, Conitzer, & Parr, 2009-=-) and security games played on a graph (Jain, Korzhyk, Vanek, Conitzer, Tambe, & Pechoucek, 2011; Letchford & Vorobeychik, 2013; Jain, Conitzer, & Tambe, 2013). Only a few works have tried to apply th... |

38 | Smoothing techniques for computing Nash equilibria of sequential games - Hoda, Gilpin, et al. |

34 | Monte Carlo sampling for regret minimization in extensive games
- Lanctot, Waugh, et al.
- 2009
(Show Context)
Citation Context ...ng is to use an approximation method. The best known approximative algorithms include counterfactual regret minimization (CFR, Zinkevich et al., 2008), improved versions of CFR with sampling methods (=-=Lanctot, Waugh, Zinkevich, & Bowling, 2009-=-; Gibson, Lanctot, Burch, Szafron, & Bowling, 2012); Nesterov’s Excessive Gap Technique (EGT, Hoda, Gilpin, Peña, & Sandholm, 2010); and variants of 831 Bošanský, Kiekintveld, Lisý, & Pěchouček ... |

28 | The computational intelligence of MoGo revealed in Taiwan’s computer Go tournaments
- Lee, Wang, et al.
- 2009
(Show Context)
Citation Context ...pace of all sequences. Monte Carlo Tree Search (MCTS) is another family of methods that has shown promise for solving very large games, in particular perfect information board games such as Go (e.g., =-=Lee et al., 2009-=-). While the CFR and EGT algorithms are guaranteed to find an ε-Nash equilibrium, convergence to an equilibrium solution has not been formally shown for any of the variants of MCTS in imperfect-inform... |

26 | The state of solving large incomplete-information games, and application to poker
- Sandholm
- 2010
(Show Context)
Citation Context ... Another notable example is the algorithmic progress that has led to game-theoretic Poker agents that are competitive with highly skilled human opponents (e.g., see Zinkevich, Bowling, & Burch, 2007; =-=Sandholm, 2010-=-). We focus on developing new algorithms for an important general class of games that includes security games and Poker, as well as many other familiar games. More precisely, we study two-player zero-... |

25 | Finding mixed strategies with small supports in extensive form games
- Koller, Megiddo
- 1996
(Show Context)
Citation Context ...in many cases finding a solution to a game only requires using a small fraction of the possible strategies, so it is not necessary to enumerate all of the strategies to find a solution (Wilson, 1972; =-=Koller & Megiddo, 1996-=-). Second, finding a best response to a specific opponent strategy in a game is computationally much less expensive than solving for an equilibrium. In addition, best response algorithms can often mak... |

21 |
Computing equilibria of two-person games from the extensive form
- Wilson
- 1972
(Show Context)
Citation Context ...games. First, in many cases finding a solution to a game only requires using a small fraction of the possible strategies, so it is not necessary to enumerate all of the strategies to find a solution (=-=Wilson, 1972-=-; Koller & Megiddo, 1996). Second, finding a best response to a specific opponent strategy in a game is computationally much less expensive than solving for an equilibrium. In addition, best response ... |

19 | Accelerating best response calculation in large extensive games
- Johanson, Waugh, et al.
- 2011
(Show Context)
Citation Context ...n the game of Phantom Tic-Tac-Toe. Another example is that fast best-response algorithms that operate on the public tree (i.e., a compact representation of games with publicly observable actions; see =-=Johanson, Bowling, Waugh, & Zinkevich, 2011-=-) can be exploited for games like poker. Finally, our formal analysis identifies the key properties that these domain-specific implementations need to satisfy to guarantee the convergence to the corre... |

14 | No-regret learning in extensive-form games with imperfect recall
- Lanctot, Gibson, et al.
(Show Context)
Citation Context ...ation set; hence, the strategy as a whole converges to a Nash equilibrium. The main benefits of this approach include simplicity and robustness, as it can be adapted for more generic games (e.g., see =-=Lanctot, Gibson, Burch, Zinkevich, & Bowling, 2012-=-, where CFR is applied on games with imperfect recall). However, the algorithm operates on the complete game tree and therefore requires convergence in all information sets, which can be very slow for... |

14 | Computing a quasi-perfect equilibrium of a two-player game. Economic Theory 42(1):175–192
- Miltersen, Sørensen
- 2010
(Show Context)
Citation Context ...tinformation EFGs; hence, here the refinements include strategic-from perfect equilibrium (Selten, 1975), sequential equilibrium (Kreps & Wilson, 1982), or quasi-perfect equilibrium (van Damme, 1984; =-=Miltersen & Sørensen, 2010-=-). The first refinement avoids using weakly dominated strategies in equilibrium strategies for two-player games (van Damme, 1991, p. 29) and it is also known as the undominated equilibrium. Sequential... |

11 | Comparing UCT versus CFR in simultaneous games
- Shafiei, Sturtevant, et al.
- 2009
(Show Context)
Citation Context ...-information games. On the contrary, the most common version of MCTS based on the Upper Confidence Bounds (UCB) selection function can converge to incorrect solutions even in simultaneous-move games (=-=Shafiei, Sturtevant, & Schaeffer, 2009-=-) that are the simplest class of imperfect-information EFGs. MCTS algorithms therefore do not (in general) guarantee finding an (approximate) optimal solution in imperfect-information games. One excep... |

9 | A fast bundle-based anytime algorithm for poker and other convex games
- McMahan, Gordon
- 2007
(Show Context)
Citation Context ...-form representation for both players and it also incrementally expands the strategy space for both players. More recent work has been done by McMahan in his thesis (McMahan, 2006) and followup work (=-=McMahan & Gordon, 2007-=-). In these works the authors investigated an extension of the double-oracle algorithm for normal-form games to the extensive-form case. Their double-oracle algorithm for EFGs operates very similarly ... |

9 | Computing approximate Nash equilibria and robust best-responses using sampling - Ponsen, Jong, et al. |

8 | Security scheduling for real-world networks
- Jain, Conitzer, et al.
- 2013
(Show Context)
Citation Context ...ns inspired by pursuit-evasion games (Halvorson, Conitzer, & Parr, 2009) and security games played on a graph (Jain, Korzhyk, Vanek, Conitzer, Tambe, & Pechoucek, 2011; Letchford & Vorobeychik, 2013; =-=Jain, Conitzer, & Tambe, 2013-=-). Only a few works have tried to apply the iterative framework of oracle algorithms to EFGs, primarily using pure and mixed strategies in EFGs. The first work that exploited this iterative principle ... |

8 | Fast algorithms for finding proper strategies in game ttrees
- Miltersen, Sørensen
(Show Context)
Citation Context ...hat compute these refinements – for example it is used for computing undominated equilibrium (e.g., see Ganzfried & Sandholm, 2013; Cermak, Bosansky, & Lisy, 2014) and normal-form proper equilibrium (=-=Miltersen & Sørensen, 2008-=-). 3.3 Sequence-Form Linear Program Extensive-form games with perfect recall can be compactly represented using the sequence form (Koller et al., 1996; von Stengel, 1996). A sequence σi is an ordered ... |

7 | Double-oracle algorithm for computing an exact nash equilibrium in zero-sum extensive-form games
- Bosanský, Kiekintveld, et al.
- 2013
(Show Context)
Citation Context ...r were published at the European Conference on Artificial Intelligence (ECAI) (Bosansky, Kiekintveld, Lisy, & Pechoucek, 2012) and the conference on Autonomous Agents and Multi Agent Systems (AAMAS) (=-=Bosansky, Kiekintveld, Lisy, Cermak, & Pechoucek, 2013-=-). The major additions to this full version include (1) a novel, more detailed description of all parts of the algorithm, (2) introduction and analysis of different policies for the player selection i... |

7 |
Generalized sampling and variance in counterfactual regret minimization
- Gibson, Lanctot, et al.
- 2012
(Show Context)
Citation Context ...st known approximative algorithms include counterfactual regret minimization (CFR, Zinkevich et al., 2008), improved versions of CFR with sampling methods (Lanctot, Waugh, Zinkevich, & Bowling, 2009; =-=Gibson, Lanctot, Burch, Szafron, & Bowling, 2012-=-); Nesterov’s Excessive Gap Technique (EGT, Hoda, Gilpin, Peña, & Sandholm, 2010); and variants of 831 Bošanský, Kiekintveld, Lisý, & Pěchouček Monte Carlo Tree Search (MCTS) algorithms applied ... |

4 | Improving performance in imperfect-information games with large state and action spaces by solving endgames
- Ganzfried, Sandholm
- 2013
(Show Context)
Citation Context ... NE and calculating the value of the game is often a starting point for many of the algorithms that compute these refinements – for example it is used for computing undominated equilibrium (e.g., see =-=Ganzfried & Sandholm, 2013-=-; Cermak, Bosansky, & Lisy, 2014) and normal-form proper equilibrium (Miltersen & Sørensen, 2008). 3.3 Sequence-Form Linear Program Extensive-form games with perfect recall can be compactly represente... |

4 | Robust Planning in Domains with Stochastic Outcomes
- McMahan
- 2006
(Show Context)
Citation Context ...ce our algorithm uses the sequence-form representation for both players and it also incrementally expands the strategy space for both players. More recent work has been done by McMahan in his thesis (=-=McMahan, 2006-=-) and followup work (McMahan & Gordon, 2007). In these works the authors investigated an extension of the double-oracle algorithm for normal-form games to the extensive-form case. Their double-oracle ... |

3 |
Iterative Algorithm for Solving Two-player Zero-sum Extensive-form Games with Imperfect Information
- Bosansky, Kiekintveld, et al.
- 2012
(Show Context)
Citation Context ...e faster and to identify the optimal way of expanding the restricted game. Acknowledgements Earlier versions of this paper were published at the European Conference on Artificial Intelligence (ECAI) (=-=Bosansky, Kiekintveld, Lisy, & Pechoucek, 2012-=-) and the conference on Autonomous Agents and Multi Agent Systems (AAMAS) (Bosansky, Kiekintveld, Lisy, Cermak, & Pechoucek, 2013). The major additions to this full version include (1) a novel, more d... |

3 |
Convergence of Monte Carlo tree search in simultaneous move games
- Lisy, Kovarik, et al.
(Show Context)
Citation Context ...uarantee finding an (approximate) optimal solution in imperfect-information games. One exception is the recent proof of convergence of MCTS with certain selection methods for simultaneous-move games (=-=Lisy, Kovarik, Lanctot, & Bosansky, 2013-=-). Still, using MCTS is sometimes a reasonable choice since it can produce good strategies in practice (Ponsen et al., 2011). Contrary to the existing approximative approaches, our algorithm aims to f... |

2 | V.: Practical Performance of Refinements of Nash Equilibria in Extensive-Form Zero-Sum Games
- Čermák, Bošanský, et al.
(Show Context)
Citation Context ...e of the game is often a starting point for many of the algorithms that compute these refinements – for example it is used for computing undominated equilibrium (e.g., see Ganzfried & Sandholm, 2013; =-=Cermak, Bosansky, & Lisy, 2014-=-) and normal-form proper equilibrium (Miltersen & Sørensen, 2008). 3.3 Sequence-Form Linear Program Extensive-form games with perfect recall can be compactly represented using the sequence form (Kolle... |

2 | Carlo Sampling and Regret Minimization for Equilibrium Computation and Decision-Making in Large Extensive Form Games
- Lanctot
- 2013
(Show Context)
Citation Context ...t algorithms for finding both exact and approximate solutions: linear programming using the sequence form, and Counterfactual Regret Minimization (CFR, Zinkevich, Johanson, Bowling, & Piccione, 2008; =-=Lanctot, 2013-=-). The experimental results confirm that our algorithm requires only a fraction of all possible sequences to solve a game in practice and significantly reduces memory requirements when solving large g... |

1 | An Exact Double-Oracle Algorithm for Zero-Sum EFGs with Imperfect Information Letchford - J, Vorobeychik - 2013 |