Results 1  10
of
14
Lossy Stochastic Game Abstraction with Bounds
, 2012
"... Abstraction followed by equilibrium finding has emerged as the leading approach to solving games. Lossless abstraction typically yields games that are still too large to solve, so lossy abstraction is needed. Unfortunately, prior lossy game abstraction algorithms have no guarantees on solution quali ..."
Abstract

Cited by 11 (5 self)
 Add to MetaCart
Abstraction followed by equilibrium finding has emerged as the leading approach to solving games. Lossless abstraction typically yields games that are still too large to solve, so lossy abstraction is needed. Unfortunately, prior lossy game abstraction algorithms have no guarantees on solution quality. We developed a framework that enables the design of lossy game abstraction algorithms with guarantees on solution quality. It simultaneously handles state and action abstraction. We define a measure of reward approximation error and transition probability error achieved by state and action abstraction in stochastic games such that the regret of the equilibrium found in the abstract game when implemented in the original, unabstracted game is upperbounded by a function of those measures. We then develop the first lossy game abstraction algorithms with bounds on solution quality. Both of them work levelbylevel up from the end of the game. One of the algorithms is greedy and the other is an integer linear program. We also prove that the abstraction problem is NPcomplete (even with just action abstraction, 2 agents, and a 1step game), but point out that this does not mean that the game abstraction problems that occur in practice cannot be solved quickly.
Using Sliding Windows to Generate Action Abstractions in ExtensiveForm Games
"... In extensiveform games with a large number of actions, careful abstraction of the action space is critically important to performance. In this paper we extend previous work on action abstraction using nolimit poker games as our test domains. We show that in such games it is no longer necessary to ..."
Abstract

Cited by 10 (1 self)
 Add to MetaCart
In extensiveform games with a large number of actions, careful abstraction of the action space is critically important to performance. In this paper we extend previous work on action abstraction using nolimit poker games as our test domains. We show that in such games it is no longer necessary to choose, a priori, one specific range of possible bet sizes. We introduce an algorithm that adjusts the range of bet sizes considered for each bet individually in an iterative fashion. This flexibility results in a substantially improved game value in nolimit Leduc poker. When applied to nolimit Texas Hold’em our algorithm produces an action abstraction that is about one third the size of a state of the art handcrafted action abstraction, yet has a better overall game value.
Regret transfer and parameter optimization
 In Proceedings of the AAAI Conference on Artificial Intelligence (AAAI
"... Regret matching is a widelyused algorithm for learning how to act. We begin by proving that regrets on actions in one setting (game) can be transferred to warm start the regrets for solving a different setting with same structure but different payoffs that can be written as a function of parameter ..."
Abstract

Cited by 9 (3 self)
 Add to MetaCart
Regret matching is a widelyused algorithm for learning how to act. We begin by proving that regrets on actions in one setting (game) can be transferred to warm start the regrets for solving a different setting with same structure but different payoffs that can be written as a function of parameters. We prove how this can be done by carefully discounting the prior regrets. This provides, to our knowledge, the first principled warmstarting method for noregret learning. It also extends to warmstarting the widelyadopted counterfactual regret minimization (CFR) algorithm for large incompleteinformation games; we show this experimentally as well. We then study optimizing a parameter vector for a player in a twoplayer zerosum game (e.g., optimizing bet sizes to use in poker). We propose a custom gradient descent algorithm that provably finds a locally optimal parameter vector while leveraging our warmstart theory to significantly save regretmatching iterations at each step. It optimizes the parameter vector while simultaneously finding an equilibrium. We present experiments in nolimit Leduc Hold’em and nolimit Texas Hold’em to optimize bet sizing. This amounts to the first action abstraction algorithm (algorithm for selecting a small number of discrete actions to use from a continuum of actions—a key preprocessing step for solving large games using current equilibriumfinding algorithms) with convergence guarantees for extensiveform games.
Action Translation in ExtensiveForm Games with Large Action Spaces: Axioms, Paradoxes, and the PseudoHarmonic Mapping
"... When solving extensiveform games with large action spaces, typically significant abstraction is needed to make the problem manageable from a modeling or computational perspective. When this occurs, a procedure is needed to interpret actions of the opponent that fall outside of our abstraction (by m ..."
Abstract

Cited by 8 (7 self)
 Add to MetaCart
When solving extensiveform games with large action spaces, typically significant abstraction is needed to make the problem manageable from a modeling or computational perspective. When this occurs, a procedure is needed to interpret actions of the opponent that fall outside of our abstraction (by mapping them to actions in our abstraction). This is called an action translation mapping. Prior action translation mappings have been based on heuristics without theoretical justification. We show that the prior mappings are highly exploitable and that most of them violate certain natural desiderata. We present a new mapping that satisfies these desiderata and has significantly lower exploitability than the prior mappings. Furthermore, we observe that the cost of this worstcase performance benefit (low exploitability) is not high in practice; our mapping performs competitively with the prior mappings against nolimit Texas Hold’em agents submitted to the 2012 Annual Computer Poker Competition. We also observe several paradoxes that can arise when performing action abstraction and translation; for example, we show that it is possible to improve performance by including suboptimal actions in our abstraction and excluding optimal actions.
ExtensiveForm Game Abstraction With Bounds
"... Abstraction has emerged as a key component in solving extensiveform games of incomplete information. However, lossless abstractions are typically too large to solve, so lossy abstraction is needed. All prior lossy abstraction algorithms for extensiveform games either 1) had no bounds on solution ..."
Abstract

Cited by 6 (4 self)
 Add to MetaCart
Abstraction has emerged as a key component in solving extensiveform games of incomplete information. However, lossless abstractions are typically too large to solve, so lossy abstraction is needed. All prior lossy abstraction algorithms for extensiveform games either 1) had no bounds on solution quality or 2) depended on specific equilibrium computation approaches, limited forms of abstraction, and only decreased the number of information sets rather than nodes in the game tree. We introduce a theoretical framework that can be used to give bounds on solution quality for any perfectrecall extensiveform game. The framework uses a new notion for mapping abstract strategies to the original game, and it leverages a new equilibrium refinement for analysis. Using this framework, we develop the first general lossy extensiveform game abstraction method with bounds. Experiments show that it finds a lossless abstraction when one is available and lossy abstractions when smaller abstractions are desired. While our framework can be used for lossy abstraction, it is also a powerful tool for lossless abstraction if we set the bound to zero. Prior abstraction algorithms typically operate level by level in the game tree. We introduce the extensiveform game tree isomorphism and action subset selection problems, both important problems for computing abstractions on a levelbylevel basis. We show that the former is graph isomorphism complete, and the latter NPcomplete. We also prove that levelbylevel abstraction can be too myopic and thus fail to find even obvious lossless abstractions.
Safe Opponent Exploitation
"... We consider the problem of playing a finitelyrepeated twoplayer zerosum game safely—that is, guaranteeing at least the value of the game per period in expectation regardless of the strategy used by the opponent. Playing a stagegame equilibrium strategy at each time step clearly guarantees safety ..."
Abstract

Cited by 5 (2 self)
 Add to MetaCart
(Show Context)
We consider the problem of playing a finitelyrepeated twoplayer zerosum game safely—that is, guaranteeing at least the value of the game per period in expectation regardless of the strategy used by the opponent. Playing a stagegame equilibrium strategy at each time step clearly guarantees safety, and prior work has conjectured that it is impossible to simultaneously deviate from a stagegame equilibrium (in hope of exploiting a suboptimal opponent) and to guarantee safety. We show that such profitable deviations are indeed possible—specifically, in games where certain types of ‘gift ’ strategies exist, which we define formally. We show that the set of strategies constituting such gifts can be strictly larger than the set of iteratively weaklydominated strategies; this disproves another recent conjecture which states that all noniterativelyweaklydominated strategies are best responses to each equilibrium strategy of the other player. We present a full characterization of safe strategies, and develop efficient algorithms for exploiting suboptimal opponents while guaranteeing safety. We also provide analogous results for sequential perfect and imperfectinformation games, and present safe exploitation algorithms and full characterizations of safe strategies for those settings as well. We present experimental results in Kuhn poker, a canonical test problem for gametheoretic algorithms. Our experiments show that 1) aggressive safe exploitation strategies significantly outperform adjusting the exploitation within equilibrium strategies and 2) all the safe exploitation strategies significantly outperform a (nonsafe) best response strategy against strong dynamic opponents.
Abstraction for solving large incompleteinformation games
 In AAAI Conference on Artificial Intelligence (AAAI). Senior Member Track
"... Most realworld games and many recreational games are games of incomplete information. Over the last dozen years, abstraction has emerged as a key enabler for solving large incompleteinformation games. First, game that is strategically similar to the original game. Second, an approximate equilibriu ..."
Abstract

Cited by 1 (1 self)
 Add to MetaCart
Most realworld games and many recreational games are games of incomplete information. Over the last dozen years, abstraction has emerged as a key enabler for solving large incompleteinformation games. First, game that is strategically similar to the original game. Second, an approximate equilibrium is computed in game is mapped back to the original game. In this paper, I will review key developments in the field. I present reasons for abstracting games, and point out the issue of abstraction pathology. I then review the practical algorithms for information abstraction and action abstraction. I then cover recent theoretical breakthroughs that beget bounds on the quality of the strategy from the abstract game, when measured in the original game. I then discuss how to reverse map the opponent’s action into the abstraction if the opponent makes a move that is not in the abstraction. Finally, I discuss other topics of current and future research.
Faster FirstOrder Methods for ExtensiveForm Game Solving
"... We study the problem of computing a Nash equilibrium in largescale twoplayer zerosum extensiveform games. While this problem can be solved in polynomial time, firstorder or regretbased methods are usually preferred for large games. Regretbased methods have largely been favored in practice, in ..."
Abstract
 Add to MetaCart
We study the problem of computing a Nash equilibrium in largescale twoplayer zerosum extensiveform games. While this problem can be solved in polynomial time, firstorder or regretbased methods are usually preferred for large games. Regretbased methods have largely been favored in practice, in spite of their theoretically inferior convergence rates. In this paper we investigate the acceleration of firstorder methods both theoretically and experimentally. An important component of many firstorder methods is a distancegenerating function. Motivated by this, we investigate a specific distancegenerating function, namely the dilated entropy function, over treeplexes, which are convex polytopes that encompass the strategy spaces of perfectrecall extensiveform games. We develop significantly stronger bounds on the associated strong convexity parameter. In terms of extensiveform game solving, this improves the convergence rate of several firstorder methods by a factor of O ( #information sets·depth·M 2depth) where M is the maximum value of the `1 norm over the treeplex encoding the strategy spaces. Experimentally, we investigate the performance of three firstorder methods (the excessive gap technique, mirror prox, and stochastic mirror prox) and compare their performance to the regretbased algorithms. In order to instantiate stochastic mirror prox, we develop a class of gradient sampling schemes for game trees. Equipped with our distancegenerating function and sampling scheme, we find that mirror prox and the excessive gap technique outperform the prior regretbased methods for finding medium accuracy solutions.
Simultaneous Abstraction and Equilibrium Finding in Games
"... A key challenge in solving extensiveform games is dealing with large, or even infinite, action spaces. In games of imperfect information, the leading approach is to find a Nash equilibrium in a smaller abstract version of the game that includes only a few actions at each decision point, and then m ..."
Abstract
 Add to MetaCart
A key challenge in solving extensiveform games is dealing with large, or even infinite, action spaces. In games of imperfect information, the leading approach is to find a Nash equilibrium in a smaller abstract version of the game that includes only a few actions at each decision point, and then map the solution back to the original game. However, it is difficult to know which actions should be included in the abstraction without first solving the game, and it is infeasible to solve the game without first abstracting it. We introduce a method that combines abstraction with equilibrium finding by enabling actions to be added to the abstraction at run time. This allows an agent to begin learning with a coarse abstraction, and then to strategically insert actions at points that the strategy computed in the current abstraction deems important. The algorithm can quickly add actions to the abstraction while provably not having to restart the equilibrium finding. It enables anytime convergence to a Nash equilibrium of the full game even in infinite games. Experiments show it can outperform fixed abstractions at every stage of the run: early on it improves as quickly as equilibrium finding in coarse abstractions, and later it converges to a better solution than does equilibrium finding in finegrained abstractions. 1
Discretization of Continuous Action Spaces in
"... Extensiveform games are a powerful tool for modeling a large range of multiagent scenarios. However, most solution algorithms require discrete, finite games. In contrast, many realworld domains require modeling with continuous action spaces. This is usually handled by heuristically discretizing ..."
Abstract
 Add to MetaCart
Extensiveform games are a powerful tool for modeling a large range of multiagent scenarios. However, most solution algorithms require discrete, finite games. In contrast, many realworld domains require modeling with continuous action spaces. This is usually handled by heuristically discretizing the continuous action space without solution quality bounds. In this paper we address this issue. Leveraging recent results on abstraction solution quality, we develop the first framework for providing bounds on solution quality for discretization of continuous action spaces in extensiveform games. For games where the error is Lipschitzcontinuous in the distance of a continuous point to its nearest discrete point, we show that a uniform discretization of the space is optimal. When the error is monotonically increasing in distance to nearest discrete point, we develop an integer program for finding the optimal discretization when the error is described by piecewise linear functions. This result can further be used to approximate optimal solutions to general monotonic error functions. Finally we discuss how our theory applies to several practical problems for which no solution quality bounds could be derived before.