Results 1  10
of
19
Potentialaware automated abstraction of sequential games, and holistic equilibrium analysis of Texas Hold’em poker
 IN AAAI’07
, 2007
"... We present a new abstraction algorithm for sequential imperfect information games. While most prior abstraction algorithms employ a myopic expectedvalue computation as a similarity metric, our algorithm considers a higherdimensional space consisting of histograms over abstracted classes of states f ..."
Abstract

Cited by 47 (14 self)
 Add to MetaCart
We present a new abstraction algorithm for sequential imperfect information games. While most prior abstraction algorithms employ a myopic expectedvalue computation as a similarity metric, our algorithm considers a higherdimensional space consisting of histograms over abstracted classes of states from later stages of the game. This enables our bottomup abstraction algorithm to automatically take into account potential: a hand can become relatively better (or worse) over time and the strength of different hands can get resolved earlier or later in the game. We further improve the abstraction quality by making multiple passes over the abstraction, enabling the algorithm to narrow the scope of analysis to information that is relevant given abstraction decisions made for earlier parts of the game. We also present a custom indexing scheme based on suit isomorphisms that enables one to work on significantly larger models than before. We apply the techniques to headsup limit Texas Hold’em poker. Whereas all prior game theorybased work for Texas Hold’em poker used generic offtheshelf linear program solvers for the equilibrium analysis of the abstracted game, we make use of a recently developed algorithm based on the excessive gap technique from convex optimization. This paper is, to our knowledge, the first to abstract and gametheoretically analyze all four betting rounds in one run (rather than splitting the game into phases). The resulting player, GS3, beats BluffBot, GS2, Hyperborean, MonashBPP, Sparbot, Teddy, and Vexbot, each with statistical significance. To our knowledge, those competitors are the best prior programs for the game.
Using Counterfactual Regret Minimization to Create Competitive Multiplayer Poker Agents
, 2010
"... Games are used to evaluate and advance Multiagent and Artificial Intelligence techniques. Most of these games are deterministic with perfect information (e.g. Chess and Checkers). A deterministic game has no chance element and in a perfect information game, all information is visible to all players. ..."
Abstract

Cited by 18 (5 self)
 Add to MetaCart
(Show Context)
Games are used to evaluate and advance Multiagent and Artificial Intelligence techniques. Most of these games are deterministic with perfect information (e.g. Chess and Checkers). A deterministic game has no chance element and in a perfect information game, all information is visible to all players. However, many realworld scenarios with competing agents are stochastic (nondeterministic) with imperfect information. For twoplayer zerosum perfect recall games, a recent technique called Counterfactual Regret Minimization (CFR) computes strategies that are provably convergent to an εNash equilibrium. A Nash equilibrium strategy is useful in twoplayer games since it maximizes its utility against a worstcase opponent. However, for multiplayer (three or more player) games, we lose all theoretical guarantees for CFR. However, we believe that CFRgenerated
Computing approximate nash equilibria and robust bestresponses using sampling
 J. Artif. Intell. Res. (JAIR
"... This article discusses two contributions to decisionmaking in complex partially observable stochastic games. First, we apply two stateoftheart search techniques that use MonteCarlo sampling to the task of approximating a NashEquilibrium (NE) in such games, namely MonteCarlo Tree Search (MCTS) ..."
Abstract

Cited by 9 (3 self)
 Add to MetaCart
(Show Context)
This article discusses two contributions to decisionmaking in complex partially observable stochastic games. First, we apply two stateoftheart search techniques that use MonteCarlo sampling to the task of approximating a NashEquilibrium (NE) in such games, namely MonteCarlo Tree Search (MCTS) and MonteCarlo Counterfactual Regret Minimization (MCCFR). MCTS has been proven to approximate a NE in perfectinformation games. We show that the algorithm quickly finds a reasonably strong strategy (but not a NE) in a complex imperfect information game, i.e. Poker. MCCFR on the other hand has theoretical NE convergence guarantees in such a game. We apply MCCFR for the first time in Poker. Based on our experiments, we may conclude that MCTS is a valid approach if one wants to learn reasonably strong strategies fast, whereas MCCFR is the better choice if the quality of the strategy is most important. Our second contribution relates to the observation that a NE is not a best response against players that are not playing a NE. We present MonteCarlo Restricted Nash Response (MCRNR), a samplebased algorithm for the computation of restricted Nash strategies. These are robust bestresponse strategies that (1) exploit nonNE opponents more than playing a NE and (2) are not (overly) exploitable by other strategies. We combine the advantages of two stateoftheart algorithms, i.e. MCCFR and Restricted Nash Response (RNR). MCRNR samples only relevant parts of the game tree. We show that MCRNR learns quicker than standard RNR in smaller games. Also we show in Poker that MCRNR learns robust bestresponse strategies fast, and that these strategies exploit opponents more than playing a NE does. 1.
Algorithms for abstracting and solving imperfect information games
, 2007
"... Game theory is the mathematical study of rational behavior in strategic environments. In many settings, most notably twoperson zerosum games, game theory provides particularly strong and appealing solution concepts. Furthermore, these solutions are efficiently computable in the complexitytheory s ..."
Abstract

Cited by 5 (1 self)
 Add to MetaCart
(Show Context)
Game theory is the mathematical study of rational behavior in strategic environments. In many settings, most notably twoperson zerosum games, game theory provides particularly strong and appealing solution concepts. Furthermore, these solutions are efficiently computable in the complexitytheory sense. However, in most interesting potential applications in artificial intelligence, the solutions are difficult to compute using current techniques due primarily to the extremely large statespaces of the environments. In this thesis, we propose new algorithms for tackling these computational difficulties. In one stream of research, we introduce automated abstraction algorithms for sequential games of imperfect information. These algorithms take as input a description of a game and produce a description of a strategically similar, but smaller, game as output. We present algorithms that are lossless (i.e., equilibriumpreserving), as well as algorithms that are lossy, but which can yield much smaller games while still retaining the most important features of the original game. In a second stream of research, we develop specialized optimization algorithms for finding ɛequilibria in sequential games of imperfect information. The algorithms are based on recent advances in nonsmooth convex optimization (namely the excessive gap technique) and provide significant improvements
MCRNR: Fast Computing of Restricted Nash Responses by Means of Sampling
"... This paper presents a samplebased algorithm for the computation of restricted Nash strategies in complex extensive form games. Recent work indicates that regretminimization algorithms using selective sampling, such as MonteCarlo Counterfactual Regret Minimization (MCCFR), converge faster to Nash ..."
Abstract

Cited by 3 (0 self)
 Add to MetaCart
This paper presents a samplebased algorithm for the computation of restricted Nash strategies in complex extensive form games. Recent work indicates that regretminimization algorithms using selective sampling, such as MonteCarlo Counterfactual Regret Minimization (MCCFR), converge faster to Nash equilibrium (NE) strategies than their nonsampled counterparts which perform a full tree traversal. In this paper, we show that MCCFR is also able to establish NE strategies in the complex domain of Poker. Although such strategies are defensive (i.e. safe to play), they are oblivious to opponent mistakes. We can thus achieve better performance by using (an estimation of) opponent strategies. The Restricted Nash Response (RNR) algorithm was proposed to learn robust counterstrategies given such knowledge. It solves a modified game, wherein it is assumed that opponents play according to a fixed strategy with a certain probability, or to a regretminimizing strategy otherwise. We improve the rate of convergence of the RNR algorithm using sampling. Our new algorithm, MCRNR, samples only relevant parts of the game tree. It is therefore able to converge faster to robust bestresponse strategies than RNR. We evaluate our algorithm on a variety of imperfect information games that are small enough to solve yet large enough to be strategically interesting, as well as a large game, Texas Hold’em Poker.
ALGORITHMS FOR EVOLVING NOLIMIT TEXAS HOLD’EM POKER PLAYING AGENTS
"... Computers have difficulty learning how to play Texas Hold’em Poker. The game contains a high degree of stochasticity, hidden information, and opponents that are deliberately trying to misrepresent their current state. Poker has a much larger game space than classic parlour games such as Chess and B ..."
Abstract
 Add to MetaCart
Computers have difficulty learning how to play Texas Hold’em Poker. The game contains a high degree of stochasticity, hidden information, and opponents that are deliberately trying to misrepresent their current state. Poker has a much larger game space than classic parlour games such as Chess and Backgammon. Evolutionary methods have been shown to find relatively good results in large state spaces, and neural networks have been shown to be able to find solutions to nonlinear search problems. In this paper, we present several algorithms for teaching agents how to play NoLimit Texas Hold’em Poker using a hybrid method known as evolving neural networks. Furthermore, we adapt heuristics such as halls of fame and coevolution to be able to handle populations of Poker agents, which can sometimes contain several hundred opponents, instead of a single opponent. Our agents were evaluated against several benchmark agents. Experimental results show the overall best performance was obtained by an agent evolved from a single population (i.e., with no coevolution) using a large hall of fame. These results demonstrate the effectiveness of our algorithms in creating competitive NoLimit Texas Hold’em Poker agents. 1
MonteCarlo Tree Search in Poker using Expected Reward Distributions ∗
"... Poker playing computer bots can be divided into two categories. There are the gametheoretic bots, that play according to a strategy that gives rise to a Nash equilibrium. These bots are impossible to beat, but are also not able to exploit nonoptimalities in their opponents. The other type of bot i ..."
Abstract
 Add to MetaCart
(Show Context)
Poker playing computer bots can be divided into two categories. There are the gametheoretic bots, that play according to a strategy that gives rise to a Nash equilibrium. These bots are impossible to beat, but are also not able to exploit nonoptimalities in their opponents. The other type of bot is the exploiting bot that employs game tree search and opponent modeling techniques to discover and exploit weaknesses of
Depth, balancing, and limits of the Elo model
"... Abstract—Much work has been devoted to the computational complexity of games. However, they are not necessarily relevant for estimating the complexity in human terms. Therefore, humancentered measures have been proposed, e.g. the depth. This paper discusses the depth of various games, extends it to ..."
Abstract
 Add to MetaCart
(Show Context)
Abstract—Much work has been devoted to the computational complexity of games. However, they are not necessarily relevant for estimating the complexity in human terms. Therefore, humancentered measures have been proposed, e.g. the depth. This paper discusses the depth of various games, extends it to a continuous measure. We provide new depth results and present tool (givenfirstmove, pie rule, size extension) for increasing it. We also use these measures for analyzing games and opening moves in Y, NoGo, Killall Go, and the effect of pie rules. I.
ATHABASCA UNIVERSITY CAN TEST DRIVEN DEVELOPMENT IMPROVE POKER ROBOT PERFORMANCE? BY
, 2008
"... Is it possible to create a poker playing robot that can beat a professional human player? Researchers are attempting to answer this question with a yes. The game of poker is extremely popular and it has caught the attention of the computer science community. We believe that some practical approaches ..."
Abstract
 Add to MetaCart
(Show Context)
Is it possible to create a poker playing robot that can beat a professional human player? Researchers are attempting to answer this question with a yes. The game of poker is extremely popular and it has caught the attention of the computer science community. We believe that some practical approaches to software development can potentially improve the performance of poker robots. Poker robot development can benefit from the application of Test Driven Development and performance testing. This essay investigates the current research that is available for poker robot development and Test Driven Development. To help support the potential benefits of combining testing with poker robot development, a simple poker robot is modified with Test Driven Development and performance testing techniques and tools. This programming exercise helps expand the research results by applying some of the theory to a handson situation. The results of this exercise provides evidence to support the use of Test Driven Development in improving poker robot performance. Poker robot developers may find Test Driven Development to be a very useful approach to designing their software. However, Test Driven Development is only one of many potential tools that can be used. ii ACKNOWLEDGMENTS I would like to thank Mai for her support. She also provided me with the inspiration to create a unique poker robot name. I would also like to thank my essay supervisor Dr. Dunwei Wen for his patience and guidance. A final thank you to the