Results 1  10
of
29
Approximate Solutions for Partially Observable Stochastic Games with Common Payoffs
 In Proc. of Int. Joint Conference on Autonomous Agents and Multi Agent Systems
, 2004
"... Partially observable decentralized decision making in robot teams is fundamentally different from decision making in fully observable problems. Team members cannot simply apply singleagent solution techniques in parallel. Instead, we must turn to game theoretic frameworks to correctly model the pro ..."
Abstract

Cited by 73 (1 self)
 Add to MetaCart
Partially observable decentralized decision making in robot teams is fundamentally different from decision making in fully observable problems. Team members cannot simply apply singleagent solution techniques in parallel. Instead, we must turn to game theoretic frameworks to correctly model the problem. While partially observable stochastic games (POSGs) provide a solution model for decentralized robot teams, this model quickly becomes intractable. We propose an algorithm that approximates POSGs as a series of smaller, related Bayesian games, using heuristics such as QMDP to provide the future discounted value of actions. This algorithm trades off limited lookahead in uncertainty for computational feasibility, and results in policies that are locally optimal with respect to the selected heuristic. Empirical results are provided for both a simple problem for which the full POSG can also be constructed, as well as more complex, robotinspired, problems.
A competitive Texas Hold’em poker player via automated abstraction and realtime equilibrium computation
 IN PROCEEDINGS OF THE NATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE (AAAI
, 2006
"... We present a game theorybased headsup Texas Hold’em poker player, GS1. To overcome the computational obstacles stemming from Texas Hold’em’s gigantic game tree, the player employs our automated abstraction techniques to reduce the complexity of the strategy computations. Texas Hold’em consists of ..."
Abstract

Cited by 45 (15 self)
 Add to MetaCart
We present a game theorybased headsup Texas Hold’em poker player, GS1. To overcome the computational obstacles stemming from Texas Hold’em’s gigantic game tree, the player employs our automated abstraction techniques to reduce the complexity of the strategy computations. Texas Hold’em consists of four betting rounds. Our player solves a large linear program (offline) to compute strategies for the abstracted first and second rounds. After the second betting round, our player updates the probability of each possible hand based on the observed betting actions in the first two rounds as well as the revealed cards. Using these updated probabilities, our player computes in realtime an equilibrium approximation for the last two abstracted rounds. We demonstrate that our player, which incorporates very little pokerspecific knowledge, is competitive with leading pokerplaying programs which incorporate extensive domain knowledge, as well as with advanced human players.
Gradientbased algorithms for finding nash equilibria in extensive form games
 In Proceedings of the Eighteenth International Conference on Game Theory
, 2007
"... We present a computational approach to the saddlepoint formulation for the Nash equilibria of twoperson, zerosum sequential games of imperfect information. The algorithm is a firstorder gradient method based on modern smoothing techniques for nonsmooth convex optimization. The algorithm requires ..."
Abstract

Cited by 32 (13 self)
 Add to MetaCart
We present a computational approach to the saddlepoint formulation for the Nash equilibria of twoperson, zerosum sequential games of imperfect information. The algorithm is a firstorder gradient method based on modern smoothing techniques for nonsmooth convex optimization. The algorithm requires O(1/ɛ) iterations to compute an ɛequilibrium, and the work per iteration is extremely low. These features enable us to find approximate Nash equilibria for sequential games with a tree representation of about 10 10 nodes. This is three orders of magnitude larger than what previous algorithms can handle. We present two heuristic improvements to the basic algorithm and demonstrate their efficacy on a range of realworld games. Furthermore, we demonstrate how the algorithm can be customized to a specific class of problems with enormous memory savings. 1
Finding equilibria in large sequential games of imperfect information
 In ACM Conference on Electronic Commerce
, 2006
"... Information ∗ ..."
SMOOTHING TECHNIQUES FOR COMPUTING NASH EQUILIBRIA OF SEQUENTIAL GAMES
"... We develop firstorder smoothing techniques for saddlepoint problems that arise in the Nash equilibria computation of sequential games. The crux of our work is a construction of suitable proxfunctions for a certain class of polytopes that encode the sequential nature of the games. An implementatio ..."
Abstract

Cited by 25 (7 self)
 Add to MetaCart
We develop firstorder smoothing techniques for saddlepoint problems that arise in the Nash equilibria computation of sequential games. The crux of our work is a construction of suitable proxfunctions for a certain class of polytopes that encode the sequential nature of the games. An implementation based on our smoothing techniques computes approximate Nash equilibria for games that are four orders of magnitude larger than what conventional computational approaches can handle.
Better automated abstraction techniques for imperfect information games, with application to Texas Hold’em poker
 In International Conference on Autonomous Agents and MultiAgent Systems (AAMAS
, 2007
"... We present new approximation methods for computing gametheoretic strategies for sequential games of imperfect information. At a high level, we contribute two new ideas. First, we introduce a new statespace abstraction algorithm. In each round of the game, there is a limit to the number of strategic ..."
Abstract

Cited by 24 (8 self)
 Add to MetaCart
We present new approximation methods for computing gametheoretic strategies for sequential games of imperfect information. At a high level, we contribute two new ideas. First, we introduce a new statespace abstraction algorithm. In each round of the game, there is a limit to the number of strategically different situations that an equilibriumfinding algorithm can handle. Given this constraint, we use clustering to discover similar positions, and we compute the abstraction via an integer program that minimizes the expected error at each stage of the game. Second, we present a method for computing the leaf payoffs for a truncated version of the game by simulating the actions in the remaining portion of the game. This allows the equilibriumfinding algorithm to take into account the entire game tree while having to explicitly solve only a truncated version. Experiments show that each of our two new techniques improves performance dramatically in Texas Hold’em poker. The techniques lead to a drastic improvement over prior approaches for automatically generating agents, and our agent plays competitively even against the best agents overall.
Lossless abstraction of imperfect information games
 Journal of the ACM
, 2007
"... Abstract. Finding an equilibrium of an extensive form game of imperfect information is a fundamental problem in computational game theory, but current techniques do not scale to large games. To address this, we introduce the ordered game isomorphism and the related ordered game isomorphic abstractio ..."
Abstract

Cited by 21 (9 self)
 Add to MetaCart
Abstract. Finding an equilibrium of an extensive form game of imperfect information is a fundamental problem in computational game theory, but current techniques do not scale to large games. To address this, we introduce the ordered game isomorphism and the related ordered game isomorphic abstraction transformation. For a multiplayer sequential game of imperfect information with observable actions and an ordered signal space, we prove that any Nash equilibrium in an abstracted smaller game, obtained by one or more applications of the transformation, can be easily converted into a Nash equilibrium in the original game. We present an algorithm, GameShrink, for abstracting the game using our isomorphism exhaustively. Its complexity is Õ(n2), where n is the number of nodes in a structure we call the signal tree. It is no larger than the game tree, and on nontrivial games it is drastically smaller, so GameShrink has time and space complexity sublinear in the size of the game tree. Using GameShrink, we find an equilibrium to a poker game with 3.1 billion nodes—over four orders of magnitude more than in the largest poker game solved previously. To address even larger games, we introduce approximation methods that do not preserve equilibrium, but nevertheless yield (ex post) provably closetooptimal strategies.
ExpectationBased Versus PotentialAware Automated Abstraction in Imperfect Information Games: An Experimental Comparison Using Poker
, 2008
"... Automated abstraction algorithms for sequential imperfect information games have recently emerged as a key component in developing competitive game theorybased agents. The existing literature has not investigated the relative performance of different abstraction algorithms. Instead, agents whose co ..."
Abstract

Cited by 10 (4 self)
 Add to MetaCart
Automated abstraction algorithms for sequential imperfect information games have recently emerged as a key component in developing competitive game theorybased agents. The existing literature has not investigated the relative performance of different abstraction algorithms. Instead, agents whose construction has used automated abstraction have only been compared under confounding effects: different granularities of abstraction and equilibriumfinding algorithms that yield different accuracies when solving the abstracted game. This paper provides the first systematic evaluation of abstraction algorithms. Two families of algorithms have been proposed. The distinguishing feature is the measure used to evaluate the strategic similarity between game states. One algorithm uses the probability of winning as the similarity measure. The other uses a potentialaware similarity measure based on probability distributions over future states. We conduct experiments on Rhode Island Hold’em poker. We compare the algorithms against each other, against optimal play, and against each agent’s nemesis. We also compare them based on the resulting game’s value. Interestingly, for very coarse abstractions the expectationbased algorithm is better, but for moderately coarse and fine abstractions the potentialaware approach is superior. Furthermore, agents constructed using the expectationbased approach are highly exploitable beyond what their performance against the game’s optimal strategy would suggest.
A nearoptimal strategy for a headsup nolimit Texas Hold’em poker tournament
 In International Conference on Autonomous Agents and MultiAgent Systems (AAMAS
, 2007
"... We analyze a headsup nolimit Texas Hold’em poker tournament with a fixed small blind of 300 chips, a fixed big blind of 600 chips and a total amount of 8000 chips on the table (until recently, these parameters defined the headsup endgame of sitngo tournaments on the popular PartyPoker.com onlin ..."
Abstract

Cited by 9 (1 self)
 Add to MetaCart
We analyze a headsup nolimit Texas Hold’em poker tournament with a fixed small blind of 300 chips, a fixed big blind of 600 chips and a total amount of 8000 chips on the table (until recently, these parameters defined the headsup endgame of sitngo tournaments on the popular PartyPoker.com online poker site). Due to the size of this game, a computation of an optimal (i.e. minimax) strategy for the game is completely infeasible. However, combining an algorithm due to Koller, Megiddo and von Stengel with concepts of Everett and suggestions of Sklansky, we compute an optimal jam/fold strategy, i.e. a strategy that would be optimal if any bet made by the player playing by the strategy (but not bets of his opponent) had to be his entire stack. Our computations establish that the computed strategy is nearoptimal for the unrestricted tournament (i.e., with postflop play being allowed) in the rigorous sense that a player playing by the computed strategy will win the tournament with a probability within 1.4 percentage points of the probability that an optimal strategy (allowing postflop play) would give.