Results 1 - 10
of
65
Bandit based Monte-Carlo Planning
- In: ECML-06. Number 4212 in LNCS
, 2006
"... Abstract. For large state-space Markovian Decision Problems Monte-Carlo planning is one of the few viable approaches to find near-optimal solutions. In this paper we introduce a new algorithm, UCT, that applies bandit ideas to guide Monte-Carlo planning. In finite-horizon or discounted MDPs the algo ..."
Abstract
-
Cited by 111 (4 self)
- Add to MetaCart
Abstract. For large state-space Markovian Decision Problems Monte-Carlo planning is one of the few viable approaches to find near-optimal solutions. In this paper we introduce a new algorithm, UCT, that applies bandit ideas to guide Monte-Carlo planning. In finite-horizon or discounted MDPs the algorithm is shown to be consistent and finite sample bounds are derived on the estimation error due to sampling. Experimental results show that in several domains, UCT is significantly more efficient than its alternatives. 1
Approximating Game-Theoretic Optimal Strategies for Full-scale Poker
- IN INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE
, 2003
"... The computation of the first complete approximations of game-theoretic optimal strategies for fullscale poker is addressed. Several abstraction techniques are combined to represent the game of 2-player Texas Hold'em, having size O(10^18), using closely related models each having size . ..."
Abstract
-
Cited by 104 (16 self)
- Add to MetaCart
The computation of the first complete approximations of game-theoretic optimal strategies for fullscale poker is addressed. Several abstraction techniques are combined to represent the game of 2-player Texas Hold'em, having size O(10^18), using closely related models each having size .
World-Championship-Caliber Scrabble
- ARTIFICIAL INTELLIGENCE
, 2002
"... Computer Scrabble programs have achieved a level of performance that exceeds that of the strongest human players. MAVEN was the first program to demonstrate this against human opposition. Scrabble is a game of imperfect information with a large branching factor. The techniques successfully applied i ..."
Abstract
-
Cited by 38 (0 self)
- Add to MetaCart
Computer Scrabble programs have achieved a level of performance that exceeds that of the strongest human players. MAVEN was the first program to demonstrate this against human opposition. Scrabble is a game of imperfect information with a large branching factor. The techniques successfully applied in two-player games such as chess do not work here. MAVEN combines a selective move generator, simulations of likely game scenarios, and the B # algorithm to produce a world-championship-caliber Scrabble-playing program.
Gradient-based algorithms for finding nash equilibria in extensive form games
- In Proceedings of the Eighteenth International Conference on Game Theory
, 2007
"... We present a computational approach to the saddle-point formulation for the Nash equilibria of two-person, zerosum sequential games of imperfect information. The algorithm is a first-order gradient method based on modern smoothing techniques for non-smooth convex optimization. The algorithm requires ..."
Abstract
-
Cited by 27 (11 self)
- Add to MetaCart
We present a computational approach to the saddle-point formulation for the Nash equilibria of two-person, zerosum sequential games of imperfect information. The algorithm is a first-order gradient method based on modern smoothing techniques for non-smooth convex optimization. The algorithm requires O(1/ɛ) iterations to compute an ɛ-equilibrium, and the work per iteration is extremely low. These features enable us to find approximate Nash equilibria for sequential games with a tree representation of about 10 10 nodes. This is three orders of magnitude larger than what previous algorithms can handle. We present two heuristic improvements to the basic algorithm and demonstrate their efficacy on a range of real-world games. Furthermore, we demonstrate how the algorithm can be customized to a specific class of problems with enormous memory savings. 1
Potential-aware automated abstraction of sequential games, and holistic equilibrium analysis of Texas Hold’em poker
- IN AAAI’07
, 2007
"... We present a new abstraction algorithm for sequential imperfect information games. While most prior abstraction algorithms employ a myopic expected-value computation as a similarity metric, our algorithm considers a higherdimensional space consisting of histograms over abstracted classes of states f ..."
Abstract
-
Cited by 27 (9 self)
- Add to MetaCart
We present a new abstraction algorithm for sequential imperfect information games. While most prior abstraction algorithms employ a myopic expected-value computation as a similarity metric, our algorithm considers a higherdimensional space consisting of histograms over abstracted classes of states from later stages of the game. This enables our bottom-up abstraction algorithm to automatically take into account potential: a hand can become relatively better (or worse) over time and the strength of different hands can get resolved earlier or later in the game. We further improve the abstraction quality by making multiple passes over the abstraction, enabling the algorithm to narrow the scope of analysis to information that is relevant given abstraction decisions made for earlier parts of the game. We also present a custom indexing scheme based on suit isomorphisms that enables one to work on significantly larger models than before. We apply the techniques to heads-up limit Texas Hold’em poker. Whereas all prior game theory-based work for Texas Hold’em poker used generic off-the-shelf linear program solvers for the equilibrium analysis of the abstracted game, we make use of a recently developed algorithm based on the excessive gap technique from convex optimization. This paper is, to our knowledge, the first to abstract and gametheoretically analyze all four betting rounds in one run (rather than splitting the game into phases). The resulting player, GS3, beats BluffBot, GS2, Hyperborean, Monash-BPP, Sparbot, Teddy, and Vexbot, each with statistical significance. To our knowledge, those competitors are the best prior programs for the game.
Finding equilibria in large sequential games of imperfect information
- In ACM Conference on Electronic Commerce
, 2006
"... Information ∗ ..."
Bayes’ bluff: Opponent modelling in poker
- In Proceedings of the 21st Annual Conference on Uncertainty in Artificial Intelligence (UAI
, 2005
"... Poker is a challenging problem for artificial intelligence, with non-deterministic dynamics, partial observability, and the added difficulty of unknown adversaries. Modelling all of the uncertainties in this domain is not an easy task. In this paper we present a Bayesian probabilistic model for a br ..."
Abstract
-
Cited by 19 (1 self)
- Add to MetaCart
Poker is a challenging problem for artificial intelligence, with non-deterministic dynamics, partial observability, and the added difficulty of unknown adversaries. Modelling all of the uncertainties in this domain is not an easy task. In this paper we present a Bayesian probabilistic model for a broad class of poker games, separating the uncertainty in the game dynamics from the uncertainty of the opponent’s strategy. We then describe approaches to two key subproblems: (i) inferring a posterior over opponent strategies given a prior distribution and observations of their play, and (ii) playing an appropriate response to that distribution. We demonstrate the overall approach on a reduced version of poker using Dirichlet priors and then on the full game of Texas hold’em using a more informed prior. We demonstrate methods for playing effective responses to the opponent, based on the posterior. 1
Better automated abstraction techniques for imperfect information games, with application to Texas Hold’em poker
- In International Conference on Autonomous Agents and Multi-Agent Systems (AAMAS
, 2007
"... We present new approximation methods for computing gametheoretic strategies for sequential games of imperfect information. At a high level, we contribute two new ideas. First, we introduce a new state-space abstraction algorithm. In each round of the game, there is a limit to the number of strategic ..."
Abstract
-
Cited by 17 (7 self)
- Add to MetaCart
We present new approximation methods for computing gametheoretic strategies for sequential games of imperfect information. At a high level, we contribute two new ideas. First, we introduce a new state-space abstraction algorithm. In each round of the game, there is a limit to the number of strategically different situations that an equilibrium-finding algorithm can handle. Given this constraint, we use clustering to discover similar positions, and we compute the abstraction via an integer program that minimizes the expected error at each stage of the game. Second, we present a method for computing the leaf payoffs for a truncated version of the game by simulating the actions in the remaining portion of the game. This allows the equilibrium-finding algorithm to take into account the entire game tree while having to explicitly solve only a truncated version. Experiments show that each of our two new techniques improves performance dramatically in Texas Hold’em poker. The techniques lead to a drastic improvement over prior approaches for automatically generating agents, and our agent plays competitively even against the best agents overall.
Associating Domain-Dependent Knowledge and Monte Carlo Approaches within a Go Program
- In: Joint Conference on Information Sciences
, 2003
"... This paper underlines the association of two computer go approaches, a domain-dependent knowledge approach and Monte Carlo. First, the strengthes and weaknesses of the two existing approaches are related. ..."
Abstract
-
Cited by 16 (4 self)
- Add to MetaCart
This paper underlines the association of two computer go approaches, a domain-dependent knowledge approach and Monte Carlo. First, the strengthes and weaknesses of the two existing approaches are related.
Lossless abstraction of imperfect information games
- Journal of the ACM
, 2007
"... Abstract. Finding an equilibrium of an extensive form game of imperfect information is a fundamental problem in computational game theory, but current techniques do not scale to large games. To address this, we introduce the ordered game isomorphism and the related ordered game isomorphic abstractio ..."
Abstract
-
Cited by 14 (7 self)
- Add to MetaCart
Abstract. Finding an equilibrium of an extensive form game of imperfect information is a fundamental problem in computational game theory, but current techniques do not scale to large games. To address this, we introduce the ordered game isomorphism and the related ordered game isomorphic abstraction transformation. For a multi-player sequential game of imperfect information with observable actions and an ordered signal space, we prove that any Nash equilibrium in an abstracted smaller game, obtained by one or more applications of the transformation, can be easily converted into a Nash equilibrium in the original game. We present an algorithm, GameShrink, for abstracting the game using our isomorphism exhaustively. Its complexity is Õ(n2), where n is the number of nodes in a structure we call the signal tree. It is no larger than the game tree, and on nontrivial games it is drastically smaller, so GameShrink has time and space complexity sublinear in the size of the game tree. Using GameShrink, we find an equilibrium to a poker game with 3.1 billion nodes—over four orders of magnitude more than in the largest poker game solved previously. To address even larger games, we introduce approximation methods that do not preserve equilibrium, but nevertheless yield (ex post) provably close-to-optimal strategies.

