Results 1 - 10
of
14
Bandit based Monte-Carlo Planning
- In: ECML-06. Number 4212 in LNCS
, 2006
"... Abstract. For large state-space Markovian Decision Problems Monte-Carlo planning is one of the few viable approaches to find near-optimal solutions. In this paper we introduce a new algorithm, UCT, that applies bandit ideas to guide Monte-Carlo planning. In finite-horizon or discounted MDPs the algo ..."
Abstract
-
Cited by 111 (4 self)
- Add to MetaCart
Abstract. For large state-space Markovian Decision Problems Monte-Carlo planning is one of the few viable approaches to find near-optimal solutions. In this paper we introduce a new algorithm, UCT, that applies bandit ideas to guide Monte-Carlo planning. In finite-horizon or discounted MDPs the algorithm is shown to be consistent and finite sample bounds are derived on the estimation error due to sampling. Experimental results show that in several domains, UCT is significantly more efficient than its alternatives. 1
Efficient selectivity and backup operators in Monte-Carlo tree search
- In: Proceedings Computers and Games 2006
, 2006
"... Abstract. Monte-Carlo evaluation consists in estimating a position by averaging the outcome of several random continuations, and can serve as an evaluation function at the leaves of a min-max tree. This paper presents a new framework to combine tree search with Monte-Carlo evaluation, that does not ..."
Abstract
-
Cited by 66 (2 self)
- Add to MetaCart
Abstract. Monte-Carlo evaluation consists in estimating a position by averaging the outcome of several random continuations, and can serve as an evaluation function at the leaves of a min-max tree. This paper presents a new framework to combine tree search with Monte-Carlo evaluation, that does not separate between a min-max phase and a Monte-Carlo phase. Instead of backing-up the min-max value close to the root, and the average value at some depth, a more general backup operator is defined that progressively changes from averaging to min-max as the number of simulations grows. This approach provides a fine-grained control of the tree growth, at the level of individual simulations, and allows efficient selectivity methods. This algorithm was implemented in a 9 × 9 Go-playing program, Crazy Stone, that won the 10th KGS computer-Go tournament. 1
Associating Domain-Dependent Knowledge and Monte Carlo Approaches within a Go Program
- In: Joint Conference on Information Sciences
, 2003
"... This paper underlines the association of two computer go approaches, a domain-dependent knowledge approach and Monte Carlo. First, the strengthes and weaknesses of the two existing approaches are related. ..."
Abstract
-
Cited by 16 (4 self)
- Add to MetaCart
This paper underlines the association of two computer go approaches, a domain-dependent knowledge approach and Monte Carlo. First, the strengthes and weaknesses of the two existing approaches are related.
Bayesian generation and integration of K-nearest-neighbor patterns for 19x19 go
- IEEE 2005 Symposium on Computational Intelligence in Games
, 2005
"... Abstract- This paper describes the generation and utilisation of a pattern database for 19x19 go with the Knearest-neighbor representation. Patterns are generated by browsing recorded games of professional players. Meanwhile, their matching and playing probabilities are estimated. The database creat ..."
Abstract
-
Cited by 14 (4 self)
- Add to MetaCart
Abstract- This paper describes the generation and utilisation of a pattern database for 19x19 go with the Knearest-neighbor representation. Patterns are generated by browsing recorded games of professional players. Meanwhile, their matching and playing probabilities are estimated. The database created is then integrated into an existing go program, INDIGO, either as an opening book or as an enrichment of other pre-existing hand-crafted databases used by INDIGO move generator. The improvement brought about by the use of this pattern database is estimated at 15 points on average, which is significant on go standards. 1
Monte-Carlo Go Reinforcement Learning Experiments
- In IEEE 2006 Symposium on Computational Intelligence in Games
, 2006
"... UFR de mathématiques et d’informatique ..."
Associating shallow and selective global tree search with monte carlo for 9x9 go
- In Proceedings of the 4th Computer and Games Conference (CG04
, 2004
"... This paper explores the association of shallow and selective global tree search with Monte Carlo in 9x9 go. This exploration is based on Olga and Indigo, two experimental Monte Carlo programs. We provide a min-max algorithm that iteratively deepens the tree until one move at the root is proved to be ..."
Abstract
-
Cited by 10 (2 self)
- Add to MetaCart
This paper explores the association of shallow and selective global tree search with Monte Carlo in 9x9 go. This exploration is based on Olga and Indigo, two experimental Monte Carlo programs. We provide a min-max algorithm that iteratively deepens the tree until one move at the root is proved to be superior to the other ones. At each iteration, random games are started at leaf nodes to compute mean values. The progressive pruning rule and the min-max rule are applied to non terminal nodes. We set up experiments demonstrating the relevance of this approach. Indigo used this algorithm at the 8th Computer Olympiad held in Graz. 1
Combining tactical search and monte-carlo in the game of go
- IN: CIG’05
, 2005
"... We present a way to integrate search and Monte-Carlo methods in the game of Go. Our program uses search to find the status of tactical goals, builds groups, selects interesting goals, and computes statistics on the realization of tactical goals during the random games. The mean score of the random g ..."
Abstract
-
Cited by 6 (1 self)
- Add to MetaCart
We present a way to integrate search and Monte-Carlo methods in the game of Go. Our program uses search to find the status of tactical goals, builds groups, selects interesting goals, and computes statistics on the realization of tactical goals during the random games. The mean score of the random games where a selected tactical goal has been reached and the mean score of the random games where it has failed are computed. They are used to evaluate the selected goals. Experimental results attest that combining search and Monte-Carlo significantly improves the playing level.
Monte Carlo Planning in RTS Games
- IEEE SYMPOSIUM ON COMPUTATIONAL INTELLIGENCE AND GAMES (CIG)
, 2005
"... Monte Carlo simulations have been successfully used in classic turn–based games such as backgammon, bridge, poker, and Scrabble. In this paper, we apply the ideas to the problem of planning in games with imperfect information, stochasticity, and simultaneous moves. The domain we consider is real–tim ..."
Abstract
-
Cited by 6 (0 self)
- Add to MetaCart
Monte Carlo simulations have been successfully used in classic turn–based games such as backgammon, bridge, poker, and Scrabble. In this paper, we apply the ideas to the problem of planning in games with imperfect information, stochasticity, and simultaneous moves. The domain we consider is real–time strategy games. We present a framework - MCPlan - for Monte Carlo planning, identify its performance parameters, and analyze the results of an implementation in a capture–the–flag game.
Monte-Carlo Tree Search in Production Management Problems
"... Classical search algorithms rely on the existence of a sufficiently powerful evaluation function for non-terminal states. In many task domains, the development of such an evaluation function requires substantial effort and domain knowledge, or is not even possible. As an alternative in recent years, ..."
Abstract
-
Cited by 4 (2 self)
- Add to MetaCart
Classical search algorithms rely on the existence of a sufficiently powerful evaluation function for non-terminal states. In many task domains, the development of such an evaluation function requires substantial effort and domain knowledge, or is not even possible. As an alternative in recent years, Monte-Carlo evaluation has been succesfully applied in such task domains. In this paper, we apply a search algorithm based on Monte-Carlo evaluation, Monte-Carlo Tree Search, in the task domain of production management problems. These can be defined as single-agent problems which consist of selecting a sequence of actions with side effects, leading to high quantities of one or more goal products. They are challenging and can be constructed with highly variable difficulty. Earlier research yielded an offline learning algorithm that leads to good solutions, but requires a long time to run. We show that Monte-Carlo Tree Search leads to a solution in a shorter period of time than this algorithm, with improved solutions for large problems. Our findings can be generalized to other task domains. 1
Move pruning techniques for Monte-Carlo Go
- In Advances in Computer Games 11
, 2005
"... Abstract. Progressive Pruning (PP) is used in the Monte-Carlo go playing program Indigo. For each candidate move, PP launches random games starting with this move. PP gathers statistics on moves, and it prunes moves statistically inferior to the best one [5]. This papers yields two new pruning techn ..."
Abstract
-
Cited by 2 (1 self)
- Add to MetaCart
Abstract. Progressive Pruning (PP) is used in the Monte-Carlo go playing program Indigo. For each candidate move, PP launches random games starting with this move. PP gathers statistics on moves, and it prunes moves statistically inferior to the best one [5]. This papers yields two new pruning techniques: Miai Pruning (MP) and Set Pruning (SP). In MP the second move of the random games is selected at random among the set of candidate moves. SP consists in gathering statistics about two sets of moves, GOOD and BAD, and it prunes the latter when statistically inferior to the former. Both enhancements clearly speed up the process on 9 × 9 boards, and MP improves slightly the playing level. Scaling up MP to 19 × 19 boards results in a 30 % speed-up enhancement and in a four-point improvement on average. 1

