Results 1 
4 of
4
Bandit based MonteCarlo Planning
 In: ECML06. Number 4212 in LNCS
, 2006
"... Abstract. For large statespace Markovian Decision Problems MonteCarlo planning is one of the few viable approaches to find nearoptimal solutions. In this paper we introduce a new algorithm, UCT, that applies bandit ideas to guide MonteCarlo planning. In finitehorizon or discounted MDPs the algo ..."
Abstract

Cited by 217 (6 self)
 Add to MetaCart
Abstract. For large statespace Markovian Decision Problems MonteCarlo planning is one of the few viable approaches to find nearoptimal solutions. In this paper we introduce a new algorithm, UCT, that applies bandit ideas to guide MonteCarlo planning. In finitehorizon or discounted MDPs the algorithm is shown to be consistent and finite sample bounds are derived on the estimation error due to sampling. Experimental results show that in several domains, UCT is significantly more efficient than its alternatives. 1
Fast Reachability Analysis for Uncertain SSPs
, 2005
"... Stochastic Shortest Path problems (SSPs) can be efficiently dealt with by the RealTime Dynamic Programming algorithm (RTDP). Yet, RTDP requires that a goal state is always reachable, what can be checked easily for a certain SSP, and with a more complex algorithm for an uncertain SSP, i.e. where onl ..."
Abstract

Cited by 2 (1 self)
 Add to MetaCart
Stochastic Shortest Path problems (SSPs) can be efficiently dealt with by the RealTime Dynamic Programming algorithm (RTDP). Yet, RTDP requires that a goal state is always reachable, what can be checked easily for a certain SSP, and with a more complex algorithm for an uncertain SSP, i.e. where only a possible interval is known for each transition probability. This paper makes a simplified description of these two processes, and demonstrates how the time consuming uncertain analysis can be dramatically speeded up. The main improvement still needed is to turn to a symbolic analysis in order to avoid a complete statespace enumeration.
Received (Day Month Year) Revised (Day Month Year)
"... Stochastic Shortest Path problems (SSPs) can be efficiently dealt with by the RealTime Dynamic Programming algorithm (RTDP). Yet, RTDP requires that a goal state is always reachable. This article presents an algorithm checking for goal reachability, especially in the complex case of an uncertain SS ..."
Abstract
 Add to MetaCart
Stochastic Shortest Path problems (SSPs) can be efficiently dealt with by the RealTime Dynamic Programming algorithm (RTDP). Yet, RTDP requires that a goal state is always reachable. This article presents an algorithm checking for goal reachability, especially in the complex case of an uncertain SSP where only a possible interval is known for each transition probability. This gives an analysis method for determining if SSP algorithms such as RTDP are applicable, even if the exact model is not known. As this is a timeconsuming algorithm, we also present a simple process that often speeds it up dramatically. Yet, the main improvement still needed is to turn to a symbolic analysis in order to avoid a complete statespace enumeration.
Fast Reachability Analysis for Uncertain SSPs Olivier Buffet
"... Stochastic Shortest Path problems (SSPs) can be efficiently dealt with by the RealTime Dynamic Programming algorithm (RTDP). Yet, RTDP requires that a goal state is always reachable, what can be checked easily for a certain SSP, and with a more complex algorithm for an uncertain SSP, i.e. where onl ..."
Abstract
 Add to MetaCart
Stochastic Shortest Path problems (SSPs) can be efficiently dealt with by the RealTime Dynamic Programming algorithm (RTDP). Yet, RTDP requires that a goal state is always reachable, what can be checked easily for a certain SSP, and with a more complex algorithm for an uncertain SSP, i.e. where only a possible interval is known for each transition probability. This paper makes a simplified description of these two processes, and demonstrates how the time consuming uncertain analysis can be dramatically speeded up. The main improvement still needed is to turn to a symbolic analysis in order to avoid a complete statespace enumeration. 1