Results 1 -
3 of
3
Comparing linguistic features for modeling learning in computer dialogue tutoring
- In Proceedings of the AIED Conference
, 2007
"... Abstract. We compare the relative utility of different automatically computable ..."
Abstract
-
Cited by 6 (5 self)
- Add to MetaCart
Abstract. We compare the relative utility of different automatically computable
Estimating the reliability of mdp policies: a confidence interval ap proach
- in NAACL
, 2007
"... Past approaches for using reinforcement learning to derive dialog control policies have assumed that there was enough collected data to derive a reliable policy. In this paper we present a methodology for numerically constructing confidence intervals for the expected cumulative reward for a learned ..."
Abstract
-
Cited by 2 (1 self)
- Add to MetaCart
Past approaches for using reinforcement learning to derive dialog control policies have assumed that there was enough collected data to derive a reliable policy. In this paper we present a methodology for numerically constructing confidence intervals for the expected cumulative reward for a learned policy. These intervals are used to (1) better assess the reliability of the expected cumulative reward, and (2) perform a refined comparison between policies derived from different Markov Decision Processes (MDP) models. We applied this methodology to a prior experiment where the goal was to select the best features to include in the MDP statespace. Our results show that while some of the policies developed in the prior work exhibited very large confidence intervals, the policy developed from the best feature set had a much smaller confidence interval and thus showed very high reliability. 1

