• Documents
  • Authors
  • Tables
  • Other Seers ▼
    RefSeer AckSeer CollabSeer SeerSeer
  • Log in
  • Sign up
  • MetaCart

CiteSeerX logo

Advanced Search Include Citations
Advanced Search Include Citations | Disambiguate

Comparing the utility of state features in spoken dialogue using reinforcement learning (2006)

by J Tetreault, D Litman
Venue:In NAACL
Add To MetaCart

Tools

Sorted by:
Results 1 - 3 of 3

Reinforcement Learning-based Feature Selection For Developing Pedagogically Effective Tutorial Dialogue Tactics

by Min Chi
"... ..."
Abstract - Cited by 7 (0 self) - Add to MetaCart
Abstract not found

Comparing linguistic features for modeling learning in computer dialogue tutoring

by Kate Forbes-riley, Diane Litman, Amruta Purandare, Mihai Rotaru, Joel Tetreault - In Proceedings of the AIED Conference , 2007
"... Abstract. We compare the relative utility of different automatically computable ..."
Abstract - Cited by 6 (5 self) - Add to MetaCart
Abstract. We compare the relative utility of different automatically computable

Estimating the reliability of mdp policies: a confidence interval ap proach

by Joel R. Tetreault - in NAACL , 2007
"... Past approaches for using reinforcement learning to derive dialog control policies have assumed that there was enough collected data to derive a reliable policy. In this paper we present a methodology for numerically constructing confidence intervals for the expected cumulative reward for a learned ..."
Abstract - Cited by 2 (1 self) - Add to MetaCart
Past approaches for using reinforcement learning to derive dialog control policies have assumed that there was enough collected data to derive a reliable policy. In this paper we present a methodology for numerically constructing confidence intervals for the expected cumulative reward for a learned policy. These intervals are used to (1) better assess the reliability of the expected cumulative reward, and (2) perform a refined comparison between policies derived from different Markov Decision Processes (MDP) models. We applied this methodology to a prior experiment where the goal was to select the best features to include in the MDP statespace. Our results show that while some of the policies developed in the prior work exhibited very large confidence intervals, the policy developed from the best feature set had a much smaller confidence interval and thus showed very high reliability. 1
The National Science Foundation
  • About CiteSeerX
  • Submit Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2010 The Pennsylvania State University