Results 1 - 10
of
25
Partially observable markov decision processes with continuous observations for dialogue management
- Computer Speech and Language
, 2005
"... This work shows how a dialogue model can be represented as a Partially Observable Markov Decision Process (POMDP) with observations composed of a discrete and continuous component. The continuous component enables the model to directly incorporate a confidence score for automated planning. Using a t ..."
Abstract
-
Cited by 79 (24 self)
- Add to MetaCart
This work shows how a dialogue model can be represented as a Partially Observable Markov Decision Process (POMDP) with observations composed of a discrete and continuous component. The continuous component enables the model to directly incorporate a confidence score for automated planning. Using a testbed simulated dialogue management problem, we show how recent optimization techniques are able to find a policy for this continuous POMDP which outperforms a traditional MDP approach. Further, we present a method for automatically improving handcrafted dialogue managers by incorporating POMDP belief state monitoring, including confidence score information. Experiments on the testbed system show significant improvements for several example handcrafted dialogue managers across a range of operating conditions. 1
Effects of the User Model on Simulation-Based Learning of Dialogue Strategies
- In Proc. of ASRU
, 2005
"... Over the past decade, a variety of user models have been proposed for user simulation-based reinforcement-learning of dialogue strategies. However, the strategies learned with these models are rarely evaluated in actual user trials and it remains unclear how the choice of user model affects the qual ..."
Abstract
-
Cited by 22 (6 self)
- Add to MetaCart
Over the past decade, a variety of user models have been proposed for user simulation-based reinforcement-learning of dialogue strategies. However, the strategies learned with these models are rarely evaluated in actual user trials and it remains unclear how the choice of user model affects the quality of the learned strategy. In particular, the degree to which strategies learned with a user model generalise to real user populations has not be investigated. This paper presents a series of experiments that qualitatively and quantitatively examine the effect of the user model on the learned strategy. Our results show that the performance and characteristics of the strategy are in fact highly dependent on the user model. Furthermore, a policy trained with a poor user model may appear to perform well when tested with the same model, but fail when tested with a more sophisticated user model. This raises significant doubts about the current practice of learning and evaluating strategies with the same user model. The paper further investigates a new technique for testing and comparing strategies directly on real human-machine dialogues, thereby avoiding any evaluation bias introduced by the user model. 1.
User simulation for spoken dialogue systems: Learning and evaluation
- in Interspeech/ICSLP
, 2006
"... We propose the “advanced ” n-grams as a new technique for simulating user behaviour in spoken dialogue systems, and we compare it with two methods used in our prior work, i.e. linear feature combination and “normal ” n-grams. All methods operate on the intention level and can incorporate speech reco ..."
Abstract
-
Cited by 19 (8 self)
- Add to MetaCart
We propose the “advanced ” n-grams as a new technique for simulating user behaviour in spoken dialogue systems, and we compare it with two methods used in our prior work, i.e. linear feature combination and “normal ” n-grams. All methods operate on the intention level and can incorporate speech recognition and understanding errors. In the linear feature combination model user actions (lists of 〈 speech act, task 〉 pairs) are selected, based on features of the current dialogue state which encodes the whole history of the dialogue. The user simulation based on “normal ” n-grams treats a dialogue as a sequence of lists of 〈 speech act, task 〉 pairs. Here the length of the history considered is restricted by the order of the n-gram. The “advanced ” n-grams are a variation of the normal ngrams, where user actions are conditioned not only on speech acts and tasks but also on the current status of the tasks, i.e. whether
The Hidden Information State model: A practical framework for
, 2009
"... Computer Speech and Language xxx (2009) xxx–xxx COMPUTER SPEECH AND LANGUAGE www.elsevier.com/locate/csl ..."
Abstract
-
Cited by 16 (7 self)
- Add to MetaCart
Computer Speech and Language xxx (2009) xxx–xxx COMPUTER SPEECH AND LANGUAGE www.elsevier.com/locate/csl
Machine learning for spoken dialogue systems
- In Proceedings of the European Conference on Speech Communication and Technologies (Interspeech’07), Anvers
, 2007
"... During the last decade, research in the field of Spoken Dialogue Systems (SDS) has experienced increasing growth. However, the design and optimization of SDS is not only about combining speech and language processing systems such as Automatic Speech Recognition (ASR), parsers, Natural Language Gener ..."
Abstract
-
Cited by 15 (6 self)
- Add to MetaCart
During the last decade, research in the field of Spoken Dialogue Systems (SDS) has experienced increasing growth. However, the design and optimization of SDS is not only about combining speech and language processing systems such as Automatic Speech Recognition (ASR), parsers, Natural Language Generation (NLG), and Text-to-Speech (TTS) synthesis systems. It also requires the development of dialogue strategies taking at least into account the performances of these subsystems (and others), the nature of the task (e.g. form filling, tutoring, robot control, or database search/browsing), and the user’s behaviour (e.g. cooperativeness, expertise). Due to the great variability of these factors, reuse of previous hand-crafted designs is also made very difficult. For these reasons, statistical machine learning (ML) methods applied to automatic SDS optimization have been a leading research area for the last few years. In this paper, we provide a short review of the field and of recent advances.
Comparing User Simulation Models for Dialog Strategy Learning
- In Proc. of NAACL-HLT
, 2007
"... This paper explores what kind of user simulation model is suitable for developing a training corpus for using Markov Decision Processes (MDPs) to automatically learn dialog strategies. Our results suggest that with sparse training data, a model that aims to randomly explore more dialog state spaces ..."
Abstract
-
Cited by 10 (3 self)
- Add to MetaCart
This paper explores what kind of user simulation model is suitable for developing a training corpus for using Markov Decision Processes (MDPs) to automatically learn dialog strategies. Our results suggest that with sparse training data, a model that aims to randomly explore more dialog state spaces with certain constraints actually performs at the same or better than a more complex model that simulates realistic user behaviors in a statistical way. 1
Comparing real-real, simulated-simulated, and simulated-real spoken dialogue corpora
- In Proc. AAAI Workshop on Statistical and Empirical Approaches for SDS
, 2006
"... User simulation is used to generate large corpora for using reinforcement learning to automatically learn the best policy for spoken dialogue systems. Although this approach is becoming increasingly popular, the differences between simulated and real corpora are not well studied. We build two simula ..."
Abstract
-
Cited by 8 (2 self)
- Add to MetaCart
User simulation is used to generate large corpora for using reinforcement learning to automatically learn the best policy for spoken dialogue systems. Although this approach is becoming increasingly popular, the differences between simulated and real corpora are not well studied. We build two simulation models to interact with an intelligent tutoring system. Both models are trained on two different real corpora separately. We use several evaluation measures proposed in previous research to compare between our two simulated corpora, between the original two real corpora, and between the simulated and real corpora. We next examine the differentiating power of these measures. Our results show that although these simple statistical measures can distinguish real corpora from simulated ones, these measures cannot help us to draw a conclusion on the “reality ” of the simulated corpora since even two real corpora can be very different when evaluated on the same measures.
Comparing Spoken Dialog Corpora Collected with Recruited Subjects versus Real Users
"... Empirical spoken dialog research often involves the collection and analysis of a dialog corpus. However, it is not well understood whether and how a corpus of dialogs collected using recruited subjects differs from a corpus of dialogs obtained from real users. In this paper we use Let’s Go Lab, a pl ..."
Abstract
-
Cited by 7 (3 self)
- Add to MetaCart
Empirical spoken dialog research often involves the collection and analysis of a dialog corpus. However, it is not well understood whether and how a corpus of dialogs collected using recruited subjects differs from a corpus of dialogs obtained from real users. In this paper we use Let’s Go Lab, a platform for experimenting with a deployed spoken dialog bus information system, to address this question. Our first corpus is collected by recruiting subjects to call Let’s Go in a standard laboratory setting, while our second corpus consists of calls from real users calling Let’s Go during its operating hours. We quantitatively characterize the two collected corpora using previously proposed measures from the spoken dialog literature, then discuss the statistically significant similarities and differences between the two corpora with respect to these measures. For example, we find that recruited subjects talk more and speak faster, while real users ask for more help and more frequently interrupt the system. In contrast, we find no difference with respect to dialog structure. 1
Knowledge Consistent User Simulations for Dialog Systems
- In Proc. of Interspeech
, 2007
"... We propose a novel model to simulate user knowledge consistency in tutoring dialogs, where no clear user goal can be defined. We also propose a new evaluation measure of knowledge consistency based on learning curves. We compare our new simulation model to real users as well as to a previously used ..."
Abstract
-
Cited by 6 (2 self)
- Add to MetaCart
We propose a novel model to simulate user knowledge consistency in tutoring dialogs, where no clear user goal can be defined. We also propose a new evaluation measure of knowledge consistency based on learning curves. We compare our new simulation model to real users as well as to a previously used simulation model. We show that the new model performs similarly to the real students and to the previous model when evaluated on high-level dialog features. The new model outperforms the previous model when measured on knowledge consistency. Index Terms: spoken dialog, user simulation, evaluation measures, knowledge consistency

