Results 1 - 10
of
16
Partially observable markov decision processes with continuous observations for dialogue management
- Computer Speech and Language
, 2005
"... This work shows how a dialogue model can be represented as a Partially Observable Markov Decision Process (POMDP) with observations composed of a discrete and continuous component. The continuous component enables the model to directly incorporate a confidence score for automated planning. Using a t ..."
Abstract
-
Cited by 79 (24 self)
- Add to MetaCart
This work shows how a dialogue model can be represented as a Partially Observable Markov Decision Process (POMDP) with observations composed of a discrete and continuous component. The continuous component enables the model to directly incorporate a confidence score for automated planning. Using a testbed simulated dialogue management problem, we show how recent optimization techniques are able to find a policy for this continuous POMDP which outperforms a traditional MDP approach. Further, we present a method for automatically improving handcrafted dialogue managers by incorporating POMDP belief state monitoring, including confidence score information. Experiments on the testbed system show significant improvements for several example handcrafted dialogue managers across a range of operating conditions. 1
Using bigrams to identify relationships between student certainness states and tutor responses in a spoken dialogue corpus
- In SIGDial
, 2005
"... We use n-gram techniques to identify dependencies between student affective states of certainty and subsequent tutor dialogue acts, in an annotated corpus of human-human spoken tutoring dialogues. We first represent our dialogues as bigrams of annotated student and tutor turns. We next use χ 2 analy ..."
Abstract
-
Cited by 11 (1 self)
- Add to MetaCart
We use n-gram techniques to identify dependencies between student affective states of certainty and subsequent tutor dialogue acts, in an annotated corpus of human-human spoken tutoring dialogues. We first represent our dialogues as bigrams of annotated student and tutor turns. We next use χ 2 analysis to identify dependent bigrams. Our results show dependencies between many student states and subsequent tutor dialogue acts. We then analyze the dependent bigrams and suggest ways that our current computer tutor can be enhanced to adapt its dialogue act generation based on these dependencies. 1
Using particle filters to track dialogue state
- in Proc IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU
, 2007
"... The benefit of tracking a probability distribution over multiple dialogue states has been demonstrated in the literature. However, the dialogue state in past work has been limited to a small number of variables, and growing the number of variables in the dialogue state prevents the probability distr ..."
Abstract
-
Cited by 9 (3 self)
- Add to MetaCart
The benefit of tracking a probability distribution over multiple dialogue states has been demonstrated in the literature. However, the dialogue state in past work has been limited to a small number of variables, and growing the number of variables in the dialogue state prevents the probability distribution from being updated in real-time. This paper shows how the number of variables composing the dialogue state can be increased while maintaining response times suitable for a spoken dialogue system. Rather than performing exact inference using the joint distribution over all variables, a particle filter is employed to compute an approximate update. Dialogue states (particles) are sampled, weighted by their agreement with the speech recognition results, and marginalized to produce a new distribution over each variable. Results on a spoken dialogue system for troubleshooting show that a relatively small number of particles are required to achieve performance close to an exact update, enabling the dialogue system to run in realtime. Index Terms — dialogue modelling, dialogue management, spoken dialogue systems, particle filter, Monte Carlo 1.
Incorporating discourse features into confidence scoring of intention recognition results in spoken dialogue systems
- Speech Comm
, 2006
"... This paper proposes a method for the confidence scoring of intention recognition results in spoken dialogue systems. To achieve tasks, a spoken dialogue system has to recognize user intentions. However, because of speech recognition errors and ambiguity in user utterances, it sometimes has difficult ..."
Abstract
-
Cited by 4 (1 self)
- Add to MetaCart
This paper proposes a method for the confidence scoring of intention recognition results in spoken dialogue systems. To achieve tasks, a spoken dialogue system has to recognize user intentions. However, because of speech recognition errors and ambiguity in user utterances, it sometimes has difficulty recognizing them correctly. Confidence scoring allows errors to be detected in intention recognition results and has proved useful for dialogue management. Conventional methods use the features obtained from speech recognition results for single utterances for confidence scoring. However, this may be insufficient since the intention recognition result is a result of discourse processing. We propose incorporating discourse features for a more accurate confidence scoring of intention recognition results. Experimental results show that incorporating discourse features significantly improves the confidence scoring. 1.
H.G.: Contextual constraints based on dialogue models in database search task for spoken dialogue systems
- In: Proc. Interspeech. (2005
, 2005
"... This paper describes the incorporation of contextual information into spoken dialogue systems in the database search task. Appropriate dialogue modeling is required to manage automatic speech recognition (ASR) errors using dialogue-level information. We define two dialogue models: a model for dialog ..."
Abstract
-
Cited by 4 (0 self)
- Add to MetaCart
This paper describes the incorporation of contextual information into spoken dialogue systems in the database search task. Appropriate dialogue modeling is required to manage automatic speech recognition (ASR) errors using dialogue-level information. We define two dialogue models: a model for dialogue flow and a model of structured dialogue history. The model for dialogue flow assumes dialogues in the database search task consist of only two modes. In the structured dialogue history model, query conditions are maintained as a tree structure, taking into consideration their inputted order. The constraints derived from these models are integrated by using a decision tree learning, so that the system can determine a dialogue act of the utterance and whether each content word should be accepted or rejected, even when it contains ASR errors. The experimental result showed that our method could interpret content words better than conventional one without the contextual information. Furthermore, it was also shown that our method was domain-independent because it achieved equivalent accuracy in another domain without any more training. 1.
Exploiting the asr n-best by tracking multiple dialog state hypotheses
- in Proc. of Interspeech
, 2008
"... When the top ASR hypothesis is incorrect, often the correct hypothesis is listed as an alternative in the ASR N-Best list. Whereas traditional spoken dialog systems have struggled to exploit this information, this paper argues that a dialog model that tracks a distribution over multiple dialog state ..."
Abstract
-
Cited by 4 (0 self)
- Add to MetaCart
When the top ASR hypothesis is incorrect, often the correct hypothesis is listed as an alternative in the ASR N-Best list. Whereas traditional spoken dialog systems have struggled to exploit this information, this paper argues that a dialog model that tracks a distribution over multiple dialog states can improve dialog accuracy by making use of the entire N-Best list. The key element of the approach is a generative model of the N-Best list given the user’s true hidden action. An evaluation on real dialog data verifies that dialog accuracy rates are improved by making use of the entire N-Best list. Index Terms: dialogue modelling, dialogue management, spoken dialogue systems, confidence score, N-Best list
Error Awareness and Recovery in Conversational Spoken Language Interfaces
, 2007
"... are those of the author and should not be interpreted as representing the official policies, either express or implied, of any sponsoring institution, the U.S. government, or any other entity. Keywords: spoken dialog systems, conversational spoken language interfaces, error detection, error recovery ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
are those of the author and should not be interpreted as representing the official policies, either express or implied, of any sponsoring institution, the U.S. government, or any other entity. Keywords: spoken dialog systems, conversational spoken language interfaces, error detection, error recovery strategies, error recovery policies, dialog management, RavenClaw, implicitly-supervised One of the most important and persistent problems in the development of conversational spoken language interfaces is their lack of robustness when confronted with understanding-errors. Most of these errors stem from limitations in current speech recognition technology, and, as a result, appear across all domains and interaction types. There are two approaches towards increased robustness: prevent the errors from happening, or recover from them through conversation, by interacting with the users. In this dissertation we have engaged in a research program centered on the second approach. We argue that three capabilities are needed in order to seamlessly and efficiently recover from errors: (1) systems must be able to detect the errors, preferably as soon as they happen, (2) systems must be equipped with a rich repertoire of error recovery strategies that can be used to set the conversation back on track, and (3) systems must know how to choose optimally between different recovery
Towards Measuring Scalability in Natural Language Understanding Tasks
"... In this paper we present a discussion of existing metrics for evaluation the performance of individual natural language understanding systems and components as well as the commonly employed metrics for measuring the specific task difficulties. We extend and generalize the common majority class basel ..."
Abstract
- Add to MetaCart
In this paper we present a discussion of existing metrics for evaluation the performance of individual natural language understanding systems and components as well as the commonly employed metrics for measuring the specific task difficulties. We extend and generalize the common majority class baseline metric and introduce an general entropy-based metric for measuring the task difficulty of arbitrary language understanding tasks. Finally, we show an empirical study evaluating this metric followed by a discussion of its role in measuring the scalability of language understanding systems and components. 1
Belief-Based Nonlinear Rescoring in Thai Speech Understanding
"... This paper proposes an approach to improve speech understanding based on rescoring of N-best semantic hypotheses. In rescoring, probabilities produced by an understanding component are combined with additional probabilities derived from system beliefs. While a normal rescoring approach is to multipl ..."
Abstract
- Add to MetaCart
This paper proposes an approach to improve speech understanding based on rescoring of N-best semantic hypotheses. In rescoring, probabilities produced by an understanding component are combined with additional probabilities derived from system beliefs. While a normal rescoring approach is to multiply or linearly interpolate with belief probabilities, this paper shows that probabilities from various sources are better combined using a nonlinear estimator. Using the proposed model together with a dialogue-state dependent semantic model shows a significant improvement when applying to a Thai interactive hotel reservation agent (TIRA), the first spoken dialogue system in Thai language.
Stochastic Discourse Modeling in Spoken Dialogue Systems Using Semantic Dependency Graphs
"... This investigation proposes an approach to modeling the discourse of spoken dialogue using semantic dependency graphs. By characterizing the discourse as a sequence of speech acts, discourse modeling becomes the identification of the speech act sequence. A statistical approach is adopted to model th ..."
Abstract
- Add to MetaCart
This investigation proposes an approach to modeling the discourse of spoken dialogue using semantic dependency graphs. By characterizing the discourse as a sequence of speech acts, discourse modeling becomes the identification of the speech act sequence. A statistical approach is adopted to model the relations between words in the user’s utterance using the semantic dependency graphs. Dependency relation between the headword and other words in a sentence is detected using the semantic dependency grammar. In order to evaluate the proposed method, a dialogue system for medical service is developed. Experimental results show that the rates for speech act detection and taskcompletion are 95.6 % and 85.24%, respectively, and the average number of turns of each dialogue is 8.3. Compared with the Bayes ’ classifier and the Partial-Pattern Tree based approaches, we obtain 14.9 % and 12.47 % improvements in accuracy for speech act identification, respectively. 1

