Results 1 - 10
of
18
Partially observable markov decision processes with continuous observations for dialogue management
- Computer Speech and Language
, 2005
"... This work shows how a dialogue model can be represented as a Partially Observable Markov Decision Process (POMDP) with observations composed of a discrete and continuous component. The continuous component enables the model to directly incorporate a confidence score for automated planning. Using a t ..."
Abstract
-
Cited by 79 (24 self)
- Add to MetaCart
This work shows how a dialogue model can be represented as a Partially Observable Markov Decision Process (POMDP) with observations composed of a discrete and continuous component. The continuous component enables the model to directly incorporate a confidence score for automated planning. Using a testbed simulated dialogue management problem, we show how recent optimization techniques are able to find a policy for this continuous POMDP which outperforms a traditional MDP approach. Further, we present a method for automatically improving handcrafted dialogue managers by incorporating POMDP belief state monitoring, including confidence score information. Experiments on the testbed system show significant improvements for several example handcrafted dialogue managers across a range of operating conditions. 1
Modern natural language interfaces to databases: Composing statistical parsing with semantic tractability
- In Proceedings of the Twentieth International Conference on Computational Linguistics (COLING-04
, 2004
"... Natural Language Interfaces to Databases (NLIs) can benefit from the advances in statistical parsing over the last fifteen years or so. However, statistical parsers require training on a massive, labeled corpus, and manually creating such a corpus for each database is prohibitively expensive. To add ..."
Abstract
-
Cited by 24 (1 self)
- Add to MetaCart
Natural Language Interfaces to Databases (NLIs) can benefit from the advances in statistical parsing over the last fifteen years or so. However, statistical parsers require training on a massive, labeled corpus, and manually creating such a corpus for each database is prohibitively expensive. To address this quandary, this paper reports on the PRECISE NLI, which uses a statistical parser as a “plug in”. The paper shows how a strong semantic model coupled with “light re-training ” enables PRECISE to overcome parser errors, and correctly map from parsed questions to the corresponding SQL queries. We discuss the issues in using statistical parsers to build database-independent NLIs, and report on experimental results with the benchmark ATIS data set where PRECISE achieves 94 % accuracy. 1
Beyond ASR 1-best: Using word confusion networks in spoken language understanding
, 2006
"... We are interested in the problem of robust understanding from noisy spontaneous speech input. With the advances in automated speech recognition (ASR), there has been increasing interest in spoken language understanding (SLU). A challenge in large vocabulary spoken language understanding is robustnes ..."
Abstract
-
Cited by 6 (0 self)
- Add to MetaCart
We are interested in the problem of robust understanding from noisy spontaneous speech input. With the advances in automated speech recognition (ASR), there has been increasing interest in spoken language understanding (SLU). A challenge in large vocabulary spoken language understanding is robustness to ASR errors. State of the art spoken language understanding relies on the best ASR hypotheses (ASR 1-best). In this paper, we propose methods for a tighter integration of ASR and SLU using word confusion networks (WCNs). WCNs obtained from ASR word graphs (lattices) provide a compact representation of multiple aligned ASR hypotheses along with word confidence scores, without compromising recognition accuracy. We present our work on exploiting WCNs instead of simply using ASR one-best hypotheses. In this work, we focus on the tasks of named entity detection and extraction and call classification in a spoken dialog system, although the idea is more general and applicable to other spoken language processing tasks. For named entity detection, we have improved the F-measure by using both word lattices and WCNs, 6–10 % absolute. The processing of WCNs was 25 times faster than lattices, which is very important for real-life applications. For call classification, we have shown between 5 % and 10 % relative reduction in error rate using WCNs compared to ASR 1-best output.
A probabilistic approach to the interpretation of spoken utterances
- Faculty of Information Technology, Monash
, 2008
"... Abstract. In this paper we describe Scusi?, the speech interpretation component of a spoken dialogue module designed for an autonomous robotic agent. Scusi? postulates and maintains multiple interpretations of the spoken discourse, and employs a probabilistic formalism to assess and rank hypotheses ..."
Abstract
-
Cited by 6 (5 self)
- Add to MetaCart
Abstract. In this paper we describe Scusi?, the speech interpretation component of a spoken dialogue module designed for an autonomous robotic agent. Scusi? postulates and maintains multiple interpretations of the spoken discourse, and employs a probabilistic formalism to assess and rank hypotheses regarding the meaning of spoken utterances. These constituents in combination enable Scusi? to cope gracefully with ambiguity and speech recognition errors. The results of our evaluation are encouraging, yielding good interpretation performance for utterances of different types and lengths. 1
Backoff Model Training using Partially Observed Data: Application to Dialog Act Tagging
, 2005
"... Dialog act (DA) tags are useful for many applications in natural language processing and automatic speech recognition. In this work, we introduce hidden backoff models (HBMs) where a large generalized backoff model is trained, using an embedded expectation-maximization (EM) procedure, on data that i ..."
Abstract
-
Cited by 5 (0 self)
- Add to MetaCart
Dialog act (DA) tags are useful for many applications in natural language processing and automatic speech recognition. In this work, we introduce hidden backoff models (HBMs) where a large generalized backoff model is trained, using an embedded expectation-maximization (EM) procedure, on data that is partially observed. We use HBMs as word models conditioned on both DAs and (hidden) DAsegments. Experimental results on the ICSI meeting recorder dialog act corpus show that our procedure can strictly increase likelihood on training data and can effectively reduce errors on test data. In the best case, test error can be reduced by 6.1 % relative to our baseline, an improvement on previously reported models that also use prosody. We also compare with our own prosody-based model, and show that our HBM is competitive even without the use of prosody. We have not yet succeeded, however, in combining the benefits of both prosody and the HBM. 1
Robust Interpretation in Dialogue by Combining Confidence Scores with Contextual Features
- In Proceedings of the 9th International Conference on Spoken Language Processing (Interspeech/ICSLP
, 2006
"... We present an approach to dialogue management and interpretation that evaluates and selects amongst candidate dialogue moves based on features at multiple levels. Multiple interpretation methods can be combined, multiple speech recognition and parsing hypotheses tested, and multiple candidate dialog ..."
Abstract
-
Cited by 4 (1 self)
- Add to MetaCart
We present an approach to dialogue management and interpretation that evaluates and selects amongst candidate dialogue moves based on features at multiple levels. Multiple interpretation methods can be combined, multiple speech recognition and parsing hypotheses tested, and multiple candidate dialogue moves considered to choose the highest scoring hypothesis overall. We integrate hypotheses generated from shallow slot-filling methods and from relatively deep parsing, using pragmatic information. We show that this gives more robust performance than using either approach alone, allowing n-best list reordering to correct errors in speech recognition or parsing. Index Terms: dialogue management, robust interpretation 1.
CHAT to your destination
- In Proceedings of 8th SIGdial Workshop on Discourse and Dialogue
, 2007
"... In the past few years, we have been developing a robust, wide-coverage, and cognitive load-sensitive spoken dialog interface, CHAT (Conversational Helper for Automotive Tasks). New progress has been made to address issues related to dynamic and attention-demanding environments, such as driving. Spec ..."
Abstract
-
Cited by 4 (3 self)
- Add to MetaCart
In the past few years, we have been developing a robust, wide-coverage, and cognitive load-sensitive spoken dialog interface, CHAT (Conversational Helper for Automotive Tasks). New progress has been made to address issues related to dynamic and attention-demanding environments, such as driving. Specifically, we try to address imperfect input and imperfect memory issues through robust understanding, knowledge-based interpretation, flexible dialog management, sensible information communication, and user-adaptive responses. In addition to the MP3 player and restaurant finder applications reported in previous publications, a third domain, navigation, has been developed, where one has to deal with dynamic information, domain switch, and error recovery. Evaluation in the new domain has shown a good degree of success: including high task completion rate, dialog efficiency, and improved user experience. 1
I.: Towards a probabilistic, multi-layered spoken language interpretation system
- In: Proceedings of the Fourth IJCAI Workshop on Knowledge and Reasoning in Practical Dialogue Systems
, 2005
"... We present a preliminary report of a probabilistic spoken-language interpretation mechanism that is part of a dialogue system for an office assistant robot. We offer a probabilistic formulation for the generation of candidate interpretations and the selection of the interpretation with the highest p ..."
Abstract
-
Cited by 3 (3 self)
- Add to MetaCart
We present a preliminary report of a probabilistic spoken-language interpretation mechanism that is part of a dialogue system for an office assistant robot. We offer a probabilistic formulation for the generation of candidate interpretations and the selection of the interpretation with the highest posterior probability. This formulation is implemented in a multi-layered interpretation process that integrates spoken and sensory input, and takes into account alternatives derived from a user’s utterance and expectations obtained from the context. Our preliminary results are encouraging. 1
Y.: Considering multiple options when interpreting spoken utterances
- In: Proceedings of the Fifth IJCAI Workshop on Knowledge and Reasoning in Practical Dialogue Systems
, 2007
"... We describe Scusi?, a spoken language interpretation mechanism designed to be part of a robot-mounted dialogue system. Scusi?’s interpretation process maps spoken utterances to text, which in turn is parsed and then converted to conceptual graphs. In order to support robust and flexible performance ..."
Abstract
-
Cited by 2 (2 self)
- Add to MetaCart
We describe Scusi?, a spoken language interpretation mechanism designed to be part of a robot-mounted dialogue system. Scusi?’s interpretation process maps spoken utterances to text, which in turn is parsed and then converted to conceptual graphs. In order to support robust and flexible performance of the dialogue module, Scusi? maintains multiple options at each stage of the interpretation process, and uses maximum posterior probability to rank the (partial) interpretations produced at each stage. The time and space requirements of maintaining multiple options are handled by means of an anytime search algorithm. Our evaluation focuses on the impact of the speech recognizer and the search algorithm on Scusi?’s performance.
Evaluation of Content Presentation Strategies for an In-car Spoken Dialogue System
"... In this paper we present a framework for managing information presentation in spoken dialogue systems. We describe a content optimization module that makes use of ontological relationships in information-seeking dialogues in order to organize knowledge base items and perform adjustments such as rela ..."
Abstract
-
Cited by 2 (1 self)
- Add to MetaCart
In this paper we present a framework for managing information presentation in spoken dialogue systems. We describe a content optimization module that makes use of ontological relationships in information-seeking dialogues in order to organize knowledge base items and perform adjustments such as relaxing or tightening user constraints. We present the results of an experimental evaluation comparing two response strategies: (a) one that uses the content optimization module to offer suggestions and (b) one that gives no suggestions. The results indicate that giving such suggestions is preferred when a user query matches either no items or many items in the knowledge base, and may also lead to more efficient dialogues. Index Terms: spoken dialogue systems, content management 1.

