Results 1 - 10
of
14
Galaxy-II: A Reference Architecture For Conversational System Development
- in Proc. ICSLP
, 1998
"... GALAXY is a client-server architecture for accessing on-line information using spoken dialogue that we introduced at ICSLP94. It has served as the testbed for developing human language technologies for our group for several years. Recently, we have initiated a significant redesign of the GALAXY arch ..."
Abstract
-
Cited by 108 (14 self)
- Add to MetaCart
GALAXY is a client-server architecture for accessing on-line information using spoken dialogue that we introduced at ICSLP94. It has served as the testbed for developing human language technologies for our group for several years. Recently, we have initiated a significant redesign of the GALAXY architecture to make it easier for many researchers to develop their own applications, using either exclusively their own servers or intermixing them with servers developed by others. This redesign was done in part due to the fact that GALAXY has been designated as the first reference architecture for the new DARPA Communicator Program. The purpose of this paper is to document the changes to GALAXY that led to this first reference architecture, which makes use of a scripting language for flow control to provide flexible interaction among the servers, and a set of libraries to support rapid prototyping of new servers. We describe the new reference architecture in some detail, and report on the cu...
Partially observable markov decision processes with continuous observations for dialogue management
- Computer Speech and Language
, 2005
"... This work shows how a dialogue model can be represented as a Partially Observable Markov Decision Process (POMDP) with observations composed of a discrete and continuous component. The continuous component enables the model to directly incorporate a confidence score for automated planning. Using a t ..."
Abstract
-
Cited by 79 (24 self)
- Add to MetaCart
This work shows how a dialogue model can be represented as a Partially Observable Markov Decision Process (POMDP) with observations composed of a discrete and continuous component. The continuous component enables the model to directly incorporate a confidence score for automated planning. Using a testbed simulated dialogue management problem, we show how recent optimization techniques are able to find a policy for this continuous POMDP which outperforms a traditional MDP approach. Further, we present a method for automatically improving handcrafted dialogue managers by incorporating POMDP belief state monitoring, including confidence score information. Experiments on the testbed system show significant improvements for several example handcrafted dialogue managers across a range of operating conditions. 1
Conversational Interfaces: Advances and Challenges
, 2000
"... The last decade has witnessed the emergence of a new breed of human computer interfaces that combines several human language technologies to enable information access and transactional processing using spoken dialogue. In this paper, I discuss my view on the research issues involved in the developme ..."
Abstract
-
Cited by 61 (4 self)
- Add to MetaCart
The last decade has witnessed the emergence of a new breed of human computer interfaces that combines several human language technologies to enable information access and transactional processing using spoken dialogue. In this paper, I discuss my view on the research issues involved in the development of such interfaces, describe the recent work done in this area at the MIT Laboratory for Computer Science, and outline some of the unmet research challenges, including the need to work in real domains, spoken language generation, and portability across domains and languages.
Modeling Out-Of-Vocabulary Words For Robust Speech Recognition
, 2000
"... This thesis concerns the problem of unknown or out-of-vocabulary (00V) words in continuous speech recognition. Most of today's state-of-the-art speech recognition systems can recognize only words that belong to some predefined finite word vocabulary. When encountering an OOV word, a speech recognize ..."
Abstract
-
Cited by 43 (5 self)
- Add to MetaCart
This thesis concerns the problem of unknown or out-of-vocabulary (00V) words in continuous speech recognition. Most of today's state-of-the-art speech recognition systems can recognize only words that belong to some predefined finite word vocabulary. When encountering an OOV word, a speech recognizer erroneously substitutes the OOV word with a similarly sounding word from its vocabulary. Furthermore, a recognition error due to an OOV word tends to spread errors into neighboring words; dramatically degrading overall recognition performance.
Jupiter: A Telephone-Based Conversational Interface for Weather Information
- IEEE Trans. on Speech and Audio Processing
, 2000
"... In early 1997, our group initiated a project to develop jupiter, a conversational interface that allows users to obtain worldwide weather forecast information over the telephone using spoken dialogue. It has served as the primary research platform for our group on many issues related to human langua ..."
Abstract
-
Cited by 32 (3 self)
- Add to MetaCart
In early 1997, our group initiated a project to develop jupiter, a conversational interface that allows users to obtain worldwide weather forecast information over the telephone using spoken dialogue. It has served as the primary research platform for our group on many issues related to human language technology, including telephonebased speech recognition, robust language understanding, language generation, dialogue modelling, and multilingual interfaces. Over a two year period since coming on line in May 1997, jupiter has received, via a toll-free number in North America, over 30,000 calls (totalling over 180,000 utterances), mostly from naive users. The purpose of this paper is to describe our development effort in terms of the underlying human language technologies as well as other system related issues such as utterance rejection and content harvesting. We will also present some evaluation results on the system and its components.
Challenges For Spoken Dialogue Systems
- In Proceedings of 1999 IEEE ASRU Workshop
, 1999
"... The past decade has seen the development of a large number of spoken dialogue systems around the world, both as research prototypes and commercial applications. These systems allow users to interact with a machine to retrieve information, conduct transactions, or perform other problem-solving tasks. ..."
Abstract
-
Cited by 24 (0 self)
- Add to MetaCart
The past decade has seen the development of a large number of spoken dialogue systems around the world, both as research prototypes and commercial applications. These systems allow users to interact with a machine to retrieve information, conduct transactions, or perform other problem-solving tasks. In this paper we discuss some of the design issues which confront developers of spoken dialogue systems, provide some examples of research being undertaken in this area, and describe some of the ongoing challenges facing current spoken language technology.
A Boosting Approach for Confidence Scoring
, 2001
"... In this paper we present the application of a boosting classification algorithm to confidence scoring. We derive feature vectors from speech recognition lattices and feed them into a boosting classifier. This classifier combines hundreds of very simple `weak learners' and derives classification rule ..."
Abstract
-
Cited by 14 (0 self)
- Add to MetaCart
In this paper we present the application of a boosting classification algorithm to confidence scoring. We derive feature vectors from speech recognition lattices and feed them into a boosting classifier. This classifier combines hundreds of very simple `weak learners' and derives classification rules that can reduce the confidence error rate by up to 34%. We compare our results to those obtained using two other standard classification techniques, Support Vector Machines (SVMs) and Classification and Regression Trees (CART), and show significant improvements. Furthermore, the nature of the boosting algorithm allows us to combine the best single classifier and improve its performance.
Semantic Confidence Measurement for Spoken Dialogue Systems
- IEEE Trans. on SAP
, 2005
"... Abstract—This paper proposes two methods to incorporate semantic information into word and concept level confidence measurement. The first method uses tag and extension probabilities obtained from a statistical classer and parser. The second method uses a maximum entropy based semantic structured la ..."
Abstract
-
Cited by 5 (0 self)
- Add to MetaCart
Abstract—This paper proposes two methods to incorporate semantic information into word and concept level confidence measurement. The first method uses tag and extension probabilities obtained from a statistical classer and parser. The second method uses a maximum entropy based semantic structured language model to assign probabilities to each word. Incorporation of semantic features into a lattice posterior probability based confidence measure provides significant improvements compared to posterior probability when used together in an air travel reservation task. At 5% False Alarm (FA) rate relative improvements of 28 % and 61 % in Correct Acceptance (CA) rate are achieved for word level and concept level confidence measurements, respectively. I.
Using Knowledge-Based Scores for Identifying Best Speech Recognition Hypothesis
- In Proc. of ISCA Tutorial and Research Workshop on Error Handling in Spoken Dialogue Systems. Chateau-d’Oex-Vaud
, 2003
"... The paper presents the evaluation of a knowledge-based scoring method applied to the problem of identifying the best speech recognition hypothesis (SRH) in a functioning multimodal dialogue system. The competing SRHs are evaluated in terms of their semantic coherence using the high-level domain know ..."
Abstract
-
Cited by 3 (1 self)
- Add to MetaCart
The paper presents the evaluation of a knowledge-based scoring method applied to the problem of identifying the best speech recognition hypothesis (SRH) in a functioning multimodal dialogue system. The competing SRHs are evaluated in terms of their semantic coherence using the high-level domain knowledge encoded in the ontology. We conducted an annotation experiment and showed that humans can reliably select the best SRH in a given N-best list (agreement 95.35%). The knowledge-based method identifies correctly 88.07% of the best SRHs (given the baseline 63.91%), which is also an improvement over the automatic speech recognizer (ASR) (83.88% accuracy) .
The Use Of Dynamic Reliability Scoring In Speech Recognition
- IN PROCEEDINGS OF THE SIXTH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING (ICSLP2000
, 2000
"... Typically, along a recognizer's search path, some acoustic units are modeled more reliably than others, due to differences in their acoustic-phonetic features and many other factors. This paper presents a dynamic reliability scoring scheme which can help adjust the partial path scores while the reco ..."
Abstract
-
Cited by 2 (1 self)
- Add to MetaCart
Typically, along a recognizer's search path, some acoustic units are modeled more reliably than others, due to differences in their acoustic-phonetic features and many other factors. This paper presents a dynamic reliability scoring scheme which can help adjust the partial path scores while the recognizer searches through the composed lexical and acoustic-phonetic network. The reliability models are trained on the acoustic scores of the correct arc and its immediate competing arcs extending the current partial path. During recognition, if, according to the trained reliability models, an arc can be more easily distinguished from the competing alternatives, that arc is more likely to be in the right path, and the partial path score can be adjusted accordingly on the fly to have a more accurate path hypothesis. We have applied this reliability scoring mechanism in two weather related domains, JUPITER [6] (for English) and PANDA (a predecessor of MUXING [5] for Mandarin Chinese). We get 9....

