Results 1 -
6 of
6
The Thoughtful Elephant: Strategies for Spoken Dialog Systems
- IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING
, 2000
"... In this paper we present technology used in spoken dialog systems for applications of a wide range. They include tasks from the travel domain and automatic switchboards as well as large scale directory assistance. The overall goal in developing spoken dialog systems is to allow for a natural and fle ..."
Abstract
-
Cited by 19 (0 self)
- Add to MetaCart
In this paper we present technology used in spoken dialog systems for applications of a wide range. They include tasks from the travel domain and automatic switchboards as well as large scale directory assistance. The overall goal in developing spoken dialog systems is to allow for a natural and flexible dialog flow similar to human--human interaction. This imposes the challenging task to recognize and interpret user input, where he/she is allowed to choose from an unrestricted vocabulary and an infinite set of possible formulations. We therefore put emphasis on strategies that make the system more robust while still maintaining a high level of naturalness and flexibility. In view of this paradigm, we found that two fundamental principles characterize many of the proposed methods: 1) to consider available sources of information as early as possible, and 2) to keep alternative hypotheses and delay the decision for a single option as long as possible. We describe
A voice-controlled automatic telephone switchboard and directory information system
- Speech Communication
, 1997
"... The Philips automatic telephone switchboard and directory information system PADIS provides a natural-language user interface to a telephone directory database. Using speech recognition and language understanding technologies, the system offers phone numbers, fax numbers, email addresses, and room n ..."
Abstract
-
Cited by 17 (5 self)
- Add to MetaCart
The Philips automatic telephone switchboard and directory information system PADIS provides a natural-language user interface to a telephone directory database. Using speech recognition and language understanding technologies, the system offers phone numbers, fax numbers, email addresses, and room numbers as well as direct call completion to a desired party. In this paper, we present the underlying probabilistic framework, the system architecture, and the individual modules for speech recognition, language understanding, dialogue control, and speech output. In addition, we report results on performance and user behaviour obtained from a field test in our research lab with a 600-entry database. We derive a new maximum-a-posteriori decision rule which incorporates database knowledge and dialogue history as constraints in speech recognition and language understanding. It has improved speech understanding accuracy by 19 % (in terms of concept error rate), and reduced attribute substitution errors (e.g. recognition of a wrong name) by 38%. The decision rule is implemented in a multi-stage approach as a combination of state-of-the-art speech recognition, partial parsing with an attributed stochastic context-free grammar, and an N-best algorithm which is also described in this paper. The system conducts a flexible mixed-initiative dialogue rather than using a rigid form-filling scheme, and incorporates database knowledge to optimize the dialogue flow.
Ontology-based Contextual Coherence Scoring
- IN PROCEEDINGS OF THE FOURTH SIGDIAL WORKSHOP ON DISCOURSE AND DIALOGUE
, 2003
"... In this paper we present a contextual extension to ONTOSCORE, a system for scoring sets of concepts on the basis of an ontology. We apply the contextually enhanced system to the task of scoring alternative speech recognition hypotheses (SRH) in terms of their semantic coherence. We conducted ..."
Abstract
-
Cited by 4 (2 self)
- Add to MetaCart
In this paper we present a contextual extension to ONTOSCORE, a system for scoring sets of concepts on the basis of an ontology. We apply the contextually enhanced system to the task of scoring alternative speech recognition hypotheses (SRH) in terms of their semantic coherence. We conducted
What's in a Word Graph Evaluation and Enhancement of Word Lattices
- In Proc. of Eurospeech
, 1997
"... During the last few years, word graphs have been gaining increasing interest within the speech community as the primary interface between speech recognizers and language processing modules. Both development and evaluation of graphproducing speech decoders require generally accepted measures of word ..."
Abstract
-
Cited by 2 (1 self)
- Add to MetaCart
During the last few years, word graphs have been gaining increasing interest within the speech community as the primary interface between speech recognizers and language processing modules. Both development and evaluation of graphproducing speech decoders require generally accepted measures of word graph quality. While the notion of recognition accuracy can easily be extended to word graphs, a meaningful measure of word graph size has not yet surfaced. We argue, that the number of derivation steps a theoretical parser would need to process all unique sub-paths in a graph could provide a measure that is both application oriented enough to be meaningful and general enough to allow a useful comparison of word recognizers across different applications. This paper discusses various measures that are used, or could be used, to measure word graph quality. Using real-life data (word graphs evaluated in the 1996 Verbmobil acoustic evaluation), it is demonstrated how different measures can affec...
Making relative sense: From word-graphs to semantic frames
- in Proceedings of the HLT/NAACL Workshop on Scalable Natural Language Understanding
, 2004
"... Scaling up from controlled single domain spoken dialogue systems towards conversational, multi-domain and multimodal dialogue systems poses new challenges for the reliable processing of less restricted user utterances. In this paper we explore the feasibility to employ a general purpose ontology for ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
Scaling up from controlled single domain spoken dialogue systems towards conversational, multi-domain and multimodal dialogue systems poses new challenges for the reliable processing of less restricted user utterances. In this paper we explore the feasibility to employ a general purpose ontology for various tasks involved in processing the user’s utterances. 1
Semantic Coherence Scoring Using an Ontology
- In Proceedings of the HLT-NAACL Conference
, 2003
"... In this paper we present ONTOSCORE, a system for scoring sets of concepts on the basis of an ontology. We apply our system to the task of scoring alternative speech recognition hypotheses (SRH) in terms of their semantic coherence. We conducted an annotation experiment and showed that human an ..."
Abstract
- Add to MetaCart
In this paper we present ONTOSCORE, a system for scoring sets of concepts on the basis of an ontology. We apply our system to the task of scoring alternative speech recognition hypotheses (SRH) in terms of their semantic coherence. We conducted an annotation experiment and showed that human annotators can reliably differentiate between semantically coherent and incoherent speech recognition hypotheses.

