Results 1 - 10
of
45
Gemini: A Natural Language System For Spoken-Language Understanding
- In Proceedings of the Thirty-First Annual Meeting of the Association for Computational Linguistics
, 1993
"... This paper describes the details of the system, and includes relevant measurements of size, efficiency, and performance of each of its components ..."
Abstract
-
Cited by 128 (34 self)
- Add to MetaCart
This paper describes the details of the system, and includes relevant measurements of size, efficiency, and performance of each of its components
A Stochastic Model of Human-Machine Interaction for learning dialog Strategies
- IEEE Transactions on Speech and Audio Processing
, 2000
"... Abstract—In this paper, we propose a quantitative model for dialog systems that can be used for learning the dialog strategy. We claim that the problem of dialog design can be formalized as an optimization problem with an objective function reflecting different dialog dimensions relevant for a given ..."
Abstract
-
Cited by 122 (3 self)
- Add to MetaCart
Abstract—In this paper, we propose a quantitative model for dialog systems that can be used for learning the dialog strategy. We claim that the problem of dialog design can be formalized as an optimization problem with an objective function reflecting different dialog dimensions relevant for a given application. We also show that any dialog system can be formally described as a sequential decision process in terms of its state space, action set, and strategy. With additional assumptions about the state transition probabilities and cost assignment, a dialog system can be mapped to a stochastic model known as Markov decision process (MDP). A variety of data driven algorithms for finding the optimal strategy (i.e., the one that optimizes the criterion) is available within the MDP framework, based on reinforcement learning. For an effective use of the available training data we propose a combination of supervised and reinforcement learning: the supervised learning is used to estimate a model of the user, i.e., the MDP parameters that quantify the user’s behavior. Then a reinforcement learning algorithm is used to estimate the optimal strategy while the system interacts with the simulated user. This approach is tested for learning the strategy in an air travel information system (ATIS) task. The experimental results we present in this paper show that it is indeed possible to find a simple criterion, a state space representation, and a simulated user parameterization in order to automatically learn a relatively complex dialog behavior, similar to one that was heuristically designed by several research groups. Index Terms—Dialog systems, Markov decision process, reinforcement learning, sequential decision process, speech, spoken
Evaluating Natural Language Processing Systems
, 1993
"... This report presents a detailed analysis and review of NLP evaluation, in principle and in practice. Part 1 examines evaluation concepts and establishes a framework for NLP system evaluation. This makes use of experience in the related area of information retrieval and the analysis also refers to ev ..."
Abstract
-
Cited by 104 (0 self)
- Add to MetaCart
This report presents a detailed analysis and review of NLP evaluation, in principle and in practice. Part 1 examines evaluation concepts and establishes a framework for NLP system evaluation. This makes use of experience in the related area of information retrieval and the analysis also refers to evaluation in speech processing. Part 2 surveys significant evaluation work done so far, for instance in machine translation, and discusses the particular problems of generic system evaluation. The conclusion is that evaluation strategies and techniques for NLP need much more development, in particular to take proper account of the influence of system tasks and settings. Part 3 develops a general approach to NLP evaluation, aimed at methodologically-sound strategies for test and evaluation motivated by comprehensive performance factor identification. The analysis throughout the report is supported by extensive illustrative examples. This work was carried out under the UK Science and Engineeri...
Preliminaries to a Theory of Speech Disfluencies
, 1994
"... This thesis examines disfluencies (e.g., "um", repeated words, and a variety of forms of self-repair) in the spontaneous speech of adult normal speakers of American English. Despite their prevalence, disfluencies have traditionally been viewed as irregular events and have received little attention. ..."
Abstract
-
Cited by 97 (7 self)
- Add to MetaCart
This thesis examines disfluencies (e.g., "um", repeated words, and a variety of forms of self-repair) in the spontaneous speech of adult normal speakers of American English. Despite their prevalence, disfluencies have traditionally been viewed as irregular events and have received little attention. The goal of the thesis is to provide evidence that, on the contrary, disfluencies show remarkably regular trends in a number of dimensions. These regularities have consequences for models of human language production; they can also be exploited to improve performance in speech applications. The method includes analysis of over 5000 hand-annotated disfluencies from a database (250,000 words) containing three different styles of spontaneous speech: task-oriented human-computer dialog, task-oriented human-human dialog, and human-human conversation on a prescribed topic. The approach is theory-neutral and strongly data-driven. The annotations correspond to observable characteristics ("features") ...
Integrating Multiple Knowledge Sources For Detection And Correction Of Repairs In Human-Computer Dialog
, 1992
"... We have analyzed 607 sentences of spontaneous human-computer speech data containing repairs, drawn from a total corpus of 10,718 sentences. We present here criteria and techniques for automaticaJ]y detecting the presence of a repair, its location, and making the appropriate correction. The criteria ..."
Abstract
-
Cited by 84 (12 self)
- Add to MetaCart
We have analyzed 607 sentences of spontaneous human-computer speech data containing repairs, drawn from a total corpus of 10,718 sentences. We present here criteria and techniques for automaticaJ]y detecting the presence of a repair, its location, and making the appropriate correction. The criteria involve integration of knowledge from several sources: pattern matching, syntactic and semantic analysis, and acoustics.
A Corpus-based study of repair cues in spontaneous speech
"... this paper, acoustic and prosodic cues to such repairs are identified, based on an analysis of a corpus taken from the ARPA Air Travel Information System database, and methods are proposed for exploiting these cues for repair detection, especially the task of modeling word fragments, and repair corr ..."
Abstract
-
Cited by 70 (1 self)
- Add to MetaCart
this paper, acoustic and prosodic cues to such repairs are identified, based on an analysis of a corpus taken from the ARPA Air Travel Information System database, and methods are proposed for exploiting these cues for repair detection, especially the task of modeling word fragments, and repair correction. The relative contributions of these speech-based cues, as well as other text-based repair cues, are examined in a statistical model of repair site detection that achieves a precision rate of 91% and recall of 86% on a prosodically labeled corpus of repair utterances. (This paper appears in the Journal of the Acoustical Society of America, 95 (3), March 1994, pp.1603--1616.) PACS numbers: 43.72Ja,43.70.B,43.70.Bk,43.70.Fq Nakatani&Hirschberg, JASA 2 Introduction
Speech repairs, intonational phrases and discourse markers: modeling speakers’ utterances in spoken dialogue
- Computational Linguistics
, 1999
"... Interactive spoken dialogue provides many new challenges for natural language understanding systems. One of the most critical challenges is simply determining the speaker’s intended utterances: both segmenting a speaker’s turn into utterances and determining the intended words in each utterance. Eve ..."
Abstract
-
Cited by 61 (9 self)
- Add to MetaCart
Interactive spoken dialogue provides many new challenges for natural language understanding systems. One of the most critical challenges is simply determining the speaker’s intended utterances: both segmenting a speaker’s turn into utterances and determining the intended words in each utterance. Even assuming perfect word recognition, the latter problem is complicated by the occurrence of speech repairs, which occur where speakers go back and change (or repeat) something they just said. The words that are replaced or repeated are no longer part of the intended utterance, and so need to be identified. Segmenting turns and resolving repairs are strongly intertwined with a third task: identifying discourse markers. Because of the interactions, and interactions with POS tagging and speech recognition, we need to address these tasks together and early on in the processing stream. This paper presents a statistical language model in which we redefine the speech recognition problem so that it includes the identification of POS tags, discourse markers, speech repairs and intonational phrases. By solving these simultaneously, we obtain better results on each task than addressing them separately. Our model is able to identify 72 % of turn-internal intonational boundaries with a precision of 71%, 97 % of discourse markers with 96 % precision, and detect and correct 66 % of repairs with 74 % precision.
The CommandTalk Spoken Dialogue System
, 1999
"... This paper describes extensions to CommandTalk to support spoken dialogue. While we make no theoretical claims about the nature and structure of dialogue, we are influenced by the theoretical work of (Grosz and Sidner, 1986) and will use terminology from that tradition when appropriate. We also foll ..."
Abstract
-
Cited by 46 (14 self)
- Add to MetaCart
This paper describes extensions to CommandTalk to support spoken dialogue. While we make no theoretical claims about the nature and structure of dialogue, we are influenced by the theoretical work of (Grosz and Sidner, 1986) and will use terminology from that tradition when appropriate. We also follow (Chu-Carroll and Brown, 1997) in distinguishing task initiative and dialogue initiative
On-Line Cursive Handwriting Recognition Using Speech Recognition Methods
, 1994
"... A hidden Markov model (HMM) based continuous speech recognition system is applied to on-line cursive handwriting recognition. The base system is unmodified except for using handwriting feature vectors instead of speech. Due to inherent properties of HMMs, segmentation of the handwritten script sente ..."
Abstract
-
Cited by 35 (5 self)
- Add to MetaCart
A hidden Markov model (HMM) based continuous speech recognition system is applied to on-line cursive handwriting recognition. The base system is unmodified except for using handwriting feature vectors instead of speech. Due to inherent properties of HMMs, segmentation of the handwritten script sentences is unnecessary. A 1.1% word error rate is achieved for a 3050 word lexicon, 52 character, writer-dependent task and 3%-5% word error rates are obtained for six different writers in a 25,595 word lexicon, 86 character, writer-dependent task. Similarities and differences between the continuous speech and on-line cursive handwriting recognition tasks are explored; the handwriting database collected over the past year is described; and specific implementation details of the handwriting system are discussed. 1. INTRODUCTION Traditionally, the first step in handwriting recognition is the segmentation of words into component characters [1]. However, in modern continuous speech recognition ef...
A Speech-First Model For Repair Detection And Correction
- In Proceedings of the 31 th Annual Meeting of the Association for Computational Linguistics
, 1993
"... Interpreting fully natural speech is an important goal for spoken language understanding systems. However, while corpus studies have shown that about 10% of spontaneous utterances contain self-corrections, or PEPAIRS, little is known about the extent to which cues in the speech signal may facilitate ..."
Abstract
-
Cited by 31 (1 self)
- Add to MetaCart
Interpreting fully natural speech is an important goal for spoken language understanding systems. However, while corpus studies have shown that about 10% of spontaneous utterances contain self-corrections, or PEPAIRS, little is known about the extent to which cues in the speech signal may facilitate repair processing. We identify several cues based on acoustic and prosodic analysis of repairs in a corpus of spontaneous speech, and propose methods for exploiting these cues to detect and correct repairs. We test our acoustic-prosodic cues with other lexical cues to repair identification and find that precision rates of 89-93% and recall of 78-83% can be achieved, depending upon the cues employed, from a prosodically labeled corpus.

