Results 1 - 10
of
10
Automatic linguistic segmentation of conversational speech
- Proc. ICSLP
, 1996
"... As speech recognition moves toward more unconstrained domains such as conversational speech, we encounter a need to be able to segment (or resegment) waveforms and recognizer output into linguistically meaningful units, such a sentences. Toward this end, we present a simple automatic segmenter of tr ..."
Abstract
-
Cited by 74 (19 self)
- Add to MetaCart
As speech recognition moves toward more unconstrained domains such as conversational speech, we encounter a need to be able to segment (or resegment) waveforms and recognizer output into linguistically meaningful units, such a sentences. Toward this end, we present a simple automatic segmenter of transcripts based on N-gram language modeling. We also study the relevance of several word-level features for segmentation performance. Using only word-level information, we achieve 85 % recall and 70 % precision on linguistic boundary detection. 1.
Speech repairs, intonational phrases and discourse markers: modeling speakers’ utterances in spoken dialogue
- Computational Linguistics
, 1999
"... Interactive spoken dialogue provides many new challenges for natural language understanding systems. One of the most critical challenges is simply determining the speaker’s intended utterances: both segmenting a speaker’s turn into utterances and determining the intended words in each utterance. Eve ..."
Abstract
-
Cited by 61 (9 self)
- Add to MetaCart
Interactive spoken dialogue provides many new challenges for natural language understanding systems. One of the most critical challenges is simply determining the speaker’s intended utterances: both segmenting a speaker’s turn into utterances and determining the intended words in each utterance. Even assuming perfect word recognition, the latter problem is complicated by the occurrence of speech repairs, which occur where speakers go back and change (or repeat) something they just said. The words that are replaced or repeated are no longer part of the intended utterance, and so need to be identified. Segmenting turns and resolving repairs are strongly intertwined with a third task: identifying discourse markers. Because of the interactions, and interactions with POS tagging and speech recognition, we need to address these tasks together and early on in the processing stream. This paper presents a statistical language model in which we redefine the speech recognition problem so that it includes the identification of POS tags, discourse markers, speech repairs and intonational phrases. By solving these simultaneously, we obtain better results on each task than addressing them separately. Our model is able to identify 72 % of turn-internal intonational boundaries with a precision of 71%, 97 % of discourse markers with 96 % precision, and detect and correct 66 % of repairs with 74 % precision.
Speech Repairs, Intonational Boundaries and Discourse Markers: Modeling Speakers
- Department of Computer Science, University of Rochester
, 1997
"... Peter Heeman was born October 22, 1963, and much to his dismay his parents had already moved away from Toronto. Instead he was born in London Ontario, where he grew up on a strawberry farm. He attended the University of Waterloo where he re-ceived a Bachelors of Mathematics with a joint degree in Pu ..."
Abstract
-
Cited by 24 (8 self)
- Add to MetaCart
Peter Heeman was born October 22, 1963, and much to his dismay his parents had already moved away from Toronto. Instead he was born in London Ontario, where he grew up on a strawberry farm. He attended the University of Waterloo where he re-ceived a Bachelors of Mathematics with a joint degree in Pure Mathematics and Com-puter Science in the spring of 1987. After working two years for a software engineering company, which supposedly used artificial intelligence techniques to automate COBOL and CICS programming, Peter was ready for a change. What better way to wipe the slate clear than by going to graduate school at the University of Toronto, but not without first spending the sum-mer in Europe. After spending two months in countries where he couldn’t speak the language, Peter became fascinated by language, and so decided to give computational linguistics a try.
Modeling Linguistic Segment And Turn Boundaries For N-Best Rescoring Of Spontaneous Speech
- Proc. EUROSPEECH
, 1997
"... Language modeling, especially for spontaneous speech, often suffers from a mismatch of utterance segmentations between training and test conditions. In particular, training often uses linguistically-based segments, whereas testing occurs on acoustically determined segments, resulting in degraded per ..."
Abstract
-
Cited by 12 (2 self)
- Add to MetaCart
Language modeling, especially for spontaneous speech, often suffers from a mismatch of utterance segmentations between training and test conditions. In particular, training often uses linguistically-based segments, whereas testing occurs on acoustically determined segments, resulting in degraded performance. We present an N-best rescoring algorithm that removes the effect of segmentation mismatch. Furthermore, we show that explicit language modeling of hidden linguistic segment boundaries is improved by including turn-boundary events in the model. 1. THE SEGMENTATION PROBLEM IN LANGUAGE MODELING One of the problems encountered in speech recognition on continuous, spontaneous speech is the segmentation of long waveforms. Because current recognizers prefer short waveform segments for best performance and to limit computational resources, conversation-length waveforms are typically pre-segmented using simple acoustic criteria, such as locations of long pauses and turn switches. This crea...
Word predictability after hesitations: A corpus-based study
- IN PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING
, 1996
"... We ask whether lexical hesitations in spontaneous speech tend to precede words that are difficult to predict. We define predictability in terms of both transition probability and entropy, in the context of an N-gram language model. Results show that transition probability is significantly lower at h ..."
Abstract
-
Cited by 12 (2 self)
- Add to MetaCart
We ask whether lexical hesitations in spontaneous speech tend to precede words that are difficult to predict. We define predictability in terms of both transition probability and entropy, in the context of an N-gram language model. Results show that transition probability is significantly lower at hesitation transitions, and that this is attributable to both the following word and the word history. In addition, results suggest that fluent transitions in sentences with a hesitation elsewhere are significantly more likely than transitions in fluent sentences to contain out-of-vocabulary words and novel word combinations. Such findings could be used to improve statistical language modeling for spontaneous-speech applications.
Modeling Speech Repairs And Intonational Phrasing To Improve Speech Recognition
- In Automatic Speech Recognition and Understanding Workshop
, 1999
"... The spontaneous speech events of speech repairs and intonational phrasing cause disruptions in the local context, and this disruption prevents traditional language models from being able to properly predict the words in the vicinity of these events. The solution is to use a language model that can a ..."
Abstract
-
Cited by 7 (3 self)
- Add to MetaCart
The spontaneous speech events of speech repairs and intonational phrasing cause disruptions in the local context, and this disruption prevents traditional language models from being able to properly predict the words in the vicinity of these events. The solution is to use a language model that can account for these spontaneous speech events. In this paper, we use such a model to rescore word graphs. This gives a small but significant decrease in the word error rate of 1.2%, in addition to an improvement of 4.4% from modeling the syntactic role of the words. Furthermore, as modeling of spontaneous speech events improves, word recognition results should also improve. 1. INTRODUCTION To enable spoken dialogue systems to advance towards more collaborative interaction, systems need to handle language as it is actually spoken. People not only utter a string of words, but they group them into intonational phrases and make repairs to what they are saying. Consider the following speaker's tur...
Topic Identification In Natural Language Dialogues Using Neural Networks
- In Proceedings of the 3rd SIGdial Workshop on Discourse and Dialogue
, 2002
"... In humancomputer interaction systems using natural language, the recognition of the topic from user's utterances is an important task. We examine two different perspectives to the problem of topic analysis needed for carrying out a successful dialogue. First, we apply selforganized document ..."
Abstract
-
Cited by 2 (1 self)
- Add to MetaCart
In humancomputer interaction systems using natural language, the recognition of the topic from user's utterances is an important task. We examine two different perspectives to the problem of topic analysis needed for carrying out a successful dialogue. First, we apply selforganized document maps for modeling the broader subject of dis- course based on the occurrence of content words in the dialogue context.
Sentence Boundary Detection On Broadcast News
"... Sentence boundary detection is an important task that augments automatic speech recognition word output with syntactic structure to allow for ease in reading and facilitate natural language processing tasks. This paper's approach recovers setence boundaries based on a combination of prosodic a ..."
Abstract
- Add to MetaCart
Sentence boundary detection is an important task that augments automatic speech recognition word output with syntactic structure to allow for ease in reading and facilitate natural language processing tasks. This paper's approach recovers setence boundaries based on a combination of prosodic and word-based features.
Curriculum Vitae
, 1997
"... Peter Heeman was born October 22, 1963, and much to his dismay his parents had already moved away from Toronto. Instead he was born in London Ontario, where he grew up on a strawberry farm. He attended the University of Waterloo where he received a Bachelors of Mathematics with a joint degree in Pur ..."
Abstract
- Add to MetaCart
Peter Heeman was born October 22, 1963, and much to his dismay his parents had already moved away from Toronto. Instead he was born in London Ontario, where he grew up on a strawberry farm. He attended the University of Waterloo where he received a Bachelors of Mathematics with a joint degree in Pure Mathematics and Computer Science in the spring of 1987. After working two years for a software engineering company, which supposedly used artificial intelligence techniques to automate COBOL and CICS programming, Peter was ready for a change. What better way to wipe the slate clear than by going to graduate school at the University of Toronto, but not without first spending the summer in Europe. After spending two months in countries where he couldn’t speak the language, Peter became fascinated by language, and so decided to give computational linguistics a try.

