Results 11 - 20
of
185
Information Structure in Discourse: Towards an Integrated Formal Theory of Pragmatics
, 1998
"... For many linguists interested in pragmatics, including the Prague School theorists, ..."
Abstract
-
Cited by 61 (2 self)
- Add to MetaCart
For many linguists interested in pragmatics, including the Prague School theorists,
The Audio Notebook - Paper and Pen Interaction with Structured Speech
, 2001
"... This paper addresses the problem that a listener experiences when attempting to capture information presented during a lecture, meeting, or interview. Listeners must divide their attention between the talker and their notetaking activity. We propose a new device -- the Audio Notebook -- for taking n ..."
Abstract
-
Cited by 59 (2 self)
- Add to MetaCart
This paper addresses the problem that a listener experiences when attempting to capture information presented during a lecture, meeting, or interview. Listeners must divide their attention between the talker and their notetaking activity. We propose a new device -- the Audio Notebook -- for taking notes and interacting with a speech recording. The Audio Notebook is a combination of a digital audio recorder and paper notebook, all in one device. Audio recordings are structured using two techniques: user structuring based on notetaking activity, and acoustic structuring based on a talker's changes in pitch, pausing, and energy. A field study showed that the interaction techniques enabled a range of usage styles, from detailed review to high speed skimming. The study motivated the addition of phrase detection and topic suggestions to improve access to the audio recordings. Through these audio interaction techniques, the Audio Notebook defines a new approach for navigation in the audio domain.
Structure and intonation
- Language
, 1991
"... JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of content in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new forms ..."
Abstract
-
Cited by 58 (10 self)
- Add to MetaCart
JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of content in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new forms
Cue phrase classification using machine learning
- Journal of Artificial Intelligence Research
, 1996
"... Cue phrases may be used in a discourse sense to explicitly signal discourse structure, but also in a sentential sense to convey semantic rather than structural information. Correctly classifying cue phrases as discourse or sentential is critical in natural language processing systems that exploit di ..."
Abstract
-
Cited by 47 (2 self)
- Add to MetaCart
Cue phrases may be used in a discourse sense to explicitly signal discourse structure, but also in a sentential sense to convey semantic rather than structural information. Correctly classifying cue phrases as discourse or sentential is critical in natural language processing systems that exploit discourse structure, e.g., for performing tasks such as anaphora resolution and plan recognition. This paper explores the use of machine learning for classifying cue phrases as discourse or sentential. Two machine learning programs (cgrendel and C4.5) are used to induce classification models from sets of pre-classified cue phrases and their features in text and speech. Machine learning is shown to be an effective technique for not only automating the generation of classification models, but also for improving upon previous results. When compared to manually derived classification models already in the literature, the learned models often perform with higher accuracy and contain new linguistic insights into the data. In addition, the ability to automatically construct classification models makes it easier to comparatively analyze the utility of alternative feature representations of the data. Finally, the ease of retraining makes the learning approach more scalable and flexible than manual methods. 1.
Assigning Intonational Features in Synthesized Spoken Directions
- In Proceedings of the 26th Annual Meeting of the Association for Computational Linguistics
, 1988
"... Speakers convey much of the information hearers use to interpret discourse by varying prosodic features such as PI-IR.ASING, PITCH ACCENT placement, TUNE, and PITCH XANOE. The ability to emulate such variation is crucial to effective (synthetic) speech generation. While text-to- speech synthesis mus ..."
Abstract
-
Cited by 43 (3 self)
- Add to MetaCart
Speakers convey much of the information hearers use to interpret discourse by varying prosodic features such as PI-IR.ASING, PITCH ACCENT placement, TUNE, and PITCH XANOE. The ability to emulate such variation is crucial to effective (synthetic) speech generation. While text-to- speech synthesis must rely primarily upon structural information to determine appropriate intonational features, speech synthesized from an abstract representation of the message to be conveyed may employ much richer sources. The implementation of an intonation assignment component for Direction Assistance, a program which generates spoken directions, provides a first approximation of how recent models of discourse structure can be used to control intonational variation in ways that build upon recent research in intonational meaning. The implementation further suggests ways in which these discourse models might be augmented to permit the assignment of appropriate intonational features.
Generating Contextually Appropriate Intonation
- In Proceedings of the 6th Conference of the European Chapter of the Association for Computational Linguistics
, 1993
"... One source of unnaturalness in the output of text-to-speech systems stems from the involvement of algorithmically generated default intonation contours, applied under minimal control from syntax and semantics. ..."
Abstract
-
Cited by 38 (12 self)
- Add to MetaCart
One source of unnaturalness in the output of text-to-speech systems stems from the involvement of algorithmically generated default intonation contours, applied under minimal control from syntax and semantics.
Generating Expression in Synthesized Speech
, 1990
"... The document examines the proposal that affect can be reproduced in synthesized speech by imitating the effects of emotion in human speech. A program, the Affect Editor, was constructed to systematically vary the influence of the speech correlates of emo... ..."
Abstract
-
Cited by 34 (1 self)
- Add to MetaCart
The document examines the proposal that affect can be reproduced in synthesized speech by imitating the effects of emotion in human speech. A program, the Affect Editor, was constructed to systematically vary the influence of the speech correlates of emo...
Discourse Structure in Spoken Language: Studies on Speech Corpora
- In Working Notes of the AAAI Spring Symposium on Empirical Methods in Discourse Interpretation and Generation
, 1995
"... A better understanding of the intonational characteristics of spoken discourse may lead to new empirical techniques for identifying discourse structure from speech, as well as new algorithms for enhancing the naturalness of synthetic speech. This paper summarizes results of pilot studies that demons ..."
Abstract
-
Cited by 32 (9 self)
- Add to MetaCart
A better understanding of the intonational characteristics of spoken discourse may lead to new empirical techniques for identifying discourse structure from speech, as well as new algorithms for enhancing the naturalness of synthetic speech. This paper summarizes results of pilot studies that demonstrate reliable correlations of discourse and speech properties, and reports findings on a new corpus of direction-giving monologues, collected in both spontaneous and read speaking styles. Preliminary analyses of the direction-giving corpus show that the availability of speech significantly affects the reliability of discourse segmentation for a set of trained discourse labelers. Introduction This paper reports on ongoing corpus-based research on the intonational characteristics of spoken discourse in American English. The scientific goal of this research is to lay the foundations for a bootstrapping process, in which empirical evidence from spoken language informs us of strengths and weak...
Generating F0 contours from ToBI labels using linear regression
- In ICSLP 96
, 1996
"... This paper describes a method for generating F 0 contours from ToBI labelled utterances. The method uses linear regression to predict F 0 target values for the start, mid-vowel and end of every syllable, using features representing the ToBI labels, stress and syllable position. Contours generated by ..."
Abstract
-
Cited by 31 (1 self)
- Add to MetaCart
This paper describes a method for generating F 0 contours from ToBI labelled utterances. The method uses linear regression to predict F 0 target values for the start, mid-vowel and end of every syllable, using features representing the ToBI labels, stress and syllable position. Contours generated by this method for an English database have a correlation of 0.62 and 34.8 Hz RMS error when compared with originals from test data. These results are significant improvements on a previous rule driven method (0.40 and 44.7), and the new method contours are preferred byhuman listeners. The technique has also been successfully applied to Japanese ToBI with similar improvements.
A Speech-First Model For Repair Detection And Correction
- In Proceedings of the 31 th Annual Meeting of the Association for Computational Linguistics
, 1993
"... Interpreting fully natural speech is an important goal for spoken language understanding systems. However, while corpus studies have shown that about 10% of spontaneous utterances contain self-corrections, or PEPAIRS, little is known about the extent to which cues in the speech signal may facilitate ..."
Abstract
-
Cited by 31 (1 self)
- Add to MetaCart
Interpreting fully natural speech is an important goal for spoken language understanding systems. However, while corpus studies have shown that about 10% of spontaneous utterances contain self-corrections, or PEPAIRS, little is known about the extent to which cues in the speech signal may facilitate repair processing. We identify several cues based on acoustic and prosodic analysis of repairs in a corpus of spontaneous speech, and propose methods for exploiting these cues to detect and correct repairs. We test our acoustic-prosodic cues with other lexical cues to repair identification and find that precision rates of 89-93% and recall of 78-83% can be achieved, depending upon the cues employed, from a prosodically labeled corpus.

