Results 1 -
4 of
4
Prosodic Cues to Discourse Segment Boundaries in Human-Computer Dialogue
- In Proc. of SIGdial
, 2004
"... Theories of discourse structure hypothesize a hierarchical structure of discourse segments, typically tree-structured. While substantial work has been done on identifying and automatically recognizing the textual and prosodic correlates of discourse structure in monologue, comparable cues for dialog ..."
Abstract
-
Cited by 3 (0 self)
- Add to MetaCart
Theories of discourse structure hypothesize a hierarchical structure of discourse segments, typically tree-structured. While substantial work has been done on identifying and automatically recognizing the textual and prosodic correlates of discourse structure in monologue, comparable cues for dialogue or multiparty conversation, and in particular humancomputer dialogue remain relatively less studied. In this paper, we explore prosodic cues to discourse segmentation in humancomputer dialogue. Using data drawn from 60 hours of interactions with a voice-only conversational spoken language system, we identify pitch and intensity features that signal segment boundaries. Specifically, based on 473 pairs of segment-final and segmentinitiating utterances, we find significant increases for segment-initial utterances in maximum pitch, average pitch, and average intensity, while segment-final utterances show significantly lower minimum pitch. These results suggest that even in the artificial environment of human-computer dialogue, prosodic cues robustly signal discourse segment structure, comparably to the contrastive uses of pitch and amplitude identified in natural monologues.
Statistical Models for Text Segmentation
- Machine Learning
, 1999
"... . This paper introduces a new statistical approach to automatically partitioning text into coherent segments. The approach is based on a technique that incrementally builds an exponential model to extract features that are correlated with the presence of boundaries in labeled training text. The mod ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
. This paper introduces a new statistical approach to automatically partitioning text into coherent segments. The approach is based on a technique that incrementally builds an exponential model to extract features that are correlated with the presence of boundaries in labeled training text. The models use two classes of features: topicality features that use adaptive language models in a novel way to detect broad changes of topic, and cue-word features that detect occurrences of specific words, whichmay be domain-specific, that tend to be used near segment boundaries. Assessment of our approachonquantitative and qualitative grounds demonstrates its effectiveness in twovery different domains, Wall Street Journal news articles and television broadcast news story transcripts. Quantitative results on these domains are presented using a new probabilistically motivated error metric, whichcombines precision and recall in a natural and flexible way. This metric is used to make a quantitative ...
Pronominal anaphora in Basque: annotation of a real corpus
"... Abstract: This paper describes the process followed in the annotation of pronominal anaphora in the Eus3LB corpus 1 of Basque. Our aim is to use this annotation as the basis for later computational treatment of our language. We present the linguistic analysis carried out, the criteria defined for th ..."
Abstract
- Add to MetaCart
Abstract: This paper describes the process followed in the annotation of pronominal anaphora in the Eus3LB corpus 1 of Basque. Our aim is to use this annotation as the basis for later computational treatment of our language. We present the linguistic analysis carried out, the criteria defined for the tagging and some relevant linguistic conclusions about the features of the antecedents needed to link them correctly to their anaphoric elements.
Applications of Lexical Cohesion in the Topic Detection and Tracking Domain
, 2004
"... This thesis investigates the appropriateness of using lexical cohesion analysis to improve the performance of Information Retrieval (IR) and Natural Language Processing (NLP) applications that deal with documents in the news domain. This thesis reports on the performance of some challenging, real-wo ..."
Abstract
- Add to MetaCart
This thesis investigates the appropriateness of using lexical cohesion analysis to improve the performance of Information Retrieval (IR) and Natural Language Processing (NLP) applications that deal with documents in the news domain. This thesis reports on the performance of some challenging, real-world applications of lexical cohesion analysis with respect to the performance of bag-of-words approaches to these problems. In particular, we attempt to enhance New Event Detection and News Story Segmentation performance: two tasks currently being investigated by the Topic Detection and Tracking (TDT) initiative, a research programme dedicated to the intelligent organisation of broadcast news and newswire data streams.

