Results 1 - 10
of
74
A Formal Framework for Linguistic Annotation
- Speech Communication
, 2000
"... `Linguistic annotation' covers any descriptive or analytic notations applied to raw language data. The basic data may be in the form of time functions -- audio, video and/or physiological recordings -- or it may be textual. The added notations may include transcriptions of all sorts (from phonetic ..."
Abstract
-
Cited by 97 (18 self)
- Add to MetaCart
`Linguistic annotation' covers any descriptive or analytic notations applied to raw language data. The basic data may be in the form of time functions -- audio, video and/or physiological recordings -- or it may be textual. The added notations may include transcriptions of all sorts (from phonetic features to discourse structures), part-of-speech and sense tagging, syntactic analysis, `named entity' identification, co-reference annotation, and so on. While there are several ongoing efforts to provide formats and tools for such annotations and to publish annotated linguistic databases, the lack of widely accepted standards is becoming a critical problem. Proposed standards, to the extent they exist, have focused on file formats. This paper focuses instead on the logical structure of linguistic annotations. We survey a wide variety of existing annotation formats and demonstrate a common conceptual core, the annotation graph. This provides a formal framework for constructing, mai...
Intonation and Dialogue Context as Constraints for Speech Recognition
- LANGUAGE AND SPEECH
, 1998
"... This paper describes a way of using intonation and dialogue context to improve the performance of an automatic speech recognition (ASR) system. Our experiments were run on the DCIEM Maptask corpus, a corpus of spontaneous task-oriented dialogue speech. This corpus has been tagged according to a ..."
Abstract
-
Cited by 29 (4 self)
- Add to MetaCart
This paper describes a way of using intonation and dialogue context to improve the performance of an automatic speech recognition (ASR) system. Our experiments were run on the DCIEM Maptask corpus, a corpus of spontaneous task-oriented dialogue speech. This corpus has been tagged according to a dialogue analysis scheme that assigns each utterance to one of 12 "move types", such as "acknowledge", "query-yes/no" or "instruct". Most asr systems use a bigram language model to constrain the possible sequences of words that might be recognised. Here we use a separate bigram language model for each move type. We show that when the "correct" move-specific language model is used for each utterance in the test set, the word error rate of the recogniser drops. Of course
A Model for Habitable and Efficient Dialogue Management for Natural Language Interaction
, 1997
"... Natural language interfaces require dialogue models that allow for robust habitable and efficient interaction. This paper presents such a model for dialogue management for natural language interfaces. The model is based on empirical studies of human computer interaction in various simple service app ..."
Abstract
-
Cited by 27 (12 self)
- Add to MetaCart
Natural language interfaces require dialogue models that allow for robust habitable and efficient interaction. This paper presents such a model for dialogue management for natural language interfaces. The model is based on empirical studies of human computer interaction in various simple service applications. It is shown that for applications belonging to this class the dialogue can be handled using fairly simple means. The interaction can be modeled in a dialogue grammar with information on the functional role of an utterance as conveyed in the linguistic structure. Focusing is handled using dialogue objects recorded in a dialogue tree representing the constituents of the dialogue. The dialogue objects in the dialogue tree can be accessed by the various modules for interpretation generation and background system access. Focused entities are modeled in entities pertaining to objects or sets of objects and related domain concept information; properties of the domain objects. A simple copying principle where a new dialogue object's focal parameters are instantiated with information from the preceding dialogue object accounts for most context dependent utterances. The action to be carried out by the interface is determined on the basis of how the objects and related properties are specified. This in turn depends on information presented in the user utterance context information from the dialogue tree and information in the domain model. The use of dialogue objects facilitates customization to the sublanguage utilized in a specific application. The framework has successfully been applied to various background systems and interaction modalities. In the paper results are presented from the customization of the dialogue manager to three typed interaction applications are prese...
Dialogue Acts, Synchronising Units and Anaphora Resolution
- Journal of Semantics
, 2000
"... In this paper, we present the results of a corpus analysis, and a model of anaphora resolution in spontaneous spoken dialogues in the form of an algorithm. The main finding of our corpus analysis is that less than half the pronouns and demonstratives have NP antecedents in the preceding text. 22% ..."
Abstract
-
Cited by 27 (0 self)
- Add to MetaCart
In this paper, we present the results of a corpus analysis, and a model of anaphora resolution in spontaneous spoken dialogues in the form of an algorithm. The main finding of our corpus analysis is that less than half the pronouns and demonstratives have NP antecedents in the preceding text. 22% have sentential antecedents and the remainder have no identifiable linguistic antecedents. As part of the corpus analysis we present the results of interannotator agreement tests. These were carried out for marking anaphor types and their antecedents, and for segmenting the dialogues into dialogue acts. The results of the inter-annotator agreement tests indicate that our classification method is reliable and that the annotated dialogues can be used as a standard against which to measure the performance of the resolution algorithm. The algorithm, based on Strube (1998), is capable of classifying pronouns and demonstratives, and co-indexing anaphors with NP and sentential antecedent...
Exploring Human Error Handling Strategies: Implications for Spoken Dialogue Systems
, 2003
"... In this study, the user experience and the consequences of different error handling strategies for spoken dialogue are examined. A modification of the Wizard of Oz method is used, where a speech recogniser is included in the setting. This makes it possible to study how humans handle speech recogniti ..."
Abstract
-
Cited by 20 (3 self)
- Add to MetaCart
In this study, the user experience and the consequences of different error handling strategies for spoken dialogue are examined. A modification of the Wizard of Oz method is used, where a speech recogniser is included in the setting. This makes it possible to study how humans handle speech recognition errors before a dialogue system is actually built. The results show that wizards tend not to signal non-understanding when they face speech recognition problems, but instead ask task-related questions to confirm the wizard's hypothesis about the situation, rather than what has been said. This strategy leads to better understanding of subsequent utterances, whereas signalling non-understanding leads to decreased user experience of task success.
The Predictive Power of Game Structure in Dialogue Act Recognition: Experimental Results Using Maximum Entropy Estimation
- In Proceedings of ICSLP-98
, 1998
"... Recognizing the dialogue act(s) performed by means of an utterance involves combining top-down expectations about the next likely `move' in a dialogue with bottom-up information extracted from the speech signal. We compare two ways of generating expectations: one which makes the expectations depend ..."
Abstract
-
Cited by 18 (2 self)
- Add to MetaCart
Recognizing the dialogue act(s) performed by means of an utterance involves combining top-down expectations about the next likely `move' in a dialogue with bottom-up information extracted from the speech signal. We compare two ways of generating expectations: one which makes the expectations depend only on the previous act, and one which also takes into account the fact that individual dialogue acts play a role as part of larger conversational structures (`games'). Our results indicate that exploiting game structure does lead to improved expectations. 1. INTRODUCTION Recognizing the dialogue act(s) performed by means of an utterance involves combining top-down expectations about the next likely `move' in a dialogue with bottom-up information extracted from the speech signal. The best current models of dialogue act recognition achieve an accuracy of about 70% on transcribed words and of 65% on recognized words (Stolcke et al., 1998; Reithinger and Klesen, 1997). We are trying to impro...
Robust interpretation in the HIGGINS spoken dialogue system
, 2004
"... This paper describes PICKERING, the semantic interpreter developed in the HIGGINS project -- a research project on error handling in spoken dialogue systems. In the project, the initial efforts are centred on the input side of the system. The semantic interpreter combines a rich set of robustness te ..."
Abstract
-
Cited by 18 (16 self)
- Add to MetaCart
This paper describes PICKERING, the semantic interpreter developed in the HIGGINS project -- a research project on error handling in spoken dialogue systems. In the project, the initial efforts are centred on the input side of the system. The semantic interpreter combines a rich set of robustness techniques with the production of deep semantic structures. It allows insertions and non-agreement inside phrases, and combines partial results to return a limited list of semantically distinct solutions. A preliminary evaluation shows that the interpreter performs well under error conditions, and that the built-in robustness techniques contribute to this performance.
Replicability of Transaction and Action Coding in the Map Task Corpus
- AAAI Spring Symposium: Empirical Methods in Discourse Interpretation and Generation
, 1995
"... Task-oriented dialogues can normally be divided into subdialogues, each of which reflects collaboration on a particular substep of the task, and which we call `transactions'. We have devised a way of identifying transactions and their associated actions for HCRC Map Task dialogues, and we have teste ..."
Abstract
-
Cited by 17 (0 self)
- Add to MetaCart
Task-oriented dialogues can normally be divided into subdialogues, each of which reflects collaboration on a particular substep of the task, and which we call `transactions'. We have devised a way of identifying transactions and their associated actions for HCRC Map Task dialogues, and we have tested the replicability of our coding scheme using naive subjects. Introduction Much work on dialogue has concentrated on dialogues arising from collaborative tasks. These dialogues are both easier to analyse than free-form conversations and more relevant to practical applications of humancomputer dialogue. At least from the participants' points of view, the most important issues are how to break the task into executable subtasks and what actions to perform and when. Comparatively little work has been done which relates participants' actions to what happens in the dialogue. During task-oriented dialogue, the participants form collaborative plans to reach a joint goal, transferring information b...
The Computational Processing of Intonational Prominence: A Functional Prosody Perspective
, 1997
"... Intonational prominence, or accent, is a fundamental prosodic feature that is said to contribute to discourse meaning. This thesis outlines a new, computational theory of the discourse interpretation of prominence, from a FUNCTIONAL PROSODY perspective. Functional prosody makes the following two imp ..."
Abstract
-
Cited by 16 (2 self)
- Add to MetaCart
Intonational prominence, or accent, is a fundamental prosodic feature that is said to contribute to discourse meaning. This thesis outlines a new, computational theory of the discourse interpretation of prominence, from a FUNCTIONAL PROSODY perspective. Functional prosody makes the following two important assumptions: first, there is an aspect of prominence interpretation that centrally concerns discourse processes, namely the discourse focusing nature of prominence; and second, the role of prominence in language processing in general, and discourse processing in particular, is not essentially separate from the processing of other grammatical, nonprosodic information. This thesis develops a computational theory of prominence interpretation by explaining how prominence serves as an inference cue in discourse processing. Prominence signals changes in the attentional status of entities in a discourse model, while nonprominence signals that the realized entities are already in discourse fo...

