Results 1 -
8 of
8
Using Context to Improve Emotion Detection in Spoken Dialog Systems
- IN PROCEEDINGS OF INTERSPEECH
, 2005
"... Most research that explores the emotional state of users of spoken dialog systems does not fully utilize the contextual nature that the dialog structure provides. This paper reports results of machine learning experiments designed to automatically classify the emotional state of user turns using a c ..."
Abstract
-
Cited by 13 (4 self)
- Add to MetaCart
Most research that explores the emotional state of users of spoken dialog systems does not fully utilize the contextual nature that the dialog structure provides. This paper reports results of machine learning experiments designed to automatically classify the emotional state of user turns using a corpus of 5,690 dialogs collected with the "How May I Help You" spoken dialog system. We show that augmenting standard lexical and prosodic features with contextual features that exploit the structure of spoken dialog and track user state increases classification accuracy by 2.6%.
Early Error Detection on Word Level
, 2004
"... In this paper two studies are presented in which the detection of speech recognition errors on the word level was examined. ..."
Abstract
-
Cited by 8 (2 self)
- Add to MetaCart
In this paper two studies are presented in which the detection of speech recognition errors on the word level was examined.
Automatic Evaluation: Using a DATE Dialogue Act Tagger for User Satisfaction and Task Completion Prediction
- In LREC 2002
, 2002
"... The objective of the DARPA Communicator project is to support rapid, cost-effective development of multi-modal speech-enabled dialogue systems with advanced conversational capabilities. During the course of the Communicator program, we have been involved in developing methods for measuring progress ..."
Abstract
-
Cited by 8 (1 self)
- Add to MetaCart
The objective of the DARPA Communicator project is to support rapid, cost-effective development of multi-modal speech-enabled dialogue systems with advanced conversational capabilities. During the course of the Communicator program, we have been involved in developing methods for measuring progress towards the program goals and assessing advances in the component technologies required to achieve such goals. Our goal has been to develop a lightweight evaluation paradigm for heterogeneous systems. In this paper, we utilize the Communicator evaluation corpus from 2001 and build on previous work applying the PARADISE evaluation framework to establish a baseline for fully automatic system evaluation. We train a regression tree to predict User Satisfaction using a random 80% of the dialogues for training. The metrics (features) we use for prediction are a fully automatic Task Success Measure, Efficiency Measures, and System Dialogue Act Behaviors extracted from the dialogue logfiles using the DATE (Dialogue Act Tagging for Evaluation) tagging scheme. The learned tree with the DATE metrics has a correlation of 0.614 (R of 0.376) with the actual user satisfaction values for the held out test set, while the learned tree without the DATE metrics has a correlation of 0.595 (R of 0.35).
Learning Decision Models in Spoken Dialogue Systems via User Simulation
"... This paper describes a set of experiments designed to explore the utility of simulated dialogues and automatic rule induction in spoken dialogue systems. The experiments were conducted within a flight domain task, where the user supplies source, destination, and date to the system. The system was co ..."
Abstract
-
Cited by 4 (0 self)
- Add to MetaCart
This paper describes a set of experiments designed to explore the utility of simulated dialogues and automatic rule induction in spoken dialogue systems. The experiments were conducted within a flight domain task, where the user supplies source, destination, and date to the system. The system was configured to support explicitly about 500 large cities; any other cities could only be recovered through a spell-mode subdialogue. Two specific problems were identified: the conflict problem, and the compliance problem. A RIPPERbased rule induction algorithm was applied to data from user simulation runs, and the resulting system was compared against a manually developed baseline system. The learned rules performed significantly better than the manual ones for a number of different measures of success, for both simulations and real user dialogues.
Prosody and Speaker State: Paralinguistics, Pragmatics, and Proficiency
, 2007
"... Prosody—suprasegmental characteristics of speech such as pitch, rhythm, and loudness— is a rich source of information in spoken language and can tell a listener much about the internal state of a speaker. This thesis explores the role of prosody in conveying three very different types of speaker sta ..."
Abstract
-
Cited by 3 (0 self)
- Add to MetaCart
Prosody—suprasegmental characteristics of speech such as pitch, rhythm, and loudness— is a rich source of information in spoken language and can tell a listener much about the internal state of a speaker. This thesis explores the role of prosody in conveying three very different types of speaker state: paralinguistic state, in particular emotion; pragmatic state, in particular questioning; and the state of spoken language proficiency of non-native English speakers. Paralinguistics. Intonational features describing pitch contour shape were found to dis-criminate emotion in terms of positive and negative affect. A procedure is described for clustering groups of listeners according to perceptual emotion ratings that foster further understanding of the relationship between acoustic-prosodic cues and emotion perception. Pragmatics. Student questions in a corpus of one-on-one tutorial dialogs were found to be signaled primarily by phrase-final rising intonation, an important cue used in conjunc-tion with lexico-pragmatic cues to differentiate the high rate of observed declarative questions from proper declaratives. The automatic classification of question form and
Automatic analysis of medical dialogue in the homehemodialysis domain: structure induction and summarization
- Journal of Biomedical Informatics
, 2006
"... Spoken medical dialogue is a valuable source of information for patients and caregivers. This work presents a first step towards automatic analysis and summarization of spoken medical dialogue. We first abstract a dialogue into a sequence of semantic categories using linguistic and contextual featur ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
Spoken medical dialogue is a valuable source of information for patients and caregivers. This work presents a first step towards automatic analysis and summarization of spoken medical dialogue. We first abstract a dialogue into a sequence of semantic categories using linguistic and contextual features integrated in a supervised machine-learning framework. Our model has a classification accuracy of 73%, compared to 33 % achieved by a majority baseline (p<0.01). We then describe and implement a summarizer that utilizes this automatically induced structure. Our evaluation results indicate that automatically generated summaries exhibit high resemblance to summaries written by humans. In addition, task-based evaluation shows that physicians can reasonably answer questions related to patient care by looking at the automatically-generated summaries alone, in contrast to the physicians ’ performance when they were given summaries from a naïve summarizer (p<0.05). This work demonstrates the feasibility of automatically structuring and summarizing spoken medical dialogue.
Modeling User Satisfaction Transitions in Dialogues from Overall Ratings
"... This paper proposes a novel approach for predicting user satisfaction transitions during a dialogue only from the ratings given to entire dialogues, with the aim of reducing the cost of creating reference ratings for utterances/dialogue-acts that have been necessary in conventional approaches. In ou ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
This paper proposes a novel approach for predicting user satisfaction transitions during a dialogue only from the ratings given to entire dialogues, with the aim of reducing the cost of creating reference ratings for utterances/dialogue-acts that have been necessary in conventional approaches. In our approach, we first train hidden Markov models (HMMs) of dialogue-act sequences associated with each overall rating. Then, we combine such rating-related HMMs into a single HMM to decode a sequence of dialogueacts into state sequences representing to which overall rating each dialogue-act is most related, which leads to our rating predictions. Experimental results in two dialogue domains show that our approach can make reasonable predictions; it significantly outperforms a baseline and nears the upper bound of a supervised approach in some evaluation criteria. We also show that introducing states that represent dialogue-act sequences that occur commonly in all ratings into an HMM significantly improves prediction accuracy. 1
Which System Differences Matter? Using ℓ1/ℓ2 Regularization to Compare Dialogue Systems
"... We investigate how to jointly explain the performance and behavioral differences of two spoken dialogue systems. The Join Evaluation and Differences Identification (JEDI), finds differences between systems relevant to performance by formulating the problem as a multi-task feature selection question. ..."
Abstract
- Add to MetaCart
We investigate how to jointly explain the performance and behavioral differences of two spoken dialogue systems. The Join Evaluation and Differences Identification (JEDI), finds differences between systems relevant to performance by formulating the problem as a multi-task feature selection question. JEDI provides evidence on the usefulness of a recent method, ℓ1/ℓp-regularized regression (Obozinski et al., 2007). We evaluate against manually annotated success criteria from real users interacting with five different spoken user interfaces that give bus schedule information. 1

