Results 1 - 10
of
24
Designing the User Interface for Multimodal Speech and Pen-based Gesture Applications: State-of-the-Art Systems and Future Research Directions
, 2000
"... The growing interest in multimodal interface design is inspired in large part by the goals of supporting more transparent, flexible, efficient, and powerfully expressive means of humancomputer interaction than in the past. Multimodal interfaces are expected to support a wider range of diverse applic ..."
Abstract
-
Cited by 102 (14 self)
- Add to MetaCart
The growing interest in multimodal interface design is inspired in large part by the goals of supporting more transparent, flexible, efficient, and powerfully expressive means of humancomputer interaction than in the past. Multimodal interfaces are expected to support a wider range of diverse applications, to be usable by a broader spectrum of the average population, and to function more reliably under realistic and challenging usage conditions. In this paper, we summarize the emerging architectural approaches for interpreting speech and pen-based gestural input in a robust manner--- including early and late fusion approaches, and the new hybrid symbolic/statistical approach. We also describe a diverse collection of state-of-the-art multimodal systems that process users' spoken and gestural input. These applications range from map-based and virtual reality systems for engaging in simulations and training, to field medic systems for mobile use in noisy environments, to web-based transactions and standard text-editing applications that will reshape daily computing and have a significant commercial impact. To realize successful multimodal systems of the future, many key research challenges remain to be addressed. Among these challenges are the development of cognitive theories to guide multimodal system design, and the development of effective natural language processing, dialogue processing, and error handling techniques. In addition, new multimodal systems will be needed that can function more robustly and adaptively, and with support for collaborative multi-person use. Before this new class of systems can proliferate, toolkits also will be needed to promote software development for both simulated and functioning systems. Multimodal Speech and Gesture Interfaces 3 CONT...
The Automated Design of Believable Dialogues for Animated Presentation Teams
- EMBODIED CONVERSATIONAL AGENTS
, 2000
"... this paper, we investigate a new style for presenting information. We introduce the notion of presentation teams which---rather than addressing the user directly---convey information in the style of performances to be observed by the user. The paper is organized as follows. First, we report on our e ..."
Abstract
-
Cited by 91 (13 self)
- Add to MetaCart
this paper, we investigate a new style for presenting information. We introduce the notion of presentation teams which---rather than addressing the user directly---convey information in the style of performances to be observed by the user. The paper is organized as follows. First, we report on our experience with two single animated presentation agents and explain how to evaluate their success. After that, we move to presentation teams and discuss their potential benefits for presentation tasks. In section 2, we describe the basic steps of our approach to the automated generation of performances with multiple characters. This approach has been applied to two different in: J. Cassell, S. Prevost, J. Sullivan, and E. Churchill: Embodied Conversational
Interactive Pedagogical Drama
, 2000
"... This paper describes an agent-based approach to realizing interactive pedagogical drama. Characters choose their actions autonomously, while director and cinematographer agents manage the action and its presentation in order to maintain story structure, achieve pedagogical goals, and present the dyn ..."
Abstract
-
Cited by 85 (14 self)
- Add to MetaCart
This paper describes an agent-based approach to realizing interactive pedagogical drama. Characters choose their actions autonomously, while director and cinematographer agents manage the action and its presentation in order to maintain story structure, achieve pedagogical goals, and present the dynamic story to as to achieve the best dramatic effect. Artistic standards must be maintained while permitting substantial variability in story scenario. To achieve these objectives, scripted dialog is deconstructed into elements that are portrayed by agents with emotion models. Learners influence how the drama unfolds by controlling the intentions of one or more characters, who then behave in accordance with those intentions. Interactions between characters create opportunities to move the story in pedagogically useful directions, which the automated director exploits. This approach is realized in the multimedia title Carmen's Bright IDEAS, an interactive health intervention designed to impro...
Tears and Fears: Modeling emotions and emotional behaviors in synthetic agents
- IN PROCEEDINGS OF THE 5TH INTERNATIONAL CONFERENCE ON AUTONOMOUS AGENTS
, 2001
"... Emotions play a critical role in creating engaging and believable characters to populate virtual worlds. Our goal is to create general computational models to support characters that act in virtual environments, make decisions, but whose behavior also suggests an underlying emotional current. In ser ..."
Abstract
-
Cited by 66 (5 self)
- Add to MetaCart
Emotions play a critical role in creating engaging and believable characters to populate virtual worlds. Our goal is to create general computational models to support characters that act in virtual environments, make decisions, but whose behavior also suggests an underlying emotional current. In service of this goal, we integrate two complementary approaches to emotional modeling into a single unified system. Gratch's mile system focuses on the problem of emotional appraisal: how emotions arise from an evaluation of how environmental events relate to an agent's plans and goals. Marsella et al.'s IPD system focuses more on the impact of emotions on behavior, including the impact on the physical expressions of emotional state through suitable choice of gestures and body language. This integrated model is layered atop Steve, a pedagogical agent architecture, and exercised within the context of the Mission Rehearsal Exercise, a prototype system designed to teach decision-making skills in highly evocative situations.
Embodied contextual agent in information delivering application
- In First International Joint Conference on Autonomous Agents & Multi-Agent Systems (AAMAS
, 2002
"... Application ..."
Eye communication in a conversational 3D synthetic agent
, 2000
"... Our goal is to create an “intelligent” 3D agent able to send complex, ‘natural ’ messages to users and, in the future, to converse with them. We look at the relationship between the agent’s communicative intentions and the way that these intentions are expressed into verbal and nonverbal messages. I ..."
Abstract
-
Cited by 24 (6 self)
- Add to MetaCart
Our goal is to create an “intelligent” 3D agent able to send complex, ‘natural ’ messages to users and, in the future, to converse with them. We look at the relationship between the agent’s communicative intentions and the way that these intentions are expressed into verbal and nonverbal messages. In this paper, we concentrate on the study and generation of coordinated linguistic and gaze communicative acts. In this view we analyse gaze signals according to their functional meaning rather than to their physical actions. We propose a formalism where a communicative act is represented by two elements: a meaning (that corresponds to a set of goals and beliefs that the agent has the purpose to transmit to the interlocutor) and a signal, that is the nonverbal expression of that meaning. We also outline a methodology to generate messages that coordinate verbal with nonverbal signals.
Automatic Generation of Non-Verbal Facial Expressions from Speech
- In Proc. Computer Graphics International 2002
, 2002
"... Speech synchronized facial animation that controls only the movement of the mouth is typically perceived as wooden and unnatural. We propose a method to generate additional facial expressions such as movement of the head, the eyes, and the eyebrows fully automatically from the input speech signal. T ..."
Abstract
-
Cited by 17 (5 self)
- Add to MetaCart
Speech synchronized facial animation that controls only the movement of the mouth is typically perceived as wooden and unnatural. We propose a method to generate additional facial expressions such as movement of the head, the eyes, and the eyebrows fully automatically from the input speech signal. This is achieved by extracting prosodic parameters such as pitch flow and power spectrum from the speech signal and using them to control facial animation parameters in accordance to results from paralinguistic research.
Paired Speech and Gesture Generation in Embodied Conversational Agents
, 2000
"... Using face-to-face conversation as an interface metaphor, an embodied conversational agent is likely to be easier to use and learn than traditional graphical user interfaces. To make a believable agent that to some extent has the same social and conversational skills as humans do, the embodied conve ..."
Abstract
-
Cited by 11 (0 self)
- Add to MetaCart
Using face-to-face conversation as an interface metaphor, an embodied conversational agent is likely to be easier to use and learn than traditional graphical user interfaces. To make a believable agent that to some extent has the same social and conversational skills as humans do, the embodied conversational agent system must be able to deal with input of the user from different communication modalities such as speech and gesture, as well as generate appropriate behaviors for those communication modalities. In this thesis, I address the problem of paired speech and gesture generation in embodied conversational agents. I propose a real-time generation framework that is capable of generating a comprehensive description of communicative actions, including speech, gesture, and intonation, in the real-estate domain. The generation of speech, gesture, and intonation are based on the same underlying representation of real-estate properties, discourse information structure, intentional and attentional structures, and a mechanism to update the common ground between the user and the agent. Algorithms have been implemented to analyze the discourse information structure, contrast, and surprising semantic features, which together decide the intonation contour of the speech utterances and where gestures occur. I also investigate through a correlational study the role of communicative goals in determining the distribution of semantic features across speech and gesture modalities.
Multimodal Model Integration for Sentence Unit Detection
, 2004
"... In this paper, we adopt a direct modeling approach to utilize conversational gesture cues in detecting sentence boundaries, called SUs, in video taped conversations. We treat the detection of SUs as a classification task such that for each inter-word boundary, the classifier decides whether there is ..."
Abstract
-
Cited by 10 (3 self)
- Add to MetaCart
In this paper, we adopt a direct modeling approach to utilize conversational gesture cues in detecting sentence boundaries, called SUs, in video taped conversations. We treat the detection of SUs as a classification task such that for each inter-word boundary, the classifier decides whether there is an SU boundary or not. In addition to gesture cues, we also utilize prosody and lexical knowledge sources. In this first investigation, we find that gesture features complement the prosodic and lexical knowledge sources for this task. By using all of the knowledge sources, the model is able to achieve the lowest overall SU detection error rate.
Modeling the Interplay of Emotions and Plans in Multi-Agent Simulations
"... The goal of this research is to create general computational models of the interplay between affect, cognition and behavior. These models are being designed to support characters that act in virtual environments, make decisions, but whose behavior also suggests an underlying emotional current. ..."
Abstract
-
Cited by 7 (1 self)
- Add to MetaCart
The goal of this research is to create general computational models of the interplay between affect, cognition and behavior. These models are being designed to support characters that act in virtual environments, make decisions, but whose behavior also suggests an underlying emotional current. We attempt to capture both the cognitive and behavioral aspects of emotion, circumscribed to the role emotions play in the performance of concrete physical tasks. We address how emotions arise from an evaluation of the relationship between environmental events and an agent's plans and goals, as well as the impact of emotions on behavior, in particular the impact on the physical expressions of emotional state through suitable choice of gestures and body language. The approach is illustrated within a virtual reality training environment.

