Results 1 -
8 of
8
A Survey of Affect Recognition Methods: Audio, Visual, and Spontaneous Expressions
, 2009
"... Automated analysis of human affective behavior has attracted increasing attention from researchers in psychology, computer science, linguistics, neuroscience, and related disciplines. However, the existing methods typically handle only deliberately displayed and exaggerated expressions of prototypi ..."
Abstract
-
Cited by 69 (17 self)
- Add to MetaCart
Automated analysis of human affective behavior has attracted increasing attention from researchers in psychology, computer science, linguistics, neuroscience, and related disciplines. However, the existing methods typically handle only deliberately displayed and exaggerated expressions of prototypical emotions, despite the fact that deliberate behavior differs in visual appearance, audio profile, and timing from spontaneously occurring behavior. To address this problem, efforts to develop algorithms that can process naturally occurring human affective behavior have recently emerged. Moreover, an increasing number of efforts are reported toward multimodal fusion for human affect analysis, including audiovisual fusion, linguistic and paralinguistic fusion, and multicue visual fusion based on facial expressions, head movements, and body gestures. This paper introduces and surveys these recent advances. We first discuss human emotion perception from a psychological perspective. Next, we examine available approaches for solving the problem of machine understanding of human affective behavior and discuss important issues like the collection and availability of training and test data. We finally outline some of the scientific and engineering challenges to advancing human affect sensing technology.
How to distinguish posed from spontaneous smiles using geometric features
- Proc. ICMI
, 2007
"... Automatic distinction between posed and spontaneous expressions is an unsolved problem. Previously cognitive sciences’ studies indicated that the automatic separation of posed from spontaneous expressions is possible using the face modality. However, little is known about the information from head a ..."
Abstract
-
Cited by 24 (12 self)
- Add to MetaCart
Automatic distinction between posed and spontaneous expressions is an unsolved problem. Previously cognitive sciences’ studies indicated that the automatic separation of posed from spontaneous expressions is possible using the face modality. However, little is known about the information from head and shoulder motion. In this work, we propose to (i) distinguish between posed and spontaneous smiles by fusing head, face, and shoulder modalities, (ii) investigate which modalities carry important information and how the modalities relate to each other, and (iii) to which extent the temporal dynamics of these signals attribute to solving the problem. A cylindrical head tracker is used to track head motion and two particle filtering techniques to track facial and shoulder motion. Classification is performed by kernel methods combined with ensemble learning techniques. We investigated two aspects of multimodal fusion: the level of abstraction (i.e., early, mid-level, and late fusion) and the fusion rule used (i.e., sum, product and weight criteria). Experimental results from 100 videos displaying posed smiles and 102 videos displaying spontaneous smiles are presented. Best results were obtained with late fusion of all modalities when 94.0 % of the videos were classified correctly. Categories and Subject Descriptors I.2.10 [Vision and scene understanding]: [Video analysis];
Gaze-X: Adaptive affective multimodal interface for single-user office scenarios
- Proc. ACM Int’l Conf. Multimodal Interfaces
, 2006
"... This paper describes an intelligent system that we developed to support affective multimodal human-computer interaction (AMM-HCI) where the user’s actions and emotions are modeled and then used to adapt the HCI and support the user in his or her activity. The proposed system, which we named Gaze-X, ..."
Abstract
-
Cited by 14 (6 self)
- Add to MetaCart
This paper describes an intelligent system that we developed to support affective multimodal human-computer interaction (AMM-HCI) where the user’s actions and emotions are modeled and then used to adapt the HCI and support the user in his or her activity. The proposed system, which we named Gaze-X, is based on sensing and interpretation of the human part of the computer’s context, known as W5+ (who, where, what, when, why, how). It integrates a number of natural human communicative modalities including speech, eye gaze direction, face and facial expression, and a number of standard HCI modalities like keystrokes, mouse movements, and active software identification, which, in turn, are fed into processes that provide decision making and adapt the HCI to support the user in his or her activity according to his or her preferences. To attain a system that can be educated, that can improve its knowledge and decision making through experience, we use case-based reasoning as the inference engine of Gaze-X. The utilized case base is a dynamic, incrementally self-organizing event-content-addressable memory that allows fact retrieval and evaluation of encountered events based upon the user preferences and the generalizations formed from prior input. To support concepts of concurrency, modularity/scalability, persistency, and mobility, Gaze-X has been built as an agent-based system where different agents are responsible for different parts of the processing. A usability study conducted in an office scenario with a number of users indicates that Gaze-X is perceived as effective, easy to use, useful, and affectively qualitative.
From the Lab to the Real World: Affect Recognition Using Multiple Cues and Modalities
"... Human affect sensing can be obtained from a broad range of behavioral cues and signals that are available via visual, acoustic, and tactual expressions or presentations of emotions. Affective states can thus be recognized from visible/external signals such as gestures (e.g., facial expressions, body ..."
Abstract
-
Cited by 6 (2 self)
- Add to MetaCart
Human affect sensing can be obtained from a broad range of behavioral cues and signals that are available via visual, acoustic, and tactual expressions or presentations of emotions. Affective states can thus be recognized from visible/external signals such as gestures (e.g., facial expressions, body gestures, head movements, etc.), and speech (e.g., parameters such
Systems University of Technology,
"... Automatic distinction between posed and spontaneous expressions is an unsolved problem. Previously cognitive sciences’ studies indicated that the automatic separation of posed from spontaneous expressions is possible using the face modality. However, little is known about the information from head a ..."
Abstract
- Add to MetaCart
Automatic distinction between posed and spontaneous expressions is an unsolved problem. Previously cognitive sciences’ studies indicated that the automatic separation of posed from spontaneous expressions is possible using the face modality. However, little is known about the information from head and shoulder motion. In this work, we propose to (i) distinguish between posed and spontaneous smiles by fusing head, face, and shoulder modalities, (ii) investigate which modalities carry important information and how the modalities relate to each other, and (iii) to which extent the temporal dynamics of these signals attribute to solving the problem. A cylindrical head tracker is used to track head motion and two particle filtering techniques to track facial and shoulder motion. Classification is performed by kernel methods combined with ensemble learning techniques. We investigated two aspects of multimodal fusion: the level of abstraction (i.e., early, mid-level, and late fusion) and the fusion rule used (i.e., sum, product and weight criteria). Experimental results from 100 videos displaying posed smiles and 102 videos displaying spontaneous smiles are presented. Best results were obtained with late fusion of all modalities when 94.0 % of the videos were classified correctly.
DOI 10.1007/s10579-007-9057-1 Virtual agent multimodal mimicry of humans
, 2008
"... Abstract This work is about multimodal and expressive synthesis on virtual agents, based on the analysis of actions performed by human users. As input we consider the image sequence of the recorded human behavior. Computer vision and image processing techniques are incorporated in order to detect cu ..."
Abstract
- Add to MetaCart
Abstract This work is about multimodal and expressive synthesis on virtual agents, based on the analysis of actions performed by human users. As input we consider the image sequence of the recorded human behavior. Computer vision and image processing techniques are incorporated in order to detect cues needed for expressivity features extraction. The multimodality of the approach lies in the fact that both facial and gestural aspects of the user’s behavior are analyzed and processed. The mimicry consists of perception, interpretation, planning and animation of the expressions shown by the human, resulting not in an exact duplicate rather than an expressive model of the user’s original behavior.
12 Emotion Modelling and Facial Affect Recognition in Human-Computer and Human-Robot Interaction
"... As research has revealed the deep role that emotion and emotion expression play in human social interaction, researchers in human-computer interaction have proposed that more effective human-computer interfaces can be realized if the interface models the user’s emotion as well as expresses emotions. ..."
Abstract
- Add to MetaCart
As research has revealed the deep role that emotion and emotion expression play in human social interaction, researchers in human-computer interaction have proposed that more effective human-computer interfaces can be realized if the interface models the user’s emotion as well as expresses emotions. Affective computing was defined by Rosalind Picard
Classification Accuracy of Neural Networks with PCA in Emotion Recognition
"... This paper presents classification accuracy of neural network with principal component analysis (PCA) for feature selections in emotion recognition using facial expressions. Dimensionality reduction of a feature set is a common preprocessing step used for pattern recognition and classification appli ..."
Abstract
- Add to MetaCart
This paper presents classification accuracy of neural network with principal component analysis (PCA) for feature selections in emotion recognition using facial expressions. Dimensionality reduction of a feature set is a common preprocessing step used for pattern recognition and classification applications. PCA is one of the popular methods used, and can be shown to be optimal using different optimality criteria. Experiment results, in which we achieved a recognition rate of approximately 85 % when testing six emotions on benchmark image data set, show that neural networks with PCA is effective in emotion recognition using facial expressions. Keywords: emotion recognition, feature selection, neural network, PCA. 2000 MSC: 68T45, 97P20, 97R40, 68T45.

