Results 1 - 10
of
13
A Survey of Affect Recognition Methods: Audio, Visual, and Spontaneous Expressions
, 2009
"... Automated analysis of human affective behavior has attracted increasing attention from researchers in psychology, computer science, linguistics, neuroscience, and related disciplines. However, the existing methods typically handle only deliberately displayed and exaggerated expressions of prototypi ..."
Abstract
-
Cited by 69 (17 self)
- Add to MetaCart
Automated analysis of human affective behavior has attracted increasing attention from researchers in psychology, computer science, linguistics, neuroscience, and related disciplines. However, the existing methods typically handle only deliberately displayed and exaggerated expressions of prototypical emotions, despite the fact that deliberate behavior differs in visual appearance, audio profile, and timing from spontaneously occurring behavior. To address this problem, efforts to develop algorithms that can process naturally occurring human affective behavior have recently emerged. Moreover, an increasing number of efforts are reported toward multimodal fusion for human affect analysis, including audiovisual fusion, linguistic and paralinguistic fusion, and multicue visual fusion based on facial expressions, head movements, and body gestures. This paper introduces and surveys these recent advances. We first discuss human emotion perception from a psychological perspective. Next, we examine available approaches for solving the problem of machine understanding of human affective behavior and discuss important issues like the collection and availability of training and test data. We finally outline some of the scientific and engineering challenges to advancing human affect sensing technology.
Human Computing and Machine Understanding of Human Behavior: A Survey
- SURVEY, PROC. ACM INT’L CONF. MULTIMODAL INTERFACES
, 2006
"... A widely accepted prediction is that computing will move to the background, weaving itself into the fabric of our everyday living spaces and projecting the human user into the foreground. If this prediction is to come true, then next generation computing, which we will call human computing, should b ..."
Abstract
-
Cited by 54 (25 self)
- Add to MetaCart
A widely accepted prediction is that computing will move to the background, weaving itself into the fabric of our everyday living spaces and projecting the human user into the foreground. If this prediction is to come true, then next generation computing, which we will call human computing, should be about anticipatory user interfaces that should be human-centered, built for humans based on human models. They should transcend the traditional keyboard and mouse to include natural, human-like interactive functions including understanding and emulating certain human behaviors such as affective and social signaling. This article discusses a number of components of human behavior, how they might be integrated into computers, and how far we are from realizing the front end of human computing, that is, how far are we from enabling computers to understand human behavior.
Smart Sensor Integration: A Framework for Multimodal Emotion Recognition in Real-Time
"... Affect sensing by machines has been argued as an essential part of next-generation human-computer interaction (HCI). To this end, in the recent years a large number of studies have been conducted, which report automatic recognition of emotion as a difficult, but feasible task. However, most effort h ..."
Abstract
-
Cited by 13 (6 self)
- Add to MetaCart
Affect sensing by machines has been argued as an essential part of next-generation human-computer interaction (HCI). To this end, in the recent years a large number of studies have been conducted, which report automatic recognition of emotion as a difficult, but feasible task. However, most effort has been put towards offline analysis, whereas to date only few applications exist, which are able to react to a user’s emotion in real-time. In response to this deficit we introduce a framework we call Smart Sensor Integration (SSI), which considerably jump-starts the development of multimodal online emotion recognition (OER) systems. In particular SSI supports the pattern recognition pipeline by offering tailored tools for data segmentation, feature extraction, and pattern recognition, as well as, tools to apply them offline (training phase) and online (real-time recognition). Furthermore, it has been designed to handle input from various input modalities and to suit the fusion of multimodal information. 1.
Human-Centred Intelligent Human-Computer Interaction (HCI²): . . .
, 2008
"... A widely accepted prediction is that computing will move to the background, weaving itself into the fabric of our everyday living spaces and projecting the human user into the foreground. To realise this prediction, next-generation computing should develop anticipatory user interfaces that are hum ..."
Abstract
-
Cited by 9 (6 self)
- Add to MetaCart
A widely accepted prediction is that computing will move to the background, weaving itself into the fabric of our everyday living spaces and projecting the human user into the foreground. To realise this prediction, next-generation computing should develop anticipatory user interfaces that are human-centred, built for humans and based on naturally occurring multimodal human communication. These interfaces should transcend the traditional keyboard and mouse and have the capacity to understand and emulate human communicative intentions as expressed through behavioural cues, such as affective and social signals. This article discusses how far we are to the goal of human-centred computing and Human-Centred Intelligent Human-Computer Interaction (HCI²) that can understand and respond to multimodal human communication.
Audio-visual spontaneous emotion recognition
"... Abstract. Automatic multimodal recognition of spontaneous emotional expressions is a largely unexplored and challenging problem. In this paper, we explore audio-visual emotion recognition in a realistic human conversation setting—the Adult Attachment Interview (AAI). Based on the assumption that fac ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
Abstract. Automatic multimodal recognition of spontaneous emotional expressions is a largely unexplored and challenging problem. In this paper, we explore audio-visual emotion recognition in a realistic human conversation setting—the Adult Attachment Interview (AAI). Based on the assumption that facial expression and vocal expression are at the same coarse affective states, positive and negative emotion sequences are labeled according to Facial Action Coding System. Facial texture in visual channel and prosody in audio channel are integrated in the framework of Adaboost multi-stream hidden Markov model (AdaMHMM) in which the Adaboost learning scheme is used to build component HMM fusion. Our approach is evaluated in AAI spontaneous emotion recognition experiments. Keywords: Multimodal Human-Computer Interaction, Affective computing, affect recognition, emotion recognition.
Affective Multimodal Mirror: Sensing and Eliciting Laughter
"... In this paper, we present a multimodal affective mirror that senses and elicits laughter. Currently, the mirror contains a vocal and a facial affect-sensing module, a component that fuses the output of these two modules to achieve a user-state assessment, a user state transition model, and a compone ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
In this paper, we present a multimodal affective mirror that senses and elicits laughter. Currently, the mirror contains a vocal and a facial affect-sensing module, a component that fuses the output of these two modules to achieve a user-state assessment, a user state transition model, and a component to present audiovisual affective feedback that should keep or bring the user in the intended state. Interaction with this intelligent interface involves a full cyclic process of sensing, interpreting, reacting, sensing (of the reaction effects), interpreting … The intention of the mirror is to evoke positive emotions, to make people laugh and to increase the laughter. The first user experiences tests showed that users show cooperative behavior, resulting in mutual user-mirror action-reaction cycles. Most users enjoyed the interaction with the mirror and immersed in an excellent user experience. Categories and Subject Descriptors H.1.2 [Models and Principles]: User/machine Systems – Human factors
PAD-based Multimodal Affective Fusion
"... The study of multimodality is comparatively less developed for Affective interfaces than for their traditional counterparts. However, one condition for the successful development of Affective interface technologies is the development of frameworks for the real-time multimodal fusion. In this paper, ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
The study of multimodality is comparatively less developed for Affective interfaces than for their traditional counterparts. However, one condition for the successful development of Affective interface technologies is the development of frameworks for the real-time multimodal fusion. In this paper, we describe an approach to multimodal affective fusion, which relies on a dimensional model, Pleasure-Arousal-Dominance (PAD) to support the fusion of affective modalities, each input modality being represented as a PAD vector. We describe how this model supports both affective content fusion and temporal fusion within a unified approach. We report results from early user studies which confirm the existence of a correlation between measured affective input and user temperament scores. 1.
Machine Understanding of Human Behavior
, 2006
"... A widely accepted prediction is that computing will move to the background, weaving itself into the fabric of our everyday living spaces and projecting the human user into the foreground. If this prediction is to come true, then next generation computing, which we will call human computing, should b ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
A widely accepted prediction is that computing will move to the background, weaving itself into the fabric of our everyday living spaces and projecting the human user into the foreground. If this prediction is to come true, then next generation computing, which we will call human computing, should be about anticipatory user interfaces that should be human-centered, built for humans based on human models. They should transcend the traditional keyboard and mouse to include natural, human-like interactive functions including understanding and emulating certain human behaviors such as affective and social signaling. This article discusses a number of components of human behavior, how they might be integrated into computers, and how far we are from realizing the front end of human computing, that is, how far are we from enabling computers to understand human behavior.
Speech Emotion Analysis: Exploring the Role of Context
- TO APPEAR IN THE “IEEE TRANSACTIONS ON MULTIMEDIA, 2010”
, 2010
"... Automated analysis of human affective behavior has attracted increasing attention in recent years. With the research shift toward spontaneous behavior, many challenges have come to surface ranging from database collection strategies to the use of new feature sets (e.g. lexical cues apart from prosod ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
Automated analysis of human affective behavior has attracted increasing attention in recent years. With the research shift toward spontaneous behavior, many challenges have come to surface ranging from database collection strategies to the use of new feature sets (e.g. lexical cues apart from prosodic features). Use of contextual information, however, is rarely addressed in the field of affect expression recognition. Yet it is evident that affect recognition by human is largely influenced by the context information. Our contribution in this paper is three fold. First, we introduce a novel set of features based on cepstrum analysis of pitch and intensity contours. We evaluate the usefulness of these features on two different databases: Berlin Database of emotional speech (EMO-DB) and locally collected audiovisual database in car settings (CVRRCar-AVDB). The overall recognition accuracy achieved for seven emotions in EMO-DB database is over 84% and over 87 % for three emotion classes in CVRRCar-AVDB. This is based on 10 fold stratified cross validation. Second, we introduce the collection of a new audiovisual database in an automobile setting (CVRRCar-AVDB). In this current study, we only use the audio channel of the database. Third, we systematically analyze the effects of different contexts on two different databases. We present context analysis of subject and text based on speaker/text dependent/independent analysis on EMO-DB database. Furthermore, we perform context analysis based on gender information on EMO-DB and CVRRCar-AVDB. The results based on these analyses are promising.
Guest Editorial Special Issue on Human Computing
"... WE HAVE entered an era of enhanced digital connectivity. Computers and the Internet have become so embedded in the daily fabric of people’s lives that people simply cannot live without them. We use this technology to work, to communicate, to shop, to seek out new information, and to entertain oursel ..."
Abstract
- Add to MetaCart
WE HAVE entered an era of enhanced digital connectivity. Computers and the Internet have become so embedded in the daily fabric of people’s lives that people simply cannot live without them. We use this technology to work, to communicate, to shop, to seek out new information, and to entertain ourselves. In other words, computers are becoming full social actors that need to interact with people as seamlessly as possible. The key to development of computers as such social actors is to design human–computer interaction (HCI) that is human centered, built for humans based on human behavior models [1], [2]. In other words, HCI designs should focus on the human portion of the HCI context rather than on the computer portion, as was the case in classic HCI designs such as direct manipulation and delegation. They should transcend the traditional keyboard and mouse to include natural

