Results 1 - 10
of
10
Recent Developments in Social Signal Processing
"... Abstract—Social signal processing has the ambitious goal of bridging the social intelligence gap between computers and humans. Nowadays, computers are not only the new interaction partners of humans, but also a privileged interaction medium for social exchange between humans. Consequently, enhancing ..."
Abstract
-
Cited by 2 (2 self)
- Add to MetaCart
Abstract—Social signal processing has the ambitious goal of bridging the social intelligence gap between computers and humans. Nowadays, computers are not only the new interaction partners of humans, but also a privileged interaction medium for social exchange between humans. Consequently, enhancing machine abilities to interpret and reproduce social signals is a crucial requirement for improving computer-mediated communication and interaction. Furthermore, automated analysis of such signals creates a host of new applications and improvements to existing applications. The study of social signals benefits a wide range of domains, including human-computer interaction, interaction design, entertainment technology, ambient intelligence, healthcare, and psychology. This paper briefly introduces the field and surveys its latest developments. Index Terms—Behavioral science, human computer interaction, emotion recognition I.
Towards View-Invariant Expression Analysis Using Analytic Shape Manifolds
"... Abstract — Facial expression analysis is one of the important components for effective human-computer interaction. However, to develop robust and generalizable models for expression analysis one needs to break the dependence of the models on the choice of the coordinate frame of the camera i.e. expr ..."
Abstract
- Add to MetaCart
Abstract — Facial expression analysis is one of the important components for effective human-computer interaction. However, to develop robust and generalizable models for expression analysis one needs to break the dependence of the models on the choice of the coordinate frame of the camera i.e. expression models should generalize across facial poses. To perform this systematically, one needs to understand the space of observed images subject to projective transformations. However, since the projective shape-space is cumbersome to work with, we address this problem by deriving models for expressions on the affine shape-space as an approximation to the projective shape-space by using a Riemannian interpretation of deformations that facial expressions cause on different parts of the face. We use landmark configurations to represent facial deformations and exploit the fact that the affine shape-space can be studied using the Grassmann manifold. This representation enables us to perform various expression analysis and recognition algorithms without the need for the normalization as a preprocessing step. We extend some of the available approaches for expression analysis to the Grassmann manifold and experimentally show promising results, paving the way for a more general theory of view-invariant expression analysis. I.
Facial Expression Analysis
"... Abstract The face is one of the most powerful channels of nonverbal communication. Facial expression provides cues about emotion, intention, alertness, pain, personality, regulates interpersonal behavior, and communicates psychiatric and biomedical status among other functions. Within the past 15 ye ..."
Abstract
- Add to MetaCart
Abstract The face is one of the most powerful channels of nonverbal communication. Facial expression provides cues about emotion, intention, alertness, pain, personality, regulates interpersonal behavior, and communicates psychiatric and biomedical status among other functions. Within the past 15 years, there has been increasing interest in automated facial expression analysis within the computer vision and machine learning communities. This chapter reviews fundamental approaches to facial measurement by behavioral scientists and current efforts in automated facial expression recognition. We consider challenges, review databases available to the research community, approaches to feature detection, tracking, and representation, and both supervised and unsupervised learning.
Gayler, and David Hawking. Similarity-Aware Indexing for
, 2009
"... or send email to: Technical-DOT-Reports-AT-cs-DOT-anu.edu.au A list of technical reports, including some abstracts and copies of some full reports may be found at: ..."
Abstract
- Add to MetaCart
or send email to: Technical-DOT-Reports-AT-cs-DOT-anu.edu.au A list of technical reports, including some abstracts and copies of some full reports may be found at:
Interpreting Hand-Over-Face Gestures
"... Abstract. People often hold their hands near their faces as a gesture in natural conversation, which can interfere with affective inference from facial expressions. However, these gestures are valuable as an additional channel for multi-modal inference. We analyse hand-over-face gestures in a corpus ..."
Abstract
- Add to MetaCart
Abstract. People often hold their hands near their faces as a gesture in natural conversation, which can interfere with affective inference from facial expressions. However, these gestures are valuable as an additional channel for multi-modal inference. We analyse hand-over-face gestures in a corpus of naturalistic labelled expressions and propose the use of those gestures as a novel affect cue for automatic inference of cognitive mental states. We define three hand cues for encoding hand-over-face gestures, namely hand shape, hand action and facial region occluded, serving as a first step in automating the interpretation process. 1
3D Corpus of Spontaneous Complex Mental
"... Abstract. Hand-over-face gestures, a subset of emotional body language, are overlooked by automatic affect inference systems. We propose the use of hand-over-face gestures as a novel affect cue for automatic inference of cognitive mental states. Moreover, affect recognition systems rely on the exist ..."
Abstract
- Add to MetaCart
Abstract. Hand-over-face gestures, a subset of emotional body language, are overlooked by automatic affect inference systems. We propose the use of hand-over-face gestures as a novel affect cue for automatic inference of cognitive mental states. Moreover, affect recognition systems rely on the existence of publicly available datasets, often the approach is only as good as the data. We present the collection and annotation methodology of a 3D multimodal corpus of 108 audio/video segments of natural complex mental states. The corpus includes spontaneous facial expressions and hand gestures labelled using crowd-sourcing and is publicly available. 1
Registration Invariant Representations for Expression Detection
"... Active appearance model (AAM) representations have been used to great effect recently in the accurate detection of expression events (e.g., action units, pain, broad expressions, etc.). The motivation for their use, and rationale for their success, lies in their ability to: (i) provide dense (i.e. 6 ..."
Abstract
- Add to MetaCart
Active appearance model (AAM) representations have been used to great effect recently in the accurate detection of expression events (e.g., action units, pain, broad expressions, etc.). The motivation for their use, and rationale for their success, lies in their ability to: (i) provide dense (i.e. 60- 70 points on the face) registration accuracy on par with a human labeler, and (ii) the ability to decompose the registered face image to separate appearance and shape representations. Unfortunately, this human-like registration performance is isolated to registration algorithms that are specifically tuned to the illumination, camera and subject being tracked (i.e. “subject dependent ” algorithms). As a result, it is rare, to see AAM representations being employed in the far more useful “subject independent ” situations (i.e., where illumination, camera and subject is unknown) due to the inherent increased geometric noise present in the estimated registration. In this paper we argue that “AAM like” expression detection results can be obtained in the presence of noisy dense registration through the employment of registration invariant representations (e.g., Gabor magnitudes and HOG features). We demonstrate that good expression detection performance can still be enjoyed over the types of geometric noise often encountered with the more geometrically noisy state of the art generic algorithms (e.g., Bayesian Tangent Shape Models (BTSM), Constrained Local Models (CLM), etc). We show these results on the extended Cohn-Kanade (CK+) database over all facial action units. 1.
Learning Discriminative Fisher Kernels
"... Fisher kernels provide a commonly used vectorial representation of structured objects. The paper presents a technique that exploits label information to improve the object representation of Fisher kernels by employing ideas from metric learning. In particular, the new technique trains a generative m ..."
Abstract
- Add to MetaCart
Fisher kernels provide a commonly used vectorial representation of structured objects. The paper presents a technique that exploits label information to improve the object representation of Fisher kernels by employing ideas from metric learning. In particular, the new technique trains a generative model in such a way that the distance between the log-likelihood gradients induced by two objects with the same label is as small as possible, and the distance between the gradients induced by two objects with different labels is as large as possible. We illustrate the strong performance of classifiers trained on the resulting object representations on problems in handwriting recognition, speech recognition, facial expression analysis, and bio-informatics. 1.
Segment-based SVMs for . . .
, 2011
"... Enabling computers to understand human and animal behavior has the potential to revolutionize many areas that benefit society such as clinical diagnosis, humancomputer interaction, and social robotics. Critical to the understanding of human and animal behavior, and any temporally-varying phenomenon ..."
Abstract
- Add to MetaCart
Enabling computers to understand human and animal behavior has the potential to revolutionize many areas that benefit society such as clinical diagnosis, humancomputer interaction, and social robotics. Critical to the understanding of human and animal behavior, and any temporally-varying phenomenon in general, is the capability to segment, classify, and cluster time series data. This thesis proposes segment-based Support Vector Machines (Seg-SVMs), a framework for supervised, weakly-supervised, and unsupervised time series analysis. Seg-SVMs outperform state-of-the-art approaches by combining three powerful ideas: energy-based structure prediction, bag-of-words representation, and maximum-margin learning. Energy-based structure prediction provides a principled mechanism for concurrent top-down recognition and bottom-up temporal localization. Bag-of-words representation provides segment-based features that tolerate misalignment errors and are computationally efficient. Maximum-margin learning, such as SVM and Structure Output SVM, has a convex learning formulation; it produces classifiers that are discriminative and less prone to over-fitting. In this thesis, we show how Seg-SVMs outperform state-of-the-art approaches for segmenting, classifying, and clustering human and animal behavior in video and accelerometer data of varying complexity. We illustrate these benefits in the problems of facial event detection, sequence labeling of human actions, and temporal clustering of animal behavior. In addition, the Seg-SVMs framework naturally provides solutions to two novel problems: early detection of human actions and weaklysupervised discovery of discriminative events.
Crowdsourcing Facial Responses to Online
"... Abstract—We present results validating a novel framework for collecting and analyzing facial responses to media content over the Internet. This system allowed 3,268 trackable face videos to be collected and analyzed in under two months. We characterize the data and present analysis of the smile resp ..."
Abstract
- Add to MetaCart
Abstract—We present results validating a novel framework for collecting and analyzing facial responses to media content over the Internet. This system allowed 3,268 trackable face videos to be collected and analyzed in under two months. We characterize the data and present analysis of the smile responses of viewers to three commercials. We compare statistics from this corpus to those from the Cohn-Kanade+ (CK+) and MMI databases and show that distributions of position, scale, pose, movement and luminance of the facial region are significantly different from those represented in these traditionally used datasets. Next we analyze the intensity and dynamics of smile responses, and show that there are significantly different facial responses from subgroups who report liking the commercials compared to those that report not liking the commercials. Similarly, we unveil significant differences between groups who were previously familiar with a commercial and those that were not and propose a link to virality. Finally, we present relationships between head movement and facial behavior that were observed within the data. The framework, data collected and analysis demonstrate an ecologically valid method for unobtrusive evaluation of facial responses to media content that is robust to challenging real-world conditions and requires no explicit recruitment or compensation of participants.

