Results 1 - 10
of
31
Recognizing Facial Expression: Machine Learning and Application to Spontaneous Behavior
- COMPUTER VISION AND PATTERN RECOGNITION
, 2005
"... We present a systematic comparison of machine learning methods applied to the problem of fully automatic recognition of facial expressions. We report results on a series of experiments comparing recognition engines, including AdaBoost, support vector machines, linear discriminant analysis. We also e ..."
Abstract
-
Cited by 45 (2 self)
- Add to MetaCart
We present a systematic comparison of machine learning methods applied to the problem of fully automatic recognition of facial expressions. We report results on a series of experiments comparing recognition engines, including AdaBoost, support vector machines, linear discriminant analysis. We also explored feature selection techniques, including the use of AdaBoost for feature selection prior to classification by SVM or LDA. Best results were obtained by selecting a subset of Gabor filters using AdaBoost followed by classification with Support Vector Machines. The system operates in real-time, and obtained 93 % correct generalization to novel subjects for a 7-way forced choice on the Cohn-Kanade expression dataset. The outputs of the classifiers change smoothly as a function of time and thus can be used to measure facial expression dynamics. We applied the system to to fully automated recognition of facial actions (FACS). The present system classifies 17 action units, whether they occur singly or in combination with other actions, with a mean accuracy of 94.8%. We present preliminary results for applying this system to spontaneous facial expressions.
A 3d facial expression database for facial behavior research
- Proc. IEEE Int’l Conf. Face and Gesture Recognition
, 2006
"... Traditionally, human facial expressions have been studied using either 2D static images or 2D video sequences. The 2D-based analysis is incapable of handing large pose variations. Although 3D modeling techniques have been extensively used for 3D face recognition and 3D face animation, barely any res ..."
Abstract
-
Cited by 29 (4 self)
- Add to MetaCart
Traditionally, human facial expressions have been studied using either 2D static images or 2D video sequences. The 2D-based analysis is incapable of handing large pose variations. Although 3D modeling techniques have been extensively used for 3D face recognition and 3D face animation, barely any research on 3D facial expression recognition using 3D range data has been reported. A primary factor for preventing such research is the lack of a publicly available 3D facial expression database. In this paper, we present a newly developed 3D facial expression database, which includes both prototypical 3D facial expression shapes and 2D facial textures of 2,500 models from 100 subjects. This is the first attempt at making a 3D facial expression database available for the research community, with the ultimate goal of fostering the research on affective computing and increasing the general understanding of facial behavior and the fine 3D structure inherent in human facial expressions. The new database can be a valuable resource for algorithm assessment, comparison and evaluation. 1.
Machine Learning Methods for Fully Automatic Recognition of Facial Expressions and Facial Actions
- Proc. IEEE Int’l Conf. Systems, Man and Cybernetics
, 2004
"... We present a systematic comparison of machine learning methods applied to the problem of fully automatic recognition of facial expressions. We explored recognition of facial actions from the Facial Action Coding System (FACS), as well as recognition of full facial expressions. Each videoframe is fir ..."
Abstract
-
Cited by 19 (1 self)
- Add to MetaCart
We present a systematic comparison of machine learning methods applied to the problem of fully automatic recognition of facial expressions. We explored recognition of facial actions from the Facial Action Coding System (FACS), as well as recognition of full facial expressions. Each videoframe is first scanned in real-time to detect approximately upright-frontal faces. The faces found are scaled into image patches of equal size, convolved with a bank of Gabor energy filters, and then passed to a recognition engine that codes facial expressions into 7 dimensions in real time: neutral, anger, disgust, fear, joy, sadness, surprise. We report results on a series of experiments comparing recognition engines, including AdaBoost, support vector machines, linear discriminant analysis, as well as feature selection techniques. Best results were obtained by selecting a subset of Gabor filters using AdaBoost and then training Support Vector Machines on the outputs of the filters selected by AdaBoost. The generalization performance to new subjects for recognition of full facial expressions in a 7-way forced choice was 93% correct, the best performance reported so far on the DFAT-504 dataset. We also applied the system to fully automated facial action coding. The present system classifies 18 action units, whether they occur singly or in combination with other actions. The system obtained a mean agreement rate of 94.5% on a FACS-coded dataset of posed expressions (DFAT-504). The outputs of the classifiers change smoothly as a function of time and thus can be used to measure facial expression dynamics.
Automatic Facial Expression Recognition Using Facial Animation Parameters And Multi-Stream Hmms
- Information Forensics and Security, IEEE Transactions on Volume 1, Issue 1, March 2006 Page(s):3
, 2005
"... Abstract—The performance of an automatic facial expression recognition system can be significantly improved by modeling the reliability of different streams of facial expression information utilizing multistream hidden Markov models (HMMs). In this paper, we present an automatic multistream HMM faci ..."
Abstract
-
Cited by 12 (1 self)
- Add to MetaCart
Abstract—The performance of an automatic facial expression recognition system can be significantly improved by modeling the reliability of different streams of facial expression information utilizing multistream hidden Markov models (HMMs). In this paper, we present an automatic multistream HMM facial expression recognition system and analyze its performance. The proposed system utilizes facial animation parameters (FAPs), supported by the MPEG-4 standard, as features for facial expression classification. Specifically, the FAPs describing the movement of the outer-lip contours and eyebrows are used as observations. Experiments are first performed employing single-stream HMMs under several different scenarios, utilizing outer-lip and eyebrow FAPs individually and jointly. A multistream HMM approach is proposed for introducing facial expression and FAP group dependent stream reliability weights. The stream weights are determined based on the facial expression recognition results obtained when FAP streams are utilized individually. The proposed multistream HMM facial expression system, which utilizes stream reliability weights, achieves relative reduction of the facial expression recognition error of 44 % compared to the single-stream HMM system. Index Terms—Facial expression recognition, multistream HMMs, facial animation parameters. I.
Generalization of a vision-based computational model of mind-reading
- 1 st Intl. Conf. on Affective Computing and Intelligent Interaction, LNCS 3784
, 2005
"... Abstract. This paper describes a vision-based computational model of mind-reading that infers complex mental states from head and facial expressions in real-time. The generalization ability of the system is evaluated on videos that were posed by lay people in a relatively uncontrolled recording envi ..."
Abstract
-
Cited by 12 (2 self)
- Add to MetaCart
Abstract. This paper describes a vision-based computational model of mind-reading that infers complex mental states from head and facial expressions in real-time. The generalization ability of the system is evaluated on videos that were posed by lay people in a relatively uncontrolled recording environment for six mental states—agreeing, concentrating, disagreeing, interested, thinking and unsure. The results show that the system’s accuracy is comparable to that of humans on the same corpus. 1
Spontaneous vs. Posed Facial Behavior: Automatic Analysis of Brow Actions
- Proc. ACM Int'l Conf. Multimodal Interfaces
, 2006
"... Past research on automatic facial expression analysis has focused mostly on the recognition of prototypic expressions of discrete emotions rather than on the analysis of dynamic changes over time, although the importance of temporal dynamics of facial expressions for interpretation of the observed f ..."
Abstract
-
Cited by 11 (6 self)
- Add to MetaCart
Past research on automatic facial expression analysis has focused mostly on the recognition of prototypic expressions of discrete emotions rather than on the analysis of dynamic changes over time, although the importance of temporal dynamics of facial expressions for interpretation of the observed facial behavior has been acknowledged for over 20 years. For instance, it has been shown that the temporal dynamics of spontaneous and volitional smiles are fundamentally different from each other. In this work, we argue that the same holds for the temporal dynamics of brow actions and show that velocity, duration, and order of occurrence of brow actions are highly relevant parameters for distinguishing posed from spontaneous brow actions. The proposed system for discrimination between volitional and spontaneous brow actions is based on automatic detection of Action Units (AUs) and their temporal segments (onset, apex, offset) produced by movements of the eyebrows. For each temporal segment of an activated AU, we compute a number of mid-level feature parameters including the maximal intensity, duration, and order of occurrence. We use Gentle Boost to select the most important of these parameters. The selected parameters are used further to train Relevance Vector Machines to determine per temporal segment of an activated AU whether the action was displayed spontaneously or volitionally. Finally, a probabilistic decision function determines the class (spontaneous or posed) for the entire brow action. When tested on 189 samples taken from three different sets of spontaneous and volitional facial data, we attain a 90.7 % correct recognition rate. Categories and Subject Descriptors I.2.10 [Vision and Scene Understanding]: motion, modeling and recovery of physical attributes
The Extended Cohn-Kanade Dataset (CK+): A complete dataset for action unit and emotion-specified expression
"... In 2000, the Cohn-Kanade (CK) database was released for the purpose of promoting research into automatically detecting individual facial expressions. Since then, the CK database has become one of the most widely used test-beds for algorithm development and evaluation. During this period, three limit ..."
Abstract
-
Cited by 10 (1 self)
- Add to MetaCart
In 2000, the Cohn-Kanade (CK) database was released for the purpose of promoting research into automatically detecting individual facial expressions. Since then, the CK database has become one of the most widely used test-beds for algorithm development and evaluation. During this period, three limitations have become apparent: 1) While AU codes are well validated, emotion labels are not, as they refer to what was requested rather than what was actually performed, 2) The lack of a common performance metric against which to evaluate new algorithms, and 3) Standard protocols for common databases have not emerged. As a consequence, the CK database has been used for both AU and emotion detection (even though labels for the latter have not been validated), comparison with benchmark algorithms is missing, and use of random subsets of the original database makes meta-analyses difficult. To address these and other concerns, we present the Extended Cohn-Kanade (CK+) database. The number of sequences is increased by 22 % and the number of subjects by 27%. The target expression for each sequence is fully FACS coded and emotion labels have been revised and validated. In addition to this, non-posed sequences for several types of smiles and their associated metadata have been added. We present baseline results using Active Appearance Models (AAMs) and a linear support vector machine (SVM) classifier using a leaveone-out subject cross-validation for both AU and emotion detection for the posed data. The emotion and AU labels, along with the extended image data and tracked landmarks will be made available July 2010. 1.
Towards Practical Smile Detection
"... Machine learning approaches have produced some of the highest reported performances for facial expression recognition. However, to date, nearly all automatic facial expression recognition research has focused on optimizing performance on a few databases that were collected under controlled lighting ..."
Abstract
-
Cited by 10 (3 self)
- Add to MetaCart
Machine learning approaches have produced some of the highest reported performances for facial expression recognition. However, to date, nearly all automatic facial expression recognition research has focused on optimizing performance on a few databases that were collected under controlled lighting conditions on a relatively small number of subjects. This paper explores whether current machine learning methods can be used to develop an expression recognition system that operates reliably in more realistic conditions. We explore the necessary characteristics of the training dataset, image registration, feature representation, and machine learning algorithms. A new database, GENKI, is presented which contains pictures, photographed by the subjects themselves, from thousands of different people in many different real-world imaging conditions. Results suggest that human-level expression recognition accuracy in reallife illumination conditions is achievable with machine learning technology. However, the datasets currently used in the automatic expression recognition literature to evaluate progress may be overly constrained and could potentially lead research into locally optimal algorithmic solutions.
Human-Centred Intelligent Human-Computer Interaction (HCI²): . . .
, 2008
"... A widely accepted prediction is that computing will move to the background, weaving itself into the fabric of our everyday living spaces and projecting the human user into the foreground. To realise this prediction, next-generation computing should develop anticipatory user interfaces that are hum ..."
Abstract
-
Cited by 9 (6 self)
- Add to MetaCart
A widely accepted prediction is that computing will move to the background, weaving itself into the fabric of our everyday living spaces and projecting the human user into the foreground. To realise this prediction, next-generation computing should develop anticipatory user interfaces that are human-centred, built for humans and based on naturally occurring multimodal human communication. These interfaces should transcend the traditional keyboard and mouse and have the capacity to understand and emulate human communicative intentions as expressed through behavioural cues, such as affective and social signals. This article discusses how far we are to the goal of human-centred computing and Human-Centred Intelligent Human-Computer Interaction (HCI²) that can understand and respond to multimodal human communication.
Action Unit Detection with Segment-based SVMs
"... Automatic facial action unit (AU) detection from video is a long-standing problem in computer vision. Two main approaches have been pursued: (1) static modeling—typically posed as a discriminative classification problem in which each video frame is evaluated independently; (2) temporal modeling—fram ..."
Abstract
-
Cited by 9 (4 self)
- Add to MetaCart
Automatic facial action unit (AU) detection from video is a long-standing problem in computer vision. Two main approaches have been pursued: (1) static modeling—typically posed as a discriminative classification problem in which each video frame is evaluated independently; (2) temporal modeling—frames are segmented into sequences and typically modeled with a variant of dynamic Bayesian networks. We propose a segment-based approach, kSeg-SVM, that incorporates benefits of both approaches and avoids their limitations. kSeg-SVM is a temporal extension of the spatial bag-of-words. kSeg-SVM is trained within a structured output SVM framework that formulates AU detection as a problem of detecting temporal events in a time series of visual features. Each segment is modeled by a variant of the BoW representation with soft assignment of the words based on similarity. Our framework has several benefits for AU detection: (1) both dependencies between features and the length of action units are modeled; (2) all possible segments of the video may be used for training; and (3) no assumptions are required about the underlying structure of the action unit events (e.g., i.i.d.). Our algorithm finds the best k-or-fewer segments that maximize the SVM score. Experimental results suggest that the proposed method outperforms state-of-the-art static methods for AU detection. 1.

