Results 1 - 10
of
153
The Extended Cohn-Kanade Dataset (CK+): A complete dataset for action unit and emotion-specified expression
"... In 2000, the Cohn-Kanade (CK) database was released for the purpose of promoting research into automatically detecting individual facial expressions. Since then, the CK database has become one of the most widely used test-beds for algorithm development and evaluation. During this period, three limit ..."
Abstract
-
Cited by 122 (7 self)
- Add to MetaCart
(Show Context)
In 2000, the Cohn-Kanade (CK) database was released for the purpose of promoting research into automatically detecting individual facial expressions. Since then, the CK database has become one of the most widely used test-beds for algorithm development and evaluation. During this period, three limitations have become apparent: 1) While AU codes are well validated, emotion labels are not, as they refer to what was requested rather than what was actually performed, 2) The lack of a common performance metric against which to evaluate new algorithms, and 3) Standard protocols for common databases have not emerged. As a consequence, the CK database has been used for both AU and emotion detection (even though labels for the latter have not been validated), comparison with benchmark algorithms is missing, and use of random subsets of the original database makes meta-analyses difficult. To address these and other concerns, we present the Extended Cohn-Kanade (CK+) database. The number of sequences is increased by 22 % and the number of subjects by 27%. The target expression for each sequence is fully FACS coded and emotion labels have been revised and validated. In addition to this, non-posed sequences for several types of smiles and their associated metadata have been added. We present baseline results using Active Appearance Models (AAMs) and a linear support vector machine (SVM) classifier using a leaveone-out subject cross-validation for both AU and emotion detection for the posed data. The emotion and AU labels, along with the extended image data and tracked landmarks will be made available July 2010. 1.
Bridging the Gap Between Social Animal and Unsocial Machine: A Survey of Social Signal Processing
- IEEE TRANSACTIONS ON AFFECTIVE COMPUTING
"... Social Signal Processing is the research domain aimed at bridging the social intelligence gap between humans and machines. This article is the first survey of the domain that jointly considers its three major aspects, namely modeling, analysis and synthesis of social behaviour. Modeling investigate ..."
Abstract
-
Cited by 35 (7 self)
- Add to MetaCart
Social Signal Processing is the research domain aimed at bridging the social intelligence gap between humans and machines. This article is the first survey of the domain that jointly considers its three major aspects, namely modeling, analysis and synthesis of social behaviour. Modeling investigates laws and principles underlying social interaction, analysis explores approaches for automatic understanding of social exchanges recorded with different sensors, and synthesis studies techniques for the generation of social behaviour via various forms of embodiment. For each of the above aspects, the paper includes an extensive survey of the literature, points to the most important publicly available resources, and outlines the most fundamental challenges ahead.
Canal9: A database of political debates for analysis of social interactions
- In Affective Computing and Intelligent Interaction and Workshops, 2009. ACII 2009. 3rd International Conference on
"... Automatic analysis of social interactions attracts major attention in the computing community, but relatively few benchmarks are available to researchers active in the domain. This paper presents a new, publicly available, corpus of political debates including not only raw data, but a rich set of so ..."
Abstract
-
Cited by 32 (3 self)
- Add to MetaCart
(Show Context)
Automatic analysis of social interactions attracts major attention in the computing community, but relatively few benchmarks are available to researchers active in the domain. This paper presents a new, publicly available, corpus of political debates including not only raw data, but a rich set of socially relevant annotations such as turn-taking (who speaks when and how much), agreement and disagreement between participants, and role played by people involved in each debate. The collection includes 70 debates for a total of 43 hours and 10 minutes of material. 1.
Spotting Agreement and Disagreement: A Survey of Nonverbal Audiovisual Cues and Tools
"... While detecting and interpreting temporal patterns of non–verbal behavioral cues in a given context is a natural and often unconscious process for humans, it remains a rather difficult task for computer systems. Nevertheless, it is an important one to achieve if the goal is to realise a naturalistic ..."
Abstract
-
Cited by 22 (8 self)
- Add to MetaCart
(Show Context)
While detecting and interpreting temporal patterns of non–verbal behavioral cues in a given context is a natural and often unconscious process for humans, it remains a rather difficult task for computer systems. Nevertheless, it is an important one to achieve if the goal is to realise a naturalistic communication between humans and machines. Machines that are able to sense social attitudes like agreement and disagreement and respond to them in a meaningful way are likely to be welcomed by users due to the more natural, efficient and human–centered interaction they are bound to experience. This paper surveys the nonverbal cues that could be present during agreement and disagreement behavioural displays and lists a number of tools that could be useful in detecting them, as well as a few publicly available databases that could be used to train these tools for analysis of spontaneous, audiovisual instances of agreement and disagreement. 1.
Automatic Role Recognition in Multiparty Recordings: Using Social Affiliation Networks for Feature Extraction
"... Abstract—Automatic analysis of social interactions attracts increasing attention in the multimedia community. This paper considers one of the most important aspects of the problem, namely the roles played by individuals interacting in different settings. In particular, this work proposes an automati ..."
Abstract
-
Cited by 18 (6 self)
- Add to MetaCart
(Show Context)
Abstract—Automatic analysis of social interactions attracts increasing attention in the multimedia community. This paper considers one of the most important aspects of the problem, namely the roles played by individuals interacting in different settings. In particular, this work proposes an automatic approach for the recognition of roles in both production environment contexts (e.g., news and talk-shows) and spontaneous situations (e.g., meetings). The experiments are performed over roughly 90 hours of material (one of the largest databases used for role recognition in the literature) and show that the recognition effectiveness depends on how much the roles influence the behavior of people. Furthermore, this work proposes the first approach for modeling mutual dependences between roles and assesses its effect on role recognition performance.
Automatic personality perception: Prediction of trait attribution based on prosodic features
- IEEE Transactions on Affective Computing
, 2012
"... Abstract—Whenever we listen to a voice for the first time, we attribute personality traits to the speaker. The process takes place in a few seconds and it is spontaneous and unaware. While the process is not necessarily accurate (attributed traits do not necessarily correspond to the actual traits o ..."
Abstract
-
Cited by 16 (4 self)
- Add to MetaCart
(Show Context)
Abstract—Whenever we listen to a voice for the first time, we attribute personality traits to the speaker. The process takes place in a few seconds and it is spontaneous and unaware. While the process is not necessarily accurate (attributed traits do not necessarily correspond to the actual traits of the speaker), still it significantly influences our behavior toward others, especially when it comes to social interaction. This paper proposes an approach for the automatic prediction of the traits the listeners attribute to a speaker they never heard before. The experiments are performed over a corpus of 640 speech clips (322 identities in total) annotated in terms of personality traits by 11 assessors. The results show that it is possible to predict with high accuracy (more than 70 percent depending on the particular trait) whether a person is perceived to be in the upper or lower part of the scales corresponding to each of the Big Five, the personality dimensions known to capture most of the individual differences. Index Terms—Personality traits, prosody, Big Five, social signal processing, automatic personality perception Ç 1
R.: “two people walk into a bar”: Dynamic multi-party social interaction with a robot agent
- In: Proc. of the 14th ACM International Conference on Multimodal Interaction ICMI. (2012
"... ABSTRACT We introduce a humanoid robot bartender that is capable of dealing with multiple customers in a dynamic, multi-party social setting. The robot system incorporates state-of-the-art components for computer vision, linguistic processing, state management, highlevel reasoning, and robot contro ..."
Abstract
-
Cited by 16 (13 self)
- Add to MetaCart
(Show Context)
ABSTRACT We introduce a humanoid robot bartender that is capable of dealing with multiple customers in a dynamic, multi-party social setting. The robot system incorporates state-of-the-art components for computer vision, linguistic processing, state management, highlevel reasoning, and robot control. In a user evaluation, 31 participants interacted with the bartender in a range of social situations. Most customers successfully obtained a drink from the bartender in all scenarios, and the factors that had the greatest impact on subjective satisfaction were task success and dialogue efficiency.
Implicit Human-Centered Tagging
- IEEE Signal Processing Magazine
, 2009
"... aim of facilitating fast and accurate data retrieval based on these tags. In contrast to this process, also referred to as Explicit Tagging, Implicit Human-Centered Tagging (IHCT) refers to exploiting the information on user’s nonverbal reactions (e.g., facial expressions like smiles or head gesture ..."
Abstract
-
Cited by 15 (8 self)
- Add to MetaCart
(Show Context)
aim of facilitating fast and accurate data retrieval based on these tags. In contrast to this process, also referred to as Explicit Tagging, Implicit Human-Centered Tagging (IHCT) refers to exploiting the information on user’s nonverbal reactions (e.g., facial expressions like smiles or head gestures like shakes) to multimedia data, with which he or she interacts, to assign new or improve the existing tags associated with the target data. Thus, implicit tagging allows that a data item gets tagged each time a user interacts with it based on the reactions of the user to the data (e.g., laughter when seeing a funny video), in contrast to explicit tagging paradigm in which a data item gets tagged only if a user is requested (or chooses) to associate tags with it. As nonverbal reactions to observed multimedia are displayed naturally and spontaneously, no purposeful explicit action (effort) is required from the user; hence, the resulting tagging process is said to be “implicit ” and “human-centered ” (in contrast to being dictated by computer and being “computer-centered”). Tags obtained through IHCT are expected to be more robust than tags associated with the data explicitly, at least in terms of generality and statistical reliability. To wit, a number of human behaviors are universally displayed and perceived – e.g., basic emotions like happiness, disgust and fear – and these could be associated to IHCT tags such as “funny ” and “horror”, which would make sense to everybody