Results 1 - 10
of
30
A Survey of Affect Recognition Methods: Audio, Visual, and Spontaneous Expressions
, 2009
"... Automated analysis of human affective behavior has attracted increasing attention from researchers in psychology, computer science, linguistics, neuroscience, and related disciplines. However, the existing methods typically handle only deliberately displayed and exaggerated expressions of prototypi ..."
Abstract
-
Cited by 69 (17 self)
- Add to MetaCart
Automated analysis of human affective behavior has attracted increasing attention from researchers in psychology, computer science, linguistics, neuroscience, and related disciplines. However, the existing methods typically handle only deliberately displayed and exaggerated expressions of prototypical emotions, despite the fact that deliberate behavior differs in visual appearance, audio profile, and timing from spontaneously occurring behavior. To address this problem, efforts to develop algorithms that can process naturally occurring human affective behavior have recently emerged. Moreover, an increasing number of efforts are reported toward multimodal fusion for human affect analysis, including audiovisual fusion, linguistic and paralinguistic fusion, and multicue visual fusion based on facial expressions, head movements, and body gestures. This paper introduces and surveys these recent advances. We first discuss human emotion perception from a psychological perspective. Next, we examine available approaches for solving the problem of machine understanding of human affective behavior and discuss important issues like the collection and availability of training and test data. We finally outline some of the scientific and engineering challenges to advancing human affect sensing technology.
Gaze-X: Adaptive affective multimodal interface for single-user office scenarios
- Proc. ACM Int’l Conf. Multimodal Interfaces
, 2006
"... This paper describes an intelligent system that we developed to support affective multimodal human-computer interaction (AMM-HCI) where the user’s actions and emotions are modeled and then used to adapt the HCI and support the user in his or her activity. The proposed system, which we named Gaze-X, ..."
Abstract
-
Cited by 14 (6 self)
- Add to MetaCart
This paper describes an intelligent system that we developed to support affective multimodal human-computer interaction (AMM-HCI) where the user’s actions and emotions are modeled and then used to adapt the HCI and support the user in his or her activity. The proposed system, which we named Gaze-X, is based on sensing and interpretation of the human part of the computer’s context, known as W5+ (who, where, what, when, why, how). It integrates a number of natural human communicative modalities including speech, eye gaze direction, face and facial expression, and a number of standard HCI modalities like keystrokes, mouse movements, and active software identification, which, in turn, are fed into processes that provide decision making and adapt the HCI to support the user in his or her activity according to his or her preferences. To attain a system that can be educated, that can improve its knowledge and decision making through experience, we use case-based reasoning as the inference engine of Gaze-X. The utilized case base is a dynamic, incrementally self-organizing event-content-addressable memory that allows fact retrieval and evaluation of encountered events based upon the user preferences and the generalizations formed from prior input. To support concepts of concurrency, modularity/scalability, persistency, and mobility, Gaze-X has been built as an agent-based system where different agents are responsible for different parts of the processing. A usability study conducted in an office scenario with a number of users indicates that Gaze-X is perceived as effective, easy to use, useful, and affectively qualitative.
Audio-visual Information Fusion In Human Computer Interfaces and Intelligent Environments: A Survey
"... Microphones and cameras have been extensively used to observe and detect human activity and to facilitate natural modes of interaction between humans and intelligent systems. Human brain processes the audio and video modalities extracting complementary and robust information from them. Intelligent s ..."
Abstract
-
Cited by 6 (5 self)
- Add to MetaCart
Microphones and cameras have been extensively used to observe and detect human activity and to facilitate natural modes of interaction between humans and intelligent systems. Human brain processes the audio and video modalities extracting complementary and robust information from them. Intelligent systems with audio-visual sensors should be capable of achieving similar goals. The audio-visual information fusion strategy is a key component in designing such systems. In this paper we exclusively survey the fusion techniques used in various audio-visual information fusion tasks. The fusion strategy used tends to depend mainly on the model, probabilistic or otherwise, used in the particular task to process sensory information to obtain higher level semantic information. The models themselves are task oriented. In this paper we describe the fusion strategies and the corresponding models used in audiovisual tasks such as speech recognition, tracking, biometrics, affective state recognition and meeting scene analysis. We also review the challenges and existing solutions and also unresolved or partially resolved issues in these fields. Specifically, we discuss established and upcoming work in hierarchical fusion strategies and crossmodal learning techniques, identifying these as critical areas of research in the future development of intelligent systems.
Smart Environments for Collaborative Design, Implementation, and Interpretation of Scientific Experiments
"... Ambient intelligence promises to enable humans to smoothly interact with their environment, mediated by computer technology. In the literature on ambient intelligence, empirical scientists are not often mentioned. Yet they form an interesting target group for this technology. In this position paper, ..."
Abstract
-
Cited by 5 (3 self)
- Add to MetaCart
Ambient intelligence promises to enable humans to smoothly interact with their environment, mediated by computer technology. In the literature on ambient intelligence, empirical scientists are not often mentioned. Yet they form an interesting target group for this technology. In this position paper, we describe a project aimed at realising an ambient intelligence environment for face-to-face meetings of researchers with different academic backgrounds involved in molecular biology “omics ” experiments. In particular, microarray experiments are a focus of attention because these experiments require multidisciplinary collaboration for their design, analysis, and interpretation. Such an environment is characterised by a high degree of complexity that has to be mitigated by ambient intelligence technology. By experimenting in a real-life setting, we will learn more about life scientists as a user group. 1
Affective Feedback: An Investigation into the Role of Emotions in the Information Seeking Process
"... User feedback is considered to be a critical element in the information seeking process, especially in relation to relevance assessment. Current feedback techniques determine content relevance with respect to the cognitive and situational levels of interaction that occurs between the user and the re ..."
Abstract
-
Cited by 4 (0 self)
- Add to MetaCart
User feedback is considered to be a critical element in the information seeking process, especially in relation to relevance assessment. Current feedback techniques determine content relevance with respect to the cognitive and situational levels of interaction that occurs between the user and the retrieval system. However, apart from real-life problems and information objects, users interact with intentions, motivations and feelings, which can be seen as critical aspects of cognition and decision-making. The study presented in this paper serves as a starting point to the exploration of the role of emotions in the information seeking process. Results show that the latter not only interweave with different physiological, psychological and cognitive processes, but also form distinctive patterns, according to specific task, and according to specific user.
Multimodal fusion for multimedia analysis: a survey
, 2010
"... This survey aims at providing multimedia researchers with a state-of-the-art overview of fusion strategies, which are used for combining multiple modalities in order to accomplish various multimedia analysis tasks. The existing literature on multimodal fusion research is presented through several c ..."
Abstract
-
Cited by 4 (1 self)
- Add to MetaCart
This survey aims at providing multimedia researchers with a state-of-the-art overview of fusion strategies, which are used for combining multiple modalities in order to accomplish various multimedia analysis tasks. The existing literature on multimodal fusion research is presented through several classifications based on the fusion methodology and the level of fusion (feature, decision, and hybrid). The fusion methods are described from the perspective of the basic concept, advantages, weaknesses, and their usage in various analysis tasks as reported in the literature. Moreover, several distinctive issues that influence a multimodal fusion process such as, the use of correlation and independence, confidence level, contextual information, synchronization between different modalities, and the optimal modality selection are also highlighted. Finally, we present the open issues for further research in the area of multimodal fusion.
Gesture Recognition Based On Elastic Deformation Energies
"... Abstract. We present a gesture recognition method based on deformable shapes and curvature templates. Gestures are modeled using a spline representation that is enhanced with elastic properties: a gesture trajectory as a whole or any of its parts may stretch or bend. We regard such an approach as we ..."
Abstract
-
Cited by 3 (1 self)
- Add to MetaCart
Abstract. We present a gesture recognition method based on deformable shapes and curvature templates. Gestures are modeled using a spline representation that is enhanced with elastic properties: a gesture trajectory as a whole or any of its parts may stretch or bend. We regard such an approach as well-suited for dealing with the inherent variability of human gesture execution. The results of our gesture classifier are demonstrated with a video-based acquisition approach. Key words: gesture recognition, elastic matching, deformation energies 1
Multimodal Interfaces: A Survey of Principles, Models and Frameworks
"... Abstract. The grand challenge of multimodal interface creation is to build reliable processing systems able to analyze and understand multiple communication means in real-time. This opens a number of associated issues covered by this chapter, such as heterogeneous data types fusion, architectures fo ..."
Abstract
-
Cited by 2 (2 self)
- Add to MetaCart
Abstract. The grand challenge of multimodal interface creation is to build reliable processing systems able to analyze and understand multiple communication means in real-time. This opens a number of associated issues covered by this chapter, such as heterogeneous data types fusion, architectures for real-time processing, dialog management, machine learning for multimodal interaction, modeling languages, frameworks, etc. This chapter does not intend to cover exhaustively all the issues related to multimodal interfaces creation and some hot topics, such as error handling, have been left aside. The chapter starts with the features and advantages associated with multimodal interaction, with a focus on particular findings and guidelines, as well as cognitive foundations underlying multimodal interaction. The chapter then focuses on the driving theoretical principles, time-sensitive software architectures and multimodal fusion and fission issues. Modeling of multimodal interaction as well as tools allowing rapid creation of multimodal interfaces are then presented. The article concludes with an outline of the current state of multimodal interaction research in Switzerland, and also summarizes the major future challenges in the field. 1
Sit Straight (and tell me what I did today): A Human Posture Alarm and Activity Summarization System
- in ACM Workshop on Continuous Archival and Retrieval of Personal Experiences (CARPE 05) in conjunction with ACM MM 2005
, 2005
"... In this paper we present a novel system for monitoring a computer user’s posture and activities in front of the computer (e.g., reading, speaking on the phone, etc.) for self-reporting. In our system, a camera and a microphone are placed in front of a computer work area (e.g., on top of the computer ..."
Abstract
-
Cited by 2 (1 self)
- Add to MetaCart
In this paper we present a novel system for monitoring a computer user’s posture and activities in front of the computer (e.g., reading, speaking on the phone, etc.) for self-reporting. In our system, a camera and a microphone are placed in front of a computer work area (e.g., on top of the computer screen). The system monitors the computer user’s postures and summarizes his or her activities. The system gives the user real time feedback on the goodness of his current posture, triggers alarms if the postures are not good postures, and generates summaries of postures and activities over a specified period of time (e.g., hours, days, months, etc.). All elements of the system are highly customizable: the user decides what “good ” postures are, what alarms are triggered, if any, and what activity and posture summaries are generated. We present novel algorithms for posture measurement (using geometric features of the user’s silhouette), and activity classification (using machine learning). Finally, we present experiments that show the feasibility of our approach, and discuss privacy issues and applications of the techniques presented (health monitoring, productivity analysis, and others).
iMediaTV: Open and Interactive Access for Live Performances and Installation Art
- 4TH INTERNATIONAL CONFERENCE ON INFORMATION LAW (ICIL2011)
, 2011
"... Abstract. Internet-based interactive TV is an emerging field affected by advances in various research areas including communication, interactivity, network efficiency, content management and aesthetics. Despite constantly reducing costs in the area of broadcast infrastructure development, this new m ..."
Abstract
-
Cited by 2 (2 self)
- Add to MetaCart
Abstract. Internet-based interactive TV is an emerging field affected by advances in various research areas including communication, interactivity, network efficiency, content management and aesthetics. Despite constantly reducing costs in the area of broadcast infrastructure development, this new medium has yet to claim its market position and recognition. The large marketshare of existing non-interactive technologies may be identified as the principal factor for non-adoption of new broadcasting technologies, followed by various quality-of-service issues and the absence of a widely accepted standard for interactive broadcasting that does not permit the development of devices that support interaction in an out-of-the-box user-experience. On the experimental forefront, educational institutions are exploring the capabilities of highbandwidth networks and experimental interactive content, setting the standards for the development of new digital services. Particular types of content such as interactive installation art, games and multimedia presentations that require synchronised content communication are aided by the development of custombuilt interactive broadcasting infrastructures offering alternative methods of content deployment, presentation and interaction. In this work we are mainly concerned with the development strategy of an interactive TV studio, the integration of existing technologies under common environments, user-related usability issues and aesthetics, while open source software solutions are employed to reduce cost. The reduction of development, production and broadcasting costs offers new opportunities that enable user groups to clearly expand and offer a high-quality media experience in global scale.

