Results 1 -
9 of
9
Automatic interpretation and coding of face images using flexible models
- IEEE Transactions on Pattern Analysis and Machine Intelligence
, 1997
"... Abstract—Face images are difficult to interpret because they are highly variable. Sources of variability include individual appearance, 3D pose, facial expression, and lighting. We describe a compact parametrized model of facial appearance which takes into account all these sources of variability. T ..."
Abstract
-
Cited by 150 (9 self)
- Add to MetaCart
Abstract—Face images are difficult to interpret because they are highly variable. Sources of variability include individual appearance, 3D pose, facial expression, and lighting. We describe a compact parametrized model of facial appearance which takes into account all these sources of variability. The model represents both shape and gray-level appearance, and is created by performing a statistical analysis over a training set of face images. A robust multiresolution search algorithm is used to fit the model to faces in new images. This allows the main facial features to be located, and a set of shape, and gray-level appearance parameters to be recovered. A good approximation to a given face can be reconstructed using less than 100 of these parameters. This representation can be used for tasks such as image coding, person identification, 3D pose recovery, gender recognition, and expression recognition. Experimental results are presented for a database of 690 face images obtained under widely varying conditions of 3D pose, lighting, and facial expression. The system performs well on all the tasks listed above.
A Unified Approach To Coding and Interpreting Face Images
- In ICCV
, 1995
"... Face images are difficult to interpret because they are highly variable. Sources of variability include individual appear# ance, 3D pose, facial expression and lighting. We describe a compact parametrised model of facial appearance which takes into account all these sources of variability. The model ..."
Abstract
-
Cited by 73 (6 self)
- Add to MetaCart
Face images are difficult to interpret because they are highly variable. Sources of variability include individual appear# ance, 3D pose, facial expression and lighting. We describe a compact parametrised model of facial appearance which takes into account all these sources of variability. The model represents both shape and grey-level appearance and is created by performing a statistical analysis over a training set of face images. A robust multi-resolution search algo# rithm is used to fit the model to faces in new images. This allows the main facial features to be located and a set of shape and grey-level appearance parameters to be recov# ered. A good approximation to a given face can be recon# structed using less than 100 of these parameters. This repre# sentation can be used for tasks such as image coding, person identification, pose recovery, gender recognition and ex# pression recognition. The system performs well on all the tasks listed above. 1: Introduction Á ÂÄÀÅÃÇÂÉÀÊÅËÂÈ...
Hallucinating Faces
- FOURTH INTERNATIONAL CONFERENCE ON AUTOMATIC FACE AND GESTURE RECOGNITION
, 1999
"... In most surveillance scenarios there is a large distance between the camera and the objects of interest in the scene. Surveillance cameras are also usually set up with wide #elds of view in order to image as much of the scene as possible. The end result is that the objects in the scene normally appe ..."
Abstract
-
Cited by 39 (5 self)
- Add to MetaCart
In most surveillance scenarios there is a large distance between the camera and the objects of interest in the scene. Surveillance cameras are also usually set up with wide #elds of view in order to image as much of the scene as possible. The end result is that the objects in the scene normally appear very small in surveillance imagery. It is generally possible to detect and track the objects in the scene, however, for tasks such as automatic face recognition and license plate reading, resolution enhancement techniques are often needed. Although numerous
Gaze Estimation using Morphable Models
- In International Conference on Automatic Face- and Gesture- Recognition
, 1998
"... This paper presents preliminary work on a novel technique for gaze estimation from a single image. The goal is to provide rough estimates of where a person is looking at a monitor. Many applications for human-computer interaction are possible for such a technique. Our approach uses the morphable mod ..."
Abstract
-
Cited by 10 (0 self)
- Add to MetaCart
This paper presents preliminary work on a novel technique for gaze estimation from a single image. The goal is to provide rough estimates of where a person is looking at a monitor. Many applications for human-computer interaction are possible for such a technique. Our approach uses the morphable model framework of Jones and Poggio [4] to model a region around the eyes of a person. After matching this model to an input image of a person's eye region, the resulting model parameters are sent to a neural network which approximates the screen coordinates being viewed. The system is user independent and can handle changes in both head orientation and iris location. Currently the system is not real-time. Future work will focus on improving the speed of the system. 1 Introduction Gaze, the orientation of the eyes when viewing a particular point in the world, is an important visual cue in communication. Where a person looks often indicates their interest and intent, and may provide subtle yet...
Audio-Visual Intent-To-Speak Detection For Human-Computer Interaction
, 2000
"... This paper introduces a practical system that aims to detect a user's intent to speak to a computer, by considering both audio and visual cues. The whole system is designed to intuitively turn on the microphone for speech recognition without needing to clickona mouse, thus improving the human-like ..."
Abstract
-
Cited by 7 (2 self)
- Add to MetaCart
This paper introduces a practical system that aims to detect a user's intent to speak to a computer, by considering both audio and visual cues. The whole system is designed to intuitively turn on the microphone for speech recognition without needing to clickona mouse, thus improving the human-like communication between users and computers. The #rst step is to detect a frontal face through a simple desktop video camera image, by using some well-known image processing techniques for face and facial feature detection on one image. The second step is an audio-visual speechevent detection that combines both visual and audio indications of speech. In this paper, we consider visual measures of speech activityaswell as audio energy to determine if the previously detected user is actually speaking or not. 1. INTRODUCTION Speech recognition systems have opened the way towards intuitive and natural Human-Computer Interaction #HCI#. Today, people can control their computer by using their voice:...
Head orientation and gaze detection from a single image
- In Proceedings of International Conference Of Computer Vision Theory And Applications
, 2006
"... Abstract: Head orientation is an important part of many advanced human-machine interaction systems. We present a single image based head pose computation algorithm. It is deduced from anthropometric data. This approach allows us to use a single camera and requires no cooperation from the user. Using ..."
Abstract
-
Cited by 5 (0 self)
- Add to MetaCart
Abstract: Head orientation is an important part of many advanced human-machine interaction systems. We present a single image based head pose computation algorithm. It is deduced from anthropometric data. This approach allows us to use a single camera and requires no cooperation from the user. Using a single image avoids the complexities associated with of a multi-camera system. Evaluation tests show that our approach is accurate, fast and can be used in a variety of contexts. Application to gaze detection, with a working system, is also demonstrated. 1
Recognizing Visual Focus of Attention from Head Pose in Natural Meetings
"... We address the problem of recognizing the visual focus of attention (VFOA) of meeting participants based on their head pose. To this end, the head pose observations are modeled using a Gaussian Mixture Model (GMM) or a Hidden Markov Model (HMM) whose hidden states corresponds to the VFOA. The novelt ..."
Abstract
-
Cited by 5 (1 self)
- Add to MetaCart
We address the problem of recognizing the visual focus of attention (VFOA) of meeting participants based on their head pose. To this end, the head pose observations are modeled using a Gaussian Mixture Model (GMM) or a Hidden Markov Model (HMM) whose hidden states corresponds to the VFOA. The novelties of this work are threefold. First, contrary to previous studies on the topic, in our set-up, the potential VFOA of a person is not restricted to other participants only, but includes environmental targets (a table and a projection screen), which increases the complexity of the task, with more VFOA targets spread in the pan as well as pan gaze space. Second, we propose a geometric model to set the GMM or HMM parameters by exploiting results from cognitive science on saccadic eye motion, which allows the prediction of the head pose given a gaze target. Third, an unsupervised parameter adaptation step (not using any labeled data) is proposed which accounts for the specific gazing behaviour of each participant. Another contribution of the paper is the development of a significant publicly available corpus of 8 meetings which are on average 10 minutes in length featuring 4 persons, with head pose and VFOA annotation. Using this corpus, we analyze the above methods by evaluating, through objective performance measures, the recognition of the VFOA from head pose information obtained either using a magnetic sensor device or a vision based tracking system. The results clearly show that in such complex but realistic situations, the VFOA recognition performance is highly dependent on how well the visual targets are separated for a given meeting participant. In addition, the results show that the use of a geometric model with unsupervised adaptation achieves better results that the use of training data to set the HMM parameters. ⋆ Corresponding author. I.
Recovering facial pose with the emalgorithm. Pattern Recognition
- Pattern Recognition
, 2002
"... This paper describes how 3D facial pose may be estimated by tting a template to 2D feature locations. The tting process is realised as projecting the control points of a 3D template onto the 2D feature locations under orthographic projection. The parameters of the orthographic projection are iterati ..."
Abstract
-
Cited by 3 (0 self)
- Add to MetaCart
This paper describes how 3D facial pose may be estimated by tting a template to 2D feature locations. The tting process is realised as projecting the control points of a 3D template onto the 2D feature locations under orthographic projection. The parameters of the orthographic projection are iteratively estimated using the EMalgorithm. The method is evaluated on both contrived data with known ground-truth together with some more naturalistic imagery. These experiments reveal that under favourable conditions the algorithm can estimate facial pitch to within 3 ◦
ZABULIS et al.: 3D HEAD POSE ESTIMATION FROM MULTIPLE DISTANT VIEWS 1 3D head pose estimation from multiple distant views
"... A method for human head pose estimation in multicamera environments is proposed. The method computes the textured visual hull of the subject and unfolds the texture of the head on a hypothetical sphere around it, whose parameterization is iteratively rotated so that the face eventually occurs on its ..."
Abstract
- Add to MetaCart
A method for human head pose estimation in multicamera environments is proposed. The method computes the textured visual hull of the subject and unfolds the texture of the head on a hypothetical sphere around it, whose parameterization is iteratively rotated so that the face eventually occurs on its equator. This gives rise to a spherical image, in which face detection is simplified, because exactly one frontal face is guaranteed to appear in it. In this image, the face center yields two components of pose (yaw, pitch), while the third (roll) is retrieved from the orientation of the major symmetry axis of the face. Face detection applied on the original images reduces the required iterations and anchors tracking drift. The method is demonstrated and evaluated in several data sets, including ones with known ground truth. Experimental results show that the proposed method is accurate and robust to distant imaging, despite the low-resolution appearance of subjects. 1

