Results 1 - 10
of
77
Face Recognition: A Literature Survey
, 2000
"... ... This paper provides an up-to-date critical survey of still- and video-based face recognition research. There are two underlying motivations for us to write this survey paper: the first is to provide an up-to-date review of the existing literature, and the second is to offer some insights into ..."
Abstract
-
Cited by 570 (19 self)
- Add to MetaCart
... This paper provides an up-to-date critical survey of still- and video-based face recognition research. There are two underlying motivations for us to write this survey paper: the first is to provide an up-to-date review of the existing literature, and the second is to offer some insights into the studies of machine recognition of faces. To provide a comprehensive survey, we not only categorize existing recognition techniques but also present detailed descriptions of representative methods within each category. In addition,
Face Detection In Color Images
- IEEE Transactions on Pattern Analysis and Machine Intelligence
, 2002
"... Human face detection is often the first step in applications such as video surveillance, human computer interface, face recognition, and image database management. We propose a face detection algorithm for color images in the presence of varying lighting conditions as well as complex backgrounds. Ou ..."
Abstract
-
Cited by 165 (5 self)
- Add to MetaCart
Human face detection is often the first step in applications such as video surveillance, human computer interface, face recognition, and image database management. We propose a face detection algorithm for color images in the presence of varying lighting conditions as well as complex backgrounds. Our method detects skin regions over the entire image, and then generates face candidates based on the spatial arrangement of these skin patches. The algorithm constructs eye, mouth, and boundary maps for verifying each face candidate. Experimental results demonstrate successful detection over a wide variety of facial variations in color, position, scale, rotation, pose, and expression from several photo collections.
On affine invariant clustering and automatic cast listing in movies
- In Proc. ECCV
, 2002
"... Abstract We develop a distance metric for clustering and classification algorithms which is invariant to affine transformations and includes priors on the transformation parameters. Such clustering requirements are generic to a number of problems in computer vision. We extend existing techniques for ..."
Abstract
-
Cited by 57 (13 self)
- Add to MetaCart
Abstract We develop a distance metric for clustering and classification algorithms which is invariant to affine transformations and includes priors on the transformation parameters. Such clustering requirements are generic to a number of problems in computer vision. We extend existing techniques for affine-invariant clustering, and show that the new distance metric outperforms existing approximations to affine invariant distance computation, particularly under large transformations. In addition, we incorporate prior probabilities on the transformation parameters. This further regularizes the solution, mitigating a rare but serious tendency of the existing solutions to diverge. For the particular special case of corresponding point sets we demonstrate that the affine invariant measure we introduced may be obtained in closed form. As an application of these ideas we demonstrate that the faces of the principal cast of a feature film can be generated automatically using clustering with appropriate invariance. This is a very demanding test as it involves detecting and clustering over tens of thousands of images with the variances including changes in viewpoint, lighting, scale and expression. 1
Head Pose Estimation in Computer Vision: A Survey
- IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE
, 2008
"... The capacity to estimate the head pose of another person is a common human ability that presents a unique challenge for computer vision systems. Compared to face detection and recognition, which have been the primary foci of face-related vision research, identity-invariant head pose estimation has ..."
Abstract
-
Cited by 40 (6 self)
- Add to MetaCart
The capacity to estimate the head pose of another person is a common human ability that presents a unique challenge for computer vision systems. Compared to face detection and recognition, which have been the primary foci of face-related vision research, identity-invariant head pose estimation has fewer rigorously evaluated systems or generic solutions. In this paper, we discuss the inherent difficulties in head pose estimation and present an organized survey describing the evolution of the field. Our discussion focuses on the advantages and disadvantages of each approach and spans 90 of the most innovative and characteristic papers that have been published on this topic. We compare these systems by focusing on their ability to estimate coarse and fine head pose, highlighting approaches that are well suited for unconstrained environments.
Multimodal human computer interaction: A survey
, 2005
"... In this paper we review the major approaches to Multimodal Human Computer Interaction, giving an overview of the field from a computer vision perspective. In particular, we focus on body, gesture, gaze, and affective interaction (facial expression recognition and emotion in audio). We discuss user ..."
Abstract
-
Cited by 38 (2 self)
- Add to MetaCart
In this paper we review the major approaches to Multimodal Human Computer Interaction, giving an overview of the field from a computer vision perspective. In particular, we focus on body, gesture, gaze, and affective interaction (facial expression recognition and emotion in audio). We discuss user and task modeling, and multimodal fusion, highlighting challenges, open issues, and emerging applications for Multimodal Human Computer Interaction (MMHCI) research.
Robust hand detection
- In International Conference on Automatic Face and Gesture Recognition (to appear), Seoul, Korea
, 2004
"... Vision-based hand gesture interfaces require fast and extremely robust hand detection. Here, we study view-specific hand posture detection with an object recognition method recently proposed by Viola and Jones. Training with this method is computationally very expensive, prohibiting the evaluation o ..."
Abstract
-
Cited by 37 (6 self)
- Add to MetaCart
Vision-based hand gesture interfaces require fast and extremely robust hand detection. Here, we study view-specific hand posture detection with an object recognition method recently proposed by Viola and Jones. Training with this method is computationally very expensive, prohibiting the evaluation of many hand appearances for their suitability to detection. As one contribution of this paper, we present a frequency analysis-based method for instantaneous estimation of class separability, without the need for any training. We built detectors for the most promising candidates, their receiver operating characteristics confirming the estimates. Next, we found that classification accuracy increases with a more expressive feature type. As a third contribution, we show that further optimization of training parameters yields additional detection rate improvements. In summary, we present a systematic approach to building an extremely robust hand appearance detector, providing an important step towards easily deployable and reliable vision-based hand gesture interfaces. 1
Independent Component Analysis of Gabor Features for Face Recognition
- IEEE Transactions on Neural Networks
, 2003
"... We present in this paper an Independent Gabor Features (IGF) method and its application to face recognition. The novelty of the IGF method comes from (i) the derivation of independent Gabor features in the feature extraction stage, and (ii) the development of an IGF features-based Probabilistic Reas ..."
Abstract
-
Cited by 36 (2 self)
- Add to MetaCart
We present in this paper an Independent Gabor Features (IGF) method and its application to face recognition. The novelty of the IGF method comes from (i) the derivation of independent Gabor features in the feature extraction stage, and (ii) the development of an IGF features-based Probabilistic Reasoning Model (PRM) classification method in the pattern recognition stage. In particular, the IGF method first derives a Gabor feature vector from a set of downsampled Gabor wavelet representations of face images, then reduces the dimensionality of the vector by means of Principal Component Analysis (PCA), and finally defines the independent Gabor features based on the Independent Component Analysis (ICA). The independence property of these Gabor features facilitates the application of the PRM method for classification. The rationale behind integrating the Gabor wavelets and the ICA is two-fold. On the one hand, the Gabor transformed face images exhibit strong characteristics of spatial locality, scale and orientation selectivity. These images can thus produce salient local features that are most suitable for face recognition. On the other hand, ICA would further reduce redundancy and represent independent features explicitly. These independent features are most useful for subsequent pattern discrimination and associative recall. Experiments on face recognition using the FERET and the Chengjun Liu is with the Department of Computer Science, New Jersey Institute of Technology, Newark, NJ 07102. E-mail: liu@cs.njit.edu.
Providing the Basis for Human-Robot-Interaction: A Multi-Modal Attention System for a Mobile Robot
, 2003
"... In order to enable the widespread use of robots in home and office environments, systems with natural interaction capabilities have to be developed. A prerequisite for natural interaction is the robot's ability to automatically recognize when and how long a person's attention is directed towards it ..."
Abstract
-
Cited by 33 (5 self)
- Add to MetaCart
In order to enable the widespread use of robots in home and office environments, systems with natural interaction capabilities have to be developed. A prerequisite for natural interaction is the robot's ability to automatically recognize when and how long a person's attention is directed towards it for communication. As in open environments several persons can be present simultaneously, the detection of the communication partner is of particular importance. In this paper we present an attention system for a mobile robot which enables the robot to shift its attention to the person of interest and to maintain attention during interaction. Our approach is based on a method for multi-modal person tracking which uses a pan-tilt camera for face recognition, two microphones for sound source localization, and a laser range finder for leg detection. Shifting of attention is realized by turning the camera into the direction of the person which is currently speaking. From the orientation of the head it is decided whether the speaker addresses the robot. The performance of the proposed approach is demonstrated with an evaluation. In addition, qualitative results from the performance of the robot at the exhibition part of the ICVS'03 are provided.
Social Signal Processing: Survey of an Emerging Domain
, 2008
"... The ability to understand and manage social signals of a person we are communicating with is the core of social intelligence. Social intelligence is a facet of human intelligence that has been argued to be indispensable and perhaps the most important for success in life. This paper argues that next- ..."
Abstract
-
Cited by 32 (10 self)
- Add to MetaCart
The ability to understand and manage social signals of a person we are communicating with is the core of social intelligence. Social intelligence is a facet of human intelligence that has been argued to be indispensable and perhaps the most important for success in life. This paper argues that next-generation computing needs to include the essence of social intelligence – the ability to recognize human social signals and social behaviours like turn taking, politeness, and disagreement – in order to become more effective and more efficient. Although each one of us understands the importance of social signals in everyday life situations, and in spite of recent advances in machine analysis of relevant behavioural cues like blinks, smiles, crossed arms, laughter, and similar, design and development of automated systems for Social Signal Processing (SSP) are rather difficult. This paper surveys the past efforts in solving these problems by a computer, it summarizes the relevant findings in social psychology, and it proposes a set of recommendations for enabling the development of the next generation of socially-aware computing.
Leveraging context to resolve identity in photo albums
- In JCDL ’05: Proceedings of the 5th ACM/IEEE-CS joint conference on Digital libraries
, 2005
"... Our system suggests likely identity labels for photographs in a personal photo collection. Instead of using face recognition techniques, the system leverages automatically available context, like the time and location where the photos were taken. Based on time and location, the system automatically ..."
Abstract
-
Cited by 30 (1 self)
- Add to MetaCart
Our system suggests likely identity labels for photographs in a personal photo collection. Instead of using face recognition techniques, the system leverages automatically available context, like the time and location where the photos were taken. Based on time and location, the system automatically computes event and location groupings of photos. As the user annotates some of the identities of people in their collection, patterns of re-occurrence and co-occurrence of different people in different locations and events emerge. The system uses these patterns to generate label suggestions for identities that were not yet annotated. These suggestions can greatly accelerate the process of manual annotation and improve the quality of retrieval from the collection. We obtained ground-truth identity annotation for four different photo albums, and used them to test our system. The system proved effective, making very accurate label suggestions, even when the number of suggestions for each photo was limited to five names, and even when only a small subset of the photos was annotated.

