Results 1 - 10
of
75
A survey on pixel-based skin color detection techniques
- In ICCGV
, 2003
"... Skin color has proven to be a useful and robust cue for face detection, localization and tracking. Image content filtering, content-aware video compression and image color balancing applications can also benefit from automatic detection of skin in images. Numerous techniques for skin color modelling ..."
Abstract
-
Cited by 59 (2 self)
- Add to MetaCart
Skin color has proven to be a useful and robust cue for face detection, localization and tracking. Image content filtering, content-aware video compression and image color balancing applications can also benefit from automatic detection of skin in images. Numerous techniques for skin color modelling and recognition have been proposed during several past years. A few papers comparing different approaches have been published [Zarit et al. 1999], [Terrillon et al. 2000], [Brand and Mason 2000]. However, a comprehensive survey on the topic is still missing. We try to fill this vacuum by reviewing most widely used methods and techniques and collecting their numerical evaluation results.
Detecting Human Faces in Color Images
, 1998
"... We propose a new method to detect human faces in color images. A human skin color model is built to capture the chromatic properties based on multivariate statistical analysis. Given a color image, multiscale segmentation is used to generate homogeneous regions at multiple different scales. From the ..."
Abstract
-
Cited by 55 (1 self)
- Add to MetaCart
We propose a new method to detect human faces in color images. A human skin color model is built to capture the chromatic properties based on multivariate statistical analysis. Given a color image, multiscale segmentation is used to generate homogeneous regions at multiple different scales. From the coarsest to the finest scale, regions of skin color are merged until the shape is approximately elliptic. Postprocessing is performed to determine whether a merged region contains a human face and include the facial features of non-skin color such as eyes and mouth if necessary. Experimental results show that human faces in color images can be detected regardless of size, orientation and viewpoint. 1 Introduction Face detection has many applications, including teleconferencing [2], face recognition [6], and gesture recognition [12]. The goal of face detection is to determine whether or not there is any human face in the image, and, if present, return its location and spatial extent. The t...
Model-Based Hand Tracking Using A Hierarchical Bayesian Filter
, 2004
"... This thesis focuses on the automatic recovery of three-dimensional hand motion from one or more views. A 3D geometric hand model is constructed from truncated cones, cylinders and ellipsoids and is used to generate contours, which can be compared with edge contours and skin colour in images. The han ..."
Abstract
-
Cited by 48 (2 self)
- Add to MetaCart
This thesis focuses on the automatic recovery of three-dimensional hand motion from one or more views. A 3D geometric hand model is constructed from truncated cones, cylinders and ellipsoids and is used to generate contours, which can be compared with edge contours and skin colour in images. The hand tracking problem is formulated as state estimation, where the model parameters define the internal state, which is to be estimated from image observations. In thew first
Estimation and Prediction of Evolving Color Distributions for Skin Segmentation Under Varying Illumination
- In CVPR
, 2000
"... A novel approach for real-time skin segmentation in video sequences is described. The approach enables reliable skin segmentation despite wide variation in illumination during tracking. An explicit second order Markov model is used to predict evolution of the skin color (HSV) histogram over time. Hi ..."
Abstract
-
Cited by 41 (4 self)
- Add to MetaCart
A novel approach for real-time skin segmentation in video sequences is described. The approach enables reliable skin segmentation despite wide variation in illumination during tracking. An explicit second order Markov model is used to predict evolution of the skin color (HSV) histogram over time. Histograms are dynamically updated based on feedback from the current segmentation and based on predictions of the Markov model. The evolution of the skin color distribution at each frame is parameterized by translation, scaling and rotation in color space. Consequent changes in geometric parameterization of the distribution are propagated by warping and re-sampling the histogram. The parameters of the discrete-time dynamic Markov model are estimated using Maximum Likelihood Estimation, and also evolve over time. Quantitative evaluation of the method was conducted on labeled ground-truth video sequences taken from popular movies. 1 Introduction Locating and tracking patches of skin-colored p...
Modeling individual and group actions in meetings: a two-layer HMM framework
- In Proc. IEEE Conf. on Computer Vision and Pattern Recognition, Workshop on Event Mining in Video (CVPREVENT), Washington DC
, 2004
"... We address the problem of recognizing sequences of human interaction patterns in meetings, with the goal of structuring them in semantic terms. The investigated patterns are inherently group-based (defined by the individual activities of meeting participants, and their interplay), and multimodal (as ..."
Abstract
-
Cited by 34 (11 self)
- Add to MetaCart
We address the problem of recognizing sequences of human interaction patterns in meetings, with the goal of structuring them in semantic terms. The investigated patterns are inherently group-based (defined by the individual activities of meeting participants, and their interplay), and multimodal (as captured by cameras and microphones). By defining a proper set of individual actions, group actions can be modeled as a two-layer process, one that models basic individual activities from low-level audio-visual features, and another one that models the interactions. We propose a two-layer Hidden Markov Model (HMM) framework that implements such concept in a principled manner, and that has advantages over previous works. First, by decomposing the problem hierarchically, learning is performed on lowdimensional observation spaces, which results in simpler models. Second, our framework is easier to interpret, as both individual and group actions have a clear meaning, and thus easier to improve. Third, different HMM models can be used in each layer, to better reflect the nature of each subproblem. Our framework is general and extensible, and we illustrate it with a set of eight group actions, using a public five-hour meeting corpus. Experiments and comparison with a single-layer HMM baseline system show its validity. 1
Comparison of Five Color Models in Skin Pixel Classification
- In ICCV’99 Int’l Workshop on
, 1999
"... Detection of skin in video is an important component of systems for detecting, recognizing, and tracking faces and hands. Different skin detection methods have used different color spaces. This paper presents a comparative evaluation of pixel classification performance of two skin detection methods ..."
Abstract
-
Cited by 33 (3 self)
- Add to MetaCart
Detection of skin in video is an important component of systems for detecting, recognizing, and tracking faces and hands. Different skin detection methods have used different color spaces. This paper presents a comparative evaluation of pixel classification performance of two skin detection methods in five color spaces. The skin detection methods used in this paper are color-histogram based approaches that are intended to work with a wide variety of individuals, lighting conditions, and skin tones. One is the widely-used lookup table method, the other makes use of Bayesian decision theory. Two types of enhancements, based on spatial and texture analyses, are also evaluated. 1.
A System for Tracking and Recognizing Multiple People with Multiple Cameras
- In Proceedings of Second International Conference on Audio-Visionbased Person Authentication
, 1998
"... In this paper we present a robust real-time method for tracking and recognizing multiple people with multiple cameras. Our method uses both static and Pan-Tilt-Zoom (PTZ) cameras to provide visual attention. The PTZ camera system uses face recognition to register people in the scene and "lock-on" to ..."
Abstract
-
Cited by 30 (4 self)
- Add to MetaCart
In this paper we present a robust real-time method for tracking and recognizing multiple people with multiple cameras. Our method uses both static and Pan-Tilt-Zoom (PTZ) cameras to provide visual attention. The PTZ camera system uses face recognition to register people in the scene and "lock-on" to those individuals. The static camera system provides a global view of the environment and is used to re-adjust the tracking of the system when the PTZ cameras lose their targets. The system works well even when people occlude one another. The underlying visual processes rely on color segmentation, movement tracking and shape information to locate target candidates. Color indexing and face recognition modules help register these candidates with the system. 1. Introduction One of the goals of building an intelligent environment is to make it more aware of the user's presence so that the interface can seek out and serve the user [1]. This work addresses the ability of a system to determine th...
Pointing gesture recognition based on 3d-tracking of face, hands and head orientation
- In Workshop on Perceptive User Interfaces
, 2003
"... In this paper, we present a system capable of visually detecting pointing gestures and estimating the 3D pointing direction in real-time. In order to acquire input features for gesture recognition, we track the positions of a person’s face and hands on image sequences provided by a stereocamera. Hid ..."
Abstract
-
Cited by 26 (6 self)
- Add to MetaCart
In this paper, we present a system capable of visually detecting pointing gestures and estimating the 3D pointing direction in real-time. In order to acquire input features for gesture recognition, we track the positions of a person’s face and hands on image sequences provided by a stereocamera. Hidden Markov Models (HMMs), trained on different phases of sample pointing gestures, are used to classify the 3D-trajectories in order to detect the occurrence of a gesture. When analyzing sample pointing gestures, we noticed that humans tend to look at the pointing target while performing the gesture. In order to utilize this behavior, we additionally measured head orientation by means of a magnetic sensor in a similar scenario. By using head orientation as an additional feature, we observed significant gains in both recall and precision of pointing gestures. Moreover, the percentage of correctly identified pointing targets improved significantly from 65 % to 83%. For estimating the pointing direction, we comparatively used three approaches: 1) The line of sight between head and hand, 2) the forearm orientation, and 3) the head orientation.
Skin Color-Based Video Segmentation under Time-Varying Illumination
- IEEE Transactions on Pattern Analysis and Machine Intelligence
, 2003
"... A novel approach for real-time skin segmentation in video sequences is described. The approach enables reliable skin segmentation despite wide variation in illumination during tracking. ..."
Abstract
-
Cited by 23 (0 self)
- Add to MetaCart
A novel approach for real-time skin segmentation in video sequences is described. The approach enables reliable skin segmentation despite wide variation in illumination during tracking.
Modeling Individual and Group Actions in Meetings with Layered HMMs
- IEEE TRANS. ON MULTIMEDIA
, 2004
"... We address the problem of recognizing sequences of human interaction patterns in meetings, with the goal of structuring them in semantic terms. The investigated patterns are inherently group-based (defined by the individual activities of meeting participants, and their interplay), and multimodal (as ..."
Abstract
-
Cited by 21 (7 self)
- Add to MetaCart
We address the problem of recognizing sequences of human interaction patterns in meetings, with the goal of structuring them in semantic terms. The investigated patterns are inherently group-based (defined by the individual activities of meeting participants, and their interplay), and multimodal (as captured by cameras and microphones). By defining a proper set of individual actions, group actions can be modeled as a two-layer process, one that models basic individual activities from low-level audio-visual features, and another one that models the interactions. We propose a two-layer Hidden Markov Model (HMM) framework that implements such concept in a principled manner, and that has advantages over previous works. First, by decomposing the problem hierarchically, learning is performed on low-dimensional observation spaces, which results in simpler models. Second, our framework is easier to interpret, as both individual and group actions have a clear meaning, and thus easier to improve. Third, different HMM models can be used in each layer, to better reflect the nature of each subproblem. Our framework is general and extensible, and we illustrate it with a set of eight group actions, using a public five-hour meeting corpus. Experiments and comparison with a single-layer HMM baseline system show its validity.

