Results 1 - 10
of
59
Mean shift: A robust approach toward feature space analysis
- In PAMI
, 2002
"... A general nonparametric technique is proposed for the analysis of a complex multimodal feature space and to delineate arbitrarily shaped clusters in it. The basic computational module of the technique is an old pattern recognition procedure, the mean shift. We prove for discrete data the convergence ..."
Abstract
-
Cited by 935 (33 self)
- Add to MetaCart
A general nonparametric technique is proposed for the analysis of a complex multimodal feature space and to delineate arbitrarily shaped clusters in it. The basic computational module of the technique is an old pattern recognition procedure, the mean shift. We prove for discrete data the convergence of a recursive mean shift procedure to the nearest stationary point of the underlying density function and thus its utility in detecting the modes of the density. The equivalence of the mean shift procedure to the Nadaraya–Watson estimator from kernel regression and the robust M-estimators of location is also established. Algorithms for two low-level vision tasks, discontinuity preserving smoothing and image segmentation are described as applications. In these algorithms the only user set parameter is the resolution of the analysis, and either gray level or color images are accepted as input. Extensive experimental results illustrate their excellent performance.
Detecting faces in images: A survey
- IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE
, 2002
"... Images containing faces are essential to intelligent vision-based human computer interaction, and research efforts in face processing include face recognition, face tracking, pose estimation, and expression recognition. However, many reported methods assume that the faces in an image or an image se ..."
Abstract
-
Cited by 437 (4 self)
- Add to MetaCart
Images containing faces are essential to intelligent vision-based human computer interaction, and research efforts in face processing include face recognition, face tracking, pose estimation, and expression recognition. However, many reported methods assume that the faces in an image or an image sequence have been identified and localized. To build fully automated systems that analyze the information contained in face images, robust and efficient face detection algorithms are required. Given a single image, the goal of face detection is to identify all image regions which contain a face regardless of its three-dimensional position, orientation, and the lighting conditions. Such a problem is challenging because faces are nonrigid and have a high degree of variability in size, shape, color, and texture. Numerous techniques have been developed to detect faces in a single image, and the purpose of this paper is to categorize and evaluate these algorithms. We also discuss relevant issues such as data collection, evaluation metrics, and benchmarking. After analyzing these algorithms and identifying their limitations, we conclude with several promising directions for future research.
Stylization and Abstraction of Photographs
, 2002
"... Good information design depends on clarifying the meaningful structure in an image. We describe a computational approach to stylizing and abstracting photographs that explicitly responds to this design goal. Our system transforms images into a line-drawing style using bold edges and large regions of ..."
Abstract
-
Cited by 125 (6 self)
- Add to MetaCart
Good information design depends on clarifying the meaningful structure in an image. We describe a computational approach to stylizing and abstracting photographs that explicitly responds to this design goal. Our system transforms images into a line-drawing style using bold edges and large regions of constant color. To do this, it represents images as a hierarchical structure of parts and boundaries computed using state-of-the-art computer vision. Our system identifi es the meaningful elements of this structure using a model of human perception and a record of a user's eye movements in looking at the photo; the system renders a new image using transformations that preserve and highlight these visual elements. Our method thus represents a new alternative for non-photorealistic rendering both in its visual style, in its approach to visual form, and in its techniques for interaction.
The Variable Bandwidth Mean Shift and Data-Driven Scale Selection
- in Proc. 8th Intl. Conf. on Computer Vision
, 2001
"... We present two solutions for the scale selection problem in computer vision. The first one is completely nonparametric and is based on the the adaptive estimation of the normalized density gradient. Employing the sample point estimator, we define the Variable Bandwidth Mean Shift, prove its converge ..."
Abstract
-
Cited by 73 (9 self)
- Add to MetaCart
We present two solutions for the scale selection problem in computer vision. The first one is completely nonparametric and is based on the the adaptive estimation of the normalized density gradient. Employing the sample point estimator, we define the Variable Bandwidth Mean Shift, prove its convergence, and show its superiority over the fixed bandwidth procedure. The second technique has a semiparametric nature and imposes a local structure on the data to extract reliable scale information. The local scale of the underlying density is taken as the bandwidth which maximizes the magnitude of the normalized mean shift vector. Both estimators provide practical tools for autonomous image and quasi real-time video analysis and several examples are shown to illustrate their effectiveness. 1 Motivation for Variable Bandwidth The efficacy of Mean Shift analysis has been demonstrated in computer vision problems such as tracking and segmentation in [5, 6]. However, one of the limitations of the mean shift procedure as defined in these papers is that it involves the specification of a scale parameter. While results obtained appear satisfactory, when the local characteristics of the feature space differs significantly across data, it is difficult to find an optimal global bandwidth for the mean shift procedure. In this paper we address the issue of locally adapting the bandwidth. We also study an alternative approach for data-driven scale selection which imposes a local structure on the data. The proposed solutions are tested in the framework of quasi real-time video analysis. We review first the intrinsic limitations of the fixed bandwidth density estimation methods. Then, two of the most popular variable bandwidth estimators, the balloon and the sample point, are introduced and...
Gaussian Mixture Model for Human Skin Color and Its Applications in Image and Video Databases
- Its Application in Image and Video Databases.” Proceedings of SPIE ’99 (San Jose CA
, 1999
"... This paper is concerned with estimating a probability density function of human skin color using a finite Gaussian mixture model whose parameters are estimated through the EM algorithm. Hawkins' statistical test on the normality and homoscedasticity (common covariance matrix) of the estimated Gaussi ..."
Abstract
-
Cited by 60 (2 self)
- Add to MetaCart
This paper is concerned with estimating a probability density function of human skin color using a finite Gaussian mixture model whose parameters are estimated through the EM algorithm. Hawkins' statistical test on the normality and homoscedasticity (common covariance matrix) of the estimated Gaussian mixture models is performed and McLachlan's bootstrap method is used to test the number of components in a mixture. Experimental results show that the estimated Gaussian mixture model fits skin images from a large database. Applications of the estimated density function in image and video databases are presented. Keywords: Gaussian Mixture Model, EM algorithm, Skin Color, Skin Detection 1. INTRODUCTION Human skin color has been used and proved to be an effective feature in many applications from human face detection to hand tracking. However, most studies use either simple thresholding or a single Gaussian distribution to characterize the properties of skin color. Although skin colors o...
An algorithm for data-driven bandwidth selection
- IEEE Transactions on Pattern Analysis and Machine Intelligence
, 2003
"... Abstract—The analysis of a feature space that exhibits multiscale patterns often requires kernel estimation techniques with locally adaptive bandwidths, such as the variable-bandwidth mean shift. Proper selection of the kernel bandwidth is, however, a critical step for superior space analysis and pa ..."
Abstract
-
Cited by 58 (7 self)
- Add to MetaCart
Abstract—The analysis of a feature space that exhibits multiscale patterns often requires kernel estimation techniques with locally adaptive bandwidths, such as the variable-bandwidth mean shift. Proper selection of the kernel bandwidth is, however, a critical step for superior space analysis and partitioning. This paper presents a mean shift-based approach for local bandwidth selection in the multimodal, multivariate case. Our method is based on a fundamental property of normal distributions regarding the bias of the normalized density gradient. We demonstrate that, within the large sample approximation, the local covariance is estimated by the matrix that maximizes the magnitude of the normalized mean shift vector. Using this property, we develop a reliable algorithm which takes into account the stability of local bandwidth estimates across scales. The validity of our theoretical results is proven in various space partitioning experiments involving the variable-bandwidth mean shift. Index Terms—Variable-bandwidth mean shift, bandwidth selection, multiscale analysis, Jensen-Shannon divergence, feature space. 1
Recovering surface layout from an image
- In IJCV
, 2007
"... Humans have an amazing ability to instantly grasp the overall 3D structure of a scene – ground orientation, relative positions of major landmarks, etc – even from a single image. This ability is completely missing in most popular recognition algorithms, which pretend that the world is flat and/or vi ..."
Abstract
-
Cited by 57 (17 self)
- Add to MetaCart
Humans have an amazing ability to instantly grasp the overall 3D structure of a scene – ground orientation, relative positions of major landmarks, etc – even from a single image. This ability is completely missing in most popular recognition algorithms, which pretend that the world is flat and/or view it through a patch-sized peephole. Yet it seems very likely that having a grasp of this “surface layout ” of a scene should be of great assistance for many tasks, including recognition, navigation, and novel view synthesis. In this paper, we take the first step towards constructing the surface layout, a labeling of the image into geometric classes. Our main insight is to learn appearance-based models of these geometric classes, which coarsely describe the 3D scene orientation of each image region. Our multiple segmentation framework provides robust spatial support, allowing a wide variety of cues (e.g., color, texture, and perspective) to contribute to the confidence in each geometric label. In experiments on a large set of outdoor images, we evaluate the impact of the individual cues and design choices in our algorithm. We further demonstrate the applicability of our method to indoor images, describe potential applications, and discuss extensions to a more complete notion of surface layout.
Detecting Human Faces in Color Images
, 1998
"... We propose a new method to detect human faces in color images. A human skin color model is built to capture the chromatic properties based on multivariate statistical analysis. Given a color image, multiscale segmentation is used to generate homogeneous regions at multiple different scales. From the ..."
Abstract
-
Cited by 55 (1 self)
- Add to MetaCart
We propose a new method to detect human faces in color images. A human skin color model is built to capture the chromatic properties based on multivariate statistical analysis. Given a color image, multiscale segmentation is used to generate homogeneous regions at multiple different scales. From the coarsest to the finest scale, regions of skin color are merged until the shape is approximately elliptic. Postprocessing is performed to determine whether a merged region contains a human face and include the facial features of non-skin color such as eyes and mouth if necessary. Experimental results show that human faces in color images can be detected regardless of size, orientation and viewpoint. 1 Introduction Face detection has many applications, including teleconferencing [2], face recognition [6], and gesture recognition [12]. The goal of face detection is to determine whether or not there is any human face in the image, and, if present, return its location and spatial extent. The t...
Recognizing Hand Gestures Using Motion Trajectories
, 2000
"... We present an algorithm for extracting and classifying two-dimensional motion in an image sequence based on motion trajectories. First, a multiscale segmentation is performed to generate homogeneous regions in each frame. Regions between consecutive frames are then matched to obtain 2-view correspon ..."
Abstract
-
Cited by 33 (0 self)
- Add to MetaCart
We present an algorithm for extracting and classifying two-dimensional motion in an image sequence based on motion trajectories. First, a multiscale segmentation is performed to generate homogeneous regions in each frame. Regions between consecutive frames are then matched to obtain 2-view correspondences. Affine transformations are computed from each pair of corresponding regions to define pixel matches. Pixels matches over consecutive images pairs are concatenated to obtain pixel-level motion trajectories across the image sequence. Motion patterns are learned from the extracted trajectories using a timedelay neural network. We apply the proposed method to recognize 40 hand gestures of American Sign Language. Experimental results show that motion patterns in hand gestures can be extracted and recognized with high recognition rate using motion trajectories. 1 Introduction In this paper, we present an algorithm for extracting two-dimensional motion fields of objects across a video seq...
Extracting subimages of an unknown category from a set of images
- in CVPR
, 2006
"... Suppose a set of images contains frequent occurrences of objects from an unknown category. This paper is aimed at simultaneously solving the following related problems: (1) unsupervised identification of photometric, geometric, and topological (mutual containment) properties of multiscale ..."
Abstract
-
Cited by 30 (6 self)
- Add to MetaCart
Suppose a set of images contains frequent occurrences of objects from an unknown category. This paper is aimed at simultaneously solving the following related problems: (1) unsupervised identification of photometric, geometric, and topological (mutual containment) properties of multiscale

