Results 1 - 10
of
357
On-line selection of discriminative tracking features
, 2003
"... This paper presents an on-line feature selection mechanism for evaluating multiple features while tracking and adjusting the set of features used to improve tracking performance. Our hypothesis is that the features that best discriminate between object and background are also best for track-ing the ..."
Abstract
-
Cited by 356 (5 self)
- Add to MetaCart
(Show Context)
This paper presents an on-line feature selection mechanism for evaluating multiple features while tracking and adjusting the set of features used to improve tracking performance. Our hypothesis is that the features that best discriminate between object and background are also best for track-ing the object. Given a set of seed features, we compute log likelihood ratios of class conditional sample densities from object and background to form a new set of candidate features tailored to the local object/background discrimination task. The two-class variance ratio is used to rank these new features according to how well they separate sample distributions of object and background pixels. This feature evaluation mechanism is embedded in a mean-shift tracking system that adap-tively selects the top-ranked discriminative features for tracking. Examples are presented that demonstrate how this method adapts to changing appearances of both tracked object and scene background. We note susceptibility of the variance ratio feature selection method to distraction by spatially correlated background clutter, and develop an additional approach that seeks to minimize the likelihood of distraction.
Mean shift blob tracking through scale space
- in Proc. CVPR
"... The mean-shift algorithm is an efficient technique for track-ing 2D blobs through an image. Although the scale of the mean-shift kernel is a crucial parameter, there is presently no clean mechanism for choosing this scale or updating it while tracking blobs that are changing in size. In this pa-per, ..."
Abstract
-
Cited by 207 (3 self)
- Add to MetaCart
(Show Context)
The mean-shift algorithm is an efficient technique for track-ing 2D blobs through an image. Although the scale of the mean-shift kernel is a crucial parameter, there is presently no clean mechanism for choosing this scale or updating it while tracking blobs that are changing in size. In this pa-per, we adapt Lindeberg’s theory of feature scale selection based on local maxima of differential scale-space filters to the problem of selecting kernel scale for mean-shift blob tracking. We show that a difference of Gaussian (DOG) mean-shift kernel enables efficient tracking of blobs through scale space. Using this kernel requires generalizing the mean-shift algorithm to handle images that contain nega-tive sample weights. 1.
Automatic sign language analysis: A survey and the future beyond lexical meaning
- IN IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE
, 2005
"... Research in automatic analysis of sign language has largely focused on recognizing the lexical (or citation) form of sign gestures as they appear in continuous signing, and developing algorithms that scale well to large vocabularies. However, successful recognition of lexical signs is not sufficien ..."
Abstract
-
Cited by 122 (1 self)
- Add to MetaCart
(Show Context)
Research in automatic analysis of sign language has largely focused on recognizing the lexical (or citation) form of sign gestures as they appear in continuous signing, and developing algorithms that scale well to large vocabularies. However, successful recognition of lexical signs is not sufficient for a full understanding of sign language communication. Nonmanual signals and grammatical processes which result in systematic variations in sign appearance are integral aspects of this communication but have received comparatively little attention in the literature. In this survey, we examine data acquisition, feature extraction and classification methods employed for the analysis of sign language gestures. These are discussed with respect to issues such as modeling transitions between signs in continuous signing, modeling inflectional processes, signer independence, and adaptation. We further examine works that attempt to analyze nonmanual signals and discuss issues related to integrating these with (hand) sign gestures.We also discuss the overall progress toward a true test of sign recognition systems—dealing with natural signing by native signers. We suggest some future directions for this research and also point to contributions it can make to other fields of research. Web-based supplemental materials (appendicies) which contain several illustrative examples and videos of signing can be found at www.computer.org/publications/dlib.
Fast multiple object tracking via a hierarchical particle filter
- In The IEEE International Conference on Computer Vision (ICCV
, 2005
"... A very efficient and robust visual object tracking algo-rithm based on the particle filter is presented. The method characterizes the tracked objects using color and edge ori-entation histogram features. While the use of more features and samples can improve the robustness, the computational load re ..."
Abstract
-
Cited by 92 (4 self)
- Add to MetaCart
(Show Context)
A very efficient and robust visual object tracking algo-rithm based on the particle filter is presented. The method characterizes the tracked objects using color and edge ori-entation histogram features. While the use of more features and samples can improve the robustness, the computational load required by the particle filter increases. To acceler-ate the algorithm while retaining robustness we adopt sev-eral enhancements in the algorithm. The first is the use of integral images [34] for efficiently computing the color fea-tures and edge orientation histograms, which allows a large amount of particles and a better description of the targets. Next, the observation likelihood based on multiple features is computed in a coarse-to-fine manner, which allows the computation to quickly focus on the more promising regions. Quasi-random sampling of the particles allows the filter to achieve a higher convergence rate. The resulting tracking algorithm maintains multiple hypotheses and offers robust-ness against clutter or short period occlusions. Experimen-tal results demonstrate the efficiency and effectiveness of the algorithm for single and multiple object tracking. 1
Motion Segmentation and Pose Recognition with Motion History Gradients
- Machine Vision and Applications
, 2000
"... This paper uses a simple method for representing motion in successively layered silhouettes that directly encode system time termed the timed Motion History Image (tMHI). This representation can be used to both (a) determine the current pose of the object and to (b) segment and measure the motions i ..."
Abstract
-
Cited by 76 (5 self)
- Add to MetaCart
(Show Context)
This paper uses a simple method for representing motion in successively layered silhouettes that directly encode system time termed the timed Motion History Image (tMHI). This representation can be used to both (a) determine the current pose of the object and to (b) segment and measure the motions induced by the object in a video scene. These segmented regions are not “motion blobs”, but instead motion regions naturally connected to the moving parts of the object of interest. This method may be used as a very general gesture recognition “toolbox”. We use it to recognize waving and overhead clapping motions to control a music synthesis program. 1. Introduction and Related
Efficient mean-shift tracking via a new similarity measure
- in Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR ’05
, 2005
"... The mean shift algorithm has achieved considerable success in object tracking due to its simplicity and robustness. It finds local minima of a similarity measure between the color histograms or kernel density estimates of the model and target image. The most typically used similarity measures are th ..."
Abstract
-
Cited by 52 (4 self)
- Add to MetaCart
(Show Context)
The mean shift algorithm has achieved considerable success in object tracking due to its simplicity and robustness. It finds local minima of a similarity measure between the color histograms or kernel density estimates of the model and target image. The most typically used similarity measures are the Bhattacharyya coefficient or the Kullback-Leibler divergence. In practice, these approaches face three difficulties. First, the spatial information of the target is lost when the color histogram is employed, which precludes the application of more elaborate motion models. Second, the classical similarity measures are not very discriminative. Third, the sample-based classical similarity measures require a calculation that is quadratic in the number of samples, making real-time performance difficult. To deal with these difficulties we propose a new, simple-tocompute and more discriminative similarity measure in spatial-feature spaces. The new similarity measure allows the mean shift algorithm to track more general motion models in an integrated way. To reduce the complexity of the computation to linear order we employ the recently proposed improved fast Gauss transform. This leads to a very efficient and robust nonparametric spatial-feature tracking algorithm. The algorithm is tested on several image sequences and shown to achieve robust and reliable frame-rate tracking.
Bi-modal emotion recognition from expressive face and body gestures
- Journal of Network and Computer Applications, 2007. 3784 Instruments, Measurement, Electronics and Information Engineering
"... Psychological research findings suggest that humans rely on the combined visual channels of face and body more than any other channel when they make judgments about human communicative behavior. However, most of the existing systems at-tempting to analyze the human nonverbal behavior are mono-modal ..."
Abstract
-
Cited by 37 (2 self)
- Add to MetaCart
(Show Context)
Psychological research findings suggest that humans rely on the combined visual channels of face and body more than any other channel when they make judgments about human communicative behavior. However, most of the existing systems at-tempting to analyze the human nonverbal behavior are mono-modal and focus only on the face. Research that aims to integrate gestures as an expression mean has only recently emerged. Accordingly, this paper presents an approach to automatic visual recognition of expressive face and upper-body gestures from video sequences suitable for use in a vision-based affective multi-modal framework. Face and body movements are captured simultaneously using two separate cameras. For each video sequence single expressive frames both from face and body are selected manually for analysis and recognition of emotions. Firstly, individual classifiers are trained from individual modalities. Secondly, we fuse facial expression and affective body gesture information at the feature and at the decision level. In the experiments performed, the emotion classification using the two modalities achieved a better recognition accuracy outperforming classification using the individual facial or bodily modality alone.
An Active Camera System for Acquiring Multi-View Video
- in Proc. International Conference on Image Processing
, 2002
"... A system is described for acquiring multi-view video of a person moving through the environment. A real-time tracking algorithm adjusts the pan, tilt, zoom and focus parameters of multiple active cameras to keep the moving person centered in each view. The output of the system is a set of synchroniz ..."
Abstract
-
Cited by 36 (1 self)
- Add to MetaCart
A system is described for acquiring multi-view video of a person moving through the environment. A real-time tracking algorithm adjusts the pan, tilt, zoom and focus parameters of multiple active cameras to keep the moving person centered in each view. The output of the system is a set of synchronized, time-stamped video streams, showing the person simultaneously from several viewpoints.
Object Tracking by Asymmetric Kernel Mean Shift with Automatic Scale and Orientation Selection
- in Proc. IEEE Conference on Computer Vision and Pattern Recognition
, 2007
"... Tracking objects using the mean shift method is performed by iteratively translating a kernel in the image space such that the past and current object observations are similar. Traditional mean shift method requires a symmetric kernel, such as a circle or an ellipse, and assumes constancy of the obj ..."
Abstract
-
Cited by 35 (0 self)
- Add to MetaCart
(Show Context)
Tracking objects using the mean shift method is performed by iteratively translating a kernel in the image space such that the past and current object observations are similar. Traditional mean shift method requires a symmetric kernel, such as a circle or an ellipse, and assumes constancy of the object scale and orientation during the course of tracking. In a tracking scenario, it is not uncommon to observe objects with complex shapes whose scale and orientation constantly change due to the camera and object motions. In this paper, we present an object tracking method based on the asymmetric kernel mean shift, in which the scale and orientation of the kernel adaptively change depending on the observations at each iteration. Proposed method extends the traditional mean shift tracking, which is performed in the image coordinates, by including the scale and orientation as additional dimensions and simultaneously estimates all the unknowns in a few number of mean shift iterations. The experimental results show that the proposed method is superior to the traditional mean shift tracking in the following aspects: 1) it provides consistent object tracking throughout the video; 2) it is not effected by the scale and orientation changes of the tracked objects; 3) it is less prone to the background clutter. 1.
Tracking using dynamic programming for appearance-based sign language recognition
- In IEEE Automatic Face and Gesture Recognition
, 2006
"... We present a novel tracking algorithm that uses dynamic programming to determine the path of target objects and that is able to track an arbitrary number of different objects. The traceback method used to track the targets avoids taking possibly wrong local decisions and thus reconstructs the best t ..."
Abstract
-
Cited by 32 (15 self)
- Add to MetaCart
(Show Context)
We present a novel tracking algorithm that uses dynamic programming to determine the path of target objects and that is able to track an arbitrary number of different objects. The traceback method used to track the targets avoids taking possibly wrong local decisions and thus reconstructs the best tracking paths using the whole observation sequence. The tracking method can be compared to the nonlinear time alignment in automatic speech recognition (ASR) and it can analogously be integrated into a hidden Markov model based recognition process. We show how the method can be applied to the tracking of hands and the face for automatic sign language recognition. 1