Results 11 - 20
of
472
Tracking groups of people
- Computer Vision and Image Understanding
, 2000
"... A computer vision system for tracking multiple people in relatively unconstrained environments is described. Tracking is performed at three levels of abstraction: regions, people and groups. A novel, adaptive background subtraction method that combines color and gradient information is used to cope ..."
Abstract
-
Cited by 66 (6 self)
- Add to MetaCart
A computer vision system for tracking multiple people in relatively unconstrained environments is described. Tracking is performed at three levels of abstraction: regions, people and groups. A novel, adaptive background subtraction method that combines color and gradient information is used to cope with shadows and unreliable color cues. People are tracked through mutual occlusions as they form groups and separate from one another. Strong use is made of color information to disambiguate occlusions and to provide qualitative estimates of depth ordering and position during occlusion. Simple interactions with objects can also be detected. The system is tested using both indoor and outdoor sequences. It is robust and should provide a useful mechanism for bootstrapping and reinitialization of tracking using more specific but less robust human models. Key Words: background subtraction, groups of people, human activity, tracking 1.
Gait Analysis for Recognition and Classification
, 2002
"... This paper describes a representation of gait appearance for the purpose of person identification and classification. This gait representation is based on simple features such as moments extracted from orthogonal view video silhouettes of human walking motion. Despite its simplicity, the resulting f ..."
Abstract
-
Cited by 60 (0 self)
- Add to MetaCart
This paper describes a representation of gait appearance for the purpose of person identification and classification. This gait representation is based on simple features such as moments extracted from orthogonal view video silhouettes of human walking motion. Despite its simplicity, the resulting feature vector contains enough information to perform well on human identification and gender classification tasks. We explore the recognition behaviors of two different methods to aggregate features over time under different recognition tasks. We demonstrate the accuracy of recognition using gait video sequences collected over different days and times and under varying lighting environments. In addition, we show results for gender classification based our gait appearance features using a support-vector machine.
A framework for high-level feedback to adaptive, per-pixel, mixture-of-gaussian background models
, 2002
"... MOGs) have recently become a popular choice for robust modeling and removal of complex and changing backgrounds at the pixel level. However, TAPPMOG-based methods cannot easily be made to model dynamic backgrounds with highly complex appearance, or to adapt promptly to sudden “uninteresting ” scene ..."
Abstract
-
Cited by 59 (3 self)
- Add to MetaCart
MOGs) have recently become a popular choice for robust modeling and removal of complex and changing backgrounds at the pixel level. However, TAPPMOG-based methods cannot easily be made to model dynamic backgrounds with highly complex appearance, or to adapt promptly to sudden “uninteresting ” scene changes such as the re-positioning of a static object or the turning on of a light, without further undermining their ability to segment foreground objects, such as people, where they occlude the background for too long. To alleviate tradeoffs such as these, and, more broadly, to allow TAPPMOG segmentation results to be tailored to the specific needs of an application, we introduce a general framework for guiding pixel-level TAPPMOG evolution with feedback from “high-level ” modules. Each such module can use pixel-wise maps of positive and negative feedback to attempt to impress upon the TAPPMOG some definition of foreground that is best expressed through “higher-level ” primitives such as image region properties or semantics of objects and events. By pooling the foreground error corrections of many high-level modules into a shared, pixel-level TAPPMOG model in this way, we improve the quality of the foreground segmentation and the performance of all modules that make use of it. We show an example of using this framework with a TAPPMOG method and high-level modules that all rely on dense depth data from a stereo camera. 1
Tracking Multiple Vehicles using Foreground, Background and Motion Models
- Image and Vision Computing
, 2001
"... In this paper a vehicle tracking algorithm is presented based on the combination of a per pixel background model (an extension of work by Stauffer and Grimson [12]) and a set of single hypothesis foreground models based on a general model of object size, position, velocity, and colour distribution. ..."
Abstract
-
Cited by 51 (13 self)
- Add to MetaCart
In this paper a vehicle tracking algorithm is presented based on the combination of a per pixel background model (an extension of work by Stauffer and Grimson [12]) and a set of single hypothesis foreground models based on a general model of object size, position, velocity, and colour distribution. Each pixel in the scene is thus `explained' as either background, belonging to one of the foreground objects or as noise. Calibrated ground-plane information is used within the foreground model to strengthen the object size and velocity consistency assumptions.
Bilayer segmentation of live video
- In: IEEE Conference on Computer Vision and Pattern Recognition
, 2006
"... a input sequence b automatic layer separation and background substitution in three different frames Figure 1: An example of automatic foreground/background segmentation in monocular image sequences. Despite the challenging foreground motion the person is accurately extracted from the sequence and th ..."
Abstract
-
Cited by 48 (3 self)
- Add to MetaCart
a input sequence b automatic layer separation and background substitution in three different frames Figure 1: An example of automatic foreground/background segmentation in monocular image sequences. Despite the challenging foreground motion the person is accurately extracted from the sequence and then composited free of aliasing upon a different background; a useful tool in video-conferencing applications. The sequences and ground truth data used throughout this paper are available from [1]. This paper presents an algorithm capable of real-time separation of foreground from background in monocular video sequences. Automatic segmentation of layers from colour/contrast or from motion alone is known to be error-prone. Here motion, colour and contrast cues are probabilistically fused together with spatial and temporal priors to infer layers accurately and efficiently. Central to our algorithm is the fact that pixel velocities are not needed, thus removing the need for optical flow estimation, with its tendency to error and computational expense. Instead, an efficient motion vs nonmotion classifier is trained to operate directly and jointly on intensity-change and contrast. Its output is then fused with colour information. The prior on segmentation is represented by a second order, temporal, Hidden Markov Model, together with a spatial MRF favouring coherence except where contrast is high. Finally, accurate layer segmentation and explicit occlusion detection are efficiently achieved by binary graph cut. The segmentation accuracy of the proposed algorithm is quantitatively evaluated with respect to existing groundtruth data and found to be comparable to the accuracy of a state of the art stereo segmentation algorithm. Foreground/background segmentation is demonstrated in the application of live background substitution and shown to generate convincingly good quality composite video. 1 1.
Fast Lighting Independent Background Subtraction
- International Journal of Computer Vision
, 1998
"... . This paper describes a simple method of fast background subtraction based upon disparity verification that is invariant to arbitrarily rapid run-time changes in illumination. Using two or more cameras, the method requires the o#-line construction of disparity fields mapping the primary background ..."
Abstract
-
Cited by 43 (7 self)
- Add to MetaCart
. This paper describes a simple method of fast background subtraction based upon disparity verification that is invariant to arbitrarily rapid run-time changes in illumination. Using two or more cameras, the method requires the o#-line construction of disparity fields mapping the primary background images. At runtime, segmentation is performed by checking background image to each of the additional auxiliary color intensity values at corresponding pixels. If more than two cameras are available, more robust segmentation can be achieved and, in particular, the occlusion shadows can be generally eliminated as well. Because the method only assumes fixed background geometry, the technique allows for illumination variation at runtime. Since no disparity search is performed, the algorithm is easily implemented in real-time on conventional hardware. Keywords: background subtraction, image segmentation, stereo, disparity warp 1.
Plan-view Trajectory Estimation with Dense Stereo Background Models
- in Proceedings of the International Conference on Computer Vision
, 2001
"... In a known environment, objects may be tracked in multiple views using a set of background models. Stereo-based models can be illumination-invariant, but often have undefined values which inevitably lead to foreground classification errors. We derive dense stereo models for object tracking using lon ..."
Abstract
-
Cited by 40 (12 self)
- Add to MetaCart
In a known environment, objects may be tracked in multiple views using a set of background models. Stereo-based models can be illumination-invariant, but often have undefined values which inevitably lead to foreground classification errors. We derive dense stereo models for object tracking using long-term, extended dynamic-range imagery, and by detecting and interpolating uniform but unoccluded planar regions. Foreground points are detected quickly in new images using pruned disparity search. We adopt a "late-segmentation" strategy, using an integrated plan-view density representation. Foreground points are segmented into object regions only when a trajectory is finally estimated, using a dynamic programming-based method. Object entry and exit are optimally determined and are not restricted to special spatial zones.
From First Contact to Close Encounters: A Developmentally Deep Perceptual System for a Humanoid Robot
, 2003
"... This thesis presents a perceptual system for a humanoid robot that integrates abilities such as object localization and recognition with the deeper developmental machinery required to forge those competences out of raw physical experiences. It shows that a robotic platform can build up and maintain ..."
Abstract
-
Cited by 35 (6 self)
- Add to MetaCart
This thesis presents a perceptual system for a humanoid robot that integrates abilities such as object localization and recognition with the deeper developmental machinery required to forge those competences out of raw physical experiences. It shows that a robotic platform can build up and maintain a system for object localization, segmentation, and recognition, starting from very little. What the robot starts with is a direct solution to achieving figure/ground separation: it simply `pokes around' in a region of visual ambiguity and watches what happens. If the arm passes through an area, that area is recognized as free space. If the arm collides with an object, causing it to move, the robot can use that motion to segment the object from the background. Once the robot can acquire reliable segmented views of objects, it learns from them, and from then on recognizes and segments those objects without further contact. Both low-level and high-level visual features can also be learned in this way, and examples are presented for both: orientation detection and affordance recognition, respectively.
Tracking Interacting People
- Proceedings. Fourth IEEE International Conference on Automatic Face and Gesture Recognition
, 2000
"... A computer vision system for tracking multiple people in relatively unconstrained environments is described. Tracking is performed at three levels of abstraction: regions, people and groups. A novel, adaptive background subtraction method that combines colour and gradient information is used to cope ..."
Abstract
-
Cited by 34 (1 self)
- Add to MetaCart
A computer vision system for tracking multiple people in relatively unconstrained environments is described. Tracking is performed at three levels of abstraction: regions, people and groups. A novel, adaptive background subtraction method that combines colour and gradient information is used to cope with shadows and unreliable colour cues. People are tracked through mutual occlusions as they form groups and part from one another. Strong use is made of colour information to disambiguate occlusions and to provide qualitative estimates of depth ordering and position during occlusion. Some simple interactions with objects can also be detected. The system is tested using indoor and outdoor sequences. It is robust and should provide a useful mechanism for boot-strapping and reinitialisation of tracking using more specific but less robust human models.
A Texture-Based Method for Modeling the Background and Detecting Moving Objects
- IEEE Trans. Pattern Anal. Machine Intell
, 2006
"... This paper presents a novel and efficient texture-based method for modeling the background and detecting moving objects from a video sequence. Each pixel is modeled as a group of adaptive local binary pattern histograms that are calculated over a circular region around the pixel. The approach provid ..."
Abstract
-
Cited by 32 (1 self)
- Add to MetaCart
This paper presents a novel and efficient texture-based method for modeling the background and detecting moving objects from a video sequence. Each pixel is modeled as a group of adaptive local binary pattern histograms that are calculated over a circular region around the pixel. The approach provides us with many advantages compared to the state-of-the-art. Experimental results clearly justify our model. Index Terms Motion, texture, background subtraction, local binary pattern.

