Results 1 - 10
of
63
CONDENSATION - conditional density propagation for visual tracking
- International Journal of Computer Vision
, 1998
"... The problem of tracking curves in dense visual clutter is challenging. Kalman filtering is inadequate because it is based on Gaussian densities which, being unimodal, cannot represent simultaneous alternative hypotheses. The Condensation algorithm uses "factored sampling", previously applied to the ..."
Abstract
-
Cited by 911 (12 self)
- Add to MetaCart
The problem of tracking curves in dense visual clutter is challenging. Kalman filtering is inadequate because it is based on Gaussian densities which, being unimodal, cannot represent simultaneous alternative hypotheses. The Condensation algorithm uses "factored sampling", previously applied to the interpretation of static images, in which the probability distribution of possible interpretations is represented by a randomly generated set. Condensation uses learned dynamical models, together with visual observations, to propagate the random set over time. The result is highly robust tracking of agile motion. Notwithstanding the use of stochastic methods, the algorithm runs in near real-time. Contents 1 Tracking curves in clutter 2 2 Discrete-time propagation of state density 3 3 Factored sampling 6 4 The Condensation algorithm 8 5 Stochastic dynamical models for curve motion 10 6 Observation model 13 7 Applying the Condensation algorithm to video-streams 17 8 Conclusions 26 A Non-line...
Contour Tracking By Stochastic Propagation of Conditional Density
, 1996
"... . In Proc. European Conf. Computer Vision, 1996, pp. 343--356, Cambridge, UK The problem of tracking curves in dense visual clutter is a challenging one. Trackers based on Kalman filters are of limited use; because they are based on Gaussian densities which are unimodal, they cannot represent s ..."
Abstract
-
Cited by 488 (23 self)
- Add to MetaCart
. In Proc. European Conf. Computer Vision, 1996, pp. 343--356, Cambridge, UK The problem of tracking curves in dense visual clutter is a challenging one. Trackers based on Kalman filters are of limited use; because they are based on Gaussian densities which are unimodal, they cannot represent simultaneous alternative hypotheses. Extensions to the Kalman filter to handle multiple data associations work satisfactorily in the simple case of point targets, but do not extend naturally to continuous curves. A new, stochastic algorithm is proposed here, the Condensation algorithm --- Conditional Density Propagation over time. It uses `factored sampling', a method previously applied to interpretation of static images, in which the distribution of possible interpretations is represented by a randomly generated set of representatives. The Condensation algorithm combines factored sampling with learned dynamical models to propagate an entire probability distribution for object pos...
Tracking People with Twists and Exponential Maps
, 1998
"... This paper demonstrates a new visual motion estimation technique that is able to recover high degree-of-freedom articulated human body configurations in complex video sequences. We introduce the use of a novel mathematical technique, the product of exponential maps and twist motions, and its integra ..."
Abstract
-
Cited by 327 (2 self)
- Add to MetaCart
This paper demonstrates a new visual motion estimation technique that is able to recover high degree-of-freedom articulated human body configurations in complex video sequences. We introduce the use of a novel mathematical technique, the product of exponential maps and twist motions, and its integration into a differential motion estimation. This results in solving simple linear systems, and enables us to recover robustly the kinematic degrees-offreedom in noise and complex self occluded configurations. We demonstrate this on several image sequences of people doing articulated full body movements, and visualize the results in re-animating an artificial 3D human model. We are also able to recover and re-animate the famous movements of Eadweard Muybridge's motion studies from the last century. To the best of our knowledge, this is the first computer vision based system that is able to process such challenging footage and recover complex motions with such high accuracy.
Learning and Recognizing Human Dynamics in Video Sequences
, 1997
"... This paper describes a probabilistic decomposition of human dynamics at multiple abstractions, and shows how to propagate hypotheses across space, time, and abstraction levels. Recognition in this framework is the succession of very general low level grouping mechanisms to increased specific and lea ..."
Abstract
-
Cited by 240 (2 self)
- Add to MetaCart
This paper describes a probabilistic decomposition of human dynamics at multiple abstractions, and shows how to propagate hypotheses across space, time, and abstraction levels. Recognition in this framework is the succession of very general low level grouping mechanisms to increased specific and learned model based grouping techniques at higher levels. Hard decision thresholds are delayed and resolved by higher level statistical models and temporal context. Low-level primitives are areas of coherent motion found by EM clustering, mid-level categories are simple movements represented by dynamical systems, and highlevel complex gestures are represented by Hidden Markov Models as successive phases of simple movements. We show how such a representation can be learned from training data, and apply it to the example of human gait recognition. 1 Introduction This paper addresses the problem of learning and recognizing human and other biological movements in video sequences of an unconstrai...
Icondensation: Unifying low-level and high-level tracking in a stochastic framework
, 1998
"... . Tracking research has diverged into two camps; low-level approaches which are typically fast and robust but provide little fine-scale information, and high-level approaches which track complex deformations in high-dimensional spaces but must trade off speed against robustness. Real-time high-level ..."
Abstract
-
Cited by 219 (13 self)
- Add to MetaCart
. Tracking research has diverged into two camps; low-level approaches which are typically fast and robust but provide little fine-scale information, and high-level approaches which track complex deformations in high-dimensional spaces but must trade off speed against robustness. Real-time high-level systems perform poorly in clutter and initialisation for most high-level systems is either performed manually or by a separate module. This paper presents a new technique to combine low- and high-level information in a consistent probabilistic framework, using the statistical technique of importance sampling combined with the Condensation algorithm. The general framework, which we term Icondensation, is described, and a hand tracker is demonstrated which combines colour blob-tracking with a contour model. The resulting tracker is robust to rapid motion, heavy clutter and hand-coloured distractors, and re-initialises automatically. The system runs comfortably in real time on an...
Recovering non-rigid 3D shape from image streams
- Conf. on Computer Vision and Pattern Recognition
, 2000
"... This paper addresses the problem of recovering 3D nonrigid shape models from image sequences. For example, given a video recording of a talking person, we would like to estimate a 3D model of the lips and the full face and its internal modes of variation. Many solutions that recover 3D shape from 2D ..."
Abstract
-
Cited by 148 (6 self)
- Add to MetaCart
This paper addresses the problem of recovering 3D nonrigid shape models from image sequences. For example, given a video recording of a talking person, we would like to estimate a 3D model of the lips and the full face and its internal modes of variation. Many solutions that recover 3D shape from 2D image sequences have been proposed; these so-called structure-from-motion techniques usually assume that the 3D object is rigid. For example Tomasi and Kanade’s factorization technique is based on a rigid shape matrix, which produces a tracking matrix of rank 3 under orthographic projection. We propose a novel technique based on a non-rigid model, where the 3D shape in each frame is a linear combination of a set of basis shapes. Under this model, the tracking matrix is of higher rank, and can be factored in a three step process to yield to pose, configuration and shape. We demonstrate this simple but effective algorithm on video sequences of people and animals. We were able to recover 3D non-rigid facial models with high accuracy. 1
A mixed-state Condensation tracker with automatic model-switching
, 1998
"... There is considerable interest in the computer vision community in representing and modelling motion. Motion models are used as predictors to increase the robustness and accuracy of visual trackers, and as classifiers for gesture recognition. This paper presents a significant development of random s ..."
Abstract
-
Cited by 135 (10 self)
- Add to MetaCart
There is considerable interest in the computer vision community in representing and modelling motion. Motion models are used as predictors to increase the robustness and accuracy of visual trackers, and as classifiers for gesture recognition. This paper presents a significant development of random sampling methods to allow automatic switching between multiple motion models as a natural extension of the tracking process. The Bayesian mixed-state framework is described in its generality, and the example of a bouncing ball is used to demonstrate that a mixed-state model can significantly improve tracking performance in heavy clutter. The relevance of the approach to the problem of gesture recognition is then investigated using a tracker which is able to follow the natural drawing action of a hand holding a pen, and switches state according to the hand's motion. 1 Introduction There is considerable interest in the computer vision community in representing and modelling motion [1, 3, 4]. ...
Tracking and Modeling Non-Rigid Objects with Rank Constraints
, 2001
"... This paper presents a novel solution for flow-based tracking and 3D reconstruction of deforming objects in monocular image sequences. A non-rigid 3D object undergoing rotation and deformation can be effectively approximated using a linear combination of 3D basis shapes. This puts a bound on the rank ..."
Abstract
-
Cited by 104 (6 self)
- Add to MetaCart
This paper presents a novel solution for flow-based tracking and 3D reconstruction of deforming objects in monocular image sequences. A non-rigid 3D object undergoing rotation and deformation can be effectively approximated using a linear combination of 3D basis shapes. This puts a bound on the rank of the tracking matrix. The rank constraint is used to achieve robust and precise low-level optical flow estimation without prior knowledge of the 3D shape of the object. The bound on the rank is also exploited to handle occlusion at the tracking level leading to the possibility of recovering the complete trajectories of occluded/disoccluded points. Following the same lowrank principle, the resulting flow matrix can be factored to get the 3D pose, configuration coefficients, and 3D basis shapes. The flow matrix is factored in an iterative manner, looping between solving for pose, configuration, and basis shapes. The flow-based tracking is applied to several video sequences and provides the input to the 3D non-rigid reconstruction task. Additional results on synthetic data and comparisons to ground truth complete the experiments.
A Probabilistic Framework for Matching Temporal Trajectories: Condensation-Based . . .
, 1998
"... The recognition of human gestures and facial expressions in image sequences is an important and challenging problem that enables a host of human-computer interaction applications. This paper describes a framework for incremental recognition of human motion that extends the "Condensation" algorit ..."
Abstract
-
Cited by 96 (4 self)
- Add to MetaCart
The recognition of human gestures and facial expressions in image sequences is an important and challenging problem that enables a host of human-computer interaction applications. This paper describes a framework for incremental recognition of human motion that extends the "Condensation" algorithm proposed by Isard and Blake (ECCV'96). Humnan motions are
Probabilistic Data Association Methods for Tracking Multiple and Compound Visual Objects
- IEEE Transactions on Pattern Analysis and Machine Intelligence
, 2000
"... We describe a framework that explicitly reasons about data association to improve tracking performance in many difficult visual environments. A hierarchy of tracking strategies results from ascribing ambiguous or missing data to: (1) noise-like visual occurrences; (2) persistent, known scene element ..."
Abstract
-
Cited by 89 (2 self)
- Add to MetaCart
We describe a framework that explicitly reasons about data association to improve tracking performance in many difficult visual environments. A hierarchy of tracking strategies results from ascribing ambiguous or missing data to: (1) noise-like visual occurrences; (2) persistent, known scene elements (i.e. other tracked objects); or (3) persistent, unknown scene elements. First, we introduce a randomized tracking algorithm adapted from an existing probabilistic data association filter (PDAF) that is resistant to clutter and follows agile motion. The algorithm is applied to three different tracking modalities -- homogeneous regions, textured regions, and snakes -- and extensibly defined for straightforward inclusion of other methods. Second, we add the capacity to track multiple objects by adapting to vision a joint PDAF which oversees correspondence choices between same-modality trackers and image features. We then derive a related technique that allows mixed tracker modalities and handles object...

