Results 1 - 10
of
157
Real-Time Tracking of Non-Rigid Objects using Mean Shift
- IEEE CVPR 2000
, 2000
"... A new method for real-time tracking of non-rigid objects seen from a moving camera isproposed. The central computational module is based on the mean shift iterations and nds the most probable target position in the current frame. The dissimilarity between the target model (its color distribution) an ..."
Abstract
-
Cited by 424 (16 self)
- Add to MetaCart
A new method for real-time tracking of non-rigid objects seen from a moving camera isproposed. The central computational module is based on the mean shift iterations and nds the most probable target position in the current frame. The dissimilarity between the target model (its color distribution) and the target candidates is expressed by a metric derived from the Bhattacharyya coefficient. The theoretical analysis of the approach shows that it relates to the Bayesian framework while providing a practical, fast and efficient solution. The capability of the tracker to handle in real-time partial occlusions, significant clutter, and target scale variations, is demonstrated for several image sequences.
Implicit Probabilistic Models of Human Motion for Synthesis and Tracking Hedvig Sidenblen
- In European Conference on Computer Vision
, 2002
"... This paper addresses the problem of probabilistically modeling 3D human motion for synthesis and tracking. Given the high dimensional nature of human motion, learning an explicit probabilistic model from available training data is currently impractical. Instead we exploit methods from texture synthe ..."
Abstract
-
Cited by 131 (3 self)
- Add to MetaCart
This paper addresses the problem of probabilistically modeling 3D human motion for synthesis and tracking. Given the high dimensional nature of human motion, learning an explicit probabilistic model from available training data is currently impractical. Instead we exploit methods from texture synthesis that treat images as representing an implicit empirical distribution . These methods replace the problem of representing the probability of a texture pattern with that of searching the training data for similar instances of that pattern. We extend this idea to temporal data representing 3D human motion with a large database of example motions. To make the method useful in practice, we must address the problem of efficient search in a large training set
Hierarchical Bayesian Inference in the Visual Cortex
, 2002
"... this paper, we propose a Bayesian theory of hierarchical cortical computation based both on (a) the mathematical and computational ideas of computer vision and pattern the- ory and on (b) recent neurophysiological experimental evidence. We ,2 have proposed that Grenander's pattern theory 3 could pot ..."
Abstract
-
Cited by 106 (0 self)
- Add to MetaCart
this paper, we propose a Bayesian theory of hierarchical cortical computation based both on (a) the mathematical and computational ideas of computer vision and pattern the- ory and on (b) recent neurophysiological experimental evidence. We ,2 have proposed that Grenander's pattern theory 3 could potentially model the brain as a generafive model in such a way that feedback serves to disambiguate and 'explain away' the earlier representa- tion. The Helmholtz machine 4, 5 was an excellent step towards approximating this proposal, with feedback implementing priors. Its development, however, was rather limited, dealing only with binary images. Moreover, its feedback mechanisms were engaged only during the learning of the feedforward connections but not during perceptual inference, though the Gibbs sampling process for inference can potentially be interpreted as top-down feedback disambiguating low level representations? Rao and Ballard's predictive coding/Kalman filter model 6 did integrate generafive feedback in the perceptual inference process, but it was primarily a linear model and thus severely limited in practical utility. The data-driven Markov Chain Monte Carlo approach of Zhu and colleagues 7, 8 might be the most successful recent application of this proposal in solving real and difficult computer vision problems using generafive models, though its connection to the visual cortex has not been explored. Here, we bring in a powerful and widely applicable paradigm from artificial intelligence and computer vision to propose some new ideas about the algorithms of visual cortical process- ing and the nature of representations in the visual cortex. We will review some of our and others' neurophysiological experimental data to lend support to these ideas
Data Fusion for Visual Tracking with Particles
- Proceedings of the IEEE
, 2004
"... this paper we present a particle filter-based visual tracker that fuses three cues in a novel way: color, motion, and sound (Fig. 1). More specifically, we will introduce color as the main visual cue and fuse it, depending on the scenario under consideration, with either sound localization cues or m ..."
Abstract
-
Cited by 91 (2 self)
- Add to MetaCart
this paper we present a particle filter-based visual tracker that fuses three cues in a novel way: color, motion, and sound (Fig. 1). More specifically, we will introduce color as the main visual cue and fuse it, depending on the scenario under consideration, with either sound localization cues or motion activity cues. The generic objective is to track a specified object or region of interest in the sequence of images captured by the camera. We employ weak object models so as not to be too restrictive about the types of objects the algorithm can track, and to achieve robustness to large variations in the object pose, illumination, motion, etc. In this generic context, contour cues are less appropriate than color cues to characterize the visual appearance of tracked entities. The use of edge-based cues indeed requires that the class of objects to be tracked is known a priori and that rather precise silhouette models can be learned beforehand. Note however that such conditions are met in a number of tracking applications where shape cues are routinely used [2], [3], [25], [30], [40], [44], [53]
People Tracking Using Hybrid Monte Carlo Filtering
, 2001
"... Particle filters are used for hidden state estimation with nonlinear dynamical systems. The inference of 3-d human motion is a natural application, given the nonlinear dynamics of the body and the nonlinear relation between states and image observations. However, the application of particle filters ..."
Abstract
-
Cited by 86 (5 self)
- Add to MetaCart
Particle filters are used for hidden state estimation with nonlinear dynamical systems. The inference of 3-d human motion is a natural application, given the nonlinear dynamics of the body and the nonlinear relation between states and image observations. However, the application of particle filters has been limited to cases where the number of state variables is relatively small, because the number of samples needed with high dimensional problems can be prohibitive. We describe a filter that uses hybrid Monte Carlo (HMC) to obtain samples in high dimensional spaces. It uses multiple Markov chains that use posterior gradients to rapidly explore the state space, yielding fair samples from the posterior. We find that the HMC filter is several thousand times faster than a conventional particle filter on a 28D people tracking problem.
Capturing Natural Hand Articulation
- In ICCV
, 2001
"... Vision-based motion capturing of hand articulation is a challenging task, since the hand presents a motion of high degrees of freedom. Model-based approaches could be taken to approach this problem by searching in a high dimensional hand state space, and matching projections of a hand model and imag ..."
Abstract
-
Cited by 79 (10 self)
- Add to MetaCart
Vision-based motion capturing of hand articulation is a challenging task, since the hand presents a motion of high degrees of freedom. Model-based approaches could be taken to approach this problem by searching in a high dimensional hand state space, and matching projections of a hand model and image observations. However, it is highly inefficient due to the curse of dimensionality. Fortunately, natural hand articulation is highly constrained, which largely reduces the dimensionality of hand state space. This paper presents a model-based method to capture hand articulation by learning hand natural constraints. Our study shows that natural hand articulation lies in a lower dimensional configurations space characterized by a union of linear manifolds spanned by a set of basis configurations. By integrating hand motion constraints, an efficient articulated motion-capturing algorithm is proposed based on sequential Monte Carlo techniques. Our experiments show that this algorithm is robust and accurate for tracking natural hand movements. This algorithm is easy to extend to other articulated motion capturing tasks.
Better proposal distributions: Object tracking using unscented particle filter
, 2001
"... Tracking objects involves the modeling of non-linear non-Gaussian systems. On one hand, variants of Kalman filters are limited by their Gaussian assumptions. On the other hand, conventional particle filter, e.g., CONDENSATION, uses transition prior as the proposal distribution. The transition prior ..."
Abstract
-
Cited by 60 (2 self)
- Add to MetaCart
Tracking objects involves the modeling of non-linear non-Gaussian systems. On one hand, variants of Kalman filters are limited by their Gaussian assumptions. On the other hand, conventional particle filter, e.g., CONDENSATION, uses transition prior as the proposal distribution. The transition prior does not take into account current observation data, and many particles can therefore be wasted in low likelihood area. To overcome these difficulties, unscented particle filter (UPF) has recently been proposed in the field of filtering theory. In this paper, we introduce the UPF framework into audio and visual tracking. The UPF uses the unscented Kalman filter to generate sophisticated proposal distributions that seamlessly integrate the current observation, thus greatly improving the tracking performance. To evaluate the efficacy of the UPF framework, we apply it in two real-world tracking applications. One is the audio-based speaker localization, and the other is the visionbased human tracking. The experimental results are compared against those of the widely used CONDENSATION approach and have demonstrated superior tracking performance. 1.
Learning Image Statistics for Bayesian Tracking
- In IEEE International Conference on Computer Vision
, 2001
"... This paper describes a framework for learning probabilistic models of objects and scenes and for exploiting these models for tracking complex, deformable, or articulated objects in image sequences. We focus on the probabilistic tracking of people and learn models of how they appear and move in image ..."
Abstract
-
Cited by 58 (6 self)
- Add to MetaCart
This paper describes a framework for learning probabilistic models of objects and scenes and for exploiting these models for tracking complex, deformable, or articulated objects in image sequences. We focus on the probabilistic tracking of people and learn models of how they appear and move in images. In particular, we learn the likelihood of observing various spatial and temporal filter responses corresponding to edges, ridges, and motion differences given a model of the person. Similarly, we learn probability distributions over filter responses for general scenes that define a likelihood of observing the filter responses for arbitrary backgrounds. We then derive a probabilistic model for tracking that exploits the ratio between the likelihood that image pixels corresponding to the foreground (person) were generated by an actual person or by some unknown background. The paper extends previous work on learning image statistics and combines it with Bayesian tracking using particle filtering. By combining multiple image cues, and by using learned likelihood models, we demonstrate improved robustness and accuracy when tracking complex objects such as people in monocular image sequences with cluttered scenes and a moving camera.
Learning and classification of complex dynamics
- IEEE Transactions on Pattern Analysis and Machine Intelligence
, 2000
"... AbstractÐStandard, exact techniques based on likelihood maximization are available for learning Auto-Regressive Process models of dynamical processes. The uncertainty of observations obtained from real sensors means that dynamics can be observed only approximately. Learning can still be achieved via ..."
Abstract
-
Cited by 55 (1 self)
- Add to MetaCart
AbstractÐStandard, exact techniques based on likelihood maximization are available for learning Auto-Regressive Process models of dynamical processes. The uncertainty of observations obtained from real sensors means that dynamics can be observed only approximately. Learning can still be achieved via ªEM-KºÐExpectation-Maximization (EM) based on Kalman Filtering. This cannot handle more complex dynamics, however, involving multiple classes of motion. A problem arises also in the case of dynamical processes observed visually: background clutter arising for example, in camouflage, produces non-Gaussian observation noise. Even with a single dynamical class, non-Gaussian observations put the learning problem beyond the scope of EM-K. For those cases, we show here how ªEM-CºÐbased on the CONDENSATION algorithm which propagates random ªparticle-sets,º can solve the learning problem. Here, learning in clutter is studied experimentally using visual observations of a hand moving over a desktop. The resulting learned dynamical model is shown to have considerable predictive value: When used as a prior for estimation of motion, the burden of computation in visual observation is significantly reduced. Multiclass dynamics are studied via visually observed juggling; plausible dynamical models have been found to emerge from the learning process, and accurate classification of motion has resulted. In practice, EM-C learning is computationally burdensome and the paper concludes with some discussion of computational complexity. Index TermsÐComputer vision, learning dynamics, Auto-Regressive Process, Expectation Maximization. 1

