Results 1  10
of
196
RealTime Tracking of NonRigid Objects using Mean Shift
 IEEE CVPR 2000
, 2000
"... A new method for realtime tracking of nonrigid objects seen from a moving camera isproposed. The central computational module is based on the mean shift iterations and nds the most probable target position in the current frame. The dissimilarity between the target model (its color distribution) an ..."
Abstract

Cited by 581 (18 self)
 Add to MetaCart
A new method for realtime tracking of nonrigid objects seen from a moving camera isproposed. The central computational module is based on the mean shift iterations and nds the most probable target position in the current frame. The dissimilarity between the target model (its color distribution) and the target candidates is expressed by a metric derived from the Bhattacharyya coefficient. The theoretical analysis of the approach shows that it relates to the Bayesian framework while providing a practical, fast and efficient solution. The capability of the tracker to handle in realtime partial occlusions, significant clutter, and target scale variations, is demonstrated for several image sequences.
Hierarchical Bayesian Inference in the Visual Cortex
, 2002
"... this paper, we propose a Bayesian theory of hierarchical cortical computation based both on (a) the mathematical and computational ideas of computer vision and pattern the ory and on (b) recent neurophysiological experimental evidence. We ,2 have proposed that Grenander's pattern theory 3 could pot ..."
Abstract

Cited by 174 (0 self)
 Add to MetaCart
this paper, we propose a Bayesian theory of hierarchical cortical computation based both on (a) the mathematical and computational ideas of computer vision and pattern the ory and on (b) recent neurophysiological experimental evidence. We ,2 have proposed that Grenander's pattern theory 3 could potentially model the brain as a generafive model in such a way that feedback serves to disambiguate and 'explain away' the earlier representa tion. The Helmholtz machine 4, 5 was an excellent step towards approximating this proposal, with feedback implementing priors. Its development, however, was rather limited, dealing only with binary images. Moreover, its feedback mechanisms were engaged only during the learning of the feedforward connections but not during perceptual inference, though the Gibbs sampling process for inference can potentially be interpreted as topdown feedback disambiguating low level representations? Rao and Ballard's predictive coding/Kalman filter model 6 did integrate generafive feedback in the perceptual inference process, but it was primarily a linear model and thus severely limited in practical utility. The datadriven Markov Chain Monte Carlo approach of Zhu and colleagues 7, 8 might be the most successful recent application of this proposal in solving real and difficult computer vision problems using generafive models, though its connection to the visual cortex has not been explored. Here, we bring in a powerful and widely applicable paradigm from artificial intelligence and computer vision to propose some new ideas about the algorithms of visual cortical process ing and the nature of representations in the visual cortex. We will review some of our and others' neurophysiological experimental data to lend support to these ideas
Implicit Probabilistic Models of Human Motion for Synthesis and Tracking Hedvig Sidenblen
 In European Conference on Computer Vision
, 2002
"... This paper addresses the problem of probabilistically modeling 3D human motion for synthesis and tracking. Given the high dimensional nature of human motion, learning an explicit probabilistic model from available training data is currently impractical. Instead we exploit methods from texture synthe ..."
Abstract

Cited by 165 (4 self)
 Add to MetaCart
This paper addresses the problem of probabilistically modeling 3D human motion for synthesis and tracking. Given the high dimensional nature of human motion, learning an explicit probabilistic model from available training data is currently impractical. Instead we exploit methods from texture synthesis that treat images as representing an implicit empirical distribution . These methods replace the problem of representing the probability of a texture pattern with that of searching the training data for similar instances of that pattern. We extend this idea to temporal data representing 3D human motion with a large database of example motions. To make the method useful in practice, we must address the problem of efficient search in a large training set
Visual Tracking and Recognition Using AppearanceAdaptive Models in Particle Filters
 IEEE Transactions on Image Processing
, 2004
"... We present an approach that incorporates appearanceadaptive models in a particle filter to realize robust visual tracking and recognition algorithms. Tracking needs modeling interframe motion and appearance changes whereas recognition needs modeling appearance changes between frames and gallery ..."
Abstract

Cited by 129 (12 self)
 Add to MetaCart
We present an approach that incorporates appearanceadaptive models in a particle filter to realize robust visual tracking and recognition algorithms. Tracking needs modeling interframe motion and appearance changes whereas recognition needs modeling appearance changes between frames and gallery images. In conventional tracking algorithms, the appearance model is either fixed or rapidly changing, and the motion model is simply a random walk with fixed noise variance. Also, the number of particles is typically fixed. All these factors make the visual tracker unstable. To stabilize the tracker, we propose the following modifications: an observation model arising from an adaptive appearance model, an adaptive velocity motion model with adaptive noise variance, and an adaptive number of particles. The adaptivevelocity model is derived using a firstorder linear predictor based on the appearance difference between the incoming observation and the previous particle configuration. Occlusion analysis is implemented using robust statistics. Experimental results on tracking visual objects in long outdoor and indoor video sequences demonstrate the effectiveness and robustness of our tracking algorithm. We then perform simultaneous tracking and recognition by embedding them in a particle filter. For recognition purposes, we model the appearance changes between frames and gallery images by constructing the intra and extrapersonal spaces. Accurate recognition is achieved when confronted by pose and view variations.
Data fusion for visual tracking with particles
 Proc. IEEE
, 2004
"... Abstract—The effectiveness of probabilistic tracking of objects in image sequences has been revolutionized by the development of particle filtering. Whereas Kalman filters are restricted to Gaussian distributions, particle filters can propagate more general distributions, albeit only approximately. ..."
Abstract

Cited by 128 (2 self)
 Add to MetaCart
Abstract—The effectiveness of probabilistic tracking of objects in image sequences has been revolutionized by the development of particle filtering. Whereas Kalman filters are restricted to Gaussian distributions, particle filters can propagate more general distributions, albeit only approximately. This is of particular benefit in visual tracking because of the inherent ambiguity of the visual world that stems from its richness and complexity. One important advantage of the particle filtering framework is that it allows the information from different measurement sources to be fused in a principled manner. Although this fact has been acknowledged before, it has not been fully exploited within a visual tracking context. Here we introduce generic importance sampling mechanisms for data fusion and discuss them for fusing color with either stereo sound, for teleconferencing, or with motion, for surveillance with a still camera. We show how each of the three cues can be modeled by an appropriate data likelihood function, and how the intermittent cues (sound or motion) are best handled by generating proposal distributions from their likelihood functions. Finally, the effective fusion of the cues by particle filtering is demonstrated on real teleconference and surveillance data. Index Terms — Visual tracking, data fusion, particle filters, sound, color, motion I.
People Tracking Using Hybrid Monte Carlo Filtering
, 2001
"... Particle filters are used for hidden state estimation with nonlinear dynamical systems. The inference of 3d human motion is a natural application, given the nonlinear dynamics of the body and the nonlinear relation between states and image observations. However, the application of particle filters ..."
Abstract

Cited by 97 (6 self)
 Add to MetaCart
Particle filters are used for hidden state estimation with nonlinear dynamical systems. The inference of 3d human motion is a natural application, given the nonlinear dynamics of the body and the nonlinear relation between states and image observations. However, the application of particle filters has been limited to cases where the number of state variables is relatively small, because the number of samples needed with high dimensional problems can be prohibitive. We describe a filter that uses hybrid Monte Carlo (HMC) to obtain samples in high dimensional spaces. It uses multiple Markov chains that use posterior gradients to rapidly explore the state space, yielding fair samples from the posterior. We find that the HMC filter is several thousand times faster than a conventional particle filter on a 28D people tracking problem.
Capturing Natural Hand Articulation
 In ICCV
, 2001
"... Visionbased motion capturing of hand articulation is a challenging task, since the hand presents a motion of high degrees of freedom. Modelbased approaches could be taken to approach this problem by searching in a high dimensional hand state space, and matching projections of a hand model and imag ..."
Abstract

Cited by 90 (10 self)
 Add to MetaCart
Visionbased motion capturing of hand articulation is a challenging task, since the hand presents a motion of high degrees of freedom. Modelbased approaches could be taken to approach this problem by searching in a high dimensional hand state space, and matching projections of a hand model and image observations. However, it is highly inefficient due to the curse of dimensionality. Fortunately, natural hand articulation is highly constrained, which largely reduces the dimensionality of hand state space. This paper presents a modelbased method to capture hand articulation by learning hand natural constraints. Our study shows that natural hand articulation lies in a lower dimensional configurations space characterized by a union of linear manifolds spanned by a set of basis configurations. By integrating hand motion constraints, an efficient articulated motioncapturing algorithm is proposed based on sequential Monte Carlo techniques. Our experiments show that this algorithm is robust and accurate for tracking natural hand movements. This algorithm is easy to extend to other articulated motion capturing tasks.
Learning and classification of complex dynamics
 IEEE Transactions on Pattern Analysis and Machine Intelligence
, 2000
"... AbstractÐStandard, exact techniques based on likelihood maximization are available for learning AutoRegressive Process models of dynamical processes. The uncertainty of observations obtained from real sensors means that dynamics can be observed only approximately. Learning can still be achieved via ..."
Abstract

Cited by 78 (1 self)
 Add to MetaCart
AbstractÐStandard, exact techniques based on likelihood maximization are available for learning AutoRegressive Process models of dynamical processes. The uncertainty of observations obtained from real sensors means that dynamics can be observed only approximately. Learning can still be achieved via ªEMKºÐExpectationMaximization (EM) based on Kalman Filtering. This cannot handle more complex dynamics, however, involving multiple classes of motion. A problem arises also in the case of dynamical processes observed visually: background clutter arising for example, in camouflage, produces nonGaussian observation noise. Even with a single dynamical class, nonGaussian observations put the learning problem beyond the scope of EMK. For those cases, we show here how ªEMCºÐbased on the CONDENSATION algorithm which propagates random ªparticlesets,º can solve the learning problem. Here, learning in clutter is studied experimentally using visual observations of a hand moving over a desktop. The resulting learned dynamical model is shown to have considerable predictive value: When used as a prior for estimation of motion, the burden of computation in visual observation is significantly reduced. Multiclass dynamics are studied via visually observed juggling; plausible dynamical models have been found to emerge from the learning process, and accurate classification of motion has resulted. In practice, EMC learning is computationally burdensome and the paper concludes with some discussion of computational complexity. Index TermsÐComputer vision, learning dynamics, AutoRegressive Process, Expectation Maximization. 1
proposal distributions: Object tracking using unscented particle filter
 in Proc. IEEE Conf. on Computer Vision and Pattern Recognition, Kauai
, 2001
"... Tracking objects involves the modeling of nonlinear nonGaussian systems. On one hand, variants of Kalman filters are limited by their Gaussian assumptions. On the other hand, conventional particle filter, e.g., CONDENSATION, uses transition prior as the proposal distribution. The transition prior ..."
Abstract

Cited by 72 (2 self)
 Add to MetaCart
Tracking objects involves the modeling of nonlinear nonGaussian systems. On one hand, variants of Kalman filters are limited by their Gaussian assumptions. On the other hand, conventional particle filter, e.g., CONDENSATION, uses transition prior as the proposal distribution. The transition prior does not take into account current observation data, and many particles can therefore be wasted in low likelihood area. To overcome these difficulties, unscented particle filter (UPF) has recently been proposed in the field of filtering theory. In this paper, we introduce the UPF framework into audio and visual tracking. The UPF uses the unscented Kalman filter to generate sophisticated proposal distributions that seamlessly integrate the current observation, thus greatly improving the tracking performance. To evaluate the efficacy of the UPF framework, we apply it in two realworld tracking applications. One is the audiobased speaker localization, and the other is the visionbased human tracking. The experimental results are compared against those of the widely used CONDENSATION approach and have demonstrated superior tracking performance. 1.