Results 1 - 10
of
110
Hierarchical Bayesian Inference in the Visual Cortex
, 2002
"... this paper, we propose a Bayesian theory of hierarchical cortical computation based both on (a) the mathematical and computational ideas of computer vision and pattern the- ory and on (b) recent neurophysiological experimental evidence. We ,2 have proposed that Grenander's pattern theory 3 could pot ..."
Abstract
-
Cited by 106 (0 self)
- Add to MetaCart
this paper, we propose a Bayesian theory of hierarchical cortical computation based both on (a) the mathematical and computational ideas of computer vision and pattern the- ory and on (b) recent neurophysiological experimental evidence. We ,2 have proposed that Grenander's pattern theory 3 could potentially model the brain as a generafive model in such a way that feedback serves to disambiguate and 'explain away' the earlier representa- tion. The Helmholtz machine 4, 5 was an excellent step towards approximating this proposal, with feedback implementing priors. Its development, however, was rather limited, dealing only with binary images. Moreover, its feedback mechanisms were engaged only during the learning of the feedforward connections but not during perceptual inference, though the Gibbs sampling process for inference can potentially be interpreted as top-down feedback disambiguating low level representations? Rao and Ballard's predictive coding/Kalman filter model 6 did integrate generafive feedback in the perceptual inference process, but it was primarily a linear model and thus severely limited in practical utility. The data-driven Markov Chain Monte Carlo approach of Zhu and colleagues 7, 8 might be the most successful recent application of this proposal in solving real and difficult computer vision problems using generafive models, though its connection to the visual cortex has not been explored. Here, we bring in a powerful and widely applicable paradigm from artificial intelligence and computer vision to propose some new ideas about the algorithms of visual cortical process- ing and the nature of representations in the visual cortex. We will review some of our and others' neurophysiological experimental data to lend support to these ideas
Nonparametric Belief Propagation for Self-Calibration in Sensor Networks
- In Proceedings of the Third International Symposium on Information Processing in Sensor Networks
, 2004
"... Automatic self-calibration of ad-hoc sensor networks is a critical need for their use in military or civilian applications. In general, self-calibration involves the combination of absolute location information (e.g. GPS) with relative calibration information (e.g. time delay or received signal stre ..."
Abstract
-
Cited by 73 (6 self)
- Add to MetaCart
Automatic self-calibration of ad-hoc sensor networks is a critical need for their use in military or civilian applications. In general, self-calibration involves the combination of absolute location information (e.g. GPS) with relative calibration information (e.g. time delay or received signal strength between sensors) over regions of the network. Furthermore, it is generally desirable to distribute the computational burden across the network and minimize the amount of inter-sensor communication. We demonstrate that the information used for sensor calibration is fundamentally local with regard to the network topology and use this observation to reformulate the problem within a graphical model framework. We then demonstrate the utility of nonparametric belief propagation (NBP), a recent generalization of particle filtering, for both estimating sensor locations and representing location uncertainties. NBP has the advantage that it is easily implemented in a distributed fashion, admits a wide variety of statistical models, and can represent multi-modal uncertainty. We illustrate the performance of NBP on several example networks while comparing to a previously published nonlinear least squares method.
Discriminative Density Propagation for 3D Human Motion Estimation
- In CVPR
, 2005
"... We describe a mixture density propagation algorithm to estimate 3D human motion in monocular video sequences based on observations encoding the appearance of image silhouettes. Our approach is discriminative rather than generative, therefore it does not require the probabilistic inversion of a predi ..."
Abstract
-
Cited by 65 (10 self)
- Add to MetaCart
We describe a mixture density propagation algorithm to estimate 3D human motion in monocular video sequences based on observations encoding the appearance of image silhouettes. Our approach is discriminative rather than generative, therefore it does not require the probabilistic inversion of a predictive observation model. Instead, it uses a large human motion capture data-base and a 3D computer graphics human model in order to synthesize training pairs of typical human configurations together with their realistically rendered 2D silhouettes. These are used to directly learn to predict the conditional state distributions required for 3D body pose tracking and thus avoid using the generative 3D model for inference (the learned discriminative predictors can also be used, complementary, as importance samplers in order to improve mixing or initialize generative inference algorithms). We aim for probabilistically motivated tracking algorithms and for models that can represent complex multivalued mappings common in inverse, uncertain perception inferences. Our paper has three contributions: (1) we establish the density propagation rules for discriminative inference in continuous, temporal chain models; (2) we propose flexible algorithms for learning multimodal state distributions based on compact, conditional Bayesian mixture of experts models; and (3) we demonstrate the algorithms empirically on real and motion capture-based test sequences and compare against nearest-neighbor and regression methods.
PAMPAS: Real-Valued Graphical Models for Computer Vision
, 2003
"... Probabilistic models have been adopted for many computer vision applications, however inference in highdimensional spaces remains problematic. As the statespace of a model grows, the dependencies between the dimensions lead to an exponential growth in computation when performing inference. Many comm ..."
Abstract
-
Cited by 64 (2 self)
- Add to MetaCart
Probabilistic models have been adopted for many computer vision applications, however inference in highdimensional spaces remains problematic. As the statespace of a model grows, the dependencies between the dimensions lead to an exponential growth in computation when performing inference. Many common computer vision problems naturally map onto the graphical model framework; the representation is a graph where each node contains a portion of the state-space and there is an edge between two nodes only if they are not independent conditional on the other nodes in the graph. When this graph is sparsely connected, belief propagation algorithms can turn an exponential inference computation into one which is linear in the size of the graph. However belief propagation is only applicable when the variables in the nodes are discrete-valued or jointly represented by a single multivariate Gaussian distribution, and this rules out many computer vision applications.
Tracking Articulated Body by Dynamic Markov Network
- PROC. IEEE INT'L CONF. ON COMPUTER VISION, NICE, FRANCE
, 2003
"... A new method for visual tracking of articulated objects is presented. Analyzing articulated motion is challenging because the dimensionality increase potentially demands tremendous increase of computation. To ease this problem, we propose an approach that analyzes subparts locally while reinforcing ..."
Abstract
-
Cited by 46 (9 self)
- Add to MetaCart
A new method for visual tracking of articulated objects is presented. Analyzing articulated motion is challenging because the dimensionality increase potentially demands tremendous increase of computation. To ease this problem, we propose an approach that analyzes subparts locally while reinforcing the structural constraints at the mean time. The computational model of the proposed approach is based on a dynamic Markov network, a generative model which characterizes the dynamics and the image observations of each individual subpart as well as the motion constraints among different subparts. Probabilistic variational analysis of the model reveals a mean field approximation to the posterior densities of each subparts given visual evidence, and provides a computationally efficient way for such a difficult Bayesian inference problem. In addition, we design mean field Monte Carlo (MFMC) algorithms, in which a set of low dimensional particle filters interact with each other and solve the high dimensional problem collaboratively. Extensive experiments on tracking human body parts demonstrate the effectiveness, significance and computational efficiency of the proposed method.
Loopy belief propagation: Convergence and effects of message errors
- Journal of Machine Learning Research
, 2005
"... Belief propagation (BP) is an increasingly popular method of performing approximate inference on arbitrary graphical models. At times, even further approximations are required, whether due to quantization of the messages or model parameters, from other simplified message or model representations, or ..."
Abstract
-
Cited by 40 (7 self)
- Add to MetaCart
Belief propagation (BP) is an increasingly popular method of performing approximate inference on arbitrary graphical models. At times, even further approximations are required, whether due to quantization of the messages or model parameters, from other simplified message or model representations, or from stochastic approximation methods. The introduction of such errors into the BP message computations has the potential to affect the solution obtained adversely. We analyze the effect resulting from message approximation under two particular measures of error, and show bounds on the accumulation of errors in the system. This analysis leads to convergence conditions for traditional BP message passing, and both strict bounds and estimates of the resulting error in systems of approximate BP message passing.
Location-based activity recognition
- In Advances in Neural Information Processing Systems (NIPS
, 2005
"... Learning patterns of human behavior from sensor data is extremely important for high-level activity inference. We show how to extract and label a person’s activities and significant places from traces of GPS data. In contrast to existing techniques, our approach simultaneously detects and classifies ..."
Abstract
-
Cited by 39 (5 self)
- Add to MetaCart
Learning patterns of human behavior from sensor data is extremely important for high-level activity inference. We show how to extract and label a person’s activities and significant places from traces of GPS data. In contrast to existing techniques, our approach simultaneously detects and classifies the significant locations of a person and takes the high-level context into account. Our system uses relational Markov networks to represent the hierarchical activity model that encodes the complex relations among GPS readings, activities and significant places. We apply FFT-based message passing to perform efficient summation over large numbers of nodes in the networks. We present experiments that show significant improvements over existing techniques. 1
Distributed occlusion reasoning for tracking with nonparametric belief propagation
- In NIPS
, 2004
"... We describe a three–dimensional geometric hand model suitable for visual tracking applications. The kinematic constraints implied by the model’s joints have a probabilistic structure which is well described by a graphical model. Inference in this model is complicated by the hand’s many degrees of fr ..."
Abstract
-
Cited by 39 (0 self)
- Add to MetaCart
We describe a three–dimensional geometric hand model suitable for visual tracking applications. The kinematic constraints implied by the model’s joints have a probabilistic structure which is well described by a graphical model. Inference in this model is complicated by the hand’s many degrees of freedom, as well as multimodal likelihoods caused by ambiguous image measurements. We use nonparametric belief propagation (NBP) to develop a tracking algorithm which exploits the graph’s structure to control complexity, while avoiding costly discretization. While kinematic constraints naturally have a local structure, self– occlusions created by the imaging process lead to complex interpendencies in color and edge–based likelihood functions. However, we show that local structure may be recovered by introducing binary hidden variables describing the occlusion state of each pixel. We augment the NBP algorithm to infer these occlusion variables in a distributed fashion, and then analytically marginalize over them to produce hand position estimates which properly account for occlusion events. We provide simulations showing that NBP may be used to refine inaccurate model initializations, as well as track hand motion through extended image sequences. 1
Attractive People: Assembling Loose-Limbed Models Using Non-parametric Belief Propagation
- in NIPS
, 2003
"... The detection and pose estimation of people in images and video is made challenging by the variability of human appearance, the complexity of natural scenes, and the high dimensionality of articulated body models. ..."
Abstract
-
Cited by 37 (2 self)
- Add to MetaCart
The detection and pose estimation of people in images and video is made challenging by the variability of human appearance, the complexity of natural scenes, and the high dimensionality of articulated body models.
Tracking people by learning their appearance
- IEEE Trans. Pattern Anal. Mach. Intell
"... Abstract—An open vision problem is to automatically track the articulations of people from a video sequence. This problem is difficult because one needs to determine both the number of people in each frame and estimate their configurations. But, finding people and localizing their limbs is hard beca ..."
Abstract
-
Cited by 36 (3 self)
- Add to MetaCart
Abstract—An open vision problem is to automatically track the articulations of people from a video sequence. This problem is difficult because one needs to determine both the number of people in each frame and estimate their configurations. But, finding people and localizing their limbs is hard because people can move fast and unpredictably, can appear in a variety of poses and clothes, and are often surrounded by limb-like clutter. We develop a completely automatic system that works in two stages; it first builds a model of appearance of each person in a video and then it tracks by detecting those models in each frame (“tracking by model-building and detection”). We develop two algorithms that build models; one bottom-up approach groups together candidate body parts found throughout a sequence. We also describe a top-down approach that automatically builds people-models by detecting convenient key poses within a sequence. We finally show that building a discriminative model of appearance is quite helpful since it exploits structure in a background (without background-subtraction). We demonstrate the resulting tracker on hundreds of thousands of frames of unscripted indoor and outdoor activity, a feature-length film (“Run Lola Run”), and legacy sports footage (from the 2002 World Series and 1998 Winter Olympics). Experiments suggest that our system 1) can count distinct individuals, 2) can identify and track them, 3) can recover when it loses track, for example, if individuals are occluded or briefly leave the view, 4) can identify body configuration accurately, and 5) is not dependent on particular models of human motion. Index Terms—People tracking, motion capture, surveillance. 1

