Results 1 - 10
of
25
Distributed occlusion reasoning for tracking with nonparametric belief propagation
- In NIPS
, 2004
"... We describe a three–dimensional geometric hand model suitable for visual tracking applications. The kinematic constraints implied by the model’s joints have a probabilistic structure which is well described by a graphical model. Inference in this model is complicated by the hand’s many degrees of fr ..."
Abstract
-
Cited by 39 (0 self)
- Add to MetaCart
We describe a three–dimensional geometric hand model suitable for visual tracking applications. The kinematic constraints implied by the model’s joints have a probabilistic structure which is well described by a graphical model. Inference in this model is complicated by the hand’s many degrees of freedom, as well as multimodal likelihoods caused by ambiguous image measurements. We use nonparametric belief propagation (NBP) to develop a tracking algorithm which exploits the graph’s structure to control complexity, while avoiding costly discretization. While kinematic constraints naturally have a local structure, self– occlusions created by the imaging process lead to complex interpendencies in color and edge–based likelihood functions. However, we show that local structure may be recovered by introducing binary hidden variables describing the occlusion state of each pixel. We augment the NBP algorithm to infer these occlusion variables in a distributed fashion, and then analytically marginalize over them to produce hand position estimates which properly account for occlusion events. We provide simulations showing that NBP may be used to refine inaccurate model initializations, as well as track hand motion through extended image sequences. 1
A Linear Programming Approach for Multiple Object Tracking
"... We propose a linear programming relaxation scheme for the class of multiple object tracking problems where the inter-object interaction metric is convex and the intraobject term quantifying object state continuity may use any metric. The proposed scheme models object tracking as a multi-path searchi ..."
Abstract
-
Cited by 20 (0 self)
- Add to MetaCart
We propose a linear programming relaxation scheme for the class of multiple object tracking problems where the inter-object interaction metric is convex and the intraobject term quantifying object state continuity may use any metric. The proposed scheme models object tracking as a multi-path searching problem. It explicitly models track interaction, such as object spatial layout consistency or mutual occlusion, and optimizes multiple object tracks simultaneously. The proposed scheme does not rely on track initialization and complex heuristics. It has much less average complexity than previous efficient exhaustive search methods such as extended dynamic programming and is found to be able to find the global optimum with high probability. We have successfully applied the proposed method to multiple object tracking in video streams. 1.
Visual recognition of grasps for human-to-robot mapping
- in IEEE/RSJ International Conference on Intelligent Robots and Systems
, 2008
"... Abstract — This paper presents a vision based method for grasp classification. It is developed as part of a Programming by Demonstration (PbD) system for which recognition of objects and pick-and-place actions represent basic building blocks for task learning. In contrary to earlier approaches, no a ..."
Abstract
-
Cited by 5 (3 self)
- Add to MetaCart
Abstract — This paper presents a vision based method for grasp classification. It is developed as part of a Programming by Demonstration (PbD) system for which recognition of objects and pick-and-place actions represent basic building blocks for task learning. In contrary to earlier approaches, no articulated 3D reconstruction of the hand over time is taking place. The indata consists of a single image of the human hand. A 2D representation of the hand shape, based on gradient orientation histograms, is extracted from the image. The hand shape is then classified as one of six grasps by finding similar hand shapes in a large database of grasp images. The database search is performed using Locality Sensitive Hashing (LSH), an approximate k-nearest neighbor approach. The nearest neighbors also give an estimated hand orientation with respect to the camera. The six human grasps are mapped to three Barret hand grasps. Depending on the type of robot grasp, a precomputed grasp strategy is selected. The strategy is further parameterized by the orientation of the hand relative to the object. To evaluate the potential for the method to be part of a robust vision system, experiments were performed, comparing classification results to a baseline of human classification performance. The experiments showed the LSH recognition performance to be comparable to human performance. I.
Hands in action: Real-time 3D reconstruction of hands in interaction with objects
- In: IEEE International Conference on Robotics and Automation
, 2010
"... Abstract — This paper presents a method for vision based estimation of the pose of human hands in interaction with objects. Despite the fact that most robotics applications of human hand tracking involve grasping and manipulation of objects, the majority of methods in the literature assume a free ha ..."
Abstract
-
Cited by 3 (1 self)
- Add to MetaCart
Abstract — This paper presents a method for vision based estimation of the pose of human hands in interaction with objects. Despite the fact that most robotics applications of human hand tracking involve grasping and manipulation of objects, the majority of methods in the literature assume a free hand, isolated from the surrounding environment. Our hand tracking method is non-parametric, performing a nearest neighbor search in a large database (100000 entries) of hand poses with and without grasped objects. The system operates in real time, it is robust to self occlusions, object occlusions and segmentation errors, and provides full hand pose reconstruction from markerless video. Temporal consistency in hand pose is taken into account, without explicitly tracking the hand in the high dimensional pose space. I.
Handy AR: Markerless Inspection of Augmented Reality Objects Using Fingertip Tracking
"... We present markerless camera tracking and user interface methodology for readily inspecting augmented reality (AR) objects in wearable computing applications. Instead of a marker, we use the human hand as a distinctive pattern that almost all wearable computer users have readily available. We presen ..."
Abstract
-
Cited by 3 (0 self)
- Add to MetaCart
We present markerless camera tracking and user interface methodology for readily inspecting augmented reality (AR) objects in wearable computing applications. Instead of a marker, we use the human hand as a distinctive pattern that almost all wearable computer users have readily available. We present a robust real-time algorithm that recognizes fingertips to reconstruct the 6DOF camera pose relative to the user’s outstretched hand. A hand pose model is constructed in a one-time calibration step by measuring the fingertip positions in presence of ground-truth scale information. Through frame-by-frame reconstruction of the camera pose relative to the hand, we can stabilize 3D graphics annotations on top of the hand, allowing the user to inspect such virtual objects conveniently from different viewing angles in AR. We evaluate our approach with regard to speed and accuracy, and compare it to state-of-the-art marker-based AR systems. We demonstrate the robustness and usefulness of our approach in an example AR application for selecting and inspecting world-stabilized virtual objects. 1.
Human body posture refinement by nonparametric belief propagation
- Int. Conf. Image Processing
, 2005
"... Accurate human body posture refinement from single or multiple images is essential in many applications. Two main causes of difficulty to solve the refinement problem are high degree freedom of human body and self-occlusion. One of the most recent algorithms is nonparametric belief propagation (NBP) ..."
Abstract
-
Cited by 2 (2 self)
- Add to MetaCart
Accurate human body posture refinement from single or multiple images is essential in many applications. Two main causes of difficulty to solve the refinement problem are high degree freedom of human body and self-occlusion. One of the most recent algorithms is nonparametric belief propagation (NBP) that solves the problem in a lower dimensional state space. However, it is difficult to handle self-occlusion. This paper presents an NBP-based algorithm that can refine body posture even in self-occlusion case, which has been shown by experimental results. The experimental results also show that our algorithm can accurately refine body posture even if the initial posture has large difference from the true posture. 1.
Manipulator and Object Tracking for In Hand Model Acquisition
"... Abstract — Recognizing and manipulating objects is an important task for mobile robots performing useful services in everyday environments. While existing techniques for object recognition related to manipulation provide very good results even for noisy and incomplete data, they are typically traine ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
Abstract — Recognizing and manipulating objects is an important task for mobile robots performing useful services in everyday environments. While existing techniques for object recognition related to manipulation provide very good results even for noisy and incomplete data, they are typically trained using data generated in an offline process. As a result, they do not enable a robot to acquire new object models as it operates in an environment. In this paper, we develop an approach to building 3D models of unknown objects based on a depth camera observing the robot’s hand while moving an object. The approach integrates both shape and appearance information into an articulated ICP approach to track the robot’s manipulator and the object. Objects are modeled by sets of surfels, which are small patches providing occlusion and appearance information. Experiments show that our approach provides very good 3D models even when the object is highly symmetric and lacking visual features and the manipulator motion is noisy. I.
On-line Simultaneous Learning and Tracking of Visual Feature Graphs
"... Model learning and tracking are two important topics in computer vision. While there are many applications where one of them is used to support the other, there are currently only few where both aid each other simultaneously. In this work, we seek to incrementally learn a graphical model from tracki ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
Model learning and tracking are two important topics in computer vision. While there are many applications where one of them is used to support the other, there are currently only few where both aid each other simultaneously. In this work, we seek to incrementally learn a graphical model from tracking and to simultaneously use whatever has been learned to improve the tracking in the next frames. The main problem encountered in this situation is that the current intermediate model may be inconsistent with future observations, creating a bias in the tracking results. We propose an uncertain model that explicitly accounts for such uncertainties by representing relations by an appropriately weighted sum of informative (parametric) and uninformative (uniform) components. The method is completely unsupervised and operates in real time. 1.
Human Posture Sequence Estimation Using Two Un-calibrated Cameras
"... 3D Human posture sequence estimation from single or multiple image sequences is essential in many applications. However, 3D posture sequence cannot be accurately estimated from single image sequence due to depth ambiguity or self-occlusion, and camera calibration is often required before estimating ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
3D Human posture sequence estimation from single or multiple image sequences is essential in many applications. However, 3D posture sequence cannot be accurately estimated from single image sequence due to depth ambiguity or self-occlusion, and camera calibration is often required before estimating 3D posture sequence from multiple image sequences. In this paper, we present an algorithm to accurately estimate 3D human posture sequence from two un-calibrated image sequences. The algorithm combines a modified Nonparametric Belief Propagation (mNBP) method with an improved camera self-calibration method. The previously developed mNBP can estimate posture even under partial self-occlusion, and here it is improved to estimate posture when the human model scale is different from that of body image in image sequences. The improved self-calibration can guarantee to find the optimal rotation and relative scale between two fixed but un-calibrated scaled orthographic cameras, without a nonlinear optimization process. Quantitative and qualitative results of experiments show that the algorithm is able to estimate 3D posture sequence from a pair of un-calibrated image sequences. 1
Gradient-Based Hand Tracking Using Silhouette Data
"... Abstract. Optical motion capture can be classified as an inference problem: given the data produced by a set of cameras, the aim is to extract the hidden state, which in this case encodes the posture of the subject’s body. Problems with motion capture arise due to the multi-modal nature of the likel ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
Abstract. Optical motion capture can be classified as an inference problem: given the data produced by a set of cameras, the aim is to extract the hidden state, which in this case encodes the posture of the subject’s body. Problems with motion capture arise due to the multi-modal nature of the likelihood distribution, the extremely large dimensionality of its state-space, and the narrow region of support of local modes. There are also problems with the size of the data, the difficulty with which useful visual cues can be extracted from it, as well as how informative these cues might be. Several algorithms exist that use stochastic methods to extract the hidden state, but although highly parallelisable in theory, such methods produce a heavy computational overhead even with the power of today’s computers. In this paper we assume a set of pre-calibrated cameras and only extract the subject’s silhouette as a visual cue. In order to describe the 2D silhouette data we define a 2D model consisting of conic fields. The resulting likelihood distribution is differentiable w.r.t. the state, meaning that its global maximum can be located fast using gradient ascent search, given manual initialisation at the first frame. In this paper we explain the construction of the model for tracking a human hand; we describe the formulation of the derivatives needed, and present initial results on both real and simulated data. 1

