Results 1 - 10
of
37
Monocular Model-Based 3D Tracking of Rigid Objects: A Survey
- Foundations and Trends in Computer Graphics and Vision
, 2005
"... Many applications require tracking of complex 3D objects. These include visual servoing of robotic arms on specific target objects, Augmented Reality systems that require real-time registration of the object to be augmented, and head tracking systems that sophisticated interfaces can use. Computer V ..."
Abstract
-
Cited by 54 (3 self)
- Add to MetaCart
Many applications require tracking of complex 3D objects. These include visual servoing of robotic arms on specific target objects, Augmented Reality systems that require real-time registration of the object to be augmented, and head tracking systems that sophisticated interfaces can use. Computer Vision offers solutions that are cheap, practical and non-invasive. This survey reviews the different techniques and approaches that have been developed by industry and research. First, important mathematical tools are introduced: Camera representation, robust estimation and uncertainty estimation. Then a comprehensive study is given of the numerous approaches developed by the Augmented Reality and Robotics communities, beginning with those that are based on point or planar fiducial marks and moving on to those that avoid the need to engineer the environment by relying on natural features such as edges, texture or interest. Recent advances that avoid manual initialization and failures due to fast motion are also presented. The survery concludes with the different possible choices that should be made when implementing a 3D tracking system and a discussion of the future of vision-based 3D tracking. Because it encompasses many computer vision techniques from lowlevel vision to 3D geometry and includes a comprehensive study of the massive literature on the subject, this survey should be the handbook of the student, the researcher, or the engineer who wants to implement a 3D tracking system. 1
Fusing Points and Lines for High Performance Tracking
- IN INTERNATIONAL CONFERENCE ON COMPUTER VISION
, 2005
"... This paper addresses the problem of real-time 3D modelbased tracking by combining point-based and edge-based tracking systems. We present a careful analysis of the properties of these two sensor systems and show that this leads to some non-trivial design choices that collectively yield extremely hig ..."
Abstract
-
Cited by 49 (5 self)
- Add to MetaCart
This paper addresses the problem of real-time 3D modelbased tracking by combining point-based and edge-based tracking systems. We present a careful analysis of the properties of these two sensor systems and show that this leads to some non-trivial design choices that collectively yield extremely high performance. In particular, we present a method for integrating the two systems and robustly combining the pose estimates they produce. Further we show how on-line learning can be used to improve the performance of feature tracking. Finally, to aid real-time performance, we introduce the FAST feature detector which can perform full-frame feature detection at 400Hz. The combination of these techniques results in a system which is capable of tracking average prediction errors of 200 pixels. This level of robustness allows us to track very rapid motions, such as 50° camera shake at 6Hz.
Object recognition and full pose registration from a single image for robotic manipulation
- in IEEE ICRA. Kobe: IEEE
, 2009
"... Abstract — Robust perception is a vital capability for robotic manipulation in unstructured scenes. In this context, full pose estimation of relevant objects in a scene is a critical step towards the introduction of robots into household environments. In this paper, we present an approach for buildi ..."
Abstract
-
Cited by 20 (8 self)
- Add to MetaCart
Abstract — Robust perception is a vital capability for robotic manipulation in unstructured scenes. In this context, full pose estimation of relevant objects in a scene is a critical step towards the introduction of robots into household environments. In this paper, we present an approach for building metric 3D models of objects using local descriptors from several images. Each model is optimized to fit a set of calibrated training images, thus obtaining the best possible alignment between the 3D model and the real object. Given a new test image, we match the local descriptors to our stored models online, using a novel combination of the RANSAC and Mean Shift algorithms to register multiple instances of each object. A robust initialization step allows for arbitrary rotation, translation and scaling of objects in the test images. The resulting system provides markerless 6-DOF pose estimation for complex objects in cluttered scenes. We provide experimental results demonstrating orientation and translation accuracy, as well a physical implementation of the pose output being used by an autonomous robot to perform grasping in highly cluttered scenes. I.
Combining Head Tracking and Mouse Input for a Gui on Multiple Monitors
- In Extended abstracts of CHI ’05
, 2005
"... The use of multiple LCD monitors is becoming popular as prices are reduced, but this creates problems for window management and switching between applications. For a single monitor, eye tracking can be combined with the mouse to reduce the amount of mouse movement, but with several monitors the head ..."
Abstract
-
Cited by 19 (0 self)
- Add to MetaCart
The use of multiple LCD monitors is becoming popular as prices are reduced, but this creates problems for window management and switching between applications. For a single monitor, eye tracking can be combined with the mouse to reduce the amount of mouse movement, but with several monitors the head is moved through a large range of positions and angles which makes eye tracking difficult. We thus use head tracking to switch the mouse pointer between monitors and use the mouse to move within each monitor. In our experiment users required significantly less mouse movement with the tracking system, and preferred using it, although task time actually increased. A graphical prompt (flashing star) prevented the user losing the pointer when switching monitors. We present discussions on our results and ideas for further developments. Author Keywords Gaze-contingent display, attentive user interface, head tracking, multiple monitors.
Real-time non-rigid shape recovery via active appearance models for augmented reality
- Proceedings 9th European Conference on Computer Vision (ECCV2006
, 2006
"... Abstract. One main challenge in Augmented Reality (AR) applications is to keep track of video objects with their movement, orientation, size, and position accurately. This poses a challenging task to recover nonrigid shape and global pose in real-time AR applications. This paper proposes a novel two ..."
Abstract
-
Cited by 10 (5 self)
- Add to MetaCart
Abstract. One main challenge in Augmented Reality (AR) applications is to keep track of video objects with their movement, orientation, size, and position accurately. This poses a challenging task to recover nonrigid shape and global pose in real-time AR applications. This paper proposes a novel two-stage scheme for online non-rigid shape recovery toward AR applications using Active Appearance Models (AAMs). First, we construct 3D shape models from AAMs offline, which do not involve processing of the 3D scan data. Based on the computed 3D shape models, we propose an efficient online algorithm to estimate both 3D pose and non-rigid shape parameters via local bundle adjustment for building up point correspondences. Our approach, without manual intervention, can recover the 3D non-rigid shape effectively from either real-time video sequences or single image. The recovered 3D pose parameters can be used for AR registrations. Furthermore, the facial feature can be tracked simultaneously, which is critical for many face related applications. We evaluate our algorithms on several video sequences. Promising experimental results demonstrate our proposed scheme is effective and significant for real-time AR applications.
Automated initialization for marker-less tracking: A sensor fusion approach
- In IEEE/ACM International Symposium on Mixed and Augmented Reality
, 2004
"... We introduce a novel sensor fusion approach for automated initialization of marker-less tracking systems. It is not limitated in tracking range and working environment, given a 3D model of the objects or the real scene. This is achieved based on a statistical analysis and probabilistic estimation of ..."
Abstract
-
Cited by 9 (1 self)
- Add to MetaCart
We introduce a novel sensor fusion approach for automated initialization of marker-less tracking systems. It is not limitated in tracking range and working environment, given a 3D model of the objects or the real scene. This is achieved based on a statistical analysis and probabilistic estimation of the uncertainties of the tracking sensors. The explicit representation of the error distribution allows the fusion of different sensor data, e.g. of mobile tracking sensors with stationary sensors, in order to estimate the initial pose and improve the registration accuracy. This methodology was applied to an augmented reality system, using a mobile camera and several stationary tracking sensors, and can be easily extended to the case of anny additional sensors. The initialization consists of an iterative pose estimation and refinement process using both stationary and mobile cameras. Thereby the registration error is minimized in 3D object space rather than in 2D image. Experimental results show how complex objects can be registered efficiently and accurately to an single initial image. 1.
Tracking by an optimal sequence of linear predictors
- IEEE Transactions on Pattern Analysis and Machine Intelligence
"... Abstract—We propose a learning approach to tracking explicitly minimizing the computational complexity of the tracking process subject to user-defined probability of failure (loss-of-lock) and precision. The tracker is formed by a Number of Sequences of Learned Linear Predictors (NoSLLiP). Robustnes ..."
Abstract
-
Cited by 9 (0 self)
- Add to MetaCart
Abstract—We propose a learning approach to tracking explicitly minimizing the computational complexity of the tracking process subject to user-defined probability of failure (loss-of-lock) and precision. The tracker is formed by a Number of Sequences of Learned Linear Predictors (NoSLLiP). Robustness of NoSLLiP is achieved by modeling the object as a collection of local motion predictors— object motion is estimated by the outlier-tolerant RANSAC algorithm from local predictions. The efficiency of the NoSLLiP tracker stems 1) from the simplicity of the local predictors and 2) from the fact that all design decisions, the number of local predictors used by the tracker, their computational complexity (i.e., the number of observations the prediction is based on), locations as well as the number of RANSAC iterations, are all subject to the optimization (learning) process. All time-consuming operations are performed during the learning stage—tracking is reduced to only a few hundred integer multiplications in each step. On PC with 1xK8 3200+, a predictor evaluation requires about 30 s. The proposed approach is verified on publicly available sequences with approximately 12,000 frames with ground truth. Experiments demonstrate superiority in frame rates and robustness with respect to the SIFT detector, Lucas-Kanade tracker, and other trackers. Index Terms—Image processing and computer vision, scene analysis, tracking. Ç 1
Real-time 3d model-based tracking: Combining edge and texture information
- ICRA
"... Abstract — This paper proposes a real-time, robust and efficient 3D model-based tracking algorithm. A non linear minimization approach is used to register 2D and 3D cues for monocular 3D tracking. The integration of texture information in a more classical non-linear edge-based pose computation highl ..."
Abstract
-
Cited by 8 (1 self)
- Add to MetaCart
Abstract — This paper proposes a real-time, robust and efficient 3D model-based tracking algorithm. A non linear minimization approach is used to register 2D and 3D cues for monocular 3D tracking. The integration of texture information in a more classical non-linear edge-based pose computation highly increases the reliability of more conventional edge-based 3D tracker. Robustness is enforced by integrating a M-estimator into the minimization process via an iteratively re-weighted least squares implementation. The method presented in this paper has been validated on several video sequences as well as in visual servoing experiments considering various objects. Results show the method to be robust to large motions and textured environments. I.
HERB: a home exploring robotic butler
, 2010
"... We describe the architecture, algorithms, and experiments with HERB, an autonomous mobile manipulator that performs useful manipulation tasks in the home. We present new algorithms for searching for objects, learning to navigate in cluttered dynamic indoor scenes, recognizing and registering object ..."
Abstract
-
Cited by 8 (5 self)
- Add to MetaCart
We describe the architecture, algorithms, and experiments with HERB, an autonomous mobile manipulator that performs useful manipulation tasks in the home. We present new algorithms for searching for objects, learning to navigate in cluttered dynamic indoor scenes, recognizing and registering objects accurately in high clutter using vision, manipulating doors and other constrained objects using caging grasps, grasp planning and execution in clutter, and manipulation on pose and torque constraint manifolds. We also
Real-time Hybrid Tracking using Edge and Texture Information
, 2007
"... This paper proposes a real-time, robust and effective tracking framework for visual servoing applications. The algorithm is based on the fusion of visual cues and on the estimation of a transformation (either a homography or a 3D pose). The parameters of this transformation are estimated using a non ..."
Abstract
-
Cited by 6 (2 self)
- Add to MetaCart
This paper proposes a real-time, robust and effective tracking framework for visual servoing applications. The algorithm is based on the fusion of visual cues and on the estimation of a transformation (either a homography or a 3D pose). The parameters of this transformation are estimated using a non-linear minimization of a unique criterion that integrates information both on the texture and the edges of the tracked object. The proposed tracker is more robust and performs well in conditions where methods based on a single cue fail. The framework has been tested for 2D object motion estimation and pose computation. The method presented in this paper has been validated on several video sequences as well as in visual servoing experiments considering various objects. Results show the method to be robust to occlusions or textured backgrounds and suitable for visual servoing applications.

