Results 1 - 10
of
163
Visual odometry for ground vehicle applications
- Journal of Field Robotics
, 2006
"... We present a system that estimates the motion of a stereo head or a single moving camera based on video input. The system operates in real-time with low delay and the motion estimates are used for navigational purposes. The front end of the system is a feature tracker. Point features are matched bet ..."
Abstract
-
Cited by 67 (5 self)
- Add to MetaCart
We present a system that estimates the motion of a stereo head or a single moving camera based on video input. The system operates in real-time with low delay and the motion estimates are used for navigational purposes. The front end of the system is a feature tracker. Point features are matched between pairs of frames and linked into image trajectories at video rate. Robust estimates of the camera motion are then produced from the feature tracks using a geometric hypothesize-and-test architecture. This generates motion estimates from visual input alone. No prior knowledge of the scene nor the motion is necessary. The visual estimates can also be used in conjunction with information from other sources such as GPS, inertia sensors, wheel encoders, etc. The pose estimation method has been applied successfully to video from aerial, automotive and handheld platforms. We focus on results obtained with a stereo-head mounted on an autonomous ground vehicle. We give examples of camera trajectories estimated in real-time purely from images over previously unseen distances (600 meters) and periods of time. 1.
Stable real-time 3d tracking using online and offline information
- IEEE Transactions on Pattern Analysis and Machine Intelligence
, 2004
"... We propose an efficient real-time solution for tracking rigid objects in 3D using a single camera that can handle large camera displacements, drastic aspect changes, and partial occlusions. While commercial products are already available for offline camera registration, robust online tracking remain ..."
Abstract
-
Cited by 51 (4 self)
- Add to MetaCart
We propose an efficient real-time solution for tracking rigid objects in 3D using a single camera that can handle large camera displacements, drastic aspect changes, and partial occlusions. While commercial products are already available for offline camera registration, robust online tracking remains an open issue because many real-time algorithms described in the literature still lack robustness and are prone to drift and jitter. To address these problems, we have formulated the tracking problem in terms of local bundle adjustment and have developed a method for establishing image correspondences that can equally well handle short and widebaseline matching. We then can merge the information from preceding frames with that provided by a very limited number of keyframes created during a training stage, which results in a real-time tracker that does not jitter or drift and can deal with significant aspect changes. Computer vision, Real-time systems, Tracking. Index Terms I.
Outdoor slam using visual appearance and laser ranging
- In IEEE International Conference on Robotics and Automation
, 2006
"... Abstract — This paper describes a 3D SLAM system using information from an actuated laser scanner and camera installed on a mobile robot.The laser samples the local geometry of the environment and is used to incrementally build a 3D point-cloud map of the workspace. Sequences of images from the came ..."
Abstract
-
Cited by 50 (4 self)
- Add to MetaCart
Abstract — This paper describes a 3D SLAM system using information from an actuated laser scanner and camera installed on a mobile robot.The laser samples the local geometry of the environment and is used to incrementally build a 3D point-cloud map of the workspace. Sequences of images from the camera are used to detect loop closure events (without reference to the internal estimates of vehicle location) using a novel appearancebased retrieval system. The loop closure detection is robust to repetitive visual structure and provides a probabilistic measure of confidence. The images suggesting loop closure are then further processed with their corresponding local laser scans to yield putative Euclidean image-image transformations. We show how naive application of this transformation to effect the loop closure can lead to catastrophic linearization errors and go on to describe a way in which gross, pre-loop closing errors can be successfully annulled. We demonstrate our system working in a challenging, outdoor setting containing substantial loops and beguiling, gently curving traversals. The results are overlaid on an aerial image to provide a ground truth comparison with the estimated map. The paper concludes with an extension into the multi-robot domain in which 3D maps resulting from distinct SLAM sessions (no common reference frame) are combined without recourse to mutual observation. I.
Modeling the World from Internet Photo Collections
- INT J COMPUT VIS
, 2007
"... There are billions of photographs on the Internet, comprising the largest and most diverse photo collection ever assembled. How can computer vision researchers exploit this imagery? This paper explores this question from the standpoint of 3D scene modeling and visualization. We present structure-fro ..."
Abstract
-
Cited by 45 (1 self)
- Add to MetaCart
There are billions of photographs on the Internet, comprising the largest and most diverse photo collection ever assembled. How can computer vision researchers exploit this imagery? This paper explores this question from the standpoint of 3D scene modeling and visualization. We present structure-from-motion and image-based rendering algorithms that operate on hundreds of images downloaded as a result of keyword-based image search queries like “Notre Dame ” or “Trevi Fountain.” This approach, which we call Photo Tourism, has enabled reconstructions of numerous well-known world sites. This paper presents these algorithms and results as a first step towards 3D modeling of the world’s well-photographed sites, cities, and landscapes from Internet imagery, and discusses key open problems and challenges for the research community.
Recent Developments on Direct Relative Orientation
, 2006
"... This paper presents a novel version of the five-point relative orientation algorithm given in Nister (2004). The name of the algorithm arises from the fact that it can operate even on the minimal five point correspondences required for a finite number of solutions to relative orientation. For the mi ..."
Abstract
-
Cited by 42 (0 self)
- Add to MetaCart
This paper presents a novel version of the five-point relative orientation algorithm given in Nister (2004). The name of the algorithm arises from the fact that it can operate even on the minimal five point correspondences required for a finite number of solutions to relative orientation. For the minimal five correspondences the algorithm returns up to ten real solutions. The algorithm can also operate on many points. Like the previous version of the five-point algorithm, our method can operate correctly even in the face of critical surfaces, including planar and ruled quadric scenes. The paper
Modeling and recognition of landmark image collections using iconic scene graphs
- In ECCV
"... Abstract. This paper presents an approach for modeling landmark sites such as the Statue of Liberty based on large-scale contaminated image collections gathered from the Internet. Our system combines 2D appearance and 3D geometric constraints to efficiently extract scene summaries, build 3D models, ..."
Abstract
-
Cited by 41 (6 self)
- Add to MetaCart
Abstract. This paper presents an approach for modeling landmark sites such as the Statue of Liberty based on large-scale contaminated image collections gathered from the Internet. Our system combines 2D appearance and 3D geometric constraints to efficiently extract scene summaries, build 3D models, and recognize instances of the landmark in new test images. We start by clustering images using low-dimensional global “gist” descriptors. Next, we perform geometric verification to retain only the clusters whose images share a common 3D structure. Each valid cluster is then represented by a single iconic view, and geometric relationships between iconic views are captured by an iconic scene graph. In addition to serving as a compact scene summary, this graph is used to guide structure from motion to efficiently produce 3D models of the different aspects of the landmark. The set of iconic images is also used for recognition, i.e., determining whether new test images contain the landmark. Results on three data sets consisting of tens of thousands of images demonstrate the potential of the proposed approach. 1
Towards urban 3d reconstruction from video
- in 3DPVT
, 2006
"... The paper introduces a data collection system and a processing pipeline for automatic geo-registered 3D reconstruction of urban scenes from video. The system collects multiple video streams, as well as GPS and INS measurements in order to place the reconstructed models in georegistered coordinates. ..."
Abstract
-
Cited by 30 (4 self)
- Add to MetaCart
The paper introduces a data collection system and a processing pipeline for automatic geo-registered 3D reconstruction of urban scenes from video. The system collects multiple video streams, as well as GPS and INS measurements in order to place the reconstructed models in georegistered coordinates. Besides high quality in terms of both geometry and appearance, we aim at real-time performance. Even though our processing pipeline is currently far from being real-time, we select techniques and we design processing modules that can achieve fast performance on multiple CPUs and GPUs aiming at real-time performance in the near future. We present the main considerations in designing the system and the steps of the processing pipeline. We show results on real video sequences captured by our system. 1
Bundle adjustment rules
- In Photogrammetric Computer Vision
, 2006
"... In this paper we investigate the status of bundle adjustment as a component of a real-time camera tracking system and show that with current computing hardware a significant amount of bundle adjustment can be performed every time a new frame is added, even under stringent real-time constraints. We a ..."
Abstract
-
Cited by 28 (0 self)
- Add to MetaCart
In this paper we investigate the status of bundle adjustment as a component of a real-time camera tracking system and show that with current computing hardware a significant amount of bundle adjustment can be performed every time a new frame is added, even under stringent real-time constraints. We also show, by quantifying the failure rate over long video sequences, that the bundle adjustment is able to significantly decrease the rate of gross failures in the camera tracking. Thus, bundle adjustment does not only bring accuracy improvements. The accuracy improvements also suppress error buildup in a way that is crucial for the performance of the camera tracker. Our experimental study is performed in the setting of tracking the trajectory a calibrated camera moving in 3D for various types of motion, showing that bundle adjustment should be considered an important component for a state-of-the-art real-time camera tracking system. 1
Gool. 3D urban scene modeling integrating recognition and reconstruction
- IJCV
, 2008
"... Abstract — Supplying realistically textured 3D city models at ground level promises to be useful for pre-visualizing upcoming traffic situations in car navigation systems. Because this previsualization can be rendered from the expected future viewpoints of the driver, the required maneuver will be m ..."
Abstract
-
Cited by 26 (1 self)
- Add to MetaCart
Abstract — Supplying realistically textured 3D city models at ground level promises to be useful for pre-visualizing upcoming traffic situations in car navigation systems. Because this previsualization can be rendered from the expected future viewpoints of the driver, the required maneuver will be more easily understandable. 3D city models can be reconstructed from the imagery recorded by surveying vehicles. The vastness of image material gathered by these vehicles, however, puts extreme demands on vision algorithms to ensure their practical usability. Algorithms need to be as fast as possible and should result in compact, memory efficient 3D city models for future ease of distribution and visualization. For the considered application, these are not contradictory demands. Simplified geometry assumptions can speed up vision algorithms while automatically guaranteeing compact geometry models. In this paper, we present a novel city modeling framework which builds upon this philosophy to create 3D content at high speed. Objects in the environment, such as cars and pedestrians, may however disturb the reconstruction, as they violate the simplified geometry assumptions, leading to visually unpleasant artifacts and degrading the visual realism of the resulting 3D city model. Unfortunately, such objects are prevalent in urban scenes. We therefore extend the reconstruction framework by integrating it with an object recognition module that automatically detects cars in the input video streams and localizes them in 3D. The two components of our system are tightly integrated and benefit from each other’s continuous input. 3D reconstruction delivers geometric scene context, which greatly helps improve detection precision. The detected car locations, on the other hand, are used to instantiate virtual placeholder models which augment the visual realism of the reconstructed city model. Index Terms — city modeling, structure from motion, 3D reconstruction, object detection, temporal integration I.
Monocular vision for mobile robot localization and autonomous navigation
- JOURNAL OF COMPUTER VISION
, 2007
"... This paper presents a new real-time localization system for a mobile robot. We show that autonomous navigation is possible in outdoor situation with the use of a single camera and natural landmarks. To do that, we use a three step approach. In a learning step, the robot is manually guided on a pat ..."
Abstract
-
Cited by 24 (0 self)
- Add to MetaCart
This paper presents a new real-time localization system for a mobile robot. We show that autonomous navigation is possible in outdoor situation with the use of a single camera and natural landmarks. To do that, we use a three step approach. In a learning step, the robot is manually guided on a path and a video sequence is recorded with a front looking camera. Then a structure from motion algorithm is used to build a 3D map from this learning sequence. Finally in the navigation step, the robot uses this map to compute its localization in real-time and it follows the learning path or a slightly different path if desired. The vision algorithms used for map building and localization are first detailed. Then a large part of the paper is dedicated to the experimental evaluation of the accuracy and robustness of our algorithms based on experimental data collected during two years in various environments.

