Results 1 - 10
of
272
A taxonomy and evaluation of dense two-frame stereo correspondence algorithms
- International Journal of Computer Vision
, 2002
"... Abstract. Stereo matching is one of the most active research areas in computer vision. While a large number of algorithms for stereo correspondence have been developed, relatively little work has been done on characterizing their performance. In this paper, we present a taxonomy of dense, two-frame ..."
Abstract
-
Cited by 709 (18 self)
- Add to MetaCart
Abstract. Stereo matching is one of the most active research areas in computer vision. While a large number of algorithms for stereo correspondence have been developed, relatively little work has been done on characterizing their performance. In this paper, we present a taxonomy of dense, two-frame stereo methods. Our taxonomy is designed to assess the different components and design decisions made in individual stereo algorithms. Using this taxonomy, we compare existing stereo methods and present experiments evaluating the performance of many different variants. In order to establish a common software platform and a collection of data sets for easy evaluation, we have designed a stand-alone, flexible C++ implementation that enables the evaluation of individual components and that can easily be extended to include new algorithms. We have also produced several new multi-frame stereo data sets with ground truth and are making both the code and data sets available on the Web. Finally, we include a comparative evaluation of a large set of today’s best-performing stereo algorithms.
Photo Tourism: Exploring Photo Collections in 3D
- ACM TRANSACTIONS ON GRAPHICS
, 2006
"... We present a system for interactively browsing and exploring large unstructured collections of photographs of a scene using a novel 3D interface. Our system consists of an image-based modeling front end that automatically computes the viewpoint of each photograph as well as a sparse 3D model of the ..."
Abstract
-
Cited by 232 (20 self)
- Add to MetaCart
We present a system for interactively browsing and exploring large unstructured collections of photographs of a scene using a novel 3D interface. Our system consists of an image-based modeling front end that automatically computes the viewpoint of each photograph as well as a sparse 3D model of the scene and image to model correspondences. Our photo explorer uses image-based rendering techniques to smoothly transition between photographs, while also enabling full 3D navigation and exploration of the set of images and world geometry, along with auxiliary information such as overhead maps. Our system also makes it easy to construct photo tours of scenic or historic locations, and to annotate image details, which are automatically transferred to other relevant images. We demonstrate our system on several large personal photo collections as well as images gathered from Internet photo sharing sites.
Robust mapping and localization in indoor environments using sonar data
- Int. J. Robotics Research
, 2002
"... In this paper we describe a new technique for the creation of featurebased stochastic maps using standard Polaroid sonar sensors. The fundamental contributions of our proposal are: (1) a perceptual grouping process that permits the robust identification and localization of environmental features, su ..."
Abstract
-
Cited by 109 (24 self)
- Add to MetaCart
In this paper we describe a new technique for the creation of featurebased stochastic maps using standard Polaroid sonar sensors. The fundamental contributions of our proposal are: (1) a perceptual grouping process that permits the robust identification and localization of environmental features, such as straight segments and corners, from the sparse and noisy sonar data; (2) a map joining technique that allows the system to build a sequence of independent limited-size stochastic maps and join them in a globally consistent way; (3) a robust mechanism to determine which features in a stochastic map correspond to the same environment feature, allowing the system to update the stochastic map accordingly, and perform tasks such as revisiting and loop closing. We demonstrate the practicality of this approach by building a geometric map of a medium size, real indoor environment, with several people moving around the robot. Maps built from laser data for the same experiment are provided for comparison. Key words
Visual odometry for ground vehicle applications
- Journal of Field Robotics
, 2006
"... We present a system that estimates the motion of a stereo head or a single moving camera based on video input. The system operates in real-time with low delay and the motion estimates are used for navigational purposes. The front end of the system is a feature tracker. Point features are matched bet ..."
Abstract
-
Cited by 67 (5 self)
- Add to MetaCart
We present a system that estimates the motion of a stereo head or a single moving camera based on video input. The system operates in real-time with low delay and the motion estimates are used for navigational purposes. The front end of the system is a feature tracker. Point features are matched between pairs of frames and linked into image trajectories at video rate. Robust estimates of the camera motion are then produced from the feature tracks using a geometric hypothesize-and-test architecture. This generates motion estimates from visual input alone. No prior knowledge of the scene nor the motion is necessary. The visual estimates can also be used in conjunction with information from other sources such as GPS, inertia sensors, wheel encoders, etc. The pose estimation method has been applied successfully to video from aerial, automotive and handheld platforms. We focus on results obtained with a stereo-head mounted on an autonomous ground vehicle. We give examples of camera trajectories estimated in real-time purely from images over previously unseen distances (600 meters) and periods of time. 1.
A Survey of Methods for Volumetric Scene Reconstruction from Photographs
"... Scene reconstruction, the task of generating a 3D model of a scene given multiple 2D photographs taken of the scene, is an old and difficult problem in computer vision. Since its introduction, scene reconstruction has found application in many fields, including robotics, virtual reality, and entert ..."
Abstract
-
Cited by 59 (1 self)
- Add to MetaCart
Scene reconstruction, the task of generating a 3D model of a scene given multiple 2D photographs taken of the scene, is an old and difficult problem in computer vision. Since its introduction, scene reconstruction has found application in many fields, including robotics, virtual reality, and entertainment. Volumetric models are a natural choice for scene reconstruction. Three broad classes of volumetric reconstruction techniques have been developed based on geometric intersections, color consistency, and pair-wise matching. Some of these techniques have spawned a number of variations and undergone considerable refinement. This paper is a survey of techniques for volumetric scene reconstruction.
Stable real-time 3d tracking using online and offline information
- IEEE Transactions on Pattern Analysis and Machine Intelligence
, 2004
"... We propose an efficient real-time solution for tracking rigid objects in 3D using a single camera that can handle large camera displacements, drastic aspect changes, and partial occlusions. While commercial products are already available for offline camera registration, robust online tracking remain ..."
Abstract
-
Cited by 51 (4 self)
- Add to MetaCart
We propose an efficient real-time solution for tracking rigid objects in 3D using a single camera that can handle large camera displacements, drastic aspect changes, and partial occlusions. While commercial products are already available for offline camera registration, robust online tracking remains an open issue because many real-time algorithms described in the literature still lack robustness and are prone to drift and jitter. To address these problems, we have formulated the tracking problem in terms of local bundle adjustment and have developed a method for establishing image correspondences that can equally well handle short and widebaseline matching. We then can merge the information from preceding frames with that provided by a very limited number of keyframes created during a training stage, which results in a real-time tracker that does not jitter or drift and can deal with significant aspect changes. Computer vision, Real-time systems, Tracking. Index Terms I.
Tracking multiple humans in complex situations
- IEEE Transactions on Pattern Analysis and Machine Intelligence
, 2004
"... Abstract—Tracking multiple humans in complex situations is challenging. The difficulties are tackled with appropriate knowledge in the form of various models in our approach. Human motion is decomposed into its global motion and limb motion. In the first part, we show how multiple human objects are ..."
Abstract
-
Cited by 51 (0 self)
- Add to MetaCart
Abstract—Tracking multiple humans in complex situations is challenging. The difficulties are tackled with appropriate knowledge in the form of various models in our approach. Human motion is decomposed into its global motion and limb motion. In the first part, we show how multiple human objects are segmented and their global motions are tracked in 3D using ellipsoid human shape models. Experiments show that it successfully applies to the cases where a small number of people move together, have occlusion, and cast shadow or reflection. In the second part, we estimate the modes (e.g., walking, running, standing) of the locomotion and 3D body postures by making inference in a prior locomotion model. Camera model and ground plane assumptions provide geometric constraints in both parts. Robust results are shown on some difficult sequences. Index Terms—Multiple-human segmentation, multiple-human tracking, visual surveillance, human shape model, human locomotion model. 1
Modeling the World from Internet Photo Collections
- INT J COMPUT VIS
, 2007
"... There are billions of photographs on the Internet, comprising the largest and most diverse photo collection ever assembled. How can computer vision researchers exploit this imagery? This paper explores this question from the standpoint of 3D scene modeling and visualization. We present structure-fro ..."
Abstract
-
Cited by 45 (1 self)
- Add to MetaCart
There are billions of photographs on the Internet, comprising the largest and most diverse photo collection ever assembled. How can computer vision researchers exploit this imagery? This paper explores this question from the standpoint of 3D scene modeling and visualization. We present structure-from-motion and image-based rendering algorithms that operate on hundreds of images downloaded as a result of keyword-based image search queries like “Notre Dame ” or “Trevi Fountain.” This approach, which we call Photo Tourism, has enabled reconstructions of numerous well-known world sites. This paper presents these algorithms and results as a first step towards 3D modeling of the world’s well-photographed sites, cities, and landscapes from Internet imagery, and discusses key open problems and challenges for the research community.
Structure and Motion from Uncalibrated Catadioptric Views
- In Proc. CVPR
, 2001
"... In this paper we present a new algorithm for structure from motion from point correspondences in images taken from uncalibrated catadioptric cameras with parabolic mirrors. We assume that the unknown intrinsic parameters are three: the combined focal length of the mirror and lens and the intersectio ..."
Abstract
-
Cited by 41 (3 self)
- Add to MetaCart
In this paper we present a new algorithm for structure from motion from point correspondences in images taken from uncalibrated catadioptric cameras with parabolic mirrors. We assume that the unknown intrinsic parameters are three: the combined focal length of the mirror and lens and the intersection of the optical axis with the image. We introduce a new representation for images of points and lines in catadioptric images which we call the circle space. This circle space includes imaginary circles, one of which is the image of the absolute conic. We formulate the epipolar constraint in this space and establish a new 4 × 4 catadioptric fundamental matrix. We show that the image of the absolute conic belongs to the kernel of this matrix. This enables us to prove that Euclidean reconstruction is feasible from two views with constant parameters and from three views with varying parameters. In both cases, it is one less than the number of views necessary with perspective cameras.
The dual-bootstrap iterative closest point algorithm with application to retinal image registration
- IEEE Trans. Med. Img
, 2003
"... Abstract—Motivated by the problem of retinal image registration, this paper introduces and analyzes a new registration algorithm called Dual-Bootstrap Iterative Closest Point (Dual-Bootstrap ICP). The approach is to start from one or more initial, low-order estimates that are only accurate in small ..."
Abstract
-
Cited by 39 (18 self)
- Add to MetaCart
Abstract—Motivated by the problem of retinal image registration, this paper introduces and analyzes a new registration algorithm called Dual-Bootstrap Iterative Closest Point (Dual-Bootstrap ICP). The approach is to start from one or more initial, low-order estimates that are only accurate in small image regions, called bootstrap regions. In each bootstrap region, the algorithm iteratively: 1) refines the transformation estimate using constraints only from within the bootstrap region; 2) expands the bootstrap region; and 3) tests to see if a higher order transformation model can be used, stopping when the region expands to cover the overlap between images. Steps 1): and 3), the bootstrap steps, are governed by the covariance matrix of the estimated transformation. Estimation refinement [Step 2)] uses a novel robust version of the ICP algorithm. In registering retinal image pairs, Dual-Bootstrap ICP is initialized by automatically matching individual vascular landmarks, and it aligns images based on detected blood vessel centerlines. The resulting quadratic transformations are accurate to less than a pixel. On tests involving approximately 6000 image pairs, it successfully registered 99.5 % of the pairs containing at least one common landmark, and 100 % of the pairs containing at least one common landmark and at least 35 % image overlap. Index Terms—Iterative closest point, medical imaging, registration, retinal imaging, robust estimation.

