Determining the Epipolar Geometry and its Uncertainty: A Review
 International Journal of Computer Vision
, 1998
Two images of a single scene/object are related by the epipolar geometry, which can be described by a 3×3 singular matrix called the essential matrix if images' internal parameters are known, or the fundamental matrix otherwise. It captures all geometric information contained in two images, and its determination is very important in many applications such as scene modeling and vehicle navigation. This paper gives an introduction to the epipolar geometry, and provides a complete review of the current techniques for estimating the fundamental matrix and its uncertainty. A wellfounded measure is proposed to compare these techniques. Projective reconstruction is also reviewed. The software which we have developed for this review is available on the Internet.
Automatic Camera Recovery for Closed or Open Image Sequences
 In Proc. ECCV
, 1998
. We describe progress in completely automatically recovering 3D scene structure together with 3D camera positions from a sequence of images acquired by an unknown camera undergoing unknown movement. The main departure from previous structure from motion strategies is that processing is not sequential. Instead a hierarchical approach is employed building from image triplets and associated trifocal tensors. This is advantageous both in obtaining correspondences and also in optimally distributing error over the sequence. The major step forward is that closed sequences can now be dealt with easily. That is, sequences where part of a scene is revisited at a later stage in the sequence. Such sequences contain additional constraints, compared to open sequences, from which the reconstruction can now benefit. The computed cameras and structure are the backbone of a system to build texture mapped graphical models directly from image sequences. 1 Introduction The goal of this work is to obtain ...
3D Model Acquisition from Extended Image Sequences
, 1995
This paper describes the extraction of 3D geometrical data from image sequences, for the purpose of creating 3D models of objects in the world. The approach is uncalibrated  camera internal parameters and camera motion are not known or required. Processing an image sequence is underpinned by token correspondences between images. We utilise matching techniques which are both robust (detecting and discarding mismatches) and fully automatic. The matched tokens are used to compute 3D structure, which is initialised as it appears and then recursively updated over time. We describe a novel robust estimator of the trifocal tensor, based on a minimum number of token correspondences across an image triplet; and a novel tracking algorithm in which corners and line segments are matched over image triplets in an integrated framework. Experimental results are provided for a variety of scenes, including outdoor scenes taken with a handheld camcorder. Quantitative statistics are included to asses...
A SpaceSweep Approach to True MultiImage Matching
, 1996
The problem of determining feature correspondences across multiple views is considered. The term "true multiimage" matching is introduced to describe techniques that make full and efficient use of the geometric relationships between multiple images and the scene. A true multiimage technique must generalize to any number of images, be of linear algorithmic complexity in the number of images, and use all the images in an equal manner. A new spacesweep approach to true multiimage matching is presented that simultaneously determines 2D feature correspondences and the 3D positions of feature points in the scene. The method is based on the premise that areas of space where several viewing rays intersect are the likely locations of observed 3D scene features. It is shown that the intersections of viewing rays with a plane sweeping through space can be determined very efficiently, and a statistical model is developed to tell how likely it is that a given number of viewing rays will pass th...
Sequential updating of projective and affine structure from motion
 International Journal of Computer Vision
, 1997
A structure from motion algorithm is described which recovers structure and camera position, modulo a projective ambiguity. Camera calibration is not required, and camera parameters such as focal length can be altered freely during motion. The structure is updated sequentially over an image sequence, in contrast to schemes which employ a batch process. A specialisation of the algorithm to recover structure and camera position modulo an affine transformation is described, together with a method to periodically update the affine coordinate frame to prevent drift over time. We describe the constraint used to obtain this specialisation. Structure is recovered from image corners detected and matched automatically and reliably in real image sequences. Results are shown for reference objects and indoor environments, and accuracy of recovered structure is fully evaluated and compared for a number of reconstruction schemes. A specific application of the work is demonstrated  affine structure is used to compute free space maps enabling navigation through unstructured environments and avoidance of obstacles. The path planning involves only affine constructions.
Occlusions and Binocular Stereo
, 1995
Binocular stereo is the process of obtaining depth information from a pair of cameras. In the past, stereo algorithms have had problems at occlusions and have tended to fail there (though sometimes postprocessing has been added to mitigate the worst effects). We show that, on the contrary, occlusions can help stereo computation by providing cues for depth discontinuities. We describe a theory for stereo based on the Bayesian approach, using adaptive windows and a prior weak smoothness constraint, which incorporates occlusion. Our model assumes that a disparity discontinuity, along the epipolar line, in one eye always corresponds to an occluded region in the other eye thus, leading to an occlusion constraint. This constraint restricts the space of possible disparity values, thereby simplifying the computations. An estimation of the disparity at occluded features is also discussed in light of psychophysical experiments. Using dynamic programming we can find the optimal solution to our s...
Occlusions, discontinuities, and epipolar lines in stereo
 In European Conference on Computer Vision
, 1998
Abstract. Binocular stereo is the process of obtaining depth information from a pair of left and right views of a scene. We present a new approach to compute the disparity map by solving a global optimization problem that models occlusions, discontinuities, and epipolarline interactions. In the model, geometric constraints require every disparity discontinuity along the epipolar lineinoneeyetoalways correspond to an occluded region in the other eye, while at the same time encouraging smoothness across epipolar lines. Smoothing coefficients are adjusted according to the edge and junction information. For some welldefined set of optimization functions, we can map the optimization problem to a maximumflow problem on a directed graph in a novel way, which enables us to obtain a global solution in a polynomial time. Experiments confirm the validity of this approach. 1
The Problem of Degeneracy in Structure and Motion Recovery from Uncalibrated Image Sequences
 International Journal of Computer Vision
, 2000
. The aim of this work is the recovery of 3D structure and camera projection matrices for each frame of an uncalibrated image sequence. In order to achieve this, correspondences are required throughout the sequence. A significant and successful mechanism for automatically establishing these correspondences is by the use of geometric constraints arising from scene rigidity. However, problems arise with such geometry guided matching if general viewpoint and general structure are assumed whilst frames in the sequence and/or scene structure do not conform to these assumptions. Such cases are termed degenerate. In this paper we describe two important cases of degeneracy and their effects on geometry guided matching. The cases are a motion degeneracy where the camera does not translate between frames, and a structure degeneracy where the viewed scene structure is planar. The effects include the loss of correspondences due to under or over fitting of geometric models estimated from image dat...
Active Visual Navigation using NonMetric Structure
 in Proceedings of the 5th International Conference on Computer Vision
, 1995
This paper demonstrates a method of using nonmetric visual information derived from an uncalibrated active vision system to navigate an autonomous vehicle through freespace regions detected in a cluttered environment. The structure of 3space is recovered modulo an affine transformation using an uncalibrated active stereo head carried by the vehicle. The plane at infinity, necessary for recovering affine structure from projective structure, is found in a novel manner by making controlled rotations of the head. The structure is composed of 3D points obtained by detecting and matching image corners through the stereo image sequence. Considerable care has been taken to ensure that the processing is reliable, robust and automatic. Driveable regions are determined from the projection of the affine structure onto a plane parallel to the ground determined using projective constructs. Two methods of negotiating the regions are explored. The first introduces metric information to allow contro...
A Method for Recognition and Localization of Generic Objects for Indoor Navigation
 IMAGE AND VISION COMPUTING
, 1994
