Stratification of 3-D vision: Projective, affine, and metric representations
| Citations: | 47 - 4 self |
BibTeX
@MISC{Faugeras_stratificationof,
author = {Olivier Faugeras},
title = {Stratification of 3-D vision: Projective, affine, and metric representations },
year = {}
}
OpenURL
Abstract
In this article we provide a conceptual framework in which to think of the relationships between the three-dimensional structure of the physical space and the geometric properties of a set of cameras which provide pictures from which measurements can be made. We usually think of the physical space as being embedded in a three-dimensional euclidean space where measurements of lengths and angles do make sense. It turns out that for artificial systems, such as robots, this is not a mandatory viewpoint and that it is sometimes sufficient to think of the physical space as being embedded in an affine or even projective space. The question then arises of how to relate these models to image measurements and to geometric properties of sets of cameras. We show that in the case of two cameras, a stereo rig, the projective structure of the world can be recovered as soon as the epipolar geometry of the stereo rig is known and that this geometry is summarized by a single 3 3 matrix, which we called the fundamental matrix [1, 2]. The affine structure can then be recovered if we add to this information a projective transformation between the two images which is induced by the plane at infinity. Finally, the euclidean structure (up to a similitude) can be recovered if we add to these two elements the knowledge of two conics (one for each camera) which are the images of the absolute conic, a circle of radius p;1 in the plane at in nity. In all three cases we showhowthe three-dimensional information can be recovered directly from the images without explicitely reconstructing the scene structure. This defines a natural hierarchy of geometric structures, a set of three strata, that we overlay onthephysical world and which we show to be recoverable by simple procedures relying on two items, the physical space itself together with possibly, but not necessarily, some a priori information about it, and some voluntary motions of the set of cameras.







