Results 1 - 10
of
21
The Fundamental matrix: theory, algorithms, and stability analysis
- International Journal of Computer Vision
, 1995
"... In this paper we analyze in some detail the geometry of a pair of cameras, i.e. a stereo rig. Contrarily to what has been done in the past and is still done currently, for example in stereo or motion analysis, we do not assume that the intrinsic parameters of the cameras are known (coordinates of th ..."
Abstract
-
Cited by 204 (13 self)
- Add to MetaCart
In this paper we analyze in some detail the geometry of a pair of cameras, i.e. a stereo rig. Contrarily to what has been done in the past and is still done currently, for example in stereo or motion analysis, we do not assume that the intrinsic parameters of the cameras are known (coordinates of the principal points, pixels aspect ratio and focal lengths). This is important for two reasons. First, it is more realistic in applications where these parameters may vary according to the task (active vision). Second, the general case considered here, captures all the relevant information that is necessary for establishing correspondences between two pairs of images. This information is fundamentally projective and is hidden in a confusing manner in the commonly used formalism of the Essential matrix introduced by Longuet-Higgins [40]. This paper clarifies the projective nature of the correspondence problem in stereo and shows that the epipolar geometry can be summarized in one 3 \Theta 3 ma...
Canonic Representations for the Geometries of Multiple Projective Views
- Computer Vision and Image Understanding
, 1994
"... This work is in the context of motion and stereo analysis. It presents a new uni ed representation which will be useful when dealing with multiple views in the case of uncalibrated cameras. Several levels of information might be considered, depending on the availability of information. Among other t ..."
Abstract
-
Cited by 171 (8 self)
- Add to MetaCart
This work is in the context of motion and stereo analysis. It presents a new uni ed representation which will be useful when dealing with multiple views in the case of uncalibrated cameras. Several levels of information might be considered, depending on the availability of information. Among other things, an algebraic description of the epipolar geometry of N views is introduced, as well as a framework for camera self-calibration, calibration updating, and structure from motion in an image sequence taken by a camera which is zooming and moving at the same time. We show how a special decomposition of a set of two or three general projection matrices, called canonical enables us to build geometric descriptions for a system of cameras which are invariant with respect to a given group of transformations. These representations are minimal and capture completely the properties of each level of description considered: Euclidean (in the context of calibration, and in the context of structure from motion, which we distinguish clearly), a ne, and projective, that we also relate to each other. In the last case, a new decomposition of the well-known fundamental matrix is obtained. Dependencies, which appear when three or more views are available, are studied in the context of the canonic decomposition, and new composition formulas are established. The theory is illustrated by tutorial examples with real images.
Algebraic Functions For Recognition
- IEEE Transactions on Pattern Analysis and Machine Intelligence
, 1994
"... In the general case, a trilinear relationship between three perspective views is shown to exist. The trilinearity result is shown to be of much practical use in visual recognition by alignment --- yielding a direct reprojection method that cuts through the computations of camera transformation, sce ..."
Abstract
-
Cited by 132 (29 self)
- Add to MetaCart
In the general case, a trilinear relationship between three perspective views is shown to exist. The trilinearity result is shown to be of much practical use in visual recognition by alignment --- yielding a direct reprojection method that cuts through the computations of camera transformation, scene structure and epipolar geometry. Moreover, the direct method is linear and sets a new lower theoretical bound on the minimal number of points that are required for a linear solution for the task of reprojection. The proof of the central result may be of further interest as it demonstrates certain regularities across homographies of the plane and introduces new view invariants. Experiments on simulated and real image data were conducted, including a comparative analysis with epipolar intersection and the linear combination methods, with results indicating a greater degree of robustness in practice and a higher level of performance in re-projection tasks. Keywords--- Visual Recognition, Al...
A Probabilistic Approach to Object Recognition Using Local Photometry and Global Geometry
, 1998
"... . Many object classes, including human faces, can be modeled as a set of characteristic parts arranged in a variable spatial configuration. We introduce a simplified model of a deformable object class and derive the optimal detector for this model. However, the optimal detector is not realizable exc ..."
Abstract
-
Cited by 111 (9 self)
- Add to MetaCart
. Many object classes, including human faces, can be modeled as a set of characteristic parts arranged in a variable spatial configuration. We introduce a simplified model of a deformable object class and derive the optimal detector for this model. However, the optimal detector is not realizable except under special circumstances (independent part positions). A cousin of the optimal detector is developed which uses "soft" part detectors with a probabilistic description of the spatial arrangement of the parts. Spatial arrangements are modeled probabilistically using shape statistics to achieve invariance to translation, rotation, and scaling. Improved recognition performance over methods based on "hard" part detectors is demonstrated for the problem of face detection in cluttered scenes. 1 Introduction Visual recognition of objects (chairs, sneakers, faces, cups, cars) is one of the most challenging problems in computer vision and artificial intelligence. Historically, there has been a...
What Can Two Images Tell Us About a Third One?
- International Journal of Computer Vision
, 1996
"... : This paper discusses the problem of predicting image features in an image from image features in two other images and the epipolar geometry between the three images. We adopt the most general camera model of perpective projection and show that a point can be predicted in the third image as a bilin ..."
Abstract
-
Cited by 99 (6 self)
- Add to MetaCart
: This paper discusses the problem of predicting image features in an image from image features in two other images and the epipolar geometry between the three images. We adopt the most general camera model of perpective projection and show that a point can be predicted in the third image as a bilinear function of its images in the first two cameras, that the tangents to three corresponding curves are related by a trilinear function, and that the curvature of a curve in the third image is a linear function of the curvatures at the corresponding points in the other two images. Our analysis relies heavily on the use of the fundamental matrix which has been recently introduced [7] and on the properties of a special plane which we call the trifocal plane. We thus answer completely the following question: given two views of an object, what would a third view look like? the question and its answer bear upon several areas of computer vision, stereo, motion analysis, and model-based object re...
Motion estimation via dynamic vision
- In Proc. European conf. on computer vision
, 1994
"... Abstruct-Zstimating the three-dimensional motion of an object from a sequence of projections is of paramount importance in a variety of applications in control and robotics, such as autonomous navigation, manipulation, servo, tracking, docking, planning, and surveillance. Although “visual motion est ..."
Abstract
-
Cited by 62 (8 self)
- Add to MetaCart
Abstruct-Zstimating the three-dimensional motion of an object from a sequence of projections is of paramount importance in a variety of applications in control and robotics, such as autonomous navigation, manipulation, servo, tracking, docking, planning, and surveillance. Although “visual motion estimation” is an old problem (the first formulations date back to the beginning of the century), only recently have tools from nonlinear systems estimation theory hinted at acceptable solutions. In this paper we formulate the visual motion estimation lproblem in terms of identification of nonlinear implicit systems with parameters on a topological manifold and propose a dynamic solution either in the local coordinates or in the embedding space of the parameter manifold. Such a formulation has structural advantages over previous recursive schemes, since the estimation of motion is decoupled from the estimation of the structure of
Projective Structure from Uncalibrated Images: Structure from Motion and Recognition
, 1994
"... We address the problem of reconstructing 3D space in a projective framework from two or more views, and the problem of artificially generating novel views of the scene from two given views (re-projection). We describe an invariance relation which provides a new description of structure, we call proj ..."
Abstract
-
Cited by 56 (14 self)
- Add to MetaCart
We address the problem of reconstructing 3D space in a projective framework from two or more views, and the problem of artificially generating novel views of the scene from two given views (re-projection). We describe an invariance relation which provides a new description of structure, we call projective depth, which is captured by a single equation relating image point correspondences across two or more views and the homographies of two arbitrary virtual planes. The framework is based on knowledge of correspondence of features across views, is linear, extremely simple, and the computations of structure readily extends to over-determination using multiple views. Experimental results demonstrate a high degree of accuracy in both tasks - reconstruction and re-projection. Keywords---Visual Recognition, 3D Reconstruction from 2D Views, Projective Geometry, Algebraic and Geometric Invariants. I. Introduction The geometric relation between objects (or scenes) in the world and their imag...
Relative Affine Structure: Canonical Model for 3D from 2D Geometry and Applications
- IEEE Transactions on Pattern Analysis and Machine Intelligence
, 1996
"... We propose an affine framework for perspective views, captured by a single extremely simple equation based on a viewer-centered invariant we call relative affine structure. Via a number of corollaries of our main results we show that our framework unifies previous work -- including Euclidean, projec ..."
Abstract
-
Cited by 54 (9 self)
- Add to MetaCart
We propose an affine framework for perspective views, captured by a single extremely simple equation based on a viewer-centered invariant we call relative affine structure. Via a number of corollaries of our main results we show that our framework unifies previous work -- including Euclidean, projective and affine -- in a natural and simple way, and introduces new, extremely simple, algorithms for the tasks of reconstruction from multiple views, recognition by alignment, and certain image coding applications.
Relative Affine Structure: Theory and Application to 3D Reconstruction From Perspective Views
- In IEEE Conference on Computer Vision and Pattern Recognition
, 1994
"... We propose an affine framework for perspective views, captured by a single extremely simple equation based on a viewer-centered invariant we call relative affine structure. Via a number of corollaries of our main results we show that our framework unifies previous work --- including Euclidean, proje ..."
Abstract
-
Cited by 52 (12 self)
- Add to MetaCart
We propose an affine framework for perspective views, captured by a single extremely simple equation based on a viewer-centered invariant we call relative affine structure. Via a number of corollaries of our main results we show that our framework unifies previous work --- including Euclidean, projective and affine --- in a natural and simple way. Finally, the main results were applied to a real image sequence for purpose of 3D reconstruction from 2D views. 1 Introduction The introduction of affine and projective tools into the field of computer vision have brought increased activity in the fields of structure from motion and recognition by alignment in the recent few years. The emerging realization is that non-metric information, although weaker than the information provided by depth maps and rigid camera geometries, is nonetheless useful in the sense that the framework may provide simpler algorithms, camera calibration is not required, more freedom in picture-taking is allowed --- ...
Rendering Real-World Objects Using View Interpolation
, 1996
"... This paper overviews the theoretical background along with the description of the preliminary experiments with the interpolated view synthesis, indicating that our approach is robust and feasible. ..."
Abstract
-
Cited by 45 (6 self)
- Add to MetaCart
This paper overviews the theoretical background along with the description of the preliminary experiments with the interpolated view synthesis, indicating that our approach is robust and feasible.

