Results 11 - 20
of
31
Large scale 6DOF SLAM with stereo-in-hand
- IEEE Transactions on Robotics
, 2008
"... Abstract—In this paper we describe a system that can carry out SLAM in large indoor and outdoor environments using a stereo pair moving with 6DOF as the only sensor. Unlike current visual SLAM systems that use either bearing-only monocular information or 3D stereo information, our system accommodate ..."
Abstract
-
Cited by 8 (0 self)
- Add to MetaCart
Abstract—In this paper we describe a system that can carry out SLAM in large indoor and outdoor environments using a stereo pair moving with 6DOF as the only sensor. Unlike current visual SLAM systems that use either bearing-only monocular information or 3D stereo information, our system accommodates both monocular and stereo. Textured point features are extracted from the images and stored as 3D points if seen in both images with sufficient disparity, or stored as inverse depth points otherwise. This allows the system to map both near and far features: the first provide distance and orientation, and the second orientation information. Unlike other vision only SLAM systems, stereo does not suffer from ’scale drift ’ because of unobservability problems, and thus no other information such as gyroscopes or accelerometers is required in our system. Our SLAM algorithm generates sequences of conditionally independent local maps that can share information related to the camera motion and common features being tracked. The system computes the full map using the novel Conditionally Independent Divide and Conquer algorithm, which allows constant time operation most of the time, with linear time updates to compute the full map. To demonstrate the robustness and scalability of our system, we show experimental results in indoor and outdoor urban environments of 210m and 140m loop trajectories, with the stereo camera being carried in hand by a person walking at normal walking speeds of 4 − 5km/hour.
Motion and Structure from Time-Varying Optical Flow
- In Vision Interface
, 1995
"... We present a computational framework for recovering both 1 st -order motion parameters (observer direction of translation and observer rotation), 2 nd -order motion parameters (observer rotational acceleration) and relative depth maps from time-varying optical flow. We cannot recover absolute o ..."
Abstract
-
Cited by 6 (3 self)
- Add to MetaCart
We present a computational framework for recovering both 1 st -order motion parameters (observer direction of translation and observer rotation), 2 nd -order motion parameters (observer rotational acceleration) and relative depth maps from time-varying optical flow. We cannot recover absolute observer translational speed or translational acceleration although because only relative depth, which is the ratio of current translational speed and 3d depth, is affected by these parameters. Our assumption is that the observer rotational motion is no more than "second order"; in other words, observer motion is either constant or has at most constant acceleration. We examine the effect of noise -- which is ubiquitous in optical flow data -- on the solution of the motion and structure parameters. This ensemble of unknowns comprises a solution to the classical `structure-and-motion from optical flow' problem. Our complete framework utilizes a simple method for interpreting the bilinear image...
Uncalibrated 1D Projective Camera and 3D Affine Reconstruction of Lines
- In Proc. CVPR, pages 60 – 65
, 1997
"... We describe a linear algorithm to recover 3D affine shape/motion from line correspondences over three views with uncalibrated affine cameras. The key idea is the introduction of a one-dimensional projective camera. This converts the 3D affine reconstruction of "lines" into 2D projective reconstructi ..."
Abstract
-
Cited by 4 (1 self)
- Add to MetaCart
We describe a linear algorithm to recover 3D affine shape/motion from line correspondences over three views with uncalibrated affine cameras. The key idea is the introduction of a one-dimensional projective camera. This converts the 3D affine reconstruction of "lines" into 2D projective reconstruction of "points". Using the full tensorial representation of three uncalibrated 1D views, we prove that the 3D affine reconstruction of lines from minimal data is unique up to a re-ordering of the views. 3D affine line reconstruction can be performed by properly rescaling image coordinates instead of using projection matrices. The algorithm is validated on both simulated and real image sequences. 1. Introduction Using line segments instead of points as features has attracted the attention of many researchers [11, 2, 29, 28, 27, 1] for various tasks such as pose estimation, stereo and structure from motion. In this paper, we are interested in structure from motion using line correspondences a...
On the Fourier Properties of Discontinuous Visual Motion
- Journal of Mathematical Imaging and Vision
, 2000
"... Retinal image motion and optical flow as its approximation are fundamental concepts in the field of vision, perceptual and computational. However, the computation of optical flow remains a challenging problem as image motion includes discontinuities and multiple values mostly due to scene geometry, ..."
Abstract
-
Cited by 3 (1 self)
- Add to MetaCart
Retinal image motion and optical flow as its approximation are fundamental concepts in the field of vision, perceptual and computational. However, the computation of optical flow remains a challenging problem as image motion includes discontinuities and multiple values mostly due to scene geometry, surface translucency and various photometric effects such as surface reflectance. In this contribution, we analyze image motion in the frequency space with respect to motion discontinuities and surface translucence. We derive, under models of constant and linear optical flow, the frequency structure of motion discontinuities due to occlusion and we demonstrate its various geometrical properties. The aperture problem is investigated and we show that the information content of an occlusion almost always disambiguates the velocity of an occluding signal suffering from the aperture problem. In addition, the theoretical framework can describe the exact frequency structure of Non-Fourier motion an...
Motion segmentation in long image sequences
- in Proceedings of the 11th British Machine Vision Conference
, 2000
"... Long image sequences provide a wealth of information, which means that a compact representation is needed to efficiently process them. In this paper a novel representation for motion segmentation in long image sequences is presented. This representation – the feature interval graph – measures the pa ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
Long image sequences provide a wealth of information, which means that a compact representation is needed to efficiently process them. In this paper a novel representation for motion segmentation in long image sequences is presented. This representation – the feature interval graph – measures the pairwise rigidity of features in the scene. The feature interval graph is recursively computed, making it a compact representation, and uses an interval model of uncertainty. The feature interval graph forms the basis for new algorithms for motion segmentation and occlusion analysis. Results of these algorithms are presented on synthetic and laboratory scenes. 1
Establishing Motion Correspondence using Extended Temporal Scope
, 2003
"... This paper addresses the motion correspondence problem: the problem of finding corresponding point measurements in an image sequence solely based on positional information. The motion correspondence problem is most difficult when the target points are densely moving. It becomes even harder when the ..."
Abstract
-
Cited by 2 (1 self)
- Add to MetaCart
This paper addresses the motion correspondence problem: the problem of finding corresponding point measurements in an image sequence solely based on positional information. The motion correspondence problem is most difficult when the target points are densely moving. It becomes even harder when the point detection scheme is imperfect or when points are temporarily occluded. Available motion constraints should be exploited in order to rule out physically impossible assignments of measurements to point tracks. The performance can be further increased by deferring the correspondence decisions, that is, by examining whether the consequences of candidate correspondences lead to alternate and better solutions. In this paper, we concentrate on the latter by introducing a scheme that extends the temporal scope over which the correspondences are optimized. The consequent problem we are faced with is a multi-dimensional assignment problem, which is known to be NP-hard. To restrict the consequent increase in computation time, the candidate solutions are suitably ordered and then additional combined motion constraints are imposed. Experiments show the appropriateness of the proposed extension, both with respect to performance as well as computational aspects.
Segment-Based Structure from an Imprecisely Located Moving Camera
- In IEEE Int. Symposium on Computer Vision
, 1995
"... A probabilistic geometric model for 2D image line segments is first presented. We then propose a method, using 2D segments, to accumulate evidence along an image sequence of a polyhedral scene. The proposed method degrades gracefully with camera location noise. The matched 2D segments are fused with ..."
Abstract
-
Cited by 2 (2 self)
- Add to MetaCart
A probabilistic geometric model for 2D image line segments is first presented. We then propose a method, using 2D segments, to accumulate evidence along an image sequence of a polyhedral scene. The proposed method degrades gracefully with camera location noise. The matched 2D segments are fused with an extended Kalman filter to reconstruct the scene structure. The correspondences are established using sequential data association. The matching function encodes the consistency between the 2D segments and the 3D segment computed from their fusion. The computation of a 3D segment from two 2D segments is overconstrained, and so using our model, the after-fusion consistency test can reject false matches even from two images, allowing the pruning of false matching hypotheses at early stages. Two examples are provided. The first determines a scene structure when the camera location is known precisely; the structure is then compared with that obtained by trinocular stereo. The proposed method i...
Real Time Motion Detection System and Scene Segmentation
, 1998
"... We address two issues in this report. One is the real time implementation of feature tracking and motion estimation. As a fundamental problem in vision field, feature tracking needs to be implemented in real time so that researchers can do further analysis on motion such as building real time visual ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
We address two issues in this report. One is the real time implementation of feature tracking and motion estimation. As a fundamental problem in vision field, feature tracking needs to be implemented in real time so that researchers can do further analysis on motion such as building real time visual navigation system etc.. Our implementation of the algorithm on the C4x board with parallel processors and its performance are described. Another part of the paper represents a 3D motion segmentation scheme. We propose an EM approach combined with the modified separation matrix scheme to perform 3D motion segmentation of the image sequence that contains multiple moving objects. We observe that, given the detected features and their 2D optical flow, in most cases the objects or their flow are separated very well from each other in space. The separation matrix method modified by using normalized cuts achieves expected grouping results for these cases. However, when the objects are overlapped s...
Multi-agent 3D tracking and segmentation using stereo
, 1993
"... We describe the current state of the 3D Feature-Based Tracker (3DFBT), a system for tracking and segmenting the 3D motion of objects using image input from a calibrated stereo pair of video cameras. After an initialization step (in which the cameras are calibrated and interest-point features are ext ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
We describe the current state of the 3D Feature-Based Tracker (3DFBT), a system for tracking and segmenting the 3D motion of objects using image input from a calibrated stereo pair of video cameras. After an initialization step (in which the cameras are calibrated and interest-point features are extracted), the system runs in a multi-level cycle of prediction and verification or correction. The currently modelled 3D positions and velocities of the feature points are extrapolated a short time into the future to yield predictions of 3D position. These 3D predictions are projected into the two stereo views, and are used to guide a fast and highly focussed visual search for the feature points. The image positions at which the features are re-acquired are back-projected in 3D space in order to update the 3D positions and velocities. At a higher level, features are dynamically grouped into clusters with common 3D motion. Predictions from the cluster level can be fed down to the lower level t...
Binocular Estimation Of Motion And Structure From Long Sequences Using Optical Flow Without Correspondence
- In Proc. ICIP
, 1995
"... We use the left and right monocular motion and structure parameters of two stereo image sequences (direction of translation, relative depth, observer rotation and rotational acceleration) to compute absolute depth, absolute translation and absolute translational acceleration for each pair of left an ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
We use the left and right monocular motion and structure parameters of two stereo image sequences (direction of translation, relative depth, observer rotation and rotational acceleration) to compute absolute depth, absolute translation and absolute translational acceleration for each pair of left and right images. Individual translation parameters computed at each frame are integrated over time using a Kalman filter to provide more accuracy and a "best" estimate of absolute translation at each time. 1. INTRODUCTION Recently, we described a monocular motion and structure algorithm that computes the observer's heading, u; rotation, ~!; rotational acceleration, ffi ~ !; and a relative-depth map (the ratio of translational speed and 3D depth ¯ at each pixel in the image) by solving simple linear systems of equations [1, 2]. These parameters are computed in a camera-centered coordinate system using adjacent 3-tuples of flow fields from a long monocular flow sequence and are then integra...

