Results 1  10
of
13
The Statistics of Optical Flow
 Computer Vision and Image Understanding
, 1999
"... When processing image sequences some representation of image motion must be derived as a first stage. The most often used such representation is the optical flow field, which is a set of velocity measurements of image patterns. It is well known that it is very difficult to estimate accurate optical ..."
Abstract

Cited by 34 (6 self)
 Add to MetaCart
(Show Context)
When processing image sequences some representation of image motion must be derived as a first stage. The most often used such representation is the optical flow field, which is a set of velocity measurements of image patterns. It is well known that it is very difficult to estimate accurate optical flow at locations in an image which correspond to scene discontinuities. What is less well known, however, is that even at the locations corresponding to smooth scene surfaces, the optical flow field often cannot be estimated accurately. Noise in the data causes many optical flow estimation techniques to give biased flow estimates. Very often there is consistent bias: the estimate tends to be an underestimate in length and to be in a direction closer to the majority of the gradients in the patch. This paper studies all three major categories of flow estimation methodsgradientbased, energybased, and correlation methods, and it analyzes different ways of compounding onedimensional motio...
Aloimonos, Ambiguity in structure from motion: Sphere versus plane
 Internat. J. Comput. Vision
, 1998
"... Abstract. If 3D rigid motion can be correctly estimated from image sequences, the structure of the scene can be correctly derived using the equations for image formation. However, an error in the estimation of 3D motion will result in the computation of a distorted version of the scene structure. Of ..."
Abstract

Cited by 21 (6 self)
 Add to MetaCart
Abstract. If 3D rigid motion can be correctly estimated from image sequences, the structure of the scene can be correctly derived using the equations for image formation. However, an error in the estimation of 3D motion will result in the computation of a distorted version of the scene structure. Of computational interest are these regions in space where the distortions are such that the depths become negative, because in order for the scene to be visible it has to lie in front of the image, and thus the corresponding depth estimates have to be positive. The stability analysis for the structure from motion problem presented in this paper investigates the optimal relationship between the errors in the estimated translational and rotational parameters of a rigid motion that results in the estimation of a minimum number of negative depth values. The input used is the value of the flow along some direction, which is more general than optic flow or correspondence. For a planar retina it is shown that the optimal configuration is achieved when the projections of the translational and rotational errors on the image plane are perpendicular. Furthermore, the projection of the actual and the estimated translation lie on a line through the center. For a spherical retina, given a rotational error, the optimal translation is the correct one; given a translational error, the optimal rotational error depends both in direction and value on the actual and estimated translation as well as the scene in view. The proofs, besides illuminating the confounding of translation and rotation in structure from motion, have an important application to ecological optics. The same analysis provides a computational explanation of why it is
The Ouchi illusion as an artifact of biased flow estimation
, 2000
"... A pattern by Ouchi has the surprising property that small motions can cause illusory relative motion between the inset and background regions. The effect can be attained with small retinal motions or a slight jiggling of the paper and is robust over large changes in the patterns, frequencies and bou ..."
Abstract

Cited by 13 (8 self)
 Add to MetaCart
A pattern by Ouchi has the surprising property that small motions can cause illusory relative motion between the inset and background regions. The effect can be attained with small retinal motions or a slight jiggling of the paper and is robust over large changes in the patterns, frequencies and boundary shapes. In this paper, we explain that the cause of the illusion lies in the statistical difficulty of integrating local onedimensional motion signals into twodimensional image velocity measurements. The estimation of image velocity generally is biased, and for the particular spatial gradient distributions of the Ouchi pattern the bias is highly pronounced, giving rise to a large difference in the velocity estimates in the two regions. The computational model introduced to describe the statistical estimation of image velocity also accounts for the findings of psychophysical studies with variations of the Ouchi pattern and for various findings on the perception of moving plaids. The insight gained from this computational study challenges the current models used to explain biological vision systems and to construct robotic vision systems. Considering the statistical difficulties in image velocity estimation in conjunction with the problem of discontinuity detection in motion fields suggests that theoretically the process of optical flow computations should not be carried out in isolation but in conjunction with the higher level processes of 3D motion estimation, segmentation and shape computation.
Analyzing Action Representations
 Workshop on Algebraic Frames for the PerceptionAction Cycle, LNCS 1888
, 2000
"... . We argue that actions represent the basic seed of intelligence underlying perception of the environment, and the representations encoding actions should be the starting point upon which further studies of cognition are built. In this paper we make a first effort in characterizing these action ..."
Abstract

Cited by 2 (1 self)
 Add to MetaCart
(Show Context)
. We argue that actions represent the basic seed of intelligence underlying perception of the environment, and the representations encoding actions should be the starting point upon which further studies of cognition are built. In this paper we make a first effort in characterizing these action representations. In particular, from the study of simple actions related to 3D rigid motion interpretation, we deduce a number of principles for the possible computations responsible for the interpretation of spacetime geometry. Using these principles, we then discuss possible avenues on how to proceed in analyzing the representations of more complex human actions. 1 Introduction and Motivation During the late eighties, with the emergence of active vision, it was realized that vision should not be studied in a vacuum but in conjunction with action. Adopting a purposive viewpoint makes visual computations easier by placing them in the context of larger processes that accomplish tasks....
ARTICLE NO. IV970649 Effects of Errors in the Viewing Geometry on Shape Estimation ∗
, 1996
"... A sequence of images acquired by a moving sensor contains information about the threedimensional motion of the sensor and the shape of the imaged scene. Interesting research during the past few years has attempted to characterize the errors that arise in computing 3D motion (egomotion estimation) a ..."
Abstract
 Add to MetaCart
(Show Context)
A sequence of images acquired by a moving sensor contains information about the threedimensional motion of the sensor and the shape of the imaged scene. Interesting research during the past few years has attempted to characterize the errors that arise in computing 3D motion (egomotion estimation) as well as the errors that result in the estimation of the scene’s structure (structure from motion). Previous research is characterized by the use of optic flow or correspondence of features in the analysis as well as by the employment of particular algorithms and models of the scene in recovering expressions for the resulting errors. This paper presents a geometric framework that characterizes the relationship between 3D motion and shape in the presence of errors. We examine how the threedimensional space recovered by a moving monocular observer, whose 3D motion is estimated with some error, is distorted. We characterize the space of distortions by its level sets, that is, we characterize the systematic distortion via a family of isodistortion surfaces, which describes the locus over which the depths of points in the scene in view are distorted by the same multiplicative factor. The framework introduced in this way has a number of applications: Since the visible surfaces have positive depth (visibility constraint), by analyzing the geometry of the regions where the distortion factor is negative, that is, where the visibility constraint is violated, we make explicit situations which are likely to give rise to ambiguities in motion estimation, independent of the algorithm used. We provide a uniqueness analysis for 3D motion analysis from normal flow. We study the constraints on egomotion, object motion, and depth for an independently moving object to be detectable by a moving observer, and we offer a quantitative account of the precision needed in an inertial sensor for accurate estimation of 3D motion. c ○ 1998 Academic Press 1.
on The Statistics of Optical Flow 1
, 2000
"... When processing image sequences some representation of image motion must be derived as a first stage. The most often used representation is the optical flow field, which is a set of velocity measurements of image patterns. It is well known that it is very difficult to estimate accurate optical flow ..."
Abstract
 Add to MetaCart
(Show Context)
When processing image sequences some representation of image motion must be derived as a first stage. The most often used representation is the optical flow field, which is a set of velocity measurements of image patterns. It is well known that it is very difficult to estimate accurate optical flow at locations in an image which correspond to scene discontinuities. What is less well known, however, is that even at the locations corresponding to smooth scene surfaces, the optical flow field often cannot be estimated accurately. Noise in the data causes many optical flow estimation techniques to give biased flow estimates. Very often there is consistent bias: the estimate tends to be an underestimate in length and to be in a direction closer to the majority of the gradients in the patch. This paper studies all three major categories of flow estimation methods—gradientbased, energybased, and correlation methods, and it analyzes different ways of compounding onedimensional motion estimates (image gradients, spatiotemporal frequency triplets, local correlation estimates) into twodimensional velocity estimates, including linear and nonlinear methods. Correcting for the bias would require knowledge of the noise parameters. In many situations, however, these are difficult to estimate accurately, as they change with the dynamic imagery in unpredictable and complex ways. Thus, the bias really is a problem inherent to optical flow estimation. We argue that the bias is also integral to the human visual system. It is the cause of the illusory perception of motion in the Ouchi pattern and also explains various psychophysical studies of the perception of moving plaids. c ○ 2001 Academic Press Key Words: analysis of optical flow estimation algorithms; bias; optical illusion.
Visual SpaceTime Geometry and Statistics
"... Although the fundamental ideas underlying research efforts in the field of computer vision have not radically changed in the past two decades, there has been a transformation in the way work in this field is conducted. This is primarily due to the emergence of a number of tools, of both a practical ..."
Abstract
 Add to MetaCart
(Show Context)
Although the fundamental ideas underlying research efforts in the field of computer vision have not radically changed in the past two decades, there has been a transformation in the way work in this field is conducted. This is primarily due to the emergence of a number of tools, of both a practical and a theoretical nature. One such tool, celebrated throughout the nineties, is the geometry of visual spacetime. It is known under a variety of headings, such as multiple view geometry, structure from motion, and model building. It is a mathematical theory relating multiple views (images) of a scene taken at different viewpoints to threedimensional models of the (possibly dynamic) scene. This mathematical theory gave rise to algorithms that take as input images (or video) and provide as output a model of the scene. Such algorithms are one of the biggest successes of the field and they have many applications in other disciplines, such as graphics (imagebased rendering, motion capture) and robotics (navigation). One of the difficulties, however, is that the current tools cannot yet be fully automated, and they do not provide very accurate results. More research is required for automation and high precision. During the past few years we have investigated a number of basic questions underlying the structure from motion problem. Our investigations resulted in a small number of principles that characterize the problem. These principles, which give rise to automatic procedures and point to new avenues for studying the next level of the structure from motion problem, are the subject of this paper. 1 Introduction: The
New Eyes for Building Models from Video
"... Models of realworld objects and actions for use in graphics, virtual and augmented reality and related fields can only be obtained through the use of visual data and particularly video. This paper examines the question of recovering shape models from video information. Given video of an object or a ..."
Abstract
 Add to MetaCart
Models of realworld objects and actions for use in graphics, virtual and augmented reality and related fields can only be obtained through the use of visual data and particularly video. This paper examines the question of recovering shape models from video information. Given video of an object or a scene captured by a moving camera, a prerequisite for model building is to recover the threedimensional (3D) motion of the camera which consists of a rotation and a translation at each instant. It is shown here that a spherical eye (an eye or system of eyes providing panoramic vision) is superior to a cameratype eye (an eye with restricted field of view such as a common video camera) as regards the competence of 3D motion estimation. This result is derived from a geometric/statistical analysis of all the possible computational models that can be used for estimating 3D motion from an image sequence. Regardless of the estimation procedure for a cameratype eye, the parameters of the 3D rigi...
The Statistics of Visual Correspondance: Insights into the Visual System
, 1999
"... A pattern by Ouchi (Figure 1) has the surprising property that small motions can cause illusory relative motion between the inset and background regions. The effect can be attained with small retinal motions or a slight jiggling of the paper and is robust over large changes in the patterns, frequenc ..."
Abstract
 Add to MetaCart
(Show Context)
A pattern by Ouchi (Figure 1) has the surprising property that small motions can cause illusory relative motion between the inset and background regions. The effect can be attained with small retinal motions or a slight jiggling of the paper and is robust over large changes in the patterns, frequencies and boundary shapes. In this paper, we explain that the cause of the illusion lies in the statistical difficulty of integrating local onedimensional motion signals into twodimensional image velocity measurements. The estimation of image velocity generally is biased, and for the particular spatial gradient distributions of the Ouchi pattern the bias is highly pronounced, giving rise to a large difference in the velocity estimates in the two regions. The computational model introduced to describe the statistical estimation of image velocity also accounts for the findings of psychophysical studies with variations of the Ouchi pattern and for various findings on the perception of moving plaids. The insight gained from this computational study challenges the current models used to explain biological vision systems and to construct robotic vision systems. Considering the statistical difficulties in image velocity estimation in conjunction with the problem of discontinuity detection in motion fields suggests that theoretically the process of optical flow computations should not be carried out in isolation but in conjunction with the higher level processes of 3D motion estimation, segmentation and shape computation.
Motion Segmentation for a Binocular Observer
, 1998
"... Since estimation of camera motion requires knowledge of independent motion, and moving object detection and localization requires knowledge about the camera motion, the two problems of motion estimation and segmentation need to be solved together in a synergistic manner. This paper provides an appro ..."
Abstract
 Add to MetaCart
(Show Context)
Since estimation of camera motion requires knowledge of independent motion, and moving object detection and localization requires knowledge about the camera motion, the two problems of motion estimation and segmentation need to be solved together in a synergistic manner. This paper provides an approach to treating both these problems for a binocular observer. The technique introduced here is based on a novel concept, "scene smoothness," which parameterizes the variation in estimated scene depth with the error in the underlying 3D motion. The idea is that incorrect 3D motion estimates cause distortions in the estimated depth map, and as a result smooth scene patches are computed as unsmooth, i.e., rugged, surfaces. The correct 3D motion can be distinguished, as it does not cause any distortion and thus gives rise to the smoothest background patches, with the locations corresponding to independent motion being unsmooth. The observer's binocular nature is exploited in the extraction of de...