Results 1 - 10
of
15
Finding the largest unambiguous component of stereo matching
- In Proc. European Conf. on Computer Vision
, 2002
"... Abstract. Stereo matching is an ill-posed problem for at least two principal reasons: (1) because of the random nature of match similarity measure and (2) because of structural ambiguity due to repetitive patterns. Both ambiguities require the problem to be posed in the regularization framework. Con ..."
Abstract
-
Cited by 37 (0 self)
- Add to MetaCart
(Show Context)
Abstract. Stereo matching is an ill-posed problem for at least two principal reasons: (1) because of the random nature of match similarity measure and (2) because of structural ambiguity due to repetitive patterns. Both ambiguities require the problem to be posed in the regularization framework. Continuity is a natural choice for a prior model. But this model may fail in low signal-to-noise ratio regions. The resulting artefacts may then completely spoil the subsequent visual task. A question arises whether one could (1) find the unambiguous component of matching and, simultaneously, (2) identify the ambiguous component of the solution and then, optionally, (3) regularize the task for the ambiguous component only. Some authors have already taken this view. In this paper we define a new stability property which is a condition a set of matches must satisfy to be considered unambiguous at a given confidence level. It turns out that for a given matching problem this set is (1) unique and (2) it is already a matching. We give a fast algorithm that is able to find the largest stable matching. The algorithm is then used to show on real scenes that the unambiguous component is quite dense (10–80%) and error-free (total error rate of 0.3–1.4%), both depending on the confidence level chosen. 1
Combining Stereo and Visual Hull Information for On-line Reconstruction and Rendering of Dynamic Scenes
"... In this paper, we present a novel system which combines depth from stereo and visual hull reconstruction for acquiring dynamic real-world scenes at interactive rates. First, we use the silhouettes from multiple views to construct a polyhedral visual hull as an initial estimation of the object in the ..."
Abstract
-
Cited by 25 (8 self)
- Add to MetaCart
(Show Context)
In this paper, we present a novel system which combines depth from stereo and visual hull reconstruction for acquiring dynamic real-world scenes at interactive rates. First, we use the silhouettes from multiple views to construct a polyhedral visual hull as an initial estimation of the object in the scene. The visual hull is used to limit the disparity range during stereo estimation. The use of the constraints imposed by the silhouettes from the earlier processing stage reduces the amount of computation significantly. The restricted search range improves both the speed and the quality of the stereo reconstruction. In return, stereo information can compensate for some of the inherent drawbacks of the visual hull method, such as inability to reconstruct some concave regions.
When is the Shape of a Scene Unique Given its Light-Field: A Fundamental Theorem of 3D Vision?
- PAMI
, 2002
"... The complete set of measurements that could ever be used by a passive 3D vision algorithm is the plenoptic function or light-field. We give a concise characterization of when the light-field of a Lambertian scene uniquely determines its shape and, conversely, when the shape is inherently ambiguous. ..."
Abstract
-
Cited by 17 (2 self)
- Add to MetaCart
The complete set of measurements that could ever be used by a passive 3D vision algorithm is the plenoptic function or light-field. We give a concise characterization of when the light-field of a Lambertian scene uniquely determines its shape and, conversely, when the shape is inherently ambiguous. In particular, we show that stereo computed from the light-field is ambiguous if and only if the scene is radiating light of a constant intensity (and color, etc) over an extended region.
Computational Cameras: Convergence of Optics and Processing
, 2011
"... A computational camera uses a combination of optics and processing to produce images that cannot be captured with traditional cameras. In the last decade, computational imaging has emerged as a vibrant field of research. A wide variety of computational cameras has been demonstrated to encode more u ..."
Abstract
-
Cited by 7 (0 self)
- Add to MetaCart
A computational camera uses a combination of optics and processing to produce images that cannot be captured with traditional cameras. In the last decade, computational imaging has emerged as a vibrant field of research. A wide variety of computational cameras has been demonstrated to encode more useful visual information in the captured images, as compared with conventional cameras. In this paper, we survey computational cameras from two perspectives. First, we present a taxonomy of computational camera designs according to the coding approaches, including object side coding, pupil plane coding, sensor side coding, illumination coding, camera arrays and clusters, and unconventional imaging systems. Second, we use the abstract notion of light field representation as a general tool to describe computational camera designs, where each camera can be formulated as a projection of a high-dimensional light field to a 2-D image sensor. We show how individual optical devices transform light fields and use these transforms to illustrate how different computational camera designs (collections of optical devices) capture and encode useful visual information.
Accurate Correspondences From Epipolar Plane
- Proc. Computer Vision Winter Workshop
, 2002
"... We present an algorithm for finding correspondences in epipolar plane images (EPIs). An EPI is a 2-dimensional spatio-temporal image obtained from a dense image sequence that is rectified so that each scene point is projected to the same row in all frames. Scenes with opaque Lambertian surfaces ..."
Abstract
-
Cited by 4 (1 self)
- Add to MetaCart
We present an algorithm for finding correspondences in epipolar plane images (EPIs). An EPI is a 2-dimensional spatio-temporal image obtained from a dense image sequence that is rectified so that each scene point is projected to the same row in all frames. Scenes with opaque Lambertian surfaces without occlusions are assumed. The approach is based on finding lines with similar intensities in an EPI for each image row separately, by dynamic programming. We focus on the correspondence accuracy. The high accuracy is enabled by a wide base line as in a stereo and by more data available. However, the matching is easier than in stereo as the displacement between neighboring frames is very small. No feature extraction is used, the algorithm is purely signal-based.
Visual hull construction, alignment and refinement across time
, 2002
"... Visual Hull (VH) construction is a popular method of shape estimation. The method, also known as Shape from Silhouette (SFS), approximates shape of an object from multiple silhouette images by constructing an upper bound of the shape called the Visual Hull. SFS is used in many applications such as n ..."
Abstract
-
Cited by 3 (2 self)
- Add to MetaCart
(Show Context)
Visual Hull (VH) construction is a popular method of shape estimation. The method, also known as Shape from Silhouette (SFS), approximates shape of an object from multiple silhouette images by constructing an upper bound of the shape called the Visual Hull. SFS is used in many applications such as non-invasive 3D object digitization, 3D object recognition and more recently human motion tracking and analysis. Though SFS is straightforward to implement and easy to use, it has several limitations. Most existing SFS methods are too slow for real-time applications and the estimated shape is sensitive to silhouette noise and camera calibration errors. Moreover, VH is only a conservative approximation of the actual shape of the object and the approximation can be very coarse when there are only a few cameras. In my thesis, I propose to investigate some of these shortcomings and suggest solutions to overcome them. First, a voxel-based real-time SFS algorithm called SPOT is proposed and its behavior under noisy silhouette images is analyzed. Secondly, the conservative nature of SFS is improved by incorporating silhouette images across time. The improvement is achieved by first estimating the rigid motions between visual hulls formed at different time instants (visual hull alignment) and then combining them (visual hull refinement) to get a tighter bound of the object’s shape. The ambiguity issue of visual
A Fundamental Theorem of Stereo?
, 2001
"... The complete set of measurements that could ever be used by a passive 3D vision algorithm is the plenoptic function or light-field. We give a concise characterization of when the light-field of a Lambertian scene uniquely determines its shape and, conversely, when the shape is inherently ambiguous. ..."
Abstract
-
Cited by 3 (1 self)
- Add to MetaCart
The complete set of measurements that could ever be used by a passive 3D vision algorithm is the plenoptic function or light-field. We give a concise characterization of when the light-field of a Lambertian scene uniquely determines its shape and, conversely, when the shape is inherently ambiguous. In particular, we show that stereo computed from the light-field is ambiguous if and only if the scene is radiating light of a constant intensity (and color, etc) over an extended region. Keywords: Stereo, inherent ambiguities, uniqueness, light-fields, the plenoptic function. 1
Robust Correspondence Recognition for Computer Vision
"... Summary. In this paper we introduce a new robust framework suitable for the task of finding correspondences in computer vision. This task lies in the heart of many problems like stereovision, 3D model reconstruction, image stitching, camera autocalibration, recognition and image retrieval and a host ..."
Abstract
-
Cited by 3 (0 self)
- Add to MetaCart
(Show Context)
Summary. In this paper we introduce a new robust framework suitable for the task of finding correspondences in computer vision. This task lies in the heart of many problems like stereovision, 3D model reconstruction, image stitching, camera autocalibration, recognition and image retrieval and a host of others. If the problem domain is general enough, the correspondence problem can seldom employ any well-structured prior knowledge. This leads to tasks that have to find maximum cardinality solutions satisfying some weak optimality condition and a set of constraints. To avoid artifacts, robustness is required to cope with decision under occlusion, uncertainty or insufficiency of data and local violations of prior model. The proposed framework is based on a robust modification of graph-theoretic notion known as digraph kernel. Key words: computer vision, robust matching, digraph kernel 1
Multi-Image Matching using Invariant Features
, 2005
"... freely available for reference and study. I further agree that permission for extensive copying of this thesis for scholarly purposes may be granted by the head of my department or by his or her representatives. It is understood that copying or publication of this thesis for financial gain shall not ..."
Abstract
-
Cited by 3 (0 self)
- Add to MetaCart
(Show Context)
freely available for reference and study. I further agree that permission for extensive copying of this thesis for scholarly purposes may be granted by the head of my department or by his or her representatives. It is understood that copying or publication of this thesis for financial gain shall not be allowed without my written permission. (Signature)
Multi-Modal Statistics of Local Image Structures . . .
, 2007
"... Processing in most artificial vision systems and in the human vision system starts with early vision which involves the extraction of local visual modalities (like optical flow, disparity and contrast transition etc.) and local image structures (edge-like, junction-like and texture-like structures). ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
Processing in most artificial vision systems and in the human vision system starts with early vision which involves the extraction of local visual modalities (like optical flow, disparity and contrast transition etc.) and local image structures (edge-like, junction-like and texture-like structures). Since information in early vision is processed only locally, it is inherently ambiguous. For example, estimation of optical flow faces the aperture problem, and thus, only the flow along the intensity gradient is computable for edgelike structures. Moreover, the extracted flow information at weakly-textured image areas are unreliable. Analogously, stereopsis needs to deal with the correspondence problem: as correspondences at weakly textured image areas cannot be found, the disparity information at such places is not accurate. One way to deal with the missing and ambiguous information is to make use of the redundancy of visual information by exploiting the statistical regularities of natural scenes. Such regularities are carried in the visual system using feedback mechanisms between different layers, or by lateral connections within a layer. This thesis is interested in the ambiguities and the biased and missing information in the processing of optic flow, stereo and junctions using statistical means. It uses statistical properties of images to analyze the extent of the ambiguous processing in optical flow estimation and whether the missing information in stereo can be recovered using interpolation of depth information at edge-like structures. Moreover, it proposes a feedback mechanism for dealing with the bias in junction detection, and another model for recovering the missing depth information in stereo computation using only the depth information at the edges.