Results 1 - 10
of
68
A taxonomy and evaluation of dense two-frame stereo correspondence algorithms
- International Journal of Computer Vision
, 2002
"... Abstract. Stereo matching is one of the most active research areas in computer vision. While a large number of algorithms for stereo correspondence have been developed, relatively little work has been done on characterizing their performance. In this paper, we present a taxonomy of dense, two-frame ..."
Abstract
-
Cited by 708 (18 self)
- Add to MetaCart
Abstract. Stereo matching is one of the most active research areas in computer vision. While a large number of algorithms for stereo correspondence have been developed, relatively little work has been done on characterizing their performance. In this paper, we present a taxonomy of dense, two-frame stereo methods. Our taxonomy is designed to assess the different components and design decisions made in individual stereo algorithms. Using this taxonomy, we compare existing stereo methods and present experiments evaluating the performance of many different variants. In order to establish a common software platform and a collection of data sets for easy evaluation, we have designed a stand-alone, flexible C++ implementation that enables the evaluation of individual components and that can easily be extended to include new algorithms. We have also produced several new multi-frame stereo data sets with ground truth and are making both the code and data sets available on the Web. Finally, we include a comparative evaluation of a large set of today’s best-performing stereo algorithms.
Computing Visual Correspondence with Occlusions using Graph Cuts
"... Several new algorithms for visual correspondence based on graph cuts [7, 14, 17] have recently been developed. While these methods give very strong results in practice, they do not handle occlusions properly. Specifically, they treat the two input images asymmetrically, and they do not ensure that a ..."
Abstract
-
Cited by 195 (11 self)
- Add to MetaCart
Several new algorithms for visual correspondence based on graph cuts [7, 14, 17] have recently been developed. While these methods give very strong results in practice, they do not handle occlusions properly. Specifically, they treat the two input images asymmetrically, and they do not ensure that a pixel corresponds to at most one pixel in the other image. In this paper, we present a new method which properly addresses occlusions, while preserving the advantages of graph cut algorithms. We give experimental results for stereo as well as motion, which demonstrate that our method performs well both at detecting occlusions and computing disparities.
Multi-camera Scene Reconstruction via Graph Cuts
- in European Conference on Computer Vision
, 2002
"... We address the problem of computing the 3-dimensional shape of an arbitrary scene from a set of images taken at known viewpoints. ..."
Abstract
-
Cited by 190 (9 self)
- Add to MetaCart
We address the problem of computing the 3-dimensional shape of an arbitrary scene from a set of images taken at known viewpoints.
A Comparison and Evaluation of Multi-View Stereo Reconstruction Algorithms
, 2006
"... This paper presents a quantitative comparison of several multi-view stereo reconstruction algorithms. Until now, the lack of suitable calibrated multi-view image datasets with known ground truth (3D shape models) has prevented such direct comparisons. In this paper, we first survey multi-view stereo ..."
Abstract
-
Cited by 189 (12 self)
- Add to MetaCart
This paper presents a quantitative comparison of several multi-view stereo reconstruction algorithms. Until now, the lack of suitable calibrated multi-view image datasets with known ground truth (3D shape models) has prevented such direct comparisons. In this paper, we first survey multi-view stereo algorithms and compare them qualitatively using a taxonomy that differentiates their key properties. We then describe our process for acquiring and calibrating multiview image datasets with high-accuracy ground truth and introduce our evaluation methodology. Finally, we present the results of our quantitative comparison of state-of-the-art multi-view stereo reconstruction algorithms on six benchmark datasets. The datasets, evaluation details, and instructions for submitting new models are available online at http://vision.middlebury.edu/mview.
Stereo matching using belief propagation
, 2003
"... In this paper, we formulate the stereo matching problem as a Markov network and solve it using Bayesian belief propagation. The stereo Markov network consists of three coupled Markov random fields that model the following: a smooth field for depth/disparity, a line process for depth discontinuity, ..."
Abstract
-
Cited by 173 (3 self)
- Add to MetaCart
In this paper, we formulate the stereo matching problem as a Markov network and solve it using Bayesian belief propagation. The stereo Markov network consists of three coupled Markov random fields that model the following: a smooth field for depth/disparity, a line process for depth discontinuity, and a binary process for occlusion. After eliminating the line process and the binary process by introducing two robust functions, we apply the belief propagation algorithm to obtain the maximum a posteriori (MAP) estimation in the Markov network. Other low-level visual cues (e.g., image segmentation) can also be easily incorporated in our stereo model to obtain better stereo results. Experiments demonstrate that our methods are comparable to the state-of-the-art stereo algorithms for many test cases.
Multi-View Stereo for Community Photo Collections
"... We present a multi-view stereo algorithm that addresses the extreme changes in lighting, scale, clutter, and other effects in large online community photo collections. Our idea is to intelligently choose images to match, both at a per-view and per-pixel level. We show that such adaptive view selecti ..."
Abstract
-
Cited by 80 (14 self)
- Add to MetaCart
We present a multi-view stereo algorithm that addresses the extreme changes in lighting, scale, clutter, and other effects in large online community photo collections. Our idea is to intelligently choose images to match, both at a per-view and per-pixel level. We show that such adaptive view selection enables robust performance even with dramatic appearance variability. The stereo matching technique takes as input sparse 3D points reconstructed from structure-from-motion methods and iteratively grows surfaces from these points. Optimizing for surface normals within a photoconsistency measure significantly improves the matching results. While the focus of our approach is to estimate high-quality depth maps, we also show examples of merging the resulting depth maps into compelling scene reconstructions. We demonstrate our algorithm on standard multi-view stereo datasets and on casually acquired photo collections of famous scenes gathered from the Internet. 1
Non-photorealistic camera: Depth edge detection and stylized rendering using multi-flash imaging
- ACM Trans. Graph
"... Figure 1: (a) A photo of a car engine (b) Stylized rendering highlighting boundaries between geometric shapes. Notice the four spark plugs and the dip-stick which are now clearly visible (c) Photo of a flower plant (d) Texture de-emphasized rendering. We present a non-photorealistic rendering approa ..."
Abstract
-
Cited by 69 (16 self)
- Add to MetaCart
Figure 1: (a) A photo of a car engine (b) Stylized rendering highlighting boundaries between geometric shapes. Notice the four spark plugs and the dip-stick which are now clearly visible (c) Photo of a flower plant (d) Texture de-emphasized rendering. We present a non-photorealistic rendering approach to capture and convey shape features of real-world scenes. We use a camera with multiple flashes that are strategically positioned to cast shadows along depth discontinuities in the scene. The projective-geometric relationship of the camera-flash setup is then exploited to detect depth discontinuities and distinguish them from intensity edges due to material discontinuities. We introduce depiction methods that utilize the detected edge features to generate stylized static and animated images. We can highlight the detected features, suppress unnecessary details or combine features from multiple images. The resulting images more clearly convey the 3D structure of the imaged scenes. We take a very different approach to capturing geometric features of a scene than traditional approaches that require reconstructing a 3D model. This results in a method that is both surprisingly simple and computationally efficient. The entire hardware/software setup can conceivably be packaged into a self-contained device no larger than existing digital cameras.
Motion layer extraction in the presence of occlusion using graph cut
- In CVPR (2
, 2004
"... Extracting layers from video is very important for video representation, analysis, compression, and synthesis. Assuming that a scene can be approximately described by multiple planar regions, this paper describes a robust and novel approach to automatically extract a set of affine or projective tran ..."
Abstract
-
Cited by 57 (7 self)
- Add to MetaCart
Extracting layers from video is very important for video representation, analysis, compression, and synthesis. Assuming that a scene can be approximately described by multiple planar regions, this paper describes a robust and novel approach to automatically extract a set of affine or projective transformations induced by these regions, detect the occlusion pixels over multiple consecutive frames, and segment the scene into several motion layers. First, after determining a number of seed regions using correspondences in two frames, we expand the seed regions and reject the outliers employing the graph cuts method integrated with level set representation. Next, these initial regions are merged into several initial layers according to the motion similarity. Third, an occlusion order constraint on multiple frames is explored, which enforces that the occlusion area increases with the temporal order in a short period and effectively maintains segmentation consistency over multiple consecutive frames. Then the correct layer segmentation is obtained by using a graph cuts algorithm, and the occlusions between the overlapping layers are explicitly determined. Several experimental results are demonstrated to show that our approach is effective and robust. Index Terms Layer-based motion segmentation, video analysis, graph cuts, level set representation, occlusion order constraint. I.
Fast Variable Window for Stereo Correspondence using Integral Images
- Proc. IEEE Conf. Computer Vision and Pattern Recognition
, 2003
"... We develop a fast and accurate variable window approach. The two main ideas for achieving accuracy are choosing a useful range of window sizes/shapes for evaluation and developing a new window cost which is particularly suitable for comparing windows of different sizes. The speed of our approach is ..."
Abstract
-
Cited by 40 (0 self)
- Add to MetaCart
We develop a fast and accurate variable window approach. The two main ideas for achieving accuracy are choosing a useful range of window sizes/shapes for evaluation and developing a new window cost which is particularly suitable for comparing windows of different sizes. The speed of our approach is due to the Integral Image technique, which allows computation of our window cost over any rectangular window in constant time, regardless of window size. Our method ranks in the top four on the Middlebury stereo database with ground truth, and performs best out of methods which have comparable efficiency.
Adaptive support-weight approach for correspondence search
- IEEE Trans. PAMI
, 2006
"... Abstract—We present a new window-based method for correspondence search using varying support-weights. We adjust the support-weights of the pixels in a given support window based on color similarity and geometric proximity to reduce the image ambiguity. Our method outperforms other local methods on ..."
Abstract
-
Cited by 36 (0 self)
- Add to MetaCart
Abstract—We present a new window-based method for correspondence search using varying support-weights. We adjust the support-weights of the pixels in a given support window based on color similarity and geometric proximity to reduce the image ambiguity. Our method outperforms other local methods on standard stereo benchmarks. Index Terms—Stereo, 3D/stereo scene analysis.

