Results 1 - 10
of
66
3-D depth reconstruction from a single still image
, 2006
"... We consider the task of 3-d depth estimation from a single still image. We take a supervised learning approach to this problem, in which we begin by collecting a training set of monocular images (of unstructured indoor and outdoor environments which include forests, sidewalks, trees, buildings, etc ..."
Abstract
-
Cited by 38 (12 self)
- Add to MetaCart
We consider the task of 3-d depth estimation from a single still image. We take a supervised learning approach to this problem, in which we begin by collecting a training set of monocular images (of unstructured indoor and outdoor environments which include forests, sidewalks, trees, buildings, etc.) and their corresponding ground-truth depthmaps. Then, we apply supervised learning to predict the value of the depthmap as a function of the image. Depth estimation is a challenging problem, since local features alone are insufficient to estimate depth at a point, and one needs to consider the global context of the image. Our model uses a hierarchical, multiscale Markov Random Field (MRF) that incorporates multiscale local- and global-image features, and models the depths and the relation between depths at different points in the image. We show that, even on unstructured scenes, our algorithm is frequently able to recover fairly accurate depthmaps. We further propose a model that incorporates both monocular cues and stereo (triangulation) cues, to obtain significantly more accurate depth estimates than is possible using either monocular or stereo cues alone.
Daisy: An efficient dense descriptor applied to wide baseline stereo
- IEEE TRANS. PATTERN ANALYSIS AND MACHINE INTELLIGENCE
, 2010
"... In this paper, we introduce a local image descriptor, DAISY, which is very efficient to compute densely. We also present an EM-based algorithm to compute dense depth and occlusion maps from wide-baseline image pairs using this descriptor. This yields much better results in wide-baseline situations t ..."
Abstract
-
Cited by 27 (9 self)
- Add to MetaCart
In this paper, we introduce a local image descriptor, DAISY, which is very efficient to compute densely. We also present an EM-based algorithm to compute dense depth and occlusion maps from wide-baseline image pairs using this descriptor. This yields much better results in wide-baseline situations than the pixel and correlation-based algorithms that are commonly used in narrowbaseline stereo. Also, using a descriptor makes our algorithm robust against many photometric and geometric transformations. Our descriptor is inspired from earlier ones such as SIFT and GLOH but can be computed much faster for our purposes. Unlike SURF, which can also be computed efficiently at every pixel, it does not introduce artifacts that degrade the matching performance when used densely. It is important to note that our approach is the first algorithm that attempts to estimate dense depth maps from wide-baseline image pairs, and we show that it is a good one at that with many experiments for depth estimation accuracy, occlusion detection, and comparing it against other descriptors on laser-scanned ground truth scenes. We also tested our approach on a variety of indoor and outdoor scenes with different photometric and geometric transformations and our experiments support our claim to being robust against these.
Classification and evaluation of cost aggregation methods for stereo correspondence
, 2008
"... ..."
Fast Global Labeling for Real-Time Stereo Using Multiple Plane Sweeps
, 2008
"... This work presents a real-time, data-parallel approach for global label assignment on regular grids. The labels are selected according to a Markov random field energy with a Potts prior term for binary interactions. We apply the proposed method to accelerate the cleanup step of a real-time dense ste ..."
Abstract
-
Cited by 9 (0 self)
- Add to MetaCart
This work presents a real-time, data-parallel approach for global label assignment on regular grids. The labels are selected according to a Markov random field energy with a Potts prior term for binary interactions. We apply the proposed method to accelerate the cleanup step of a real-time dense stereo method based on plane sweeping with multiple sweeping directions, where the label set directly corresponds to the employed directions. In this setting the Potts smoothness model is suitable, since the set of labels does not possess an intrinsic metric or total order. The observed run-times are approximately 30 times faster than the ones obtained by graph cut approaches. 1
An energy minimisation approach to stereo-temporal dense reconstruction
- Proc. International Conference on Pattern Recognition
, 2004
"... We propose a novel energy minimisation framework for the dense reconstruction of stereo image sequences that incorporates data fidelity as well as spatial and temporal regularity. An iterated dynamic programming scheme is proposed to minimise the energy function. We also present an efficient impleme ..."
Abstract
-
Cited by 9 (1 self)
- Add to MetaCart
We propose a novel energy minimisation framework for the dense reconstruction of stereo image sequences that incorporates data fidelity as well as spatial and temporal regularity. An iterated dynamic programming scheme is proposed to minimise the energy function. We also present an efficient implementation of the minimisation scheme by introducing morphological decomposition techniques to solve the dynamic programming subproblem. Our proposed method is capable of reconstructing dynamic scenes with complex motion. Results are presented demonstrating the strength of our proposed algorithm. 1
The infection algorithm: An artificial epidemic approach to dense stereo matching
- In X. Yao et al. (Eds.), Parallel Problem Solving from Nature VIII. Lecture
, 2004
"... Abstract. We present a new bio-inspired approach applied to a problem of stereo images matching. This approach is based on an artifical epidemic process, that we call “the infection algorithm. ” The problem at hand is a basic one in computer vision for 3D scene reconstruction. It has many complex as ..."
Abstract
-
Cited by 7 (4 self)
- Add to MetaCart
Abstract. We present a new bio-inspired approach applied to a problem of stereo images matching. This approach is based on an artifical epidemic process, that we call “the infection algorithm. ” The problem at hand is a basic one in computer vision for 3D scene reconstruction. It has many complex aspects and is known as an extremely difficult one. The aim is to match the contents of two images in order to obtain 3D informations which allow the generation of simulated projections from a viewpoint that is different from the ones of the initial photographs. This process is known as view synthesis. The algorithm we propose exploits the image contents in order to only produce the necessary 3D depth information, while saving computational time. It is based on a set of distributed rules, that propagate like an artificial epidemy over the images. Experiments on a pair of real images are presented, and realistic reprojected images have been generated. 1
Fast stereo matching by Iterated Dynamic Programming and quadtree subregioning
- British Machine Vision Conference
, 2004
"... The application of energy minimisation methods for stereo matching has been demonstrated to produce high quality disparity maps. However the majority of these methods are known to be computationally expensive, requiring minutes or even hours of computation. We propose a fast minimisation scheme that ..."
Abstract
-
Cited by 6 (1 self)
- Add to MetaCart
The application of energy minimisation methods for stereo matching has been demonstrated to produce high quality disparity maps. However the majority of these methods are known to be computationally expensive, requiring minutes or even hours of computation. We propose a fast minimisation scheme that produces strongly competitive results for significantly reduced computation, requiring only a few seconds of computation. In this paper, we present our iterated dynamic programming algorithm along with a quadtree subregioning process for fast stereo matching. 1
Stereo Using Monocular Cues within the Tensor Voting Framework
- Proc. European Conf. Computer Vision
, 2004
"... Abstract—We address the fundamental problem of matching in two static images. The remaining challenges are related to occlusion and lack of texture. Our approach addresses these difficulties within a perceptual organization framework, considering both binocular and monocular cues. Initially, matchin ..."
Abstract
-
Cited by 6 (1 self)
- Add to MetaCart
Abstract—We address the fundamental problem of matching in two static images. The remaining challenges are related to occlusion and lack of texture. Our approach addresses these difficulties within a perceptual organization framework, considering both binocular and monocular cues. Initially, matching candidates for all pixels are generated by a combination of matching techniques. The matching candidates are then embedded in disparity space, where perceptual organization takes place in 3D neighborhoods and, thus, does not suffer from problems associated with scanline or image neighborhoods. The assumption is that correct matches produce salient, coherent surfaces, while wrong ones do not. Matching candidates that are consistent with the surfaces are kept and grouped into smooth layers. Thus, we achieve surface segmentation based on geometric and not photometric properties. Surface overextensions, which are due to occlusion, can be corrected by removing matches whose projections are not consistent in color with their neighbors of the same surface in both images. Finally, the projections of the refined surfaces on both images are used to obtain disparity hypotheses for unmatched pixels. The final disparities are selected after a second tensor voting stage, during which information is propagated from more reliable pixels to less reliable ones. We present results on widely used benchmark stereo pairs. Index Terms—Stereo, occlusion, pixel correspondence, computer vision, perceptual organization, tensor voting. 1
Asymmetrical occlusion handling using graph cut for multi-view stereo
- In CVPR
, 2005
"... Occlusion is usually modelled in two images symmetrically in previous stereo algorithms which cannot work for multi-view stereo efficiently. In this paper, we present a novel formulation that handles occlusion using only one depth map in an asymmetrical way. Consequently, multiview information is ef ..."
Abstract
-
Cited by 6 (0 self)
- Add to MetaCart
Occlusion is usually modelled in two images symmetrically in previous stereo algorithms which cannot work for multi-view stereo efficiently. In this paper, we present a novel formulation that handles occlusion using only one depth map in an asymmetrical way. Consequently, multiview information is efficiently accumulated to achieve high accuracy. The resulting energy function is complex and approximate graph cut based solutions are proposed. Our approach complements the theory and extends the applicability of using graph cut in stereo. The experiments demonstrate that the approach is comparable with the state of the art and potentially more efficient for multi-view stereo.
Motion Detail Preserving Optical Flow Estimation ∗
"... We discuss the cause of a severe optical flow estimation problem that fine motion structures cannot always be correctly reconstructed in the commonly employed multiscale variational framework. Our major finding is that significant and abrupt displacement transition wrecks smallscale motion structure ..."
Abstract
-
Cited by 6 (2 self)
- Add to MetaCart
We discuss the cause of a severe optical flow estimation problem that fine motion structures cannot always be correctly reconstructed in the commonly employed multiscale variational framework. Our major finding is that significant and abrupt displacement transition wrecks smallscale motion structures in the coarse-to-fine refinement. A novel optical flow estimation method is proposed in this paper to address this issue, which reduces the reliance of the flow estimates on their initial values propagated from the coarser level and enables recovering many motion details in each scale. The contribution of this paper also includes adaption of the objective function and development of a new optimization procedure. The effectiveness of our method is borne out by experiments for both large- and small-displacement optical flow estimation. (a) (c)

