Results 1 - 10
of
116
PatchMatch Stereo- Stereo Matching with Slanted Support Windows
"... Common local stereo methods match support windows at integer-valued disparities. The implicit assumption that pixels within the support region have constant disparity does not hold for slanted surfaces and leads to a bias towards reconstructing frontoparallel surfaces. This work overcomes this bias ..."
Abstract
-
Cited by 41 (5 self)
- Add to MetaCart
(Show Context)
Common local stereo methods match support windows at integer-valued disparities. The implicit assumption that pixels within the support region have constant disparity does not hold for slanted surfaces and leads to a bias towards reconstructing frontoparallel surfaces. This work overcomes this bias by estimating an individual 3D plane at each pixel onto which the support region is projected. The major challenge of this approach is to find a pixel’s optimal 3D plane among all possible planes whose number is infinite. We show that an ideal algorithm to solve this problem is PatchMatch [1] that we extend to find an approximate nearest neighbor according to a plane. In addition to Patch-Match’s spatial propagation scheme, we propose (1) view propagation where planes are propagated among left and right views of the stereo pair and (2) temporal propagation where planes are propagated from preceding and consecutive frames of a video when doing temporal stereo. Adaptive support weights are used in matching cost aggregation to improve results at disparity borders. We also show that our slanted support windows can be used to compute a cost volume for global stereo methods, which allows for explicit treatment of occlusions and can handle large untextured regions. In the results we demonstrate that our method reconstructs highly slanted surfaces and achieves impressive disparity details with sub-pixel precision. In the Middlebury table, our method is currently top-performer among local methods and takes rank 2 among approximately 110 competitors if sub-pixel precision is considered. 1
Integrating Visual and Range Data for Robotic Object Detection
"... Abstract. The problem of object detection and recognition is a notoriously difficult one, and one that has been the focus of much work in the computer vision and robotics communities. Most work has concentrated on systems that operate purely on visual inputs (i.e., images) and largely ignores other ..."
Abstract
-
Cited by 38 (3 self)
- Add to MetaCart
(Show Context)
Abstract. The problem of object detection and recognition is a notoriously difficult one, and one that has been the focus of much work in the computer vision and robotics communities. Most work has concentrated on systems that operate purely on visual inputs (i.e., images) and largely ignores other sensor modalities. However, despite the great progress made down this track, the goal of high accuracy object detection for robotic platforms in cluttered real-world environments remains elusive. Instead of relying on information from the image alone, we present a method that exploits the multiple sensor modalities available on a robotic platform. In particular, our method augments a 2-d object detector with 3-d information from a depth sensor to produce a “multi-modal object detector.” We demonstrate our method on a working robotic system and evaluate its performance on a number of common household/office objects. 1
Time-of-Flight Sensors in Computer Graphics
- EUROGRAPHICS 2009 / M. PAULY AND G. GREINER STAR – STATE OF THE ART REPORT
, 2009
"... A growing number of applications depend on accurate and fast 3D scene analysis. Examples are model and lightfield acquisition, collision prevention, mixed reality, and gesture recognition. The estimation of a range map by image analysis or laser scan techniques is still a time-consuming and expensiv ..."
Abstract
-
Cited by 37 (0 self)
- Add to MetaCart
A growing number of applications depend on accurate and fast 3D scene analysis. Examples are model and lightfield acquisition, collision prevention, mixed reality, and gesture recognition. The estimation of a range map by image analysis or laser scan techniques is still a time-consuming and expensive part of such systems. A lower-priced, fast and robust alternative for distance measurements are Time-of-Flight (ToF) cameras. Re-cently, significant improvements have been made in order to achieve low-cost and compact ToF-devices, that have the potential to revolutionize many fields of research, including Computer Graphics, Computer Vision and Man Machine Interaction (MMI). These technologies are starting to have an impact on research and commercial applications. The upcoming gen-eration of ToF sensors, however, will be even more powerful and will have the potential to become “ubiquitous real-time geometry devices ” for gaming, web-conferencing, and numerous other applications. This STAR gives an account of recent developments in ToF-technology and discusses the current state of the integration of this technology into various graphics-related applications.
Realtime spatiotemporal stereo matching using the dual-cross-bilateral Grid
- in Proc. ECCV2010, LNCS 6313. 2010
"... Abstract. We introduce a real-time stereo matching technique based on a reformulation of Yoon and Kweon’s adaptive support weights algorithm [1]. Our implementation uses the bilateral grid to achieve a speedup of 200 × compared to a straightforward full-kernel GPU implementation, making it the faste ..."
Abstract
-
Cited by 35 (0 self)
- Add to MetaCart
(Show Context)
Abstract. We introduce a real-time stereo matching technique based on a reformulation of Yoon and Kweon’s adaptive support weights algorithm [1]. Our implementation uses the bilateral grid to achieve a speedup of 200 × compared to a straightforward full-kernel GPU implementation, making it the fastest technique on the Middlebury website. We introduce a colour component into our greyscale approach to recover precision and increase discriminability. Using our implementation, we speed up spatialdepth superresolution 100×. We further present a spatiotemporal stereo matching approach based on our technique that incorporates temporal evidence in real time (>14 fps). Our technique visibly reduces flickering and outperforms per-frame approaches in the presence of image noise. We have created five synthetic stereo videos, with ground truth disparity maps, to quantitatively evaluate depth estimation from stereo video. Source code and datasets are available on our project website 3. 1
Upsampling Range Data in Dynamic Environments
"... We present a flexible method for fusing information from optical and range sensors based on an accelerated highdimensional filtering approach. Our system takes as input a sequence of monocular camera images as well as a stream of sparse range measurements as obtained from a laser or other sensor sys ..."
Abstract
-
Cited by 34 (0 self)
- Add to MetaCart
(Show Context)
We present a flexible method for fusing information from optical and range sensors based on an accelerated highdimensional filtering approach. Our system takes as input a sequence of monocular camera images as well as a stream of sparse range measurements as obtained from a laser or other sensor system. In contrast with existing approaches, we do not assume that the depth and color data streams have the same data rates or that the observed scene is fully static. Our method produces a dense, high-resolution depth map of the scene, automatically generating confidence values for every interpolated depth point. We describe how to integrate priors on object motion and appearance and how to achieve an efficient implementation using parallel processing hardware such as GPUs. 1.
Multi-view Image and ToF Sensor Fusion for Dense 3D Reconstruction
, 2009
"... Multi-view stereo methods frequently fail to properly reconstruct 3D scene geometry if visible texture is sparse or the scene exhibits difficult self-occlusions. Time-of-Flight (ToF) depth sensors can provide 3D information regardless of texture but with only limited resolution and accuracy. To find ..."
Abstract
-
Cited by 33 (1 self)
- Add to MetaCart
(Show Context)
Multi-view stereo methods frequently fail to properly reconstruct 3D scene geometry if visible texture is sparse or the scene exhibits difficult self-occlusions. Time-of-Flight (ToF) depth sensors can provide 3D information regardless of texture but with only limited resolution and accuracy. To find an optimal reconstruction, we propose an integrated multi-view sensor fusion approach that combines information from multiple color cameras and multiple ToF depth sensors. First, multi-view ToF sensor measurements are combined to obtain a coarse but complete model. Then, the initial model is refined by means of a probabilistic multiview fusion framework, optimizing over an energy function that aggregates ToF depth sensor information with multiview stereo and silhouette constraints. We obtain high quality dense and detailed 3D models of scenes challenging for stereo alone, while simultaneously reducing complex noise of ToF sensors.
Time-of-Flight Cameras in Computer Graphics
, 2011
"... A growing number of applications depend on accurate and fast 3D scene analysis. Examples are model and lightfield acquisition, collision prevention, mixed reality, and gesture recognition. The estimation of a range map by image analysis or laser scan techniques is still a time-consuming and expensiv ..."
Abstract
-
Cited by 31 (4 self)
- Add to MetaCart
A growing number of applications depend on accurate and fast 3D scene analysis. Examples are model and lightfield acquisition, collision prevention, mixed reality, and gesture recognition. The estimation of a range map by image analysis or laser scan techniques is still a time-consuming and expensive part of such systems. A lower-priced, fast and robust alternative for distance measurements are Time-of-Flight (ToF) cameras. Recently, significant advances have been made in producing low-cost and compact ToF-devices, which have the potential to revolutionize many fields of research, including Computer Graphics, Computer Vision and Human Machine Interaction (HMI). These technologies are starting to have an impact on research and commercial applications. The upcoming gen-eration of ToF sensors, however, will be even more powerful and will have the potential to become “ubiquitous real-time geometry devices ” for gaming, web-conferencing, and numerous other applications. This paper gives an account of recent developments in ToF technology and discusses the current state of the integration of this technology into various graphics-related applications.
LidarBoost: Depth Superresolution for ToF 3D Shape Scanning
"... Depth maps captured with time-of-flight cameras have very low data quality: the image resolution is rather limited and the level of random noise contained in the depth maps is very high. Therefore, such flash lidars cannot be used out of the box for high-quality 3D object scanning. To solve this pro ..."
Abstract
-
Cited by 29 (3 self)
- Add to MetaCart
(Show Context)
Depth maps captured with time-of-flight cameras have very low data quality: the image resolution is rather limited and the level of random noise contained in the depth maps is very high. Therefore, such flash lidars cannot be used out of the box for high-quality 3D object scanning. To solve this problem, we present LidarBoost, a 3D depth superresolution method that combines several low resolution noisy depth images of a static scene from slightly displaced viewpoints, and merges them into a high-resolution depth image. We have developed an optimization framework that uses a data fidelity term and a geometry prior term that is tailored to the specific characteristics of flash lidars. We demonstrate both visually and quantitatively that LidarBoost produces better results than previous methods from the literature. 1.
A noise-aware filter for real-time depth upsampling
- IN WORKSHOP ON MULTI-CAMERA AND MULTI-MODAL SENSOR FUSION ALGORITHMS AND APPLICATIONS
, 2008
"... A new generation of active 3D range sensors, such as time-of-flight cameras, enables recording of full-frame depth maps at video frame rate. Unfortunately, the captured data are typically starkly contaminated by noise and the sensors feature only a rather limited image resolution. We therefore pres ..."
Abstract
-
Cited by 28 (5 self)
- Add to MetaCart
(Show Context)
A new generation of active 3D range sensors, such as time-of-flight cameras, enables recording of full-frame depth maps at video frame rate. Unfortunately, the captured data are typically starkly contaminated by noise and the sensors feature only a rather limited image resolution. We therefore present a pipeline to enhance the quality and increase the spatial resolution of range data in real-time by upsampling the range information with the data from a high resolution video camera. Our algorithm is an adaptive multi-lateral upsampling filter that takes into account the inherent noisy nature of real-time depth data. Thus, we can greatly improve reconstruction quality, boost the resolution of the data to that of the video sensor, and prevent unwanted artifacts like texture copy into geometry. Our technique has been crafted to achieve improvement in depth map quality while maintaining high computational efficiency for a real-time application. By implementing our approach on the GPU, the creation of a real-time 3D camera with video camera resolution is feasible.
A Constant-Space Belief Propagation Algorithm for Stereo Matching ∗
"... In this paper, we consider the problem of stereo matching using loopy belief propagation. Unlike previous methods which focus on the original spatial resolution, we hierarchically reduce the disparity search range. By fixing the number of disparity levels on the original resolution, our method solve ..."
Abstract
-
Cited by 26 (4 self)
- Add to MetaCart
(Show Context)
In this paper, we consider the problem of stereo matching using loopy belief propagation. Unlike previous methods which focus on the original spatial resolution, we hierarchically reduce the disparity search range. By fixing the number of disparity levels on the original resolution, our method solves the message updating problem in a time linear in the number of pixels contained in the image and requires only constant memory space. Specifically, for a 800 × 600 image with 300 disparities, our message updating method is about 30 × faster (1.5 second) than standard method, and requires only about 0.6 % memory (9 MB). Also, our algorithm lends itself to a parallel implementation. Our GPU implementation (NVIDIA Geforce 8800GTX) is about 10× faster than our CPU implementation. Given the trend toward higher-resolution images, stereo matching using belief propagation with large number of disparity levels as efficient as the small ones makes our method future-proof. In addition to the computational and memory advantages, our method is straightforward to implement 1. 1.