Results 1 - 10
of
160
A theory of shape by space carving
- In Proceedings of the 7th IEEE International Conference on Computer Vision (ICCV-99), volume I, pages 307– 314, Los Alamitos, CA
, 1999
"... In this paper we consider the problem of computing the 3D shape of an unknown, arbitrarily-shaped scene from multiple photographs taken at known but arbitrarilydistributed viewpoints. By studying the equivalence class of all 3D shapes that reproduce the input photographs, we prove the existence of a ..."
Abstract
-
Cited by 363 (14 self)
- Add to MetaCart
In this paper we consider the problem of computing the 3D shape of an unknown, arbitrarily-shaped scene from multiple photographs taken at known but arbitrarilydistributed viewpoints. By studying the equivalence class of all 3D shapes that reproduce the input photographs, we prove the existence of a special member of this class, the photo hull, that (1) can be computed directly from photographs of the scene, and (2) subsumes all other members of this class. We then give a provably-correct algorithm, called Space Carving, for computing this shape and present experimental results on complex real-world scenes. The approach is designed to (1) build photorealistic shapes that accurately model scene appearance from a wide range of viewpoints, and (2) account for the complex interactions between occlusion, parallax, shading, and their effects on arbitrary views of a 3D scene. 1.
Bundle adjustment – a modern synthesis
- Vision Algorithms: Theory and Practice, LNCS
, 2000
"... This paper is a survey of the theory and methods of photogrammetric bundle adjustment, aimed at potential implementors in the computer vision community. Bundle adjustment is the problem of refining a visual reconstruction to produce jointly optimal structure and viewing parameter estimates. Topics c ..."
Abstract
-
Cited by 284 (11 self)
- Add to MetaCart
This paper is a survey of the theory and methods of photogrammetric bundle adjustment, aimed at potential implementors in the computer vision community. Bundle adjustment is the problem of refining a visual reconstruction to produce jointly optimal structure and viewing parameter estimates. Topics covered include: the choice of cost function and robustness; numerical optimization including sparse Newton methods, linearly convergent approximations, updating and recursive methods; gauge (datum) invariance; and quality control. The theory is developed for general robust cost functions rather than restricting attention to traditional nonlinear least squares.
A Comparison and Evaluation of Multi-View Stereo Reconstruction Algorithms
, 2006
"... This paper presents a quantitative comparison of several multi-view stereo reconstruction algorithms. Until now, the lack of suitable calibrated multi-view image datasets with known ground truth (3D shape models) has prevented such direct comparisons. In this paper, we first survey multi-view stereo ..."
Abstract
-
Cited by 189 (12 self)
- Add to MetaCart
This paper presents a quantitative comparison of several multi-view stereo reconstruction algorithms. Until now, the lack of suitable calibrated multi-view image datasets with known ground truth (3D shape models) has prevented such direct comparisons. In this paper, we first survey multi-view stereo algorithms and compare them qualitatively using a taxonomy that differentiates their key properties. We then describe our process for acquiring and calibrating multiview image datasets with high-accuracy ground truth and introduce our evaluation methodology. Finally, we present the results of our quantitative comparison of state-of-the-art multi-view stereo reconstruction algorithms on six benchmark datasets. The datasets, evaluation details, and instructions for submitting new models are available online at http://vision.middlebury.edu/mview.
M2Tracker: A Multi-View Approach to Segmenting and Tracking People in a Cluttered Scene Using Region-Based Stereo
- International Journal of Computer Vision
, 2002
"... We present a system that is capable of segmenting, detecting and tracking multiple people in a cluttered scene using multiple synchronized cameras located far from each other. The system improves upon existing systems in many ways including: (1) We do not assume that a foreground connected compon ..."
Abstract
-
Cited by 132 (10 self)
- Add to MetaCart
We present a system that is capable of segmenting, detecting and tracking multiple people in a cluttered scene using multiple synchronized cameras located far from each other. The system improves upon existing systems in many ways including: (1) We do not assume that a foreground connected component belongs to only one object; rather, we segment the views taking into account color models for the objects and the background. This helps us to not only separate foreground regions belonging to different objects, but to also obtain better background regions than traditional background subtraction methods (as it uses foreground color models in the algorithm). (2) It is fully automatic and does not require any manual input or initializations of any kind. (3) Instead of taking decisions about object detection and tracking from a single view or camera pair, we collect evidences from each pair and combine the evidence to obtain a decision in the end. This helps us to obtain much better detection and tracking as opposed to traditional systems.
Object-Centered Surface Reconstruction: Combining Multi-Image Stereo and Shading
- International Journal of Computer Vision
, 1995
"... Our goal is to reconstruct both the shape and reflectance properties of surfaces from multiple images. We argue that an object-centered representation is most appropriate for this purpose because it naturally accommodates multiple sources of data, multiple images (including motion sequences of a rig ..."
Abstract
-
Cited by 103 (19 self)
- Add to MetaCart
Our goal is to reconstruct both the shape and reflectance properties of surfaces from multiple images. We argue that an object-centered representation is most appropriate for this purpose because it naturally accommodates multiple sources of data, multiple images (including motion sequences of a rigid object), and self-occlusions. We then present a specific objectcentered reconstruction method and its implementation. The method begins with an initial estimate of surface shape provided, for example, by triangulating the result of conventional stereo. The surface shape and reflectance properties are then iteratively adjusted to minimize an objective function that combines information from multiple input images. The objective function is a weighted sum of stereo, shading, and smoothness components, where the weight varies over the surface. For example, the stereo component is weighted more strongly where the surface projects onto highly textured areas in the images, and less strongly othe...
A stereo machine for video-rate dense depth mapping and its new applications
- in Proc. IEEE Computer Vision and Pattern Recognition
"... The CMU RSTA Project has been developing a video-rate stereo machine that has the capability of generating a dense depth map at the video rate. The performance bench marks of the CMU video-rate stereo machine are: 1) multi image input of up to 6 cameras; 2) throughput of 30 million point × disparity ..."
Abstract
-
Cited by 84 (5 self)
- Add to MetaCart
The CMU RSTA Project has been developing a video-rate stereo machine that has the capability of generating a dense depth map at the video rate. The performance bench marks of the CMU video-rate stereo machine are: 1) multi image input of up to 6 cameras; 2) throughput of 30 million point × disparity range per second; 3) frame rate of 30 frame/ sec; 4) a dense depth map of up to 256 × 240 pixels; 5) disparity search range of up to 60 pixels; 6) high precision of depth output up to 8 bits (with interpolation). The capability of passively producing such a dense depth map (3D representation) of a scene at the video rate can open up a new class of applications of 3D vision: merging real and virtual worlds in real time. 1.
Task Parallelism in a High Performance Fortran Framework
- IEEE Parallel and Distributed Technology
, 1994
"... High Performance Fortran (HPF) has emerged as a standard dialect of Fortran for data parallel computing. However, for a wide variety of applications, both task and data parallelism must be exploited to achieve the best possible performance on a multicomputer. We present the design and implementation ..."
Abstract
-
Cited by 83 (18 self)
- Add to MetaCart
High Performance Fortran (HPF) has emerged as a standard dialect of Fortran for data parallel computing. However, for a wide variety of applications, both task and data parallelism must be exploited to achieve the best possible performance on a multicomputer. We present the design and implementation of a Fortran compiler that integrates task and data parallelism in an HPF framework. A small set of simple directives allow users to express task parallel programs in a variety of domains. The user identifies opportunities for task parallelism, and the compiler handles task creation and management, as well as communication between tasks. Since a unified compiler handles both task parallelism and data parallelism, existing data parallel programs and libraries can serve as the building blocks for constructing larger task parallel programs. This paper concludes with a description of several parallel application kernels that were developed with the compiler. The examples demonstrate that exploi...
Stereo Matching with Non-Linear Diffusion
- International Journal of Computer Vision
, 1998
"... One of the central problems in stereo matching (and other image registration tasks) is the selection of optimal window sizes for comparing image regions. This paper addresses this problem with some novel algorithms based on iteratively diffusing support at different disparity hypotheses, and locally ..."
Abstract
-
Cited by 82 (14 self)
- Add to MetaCart
One of the central problems in stereo matching (and other image registration tasks) is the selection of optimal window sizes for comparing image regions. This paper addresses this problem with some novel algorithms based on iteratively diffusing support at different disparity hypotheses, and locally controlling the amount of diffusion based on the current quality of the disparity estimate. It also develops a novel Bayesian estimation technique which significantly outperforms techniques based on area-based matching (SSD) and regular diffusion. We provide experimental results on both synthetic and real stereo image pairs. 1 Introduction and related work Most area-based approaches to the stereo correspondence problem perform the following three tasks: 1. For each disparity under consideration, compute a per-pixel matching cost; 2. Aggregate support spatially (e.g. by summing over a window, or by diffusion); 3. Across all disparities, find the best match based on the aggregated support. ...
Toward Automatic Robot Instruction from Perception - Mapping Human Grasps to Manipulator Grasps
- IEEE Transactions on Robotics and Automation
, 1997
"... Conventional methods for programming a robot either are inflexible or demand significant expertise. While the notion of automatic programming by high-level goal specification addresses these issues, the overwhelming complexity of planning manipulator grasps and paths remains a formidable obstacle to ..."
Abstract
-
Cited by 78 (4 self)
- Add to MetaCart
Conventional methods for programming a robot either are inflexible or demand significant expertise. While the notion of automatic programming by high-level goal specification addresses these issues, the overwhelming complexity of planning manipulator grasps and paths remains a formidable obstacle to practical implementation. Our approach of programming a robot is by direct human demonstration. Our system observes a human performing the task, recognizes the human grasp, and maps it onto the manipulator. Using human actions to guide robot execution greatly reduces the planning complexity. Subsequent to recording the human task execution, temporal task segmentation is carried out to identify task breakpoints. This step facilitates human grasp recognition and object motion extraction for robot execution of the task. This paper describes how an observed human grasp can be mapped to that of a given general-purpose manipulator for task replication. Planning the manipulator grasp based upon the observed human grasp is done at two levels: the functional and physical levels. Initially, at the functional level, grasp mapping is achieved at the virtual finger level; the virtual finger is a group of fingers acting against an object surface in a similar manner. Subsequently, at the physical level, the geometric properties of the object and manipulator are considered in finetuning the manipulator grasp. Our work concentrates on power or enveloping grasps and the fingertip precision grasps. We conclude by showing an example of an entire programming cycle from human demonstration to robot execution.

