## A theory of shape by space carving (1999)

Venue: | In Proceedings of the 7th IEEE International Conference on Computer Vision (ICCV-99), volume I, pages 307– 314, Los Alamitos, CA |

Citations: | 457 - 14 self |

@INPROCEEDINGS{Kutulakos99atheory,

author = {Kiriakos N. Kutulakos and Steven M. Seitz},

title = {A theory of shape by space carving},

booktitle = {In Proceedings of the 7th IEEE International Conference on Computer Vision (ICCV-99), volume I, pages 307– 314, Los Alamitos, CA},

year = {1999},

publisher = {IEEE}

}

### Abstract

In this paper we consider the problem of computing the 3D shape of an unknown, arbitrarily-shaped scene from multiple photographs taken at known but arbitrarilydistributed viewpoints. By studying the equivalence class of all 3D shapes that reproduce the input photographs, we prove the existence of a special member of this class, the photo hull, that (1) can be computed directly from photographs of the scene, and (2) subsumes all other members of this class. We then give a provably-correct algorithm, called Space Carving, for computing this shape and present experimental results on complex real-world scenes. The approach is designed to (1) build photorealistic shapes that accurately model scene appearance from a wide range of viewpoints, and (2) account for the complex interactions between occlusion, parallax, shading, and their effects on arbitrary views of a 3D scene. 1.

893 | Shape and motion from image streams under orthography: a factorization method
- Tomasi, Kanade
- 1992
(Show Context)
Citation Context ... changes in visibility, due to the complete rotation of the object in front of the camera. The low-texture and occlusion properties also cause problems for feature-based structure-from-motion methods =-=[37, 43, 60, 61]-=-, due to the difficulty of locating and tracking a sufficient number of features throughout the sequence. While volume intersection [10, 21, 56] and other contour-based techniques [6, 8, 9, 41, 42, 62... |

889 | Modeling and Rendering Architecture from Photographs: A Hybrid Geometry
- Debevec, Taylor, et al.
- 1996
(Show Context)
Citation Context ... smoothness constraints or other geometric heuristics, there are many cases where it may be advantageous to impose a priori constraints, especially when the scene is known to have a certain structure =-=[53, 54]-=-. Least-commitment reconstruction suggests a new way of incorporating such constraints: rather than imposing them as early as possible in the reconstruction process, we can impose them after first rec... |

794 | A volumetric method for building complex models from range images
- Curless, Levoy
- 1996
(Show Context)
Citation Context ...o global reconstruction algorithms [12, 37] that recover 3D shape information from all photographs in a single step. This eliminates the need for complex partial reconstruction and merging operations =-=[38, 39]-=- in which partial 3D shape information is extracted from subsets of the photographs [32, 40–42], and where global consistency with the entire set of photographs is not guaranteed for the final shape. ... |

454 |
Zippered Polygon Meshes from Range Images
- Turk, Levoy
- 1994
(Show Context)
Citation Context ...o global reconstruction algorithms [12, 37] that recover 3D shape information from all photographs in a single step. This eliminates the need for complex partial reconstruction and merging operations =-=[38, 39]-=- in which partial 3D shape information is extracted from subsets of the photographs [32, 40–42], and where global consistency with the entire set of photographs is not guaranteed for the final shape. ... |

387 |
The visual hull concept for silhouette-based image understanding
- Laurentini
- 1994
(Show Context)
Citation Context ...o a cone defined by - and the photograph’s nonbackground pixels. + Given such photographs, the scene is restricted to the visual hull, which is the volume of intersection of their corresponding cones =-=[5]-=-. When no a priori information is available about the scene’s radiance, the visual hull defines all the shape constraints in the input photographs. This is because there is always an assignment of rad... |

387 | Photorealistic scene reconstruction by voxel coloring
- Seitz, Dyer
- 1999
(Show Context)
Citation Context ... they rely on the presence of specific image features such as edges and hence generate only sparse reconstructions [28], or they place strong constraints on the input viewpoints relative to the scene =-=[29, 30]-=-. Our implementation of the Space Carving Algorithm also uses plane sweeps, but unlike all previous methods the algorithm guarantees complete reconstructions in the general case. Our approach offers s... |

347 |
Theory for off-specular reflection from roughened surfaces
- Torrance, Sparrow
- 1967
(Show Context)
Citation Context ... such as shadows, transparencies and inter-reflections can be ignored, and is sufficiently general to include scenes with parameterized radiance models (e.g., Lambertian, Phong [16], Torrance-Sparrow =-=[17]-=-). Using this observation as a starting point, we show how to compute, from arbitrary photographs of an unknown scene, a maximal photo-consistent shape that encloses the set of all photo-consistent re... |

322 | What is the set of images of an object under all possible lighting conditions - Belhumeur, Kriegman - 1996 |

292 |
Affine structure from motion
- Koenderink, Doorn
- 1991
(Show Context)
Citation Context ...ape that encloses the set of all photo-consistent reconstructions. The only requirements are that (1) the viewpoint of each photograph is known in a common 3D world reference frame (Euclidean, affine =-=[18]-=-, or projective [19]), and (2) scene radiance follows a known, locally-computable radiance function. Experimental results illustrating our method’s performance are given for both real and simulated ge... |

283 |
Gool, ”Self-Calibration and Metric Reconstruction in spite of Varying and
- Pollefeys, Koch, et al.
- 1997
(Show Context)
Citation Context ... changes in visibility, due to the complete rotation of the object in front of the camera. The low-texture and occlusion properties also cause problems for feature-based structure-from-motion methods =-=[37, 43, 60, 61]-=-, due to the difficulty of locating and tracking a sufficient number of features throughout the sequence. While volume intersection [10, 21, 56] and other contour-based techniques [6, 8, 9, 41, 42, 62... |

265 |
A multiple-baseline stereo
- Okutomi, Kanade
- 1993
(Show Context)
Citation Context ...al problem in computer vision is reconstructing the shape of a complex 3D scene from multiple photographs. While current techniques work well under controlled conditions (e.g., small stereo baselines =-=[1]-=-, active viewpoint control [2], spatial and temporal smoothness [3], or scenes containing linear features or texture-less surfaces [4–6]), very little is known about scene reconstruction under general... |

239 |
Computational Vision and Regularization Theory
- Poggio, Koch, et al.
- 1985
(Show Context)
Citation Context ...f shapes each of which reproduces all the input photographs exactly. This result is yet another manifestation of the well-known fact that 3D shape recovery from a set of images is generally ill-posed =-=[3]-=-, i.e., there may be multiple shapes that are consistent with the same set of images. 2 Reconstruction methods must therefore choose 2 Faugeras [48] has recently proposed the term metameric to describ... |

234 | A maximum-flow formulation of the ncamera stereo correspondence problem - Roy, Cox - 1998 |

222 |
Rapid Octree Construction from Image Sequences
- SZELISKI
- 1993
(Show Context)
Citation Context ...r notion of photo-consistency implicitly ensures that all these 3D shape cues are taken into account in the recovery process, our approach is related to work on stereo [1, 14, 20], shape-from-contour =-=[8, 9, 21]-=-, as well as shape-from-shading [22–24]. These approaches rely on studying a single 3D shape cue under the assumptions that (1) other sources of variability can be safely ignored, and (2) the input 2... |

208 | Epipolar-plane image analysis: An approach to determining structure from motion. IJCV
- Bolles, Baker, et al.
- 1987
(Show Context)
Citation Context ...lex 3D scene from multiple photographs. While current techniques work well under controlled conditions (e.g., small stereo baselines [1], active viewpoint control [2], spatial and temporal smoothness =-=[3]-=-, or scenes containing linear features or texture-less surfaces [4–6]), very little is known about scene reconstruction under general conditions. In particular, in the absence of a priori geometric in... |

203 | 3D Model Acquisition from Extended Image Sequences
- Beardsley, Torr, et al.
- 1996
(Show Context)
Citation Context ... changes in visibility, due to the complete rotation of the object in front of the camera. The low-texture and occlusion properties also cause problems for feature-based structure-from-motion methods =-=[37, 43, 60, 61]-=-, due to the difficulty of locating and tracking a sufficient number of features throughout the sequence. While volume intersection [10, 21, 56] and other contour-based techniques [6, 8, 9, 41, 42, 62... |

196 | A maximum likelihood stereo algorithm,” Computer vision and image understanding
- Cox, Hingorani, et al.
- 1996
(Show Context)
Citation Context ...ependence on viewpoint. Since our notion of photo-consistency implicitly ensures that all these 3D shape cues are taken into account in the recovery process, our approach is related to work on stereo =-=[1, 14, 20]-=-, shape-from-contour [8, 9, 21], as well as shape-from-shading [22–24]. These approaches rely on studying a single 3D shape cue under the assumptions that (1) other sources of variability can be safel... |

194 | Variational principles, surface evolution, PDE’s, level set methods and the stereo problem
- Faugeras, Keriven
- 1998
(Show Context)
Citation Context ...te in a 3D scene space and is therefore related to other scenespace stereo algorithms that have been recently proposed [27–34]. Of these, most closely related are recent mesh-based [27] and level-set =-=[35]-=- algorithms, as well as algorithms that sweep a plane or other manifold through a discretized scene 1 Examples include the use of the small baseline assumption in stereo to simplify correspondence-fin... |

190 | Object shape and reflectance modeling from observation
- Sato, Wheeler, et al.
- 1997
(Show Context)
Citation Context ...ific examples include (1) using a mobile camera mounted with a light source to capture photographs of a scene whose reflectance can be expressed in closed form (e.g., using the Torrance-Sparrow model =-=[17, 47]-=-), and (2) using multiple cameras to capture photographs of an approximately Lambertian scene under arbitrary unknown illumination (Fig. 1). 8(a) (b) Fig. 3: (a) Illustration of the Visibility Lemma.... |

183 | A space-sweep approach to true multi-image matching
- Collins
- 1996
(Show Context)
Citation Context ...ave many attractive properties, existing algorithms [28–30, 33] are not general i.e., they rely on the presence of specific image features such as edges and hence generate only sparse reconstructions =-=[28]-=-, or they place strong constraints on the input viewpoints relative to the scene [29, 30]. Our implementation of the Space Carving Algorithm also uses plane sweeps, but unlike all previous methods the... |

134 | Wide baseline stereo matching
- Pritchett, Zisserman
- 1998
(Show Context)
Citation Context ... techniques work well under controlled conditions (e.g., small stereo baselines [1], active viewpoint control [2], spatial and temporal smoothness [3–5], or scenes containing curved lines [6], planes =-=[7]-=-, or texture-less surfaces [8–12]), very little is known about scene reconstruction under general conditions. In particular, in the absence of a priori geometric information, what can we infer about t... |

132 |
Motion from point matches: Multiplicity of solutions
- Faugeras, Maybank
- 1990
(Show Context)
Citation Context ...nalyzing the general properties of recently-proposed scene-space stereo techniques [27–34]. In this respect, our analysis has goals similar to those of theoretical approaches to structure-from-motion =-=[36]-=-, although the different assumptions employed (i.e., unknown vs. known correspondences, known vs. unknown camera motion), make the 1 Examples include the use of the small baseline assumption in stereo... |

125 |
Blue screen matting
- Smith, Blinn
- 1996
(Show Context)
Citation Context ...nd pixels in these photographs. Unfortunately, these constraints become useless when photographs contain no background pixels (i.e., the visual hull degenerates to ) or when background identification =-=[59]-=- cannot be performed accurately. Below we study the picture constraints provided by non-background pixels when the scene’s radiance is restricted to a special class of radiance models. The resulting p... |

124 | Constructing virtual worlds using dense stereo - Narayanan, Rander, et al. - 1998 |

123 |
Surface shape from deformation of apparent contours
- Cipolla, Blake
- 1992
(Show Context)
Citation Context ...r notion of photo-consistency implicitly ensures that all these 3D shape cues are taken into account in the recovery process, our approach is related to work on stereo [1, 14, 20], shape-from-contour =-=[8, 9, 21]-=-, as well as shape-from-shading [22–24]. These approaches rely on studying a single 3D shape cue under the assumptions that (1) other sources of variability can be safely ignored, and (2) the input 2... |

121 | R.: “3-d scene data recovery using omnidirectional multibaseline stereo - Kang, Szeliski - 1997 |

119 | Object-centered surface reconstruction: Combining multiimage stereo and shading
- Fua, Leclerc
- 1995
(Show Context)
Citation Context ...owever, does operate in a 3D scene space and is therefore related to other scene-space stereo algorithms that have been recently proposed [27–34]. Of these, most closely related are recent mesh-based =-=[27]-=- and level-set [35] algorithms, as well as algorithms that sweep a plane or other manifold through a discretized scene space [28–30, 33]. While the algorithms in [27, 35] generate high-quality reconst... |

114 | 3D human body model acquisition from multiple views. ICCV'95
- Kakadiaris, Metaxas
(Show Context)
Citation Context ...ng smoothness constraints or other geometric heuristics, there are many cases where it may be advantageous to impose aprioriconstraints, especially when the scene is known to have a certain structure =-=[53, 54]-=-. Least-commitment reconstruction suggests a new way of incorporating such constraints: rather than imposing them as early as possible in the reconstruction process, we can impose them after first rec... |

107 | A Bayesian approach to binocular stereopsis - Belhumeur - 1996 |

107 |
Volumetric description of objects from multiple views
- Martin, Aggarwal
- 1983
(Show Context)
Citation Context ...ce Carving Algorithm that iteratively “carves” out the scene from an initial set of voxels. This implementation can be seen as a generalization of silhouette-based techniques like volume intersection =-=[21, 44, 56, 57]-=- to the case of gray-scale and full-color images, and extends voxel coloring [29] and plenoptic decomposition [30] to the case of arbitrary 5Fig. 2: Viewing geometry. camera geometries. 3 Section 5 c... |

103 |
Using Extremal Boundaries for 3-D Object Modeling
- Vaillant, Faugeras
- 1992
(Show Context)
Citation Context ...r notion of photo-consistency implicitly ensures that all these 3D shape cues are taken into account in the recovery process, our approach is related to work on stereo [1, 14, 20], shape-from-contour =-=[8, 9, 21]-=-, as well as shape-from-shading [22–24]. These approaches rely on studying a single 3D shape cue under the assumptions that (1) other sources of variability can be safely ignored, and (2) the input 2... |

94 | A Stereo Machine For Videorate Dense Depth Mapping And
- Kanade, Yoshida, et al.
- 1996
(Show Context)
Citation Context ...nces, known vs. unknown camera motion), make the 1 Examples include the use of the small baseline assumption in stereo to simplify correspondence-finding and maximize joint visibility of scene points =-=[26]-=-, the availability of easily-detectable image contours in shape-from-contour reconstruction [9], and the assumption that all views are taken from the same viewpoint in photometric stereo [24]. 3geome... |

89 | Stereo matching with transparency and matting - Szeliski, Golland - 1998 |

77 | Surfaces from stereo: Integrating feature matching, disparity estimation, and contour detection
- Hoff, Ahuja
- 1989
(Show Context)
Citation Context ...ependence on viewpoint. Since our notion of photo-consistency implicitly ensures that all these 3D shape cues are taken into account in the recovery process, our approach is related to work on stereo =-=[1, 14, 20]-=-, shape-from-contour [8, 9, 21], as well as shape-from-shading [22–24]. These approaches rely on studying a single 3D shape cue under the assumptions that (1) other sources of variability can be safel... |

72 |
Stratification of Three Dimensional Vision: Projective, Affine and Metric Representations
- Faugeras
- 1995
(Show Context)
Citation Context ... where these constraints are used only to choose among shapes within the class of photo-consistent reconstructions. This approach is similar in spirit to “stratification” approaches of shape recovery =-=[18, 55]-=-, where 3D shape is first recovered modulo an equivalence class of reconstructions and is then refined within that class at subsequent stages of processing. The remainder of this paper is structured a... |

62 | Minpran: A New Robust Estimator for Computer Vision - Stewart - 1995 |

57 | Virtualized Reality: Concepts and Early Results - Kanade, Narayanan, et al. - 1995 |

55 | Recovering Shape by Purposive Viewpoint Adjustment,” Int
- Kutulakos, Dyer
- 1994
(Show Context)
Citation Context ...is reconstructing the shape of a complex 3D scene from multiple photographs. While current techniques work well under controlled conditions (e.g., small stereo baselines [1], active viewpoint control =-=[2]-=-, spatial and temporal smoothness [3], or scenes containing linear features or texture-less surfaces [4–6]), very little is known about scene reconstruction under general conditions. In particular, in... |

46 | Robust shape recovery from occluding contours using a linear smoother
- Szeliski, Weiss
- 1998
(Show Context)
Citation Context ...ods [37, 43, 60, 61], due to the difficulty of locating and tracking a sufficient number of features throughout the sequence. While volume intersection [10, 21, 56] and other contour-based techniques =-=[6, 8, 9, 41, 42, 62]-=- are often used successfully in similar experiments, they require the detection of silhouettes or occluding contours. For the gargoyle sequence, the background was unknown and heterogeneous, making th... |

35 | Global surface reconstruction by purposive control of observer motion
- Kutulakos, Dyer
- 1995
(Show Context)
Citation Context ...om cameras distributed throughout the inside and outside of the house. 4. Because no constraints on the camera viewpoints are imposed, our approach leads naturally to global reconstruction algorithms =-=[12, 37]-=- that recover 3D shape information from all photographs in a single step. This eliminates the need for complex partial reconstruction and merging operations [38, 39] in which partial 3D shape informat... |

35 | Plenoptic image editing - Seitz, Kutulakos - 1998 |

34 |
Reality modeling and visualization from multiple video sequences
- Moezzi, Katkere, et al.
- 1996
(Show Context)
Citation Context ...ce Carving Algorithm that iteratively “carves” out the scene from an initial set of voxels. This implementation can be seen as a generalization of silhouette-based techniques like volume intersection =-=[21, 44, 56, 57]-=- to the case of gray-scale and full-color images, and extends voxel coloring [29] and plenoptic decomposition [30] to the case of arbitrary 5Fig. 2: Viewing geometry. camera geometries. 3 Section 5 c... |

33 | A Viewpoint Dependent Stereoscopic Display Method with Interpolation and - Katayama, Tanaka, et al. - 1996 |

25 | Learning Object Representations from Lighting Variation , ECCV - Epstein, Yuille, et al. |

25 | Photometric stereo: Lambertian reflectance and light sources with unknown direction and strength
- Woodham, Iwahori, et al.
- 1991
(Show Context)
Citation Context ...e points [26], the availability of easily-detectable image contours in shape-from-contour reconstruction [9], and the assumption that all views are taken from the same viewpoint in photometric stereo =-=[24]-=-. 3geometry, solution space, and underlying techniques completely different. 2. Our analysis provides a volume which is the tightest possible bound on the shape of the true scene that can be inferred... |

19 |
Visual shape computing
- Aloimonos
- 1988
(Show Context)
Citation Context ...r scene to reconstruct from the space of all consistent shapes. Traditionally, the most common way of dealing with this ambiguity has been to apply smoothness heuristics and regularization techniques =-=[3, 51]-=- to obtain reconstructions that are as smooth as possible. A drawback of this type of approach is that it typically penalizes discontinuities and sharp edges, features that are very common in real sce... |

15 |
Building three-dimensional object models from image sequences
- Seales, Faugeras
- 1995
(Show Context)
Citation Context ...ods [37, 43, 60, 61], due to the difficulty of locating and tracking a sufficient number of features throughout the sequence. While volume intersection [10, 21, 56] and other contour-based techniques =-=[6, 8, 9, 41, 42, 62]-=- are often used successfully in similar experiments, they require the detection of silhouettes or occluding contours. For the gargoyle sequence, the background was unknown and heterogeneous, making th... |

15 | Image-based geometrically-correct photorealistic scene/object modeling (IBPhM): a review - Zhang - 1998 |

14 |
Stereo matching, reconstruction and refinement of 3d curves using deformable contours
- Bascle, Deriche
- 1993
(Show Context)
Citation Context ...hile current techniques work well under controlled conditions (e.g., small stereo baselines [1], active viewpoint control [2], spatial and temporal smoothness [3–5], or scenes containing curved lines =-=[6]-=-, planes [7], or texture-less surfaces [8–12]), very little is known about scene reconstruction under general conditions. In particular, in the absence of a priori geometric information, what can we i... |

14 | Complete scene structure from four point correspondences
- Seitz, Dyer
- 1995
(Show Context)
Citation Context ...om cameras distributed throughout the inside and outside of the house. 4. Because no constraints on the camera viewpoints are imposed, our approach leads naturally to global reconstruction algorithms =-=[12, 37]-=- that recover 3D shape information from all photographs in a single step. This eliminates the need for complex partial reconstruction and merging operations [38, 39] in which partial 3D shape informat... |