## Manhattan-world Stereo

### Cached

### Download Links

Citations: | 41 - 6 self |

### BibTeX

@MISC{Furukawa_manhattan-worldstereo,

author = {Yasutaka Furukawa and Brian Curless and Steven M. Seitz and Richard Szeliski},

title = {Manhattan-world Stereo},

year = {}

}

### OpenURL

### Abstract

Multi-view stereo (MVS) algorithms now produce reconstructions that rival laser range scanner accuracy. However, stereo algorithms require textured surfaces, and therefore work poorly for many architectural scenes (e.g., building interiors with textureless, painted walls). This paper presents a novel MVS approach to overcome these limitations for Manhattan World scenes, i.e., scenes that consists of piece-wise planar surfaces with dominant directions. Given a set of calibrated photographs, we first reconstruct textured regions using an existing MVS algorithm, then extract dominant plane directions, generate plane hypotheses, and recover per-view depth maps using Markov random fields. We have tested our algorithm on several datasets ranging from office interiors to outdoor buildings, and demonstrate results that outperform the current state of the art for such texture-poor scenes. 1.

### Citations

1619 | Mean shift: A robust approach toward feature space analysis
- Comaniciu, Meer
- 2002
(Show Context)
Citation Context ...rough Pi has an offset −→ dk · Pi; i.e., the plane equation is −→ dk · X = −→ dk · Pi. For each axis direction −→ dk we compute the set of offsets { −→ dk · Pi} and perform a 1D mean shift clustering =-=[7]-=- to extract clusters and peaks. The candidate planes are generated at the offsets of the peaks. Some clusters may contain a small number of samples, thus providing only weak support for the correspond... |

1485 | Fast approximate energy minimization via graph cuts
- BOYKOV, VEKSLER, et al.
- 2001
(Show Context)
Citation Context ... geometry. We then recover a depth map for each image by assigning one of the candidate planes to each pixel in the image. This step is posed as a Markov random field (MRF) and solved with graph cuts =-=[4, 5, 13]-=- (Fig. 2). 1.1. Related work Our work builds upon a long tradition of piecewiseplanar stereo, beginning with the seminal work of Wang 1Oriented points reconstructed by MVS Dominant axes extracted fro... |

1129 | A Taxonomy and Evaluation of Dense Two-Frame Stereo Correspondence Algorithms
- Scharstein, Szeliski
- 2008
(Show Context)
Citation Context ...vector and then find one or two dominant plane directions orthogonal to this vector using low-level cues such as reconstructed 3D points or lines. They then sweep families of planes through the scene =-=[6, 16]-=- and measure the photoconsistency or correlation at each pixel in order to estimate depth maps. There also exist approaches specifically designed for architectural scenes. Cornelis et al. [9] estimate... |

870 | An Experimental Comparison of Min-Cut/Max-Flow Algorithms for Energy Minimization in Vision
- Boykov, Kolmogorov
- 2004
(Show Context)
Citation Context ... geometry. We then recover a depth map for each image by assigning one of the candidate planes to each pixel in the image. This step is posed as a Markov random field (MRF) and solved with graph cuts =-=[4, 5, 13]-=- (Fig. 2). 1.1. Related work Our work builds upon a long tradition of piecewiseplanar stereo, beginning with the seminal work of Wang 1Oriented points reconstructed by MVS Dominant axes extracted fro... |

749 | What Energy Functions can be Minimized via Graph Cuts
- Kolmogorov, Zabih
- 2004
(Show Context)
Citation Context ... geometry. We then recover a depth map for each image by assigning one of the candidate planes to each pixel in the image. This step is posed as a Markov random field (MRF) and solved with graph cuts =-=[4, 5, 13]-=- (Fig. 2). 1.1. Related work Our work builds upon a long tradition of piecewiseplanar stereo, beginning with the seminal work of Wang 1Oriented points reconstructed by MVS Dominant axes extracted fro... |

483 | S.M.: A Theory of Shape by Space Carving
- Kutulakos, Seitz
- 2000
(Show Context)
Citation Context ...e this assumption may seem to be overly restrictive, note that any scene can be arbitrarily-well approximated (to first order) by axis-aligned geometry, as in the case of a high resolution voxel grid =-=[14, 17]-=-. While the Manhattan-world model may be reminiscent of blocksworld models from the 70’s and 80’s, we demonstrate stateof-the-art results on very complex environments. Our approach, within the constra... |

476 | Representing moving images with layers
- Wang, Adelson
- 1994
(Show Context)
Citation Context ... dominant axes d1,d2,d3. Hypothesis planes are found by finding point density peaks along each axis di. These planes are then used as per-pixels labels in an MRF. and Adelson on layered motion models =-=[20]-=-. Several authors, including Baker et al. [1], Birchfield and Tomasi [3], and Tao et al. [19], have specialized the 2D affine motion models first suggested by Wang and Adelson to the rigid multi-view ... |

411 | C.R.: Photorealistic scene reconstruction by voxel coloring
- Seitz, Dyer
- 1997
(Show Context)
Citation Context ...e this assumption may seem to be overly restrictive, note that any scene can be arbitrarily-well approximated (to first order) by axis-aligned geometry, as in the case of a high resolution voxel grid =-=[14, 17]-=-. While the Manhattan-world model may be reminiscent of blocksworld models from the 70’s and 80’s, we demonstrate stateof-the-art results on very complex environments. Our approach, within the constra... |

230 | Poisson surface reconstruction
- KAZHDAN, BOLITHO, et al.
- 2006
(Show Context)
Citation Context ... algorithm with a state of the art MVS approach, where PMVS software [11] is used to recover oriented points, which are converted into a mesh model using Poisson Surface Reconstruction software (PSR) =-=[12]-=-. The first row of the figure shows PSR reconstructions. PSR fills all holes with curved surfaces that do not respect the architectural structure of the scenes. PSR also generates closed surfaces, inc... |

192 | A space-sweep approach to true multi-image matching
- Collins
- 1996
(Show Context)
Citation Context ...vector and then find one or two dominant plane directions orthogonal to this vector using low-level cues such as reconstructed 3D points or lines. They then sweep families of planes through the scene =-=[6, 16]-=- and measure the photoconsistency or correlation at each pixel in order to estimate depth maps. There also exist approaches specifically designed for architectural scenes. Cornelis et al. [9] estimate... |

119 | Multiway cut for stereo and motion with slanted surfaces
- Birchfield, Tomasi
- 1999
(Show Context)
Citation Context ...sity peaks along each axis di. These planes are then used as per-pixels labels in an MRF. and Adelson on layered motion models [20]. Several authors, including Baker et al. [1], Birchfield and Tomasi =-=[3]-=-, and Tao et al. [19], have specialized the 2D affine motion models first suggested by Wang and Adelson to the rigid multi-view stereo setting. What all these algorithms have in common is that they al... |

107 |
A global matching framework for stereo computation
- Tao, Sawhney, et al.
- 2001
(Show Context)
Citation Context ...h axis di. These planes are then used as per-pixels labels in an MRF. and Adelson on layered motion models [20]. Several authors, including Baker et al. [1], Birchfield and Tomasi [3], and Tao et al. =-=[19]-=-, have specialized the 2D affine motion models first suggested by Wang and Adelson to the rigid multi-view stereo setting. What all these algorithms have in common is that they alternate between assig... |

106 | A layered approach to stereo reconstruction
- Baker, Szeliski, et al.
- 1998
(Show Context)
Citation Context ... found by finding point density peaks along each axis di. These planes are then used as per-pixels labels in an MRF. and Adelson on layered motion models [20]. Several authors, including Baker et al. =-=[1]-=-, Birchfield and Tomasi [3], and Tao et al. [19], have specialized the 2D affine motion models first suggested by Wang and Adelson to the rigid multi-view stereo setting. What all these algorithms hav... |

92 | New techniques for automated architecture reconstruction from photographs
- WERNER, ZISSERMAN
- 2002
(Show Context)
Citation Context ...ted research uses dominant plane orientations in outdoor architectural models to perform plane sweep stereo reconstruction. Notable examples are the work of Coorg and Teller [8], Werner and Zisserman =-=[21]-=-, and Pollefeys et al. [15]. These approaches first estimate the gravity (up) vector and then find one or two dominant plane directions orthogonal to this vector using low-level cues such as reconstru... |

69 | Manhattan world: Compass direction from a single image by bayesian inference
- Coughlan, Yuille
- 1999
(Show Context)
Citation Context ...rnet are images of architectural scenes with texture-poor but highly structured surfaces. methods with priors that are more appropriate. To this end we invoke the so-called Manhattan-world assumption =-=[10]-=-, which states that all surfaces in the world are aligned with three dominant directions, typically corresponding to the X, Y, and Z axes; i.e., the world is piecewise-axis-alignedplanar. We call the ... |

52 | Gool, “3D urban scene modeling integrating recognition and reconstruction
- Cornelis, Leibe, et al.
- 2008
(Show Context)
Citation Context ...cene [6, 16] and measure the photoconsistency or correlation at each pixel in order to estimate depth maps. There also exist approaches specifically designed for architectural scenes. Cornelis et al. =-=[9]-=- estimate ruled vertical facades in urban street scenes by correlating complete vertical scanlines in images. Barinova et al. [2] also use vertical facades to reconstruct city building models from a s... |

44 | Extracting textured vertical facades from controlled close-range imagery
- Coorg, Teller
- 1999
(Show Context)
Citation Context ...ions. Another line of related research uses dominant plane orientations in outdoor architectural models to perform plane sweep stereo reconstruction. Notable examples are the work of Coorg and Teller =-=[8]-=-, Werner and Zisserman [21], and Pollefeys et al. [15]. These approaches first estimate the gravity (up) vector and then find one or two dominant plane directions orthogonal to this vector using low-l... |

33 |
Fusion of feature- and area-based information for urban buildings modeling from aerial imagery
- ZEBEDIN, BAUER, et al.
- 2008
(Show Context)
Citation Context ...e measurements to dense depth maps. We demonstrate good reconstruction results on challenging complex indoor scenes with many small axis-aligned surfaces such as tables and appliances. Zebedin et al. =-=[22]-=- also use an MRF to reconstruct building models, where they segment out buildings from images based on a height field, a rough building mask, and 3D lines, then recover roof shapes. Their system produ... |

17 |
et al : Detailed real-time urban 3d reconstruction from video
- Pollefeys
(Show Context)
Citation Context ...plane orientations in outdoor architectural models to perform plane sweep stereo reconstruction. Notable examples are the work of Coorg and Teller [8], Werner and Zisserman [21], and Pollefeys et al. =-=[15]-=-. These approaches first estimate the gravity (up) vector and then find one or two dominant plane directions orthogonal to this vector using low-level cues such as reconstructed 3D points or lines. Th... |

14 |
A.: Fast automatic single-view 3-d reconstruction of urban scenes
- Barinova, Konushin, et al.
- 2008
(Show Context)
Citation Context ...oaches specifically designed for architectural scenes. Cornelis et al. [9] estimate ruled vertical facades in urban street scenes by correlating complete vertical scanlines in images. Barinova et al. =-=[2]-=- also use vertical facades to reconstruct city building models from a single image. However, these approaches that estimate vertical facades cannot handle more complex scenes consisting of mixed verti... |

6 |
Bundler—Structure from Motion for Unordered Image Collections, Version 0.4; Available online: http://phototour.cs.washington.edu/bundler/ (accessed on January 29
- Snavely
- 2010
(Show Context)
Citation Context ...faces, sharp corners – that are challenging for standard stereo and MVS approaches. The camera parameters for each dataset were recovered using publicly available structure-from-motion (SfM) software =-=[18]-=-. Table 1 summarizes some characteristics of the datasets, along with the choice of the parameters in our algorithm.MVS points Point clusters extracted by the mean shift algorithm for each dominant a... |