• Documents
  • Authors
  • Tables
  • Other Seers ▼
    RefSeer AckSeer CollabSeer SeerSeer
  • Log in
  • Sign up
  • MetaCart

CiteSeerX logo

Advanced Search Include Citations
Advanced Search Include Citations | Disambiguate

3D Urban Scene Modeling Integrating Recognition and Reconstruction (0)

by N Cornelis, B Leibe, K Cornelis, L Van Gool
Add To MetaCart

Tools

Sorted by:
Results 1 - 10 of 25
Next 10 →

Reconstructing building interiors from images

by Yasutaka Furukawa, Brian Curless, Steven M. Seitz, Richard Szeliski - In Proc. of the International Conference on Computer Vision (ICCV , 2009
"... This paper proposes a fully automated 3D reconstruction and visualization system for architectural scenes (interiors and exteriors). The reconstruction of indoor environments from photographs is particularly challenging due to texture-poor planar surfaces such as uniformly-painted walls. Our system ..."
Abstract - Cited by 25 (6 self) - Add to MetaCart
This paper proposes a fully automated 3D reconstruction and visualization system for architectural scenes (interiors and exteriors). The reconstruction of indoor environments from photographs is particularly challenging due to texture-poor planar surfaces such as uniformly-painted walls. Our system first uses structure-from-motion, multiview stereo, and a stereo algorithm specifically designed for Manhattan-world scenes (scenes consisting predominantly of piece-wise planar surfaces with dominant directions) to calibrate the cameras and to recover initial 3D geometry in the form of oriented points and depth maps. Next, the initial geometry is fused into a 3D model with a novel depth-map integration algorithm that, again, makes use of Manhattanworld assumptions and produces simplified 3D models. Finally, the system enables the exploration of reconstructed environments with an interactive, image-based 3D viewer. We demonstrate results on several challenging datasets, including a 3D reconstruction and image-based walk-through of an entire floor of a house, the first result of this kind from an automated computer vision system. 1.

Recovering the Spatial Layout of Cluttered Rooms

by Varsha Hedau, Derek Hoiem, David Forsyth
"... In this paper, we consider the problem of recovering the spatial layout of indoor scenes from monocular images. The presence of clutter is a major problem for existing singleview 3D reconstruction algorithms, most of which rely on finding the ground-wall boundary. In most rooms, this boundary is par ..."
Abstract - Cited by 23 (5 self) - Add to MetaCart
In this paper, we consider the problem of recovering the spatial layout of indoor scenes from monocular images. The presence of clutter is a major problem for existing singleview 3D reconstruction algorithms, most of which rely on finding the ground-wall boundary. In most rooms, this boundary is partially or entirely occluded. We gain robustness to clutter by modeling the global room space with a parameteric 3D “box ” and by iteratively localizing clutter and refitting the box. To fit the box, we introduce a structured learning algorithm that chooses the set of parameters to minimize error, based on global perspective cues. On a dataset of 308 images, we demonstrate the ability of our algorithm to recover spatial layout in cluttered rooms and show several examples of estimated free space. 1.

Manhattan-world Stereo

by Yasutaka Furukawa, Brian Curless, Steven M. Seitz, Richard Szeliski
"... Multi-view stereo (MVS) algorithms now produce reconstructions that rival laser range scanner accuracy. However, stereo algorithms require textured surfaces, and therefore work poorly for many architectural scenes (e.g., building interiors with textureless, painted walls). This paper presents a nove ..."
Abstract - Cited by 17 (4 self) - Add to MetaCart
Multi-view stereo (MVS) algorithms now produce reconstructions that rival laser range scanner accuracy. However, stereo algorithms require textured surfaces, and therefore work poorly for many architectural scenes (e.g., building interiors with textureless, painted walls). This paper presents a novel MVS approach to overcome these limitations for Manhattan World scenes, i.e., scenes that consists of piece-wise planar surfaces with dominant directions. Given a set of calibrated photographs, we first reconstruct textured regions using an existing MVS algorithm, then extract dominant plane directions, generate plane hypotheses, and recover per-view depth maps using Markov random fields. We have tested our algorithm on several datasets ranging from office interiors to outdoor buildings, and demonstrate results that outperform the current state of the art for such texture-poor scenes. 1.

Imagebased facade modeling

by Jianxiong Xiao, Tian Fang, Ping Tan, Peng Zhao, Eyal Ofek, Long Quan - Proc. of SIGGRAPH Asia 2008 , 2008
"... Figure 1: A few façade modeling examples from the two sides of a street with 614 captured images: some input images in the bottom row, the recovered model rendered in the middle row, and three zoomed sections of the recovered model rendered in the top row. We propose in this paper a semi-automatic i ..."
Abstract - Cited by 13 (3 self) - Add to MetaCart
Figure 1: A few façade modeling examples from the two sides of a street with 614 captured images: some input images in the bottom row, the recovered model rendered in the middle row, and three zoomed sections of the recovered model rendered in the top row. We propose in this paper a semi-automatic image-based approach to façade modeling that uses images captured along streets and relies on structure from motion to recover camera positions and point clouds automatically as the initial stage for modeling. We start by considering a building façade as a flat rectangular plane or a developable surface with an associated texture image composited from the multiple visible images. A façade is then decomposed and structured into a Directed Acyclic Graph of rectilinear elementary patches. The decomposition is carried out top-down by a recursive subdivision, and followed by a bottom-up merging with the detection of the architectural bilateral symmetry and repetitive patterns. Each subdivided patch of the flat façade is augmented with a depth optimized using the 3D points cloud. Our system also allows for an easy user feedback in the 2D image space for the proposed decomposition and augmentation. Finally, our approach is demonstrated on a large number of façades from a variety of street-side images.

Image-based Street-side City Modeling

by Jianxiong Xiao, Tian Fang, Peng Zhao, Maxime Lhuillier, Long Quan, Lasmea Université, Blaise Pascal
"... Figure 1: Two close-ups of the parts 1 and 2 of a modeled city area shown in the first two rows. All the models are automatically generated from input images, exemplified by the bottom row. The close-up of the part 3 is shown in Figure 15. We propose an automatic approach to generate street-side 3D ..."
Abstract - Cited by 12 (2 self) - Add to MetaCart
Figure 1: Two close-ups of the parts 1 and 2 of a modeled city area shown in the first two rows. All the models are automatically generated from input images, exemplified by the bottom row. The close-up of the part 3 is shown in Figure 15. We propose an automatic approach to generate street-side 3D photo-realistic models from images captured along the streets at ground level. We first develop a multi-view semantic segmentation method that recognizes and segments each image at pixel level into semantically meaningful areas, each labeled with a specific object class, such as building, sky, ground, vegetation and car. A partition scheme is then introduced to separate buildings into independent blocks using the major line structures of the scene. Finally, for each block, we propose an inverse patch-based orthographic composition and structure analysis method for façade modeling that efficiently regularizes the noisy and missing reconstructed 3D data. Our system has the distinct advantage of producing visually compelling results by imposing strong priors of building regularity. We demonstrate the fully automatic system on a typical city example to validate our methodology. Keywords: Image-based modeling, street view, street-side, building modeling, façade modeling, city modeling, 3D reconstruction.

Coupled Object Detection and Tracking from Static Cameras and Moving Vehicles

by Bastian Leibe, Konrad Schindler, Nico Cornelis, Luc Van Gool , 2008
"... We present a novel approach for multi-object tracking which considers object detection and spacetime trajectory estimation as a coupled optimization problem. Our approach is formulated in a Minimum Description Length hypothesis selection framework, which allows our system to recover from mismatches ..."
Abstract - Cited by 7 (0 self) - Add to MetaCart
We present a novel approach for multi-object tracking which considers object detection and spacetime trajectory estimation as a coupled optimization problem. Our approach is formulated in a Minimum Description Length hypothesis selection framework, which allows our system to recover from mismatches and temporarily lost tracks. Building upon a state-of-the-art object detector, it performs multiview/multicategory object recognition to detect cars and pedestrians in the input images. The 2D object detections are checked for their consistency with (automatically estimated) scene geometry and are converted to 3D observations which are accumulated in a world coordinate frame. A subsequent trajectory estimation module analyzes the resulting 3D observations to find physically plausible spacetime trajectories. Tracking is achieved by performing model selection after every frame. At each time instant, our approach searches for the globally optimal set of spacetime trajectories which provides the best explanation for the current image and for all evidence collected so far while satisfying the constraints that no two objects may occupy the same physical space nor explain the same image pixels at any point in time. Successful trajectory hypotheses are then fed back to guide object detection in future frames. The optimization procedure is kept efficient through incremental computation and conservative hypothesis pruning. We evaluate our approach on several challenging video sequences and demonstrate its performance on both a surveillance-type scenario and a scenario where the input videos are taken from inside a moving vehicle passing through crowded city areas.

3D reconstruction using an n-layer heightmap

by David Gallup, Marc Pollefeys, Jan-michael Frahm - In Proceedings of the DAGM Symposium on Pattern Recognition
"... Abstract. We present a novel method for 3D reconstruction of urban scenes extending a recently introduced heightmap model. Our model has several advantages for 3D modeling of urban scenes: it naturally enforces vertical surfaces, has no holes, leads to an efficient algorithm, and is compact in size. ..."
Abstract - Cited by 4 (1 self) - Add to MetaCart
Abstract. We present a novel method for 3D reconstruction of urban scenes extending a recently introduced heightmap model. Our model has several advantages for 3D modeling of urban scenes: it naturally enforces vertical surfaces, has no holes, leads to an efficient algorithm, and is compact in size. We remove the major limitation of the heightmap by enabling modeling of overhanging structures. Our method is based on an an n-layer heightmap with each layer representing a surface between full and empty space. The configuration of layers can be computed optimally using a dynamic programming method. Our cost function is derived from probabilistic occupancy, and incorporates the Bayesian Information Criterion (BIC) for selecting the number of layers to use at each pixel. 3D surface models are extracted from the heightmap. We show results from a variety of datasets including Internet photo collections. Our method runs on the GPU and the complete system processes video at 13 Hz. 1

Scene shape from texture of objects

by Nadia Payet, Sinisa Todorovic - In CVPR , 2011
"... Joint reasoning about objects and 3D scene layout has shown great promise in scene interpretation. One visual cue that has been overlooked is texture arising from a spatial repetition of objects in the scene (e.g., windows of a building). Such texture provides scene-specific constraints among object ..."
Abstract - Cited by 4 (0 self) - Add to MetaCart
Joint reasoning about objects and 3D scene layout has shown great promise in scene interpretation. One visual cue that has been overlooked is texture arising from a spatial repetition of objects in the scene (e.g., windows of a building). Such texture provides scene-specific constraints among objects, and thus facilitates scene interpretation. We present an approach to: (1) detecting distinct textures of objects in a scene, (2) reconstructing the 3D shape of detected texture surfaces, and (3) combining object detections and shape-from-texture toward a globally consistent scene interpretation. Inference is formulated within the reinforcement learning framework as a sequential interpretation of image regions, starting from confident regions to guide the interpretation of other regions. Our algorithm finds an optimal policy that maps states of detected objects and reconstructed surfaces to actions which ought to be taken in those states, including detecting new objects and identifying new textures, so as to minimize a long-term loss. Tests against ground truth obtained from stereo images demonstrate that we can coarsely reconstruct a 3D model of the scene from a single image, without learning the layout of common scene surfaces, as done in prior work. We also show that reasoning about texture of objects improves object detection. 1.

Structure-and-Motion Pipeline on a Hierarchical Cluster Tree

by Michela Farenzena, Andrea Fusiello, Riccardo Gherardi
"... This papers introduces a novel hierarchical scheme for computing Structure and Motion. The images are organized into a tree with agglomerative clustering, using a measure of overlap as the distance. The reconstruction then follows this tree from the leaves to the root. As a result, the problems is b ..."
Abstract - Cited by 3 (2 self) - Add to MetaCart
This papers introduces a novel hierarchical scheme for computing Structure and Motion. The images are organized into a tree with agglomerative clustering, using a measure of overlap as the distance. The reconstruction then follows this tree from the leaves to the root. As a result, the problems is broken into smaller instances, which are then separately solved and combined. Compared to the standard sequential approach, this framework has a lower computational complexity, it is independent from the initial pair of views, and copes better with drift problems. A formal complexity analysis and some experimental results support these claims. 1.

Building Reconstruction using Manhattan-World Grammars

by Carlos A. Vanegas, Daniel G. Aliaga, Bedřich Beneš
"... Figure 1. System Pipeline. The input to our system consists of one or more calibrated aerial images of a Manhattan-world building. After color segmentation and background/windows removal, our grammar-based algorithm adapts the geometry of the building that produces the façade orientation changes obs ..."
Abstract - Cited by 3 (1 self) - Add to MetaCart
Figure 1. System Pipeline. The input to our system consists of one or more calibrated aerial images of a Manhattan-world building. After color segmentation and background/windows removal, our grammar-based algorithm adapts the geometry of the building that produces the façade orientation changes observed in the photos. The input photos are projected as textures onto the reconstructed model. The result is an automatically-generated complete, closed 3D model of the observed building. We present a passive computer vision method that exploits existing mapping and navigation databases in order to automatically create 3D building models. Our method defines a grammar for representing changes in building geometry that approximately follow the Manhattan-world assumption which states there is a predominance of three mutually orthogonal directions in the scene. By using multiple calibrated aerial images, we extend previous Manhattan-world methods to robustly produce a single, coherent, complete geometric model of a building with partial textures. Our method uses an optimization to discover a 3D building geometry that produces the same set of façade orientation changes observed in the captured images. We have applied our method to several real-world buildings and have analyzed our approach using synthetic buildings. 1.
The National Science Foundation
  • About CiteSeerX
  • Submit Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2010 The Pennsylvania State University