Results 1  10
of
36
Geometric Context from a Single Image
 In ICCV
, 2005
"... Many computer vision algorithms limit their performance by ignoring the underlying 3D geometric structure in the image. We show that we can estimate the coarse geometric properties of a scene by learning appearancebased models of geometric classes, even in cluttered natural scenes. Geometric classe ..."
Abstract

Cited by 169 (32 self)
 Add to MetaCart
Many computer vision algorithms limit their performance by ignoring the underlying 3D geometric structure in the image. We show that we can estimate the coarse geometric properties of a scene by learning appearancebased models of geometric classes, even in cluttered natural scenes. Geometric classes describe the 3D orientation of an image region with respect to the camera. We provide a multiplehypothesis framework for robustly estimating scene structure from a single image and obtaining confidences for each geometric label. These confidences can then be used to improve the performance of many other applications. We provide a thorough quantitative evaluation of our algorithm on a set of outdoor images and demonstrate its usefulness in two applications: object detection and automatic singleview reconstruction.
Image Parsing: Unifying Segmentation, Detection, and Recognition
, 2005
"... In this paper we present a Bayesian framework for parsing images into their constituent visual patterns. The parsing algorithm optimizes the posterior probability and outputs a scene representation in a "parsing graph", in a spirit similar to parsing sentences in speech and natural language. The ..."
Abstract

Cited by 162 (18 self)
 Add to MetaCart
In this paper we present a Bayesian framework for parsing images into their constituent visual patterns. The parsing algorithm optimizes the posterior probability and outputs a scene representation in a "parsing graph", in a spirit similar to parsing sentences in speech and natural language. The algorithm constructs the parsing graph and reconfigures it dynamically using a set of reversible Markov chain jumps. This computational framework integrates two popular inference approaches  generative (topdown) methods and discriminative (bottomup) methods. The former formulates the posterior probability in terms of generative models for images defined by likelihood functions and priors. The latter computes discriminative probabilities based on a sequence (cascade) of bottomup tests/filters.
Make3D: Learning 3D Scene Structure from a Single Still Image
"... We consider the problem of estimating detailed 3d structure from a single still image of an unstructured environment. Our goal is to create 3d models which are both quantitatively accurate as well as visually pleasing. For each small homogeneous patch in the image, we use a Markov Random Field (M ..."
Abstract

Cited by 72 (18 self)
 Add to MetaCart
We consider the problem of estimating detailed 3d structure from a single still image of an unstructured environment. Our goal is to create 3d models which are both quantitatively accurate as well as visually pleasing. For each small homogeneous patch in the image, we use a Markov Random Field (MRF) to infer a set of “plane parameters” that capture both the 3d location and 3d orientation of the patch. The MRF, trained via supervised learning, models both image depth cues as well as the relationships between different parts of the image. Other than assuming that the environment is made up of a number of small planes, our model makes no explicit assumptions about the structure of the scene; this enables the algorithm to capture much more detailed 3d structure than does prior art, and also give a much richer experience in the 3d flythroughs created using imagebased rendering, even for scenes with significant nonvertical structure. Using this approach, we have created qualitatively correct 3d models for 64.9 % of 588 images downloaded from the internet. We have also extended our model to produce large scale 3d models from a few images.
Recovering the Spatial Layout of Cluttered Rooms
"... In this paper, we consider the problem of recovering the spatial layout of indoor scenes from monocular images. The presence of clutter is a major problem for existing singleview 3D reconstruction algorithms, most of which rely on finding the groundwall boundary. In most rooms, this boundary is par ..."
Abstract

Cited by 47 (6 self)
 Add to MetaCart
In this paper, we consider the problem of recovering the spatial layout of indoor scenes from monocular images. The presence of clutter is a major problem for existing singleview 3D reconstruction algorithms, most of which rely on finding the groundwall boundary. In most rooms, this boundary is partially or entirely occluded. We gain robustness to clutter by modeling the global room space with a parameteric 3D “box ” and by iteratively localizing clutter and refitting the box. To fit the box, we introduce a structured learning algorithm that chooses the set of parameters to minimize error, based on global perspective cues. On a dataset of 308 images, we demonstrate the ability of our algorithm to recover spatial layout in cluttered rooms and show several examples of estimated free space. 1.
A dynamic bayesian network model for autonomous 3d reconstruction from a single indoor image
 In CVPR
, 2006
"... indoor image ..."
Imagebased plant modeling
 ACM Trans. Graph
"... Figure 1: Imagebased modeling of poinsettia plant. (a) An input image out of 35 images, (b) recovered model rendered at the same viewpoint as (a), (c) recovered model rendered at a different viewpoint, (d) recovered model with modified leaf textures. In this paper, we propose a semiautomatic techn ..."
Abstract

Cited by 34 (5 self)
 Add to MetaCart
Figure 1: Imagebased modeling of poinsettia plant. (a) An input image out of 35 images, (b) recovered model rendered at the same viewpoint as (a), (c) recovered model rendered at a different viewpoint, (d) recovered model with modified leaf textures. In this paper, we propose a semiautomatic technique for modeling plants directly from images. Our imagebased approach has the distinct advantage that the resulting model inherits the realistic shape and complexity of a real plant. We designed our modeling system to be interactive, automating the process of shape recovery while relying on the user to provide simple hints on segmentation. Segmentation is performed in both image and 3D spaces, allowing the user to easily visualize its effect immediately. Using the segmented image and 3D data, the geometry of each leaf is then automatically recovered from the multiple views by fitting a deformable leaf model. Our system also allows the user to easily reconstruct branches in a similar manner. We show realistic reconstructions of a variety of plants, and demonstrate examples of plant editing.
Imagebased tree modeling
 ACM Trans. Graph
, 2007
"... Figure 1: Imagebased modeling of a tree. From left to right: A source image (out of 18 images), reconstructed branch structure rendered at the same viewpoint, tree model rendered at the same viewpoint, and tree model rendered at a different viewpoint. In this paper, we propose an approach for gener ..."
Abstract

Cited by 26 (3 self)
 Add to MetaCart
Figure 1: Imagebased modeling of a tree. From left to right: A source image (out of 18 images), reconstructed branch structure rendered at the same viewpoint, tree model rendered at the same viewpoint, and tree model rendered at a different viewpoint. In this paper, we propose an approach for generating 3D models of naturallooking trees from images that has the additional benefit of requiring little user intervention. While our approach is primarily imagebased, we do not model each leaf directly from images due to the large leaf count, small image footprint, and widespread occlusions. Instead, we populate the tree with leaf replicas from segmented source images to reconstruct the overall tree shape. In addition, we use the shape patterns of visible branches to predict those of obscured branches. We demonstrate our approach on a variety of trees. 1
Automatic singleimage 3d reconstructions of indoor manhattan world scenes
 In ISRR
, 2005
"... Summary. 3d reconstruction from a single image is inherently an ambiguous problem. Yet when we look at a picture, we can often infer 3d information about the scene. Humans perform singleimage 3d reconstructions by using a variety of singleimage depth cues, for example, by recognizing objects and su ..."
Abstract

Cited by 17 (7 self)
 Add to MetaCart
Summary. 3d reconstruction from a single image is inherently an ambiguous problem. Yet when we look at a picture, we can often infer 3d information about the scene. Humans perform singleimage 3d reconstructions by using a variety of singleimage depth cues, for example, by recognizing objects and surfaces, and reasoning about how these surfaces are connected to each other. In this paper, we focus on the problem of automatic 3d reconstruction of indoor scenes, specifically ones (sometimes called “Manhattan worlds”) that consist mainly of orthogonal planes. We use a Markov random field (MRF) model to identify the different planes and edges in the scene, as well as their orientations. Then, an iterative optimization algorithm is applied to infer the most probable position of all the planes, and thereby obtain a 3d reconstruction. Our approach is fully automatic—given an input image, no human intervention is necessary to obtain an approximate 3d reconstruction. 1
Parsing images of architectural scenes
 In CVPR
, 2007
"... We address image parsing in the setting of architectural scenes. Our goal is to parse an image into regions of various types such as sky, foliage, buildings, and street. Furthermore we parse the building regions at a finer level of detail, identifying the positions of windows, doors, and rooflines, ..."
Abstract

Cited by 15 (1 self)
 Add to MetaCart
We address image parsing in the setting of architectural scenes. Our goal is to parse an image into regions of various types such as sky, foliage, buildings, and street. Furthermore we parse the building regions at a finer level of detail, identifying the positions of windows, doors, and rooflines, the colors of walls, and the spatial extent of particular buildings. Recognizing these individual elements is often impossible without the context provided by the initial parsing of the image, for instance a roofline is only defined in relation to the building below and the sky above. Our approach is driven by recognition of generic classes of visual appearance, e.g. for foliage. The generic recognition results bootstrap an image specific model that provides refined estimates to use for matting, segmentation, and more detailed parsing. 1.
Parsing Images into Regions, Curves, and Curve Groups
"... In this paper, we present an algorithm for parsing natural images into middle level vision representations – regions, curves, and curve groups (parallel curves and trees). This algorithm is targeted for an integrated solution to image segmentation and curve grouping through Bayesian inference. The ..."
Abstract

Cited by 12 (4 self)
 Add to MetaCart
In this paper, we present an algorithm for parsing natural images into middle level vision representations – regions, curves, and curve groups (parallel curves and trees). This algorithm is targeted for an integrated solution to image segmentation and curve grouping through Bayesian inference. The paper makes the following contributions. (1) It adopts a layered (or 2.1Dsketch) representation integrating both region and curve models which compete to explain an input image. The curve layer occludes the region layer and curves observe a partial order occlusion relation. (2) A Markov chain search scheme Metropolized Gibbs Samplers (MGS) is studied. It consists of several pairs of reversible jumps to traverse the complex solution space. An MGS proposes the next state within the jump scope of the current state according to a conditional probability like a Gibbs sampler and then accepts the proposal with a MetropolisHastings step. This paper discusses systematic design strategies of devising reversible jumps for a complex inference task. (3) The proposal probability ratios in jumps are factorized into ratios of discriminative probabilities. The latter are computed in a bottomup process, and they drive the Markov chain dynamics in a datadriven Markov chain Monte Carlo framework. We demonstrate the performance of the algorithm in experiments with a number of natural images.