Results 1 - 10
of
50
Geometric Context from a Single Image
- In ICCV
, 2005
"... Many computer vision algorithms limit their performance by ignoring the underlying 3D geometric structure in the image. We show that we can estimate the coarse geometric properties of a scene by learning appearance-based models of geometric classes, even in cluttered natural scenes. Geometric classe ..."
Abstract
-
Cited by 111 (27 self)
- Add to MetaCart
Many computer vision algorithms limit their performance by ignoring the underlying 3D geometric structure in the image. We show that we can estimate the coarse geometric properties of a scene by learning appearance-based models of geometric classes, even in cluttered natural scenes. Geometric classes describe the 3D orientation of an image region with respect to the camera. We provide a multiplehypothesis framework for robustly estimating scene structure from a single image and obtaining confidences for each geometric label. These confidences can then be used to improve the performance of many other applications. We provide a thorough quantitative evaluation of our algorithm on a set of outdoor images and demonstrate its usefulness in two applications: object detection and automatic singleview reconstruction.
Putting objects in perspective
- In CVPR
, 2006
"... Image understanding requires not only individually estimating elements of the visual world but also capturing the interplay among them. In this paper, we provide a framework for placing local object detection in the context of the overall 3D scene by modeling the interdependence of objects, surface ..."
Abstract
-
Cited by 106 (10 self)
- Add to MetaCart
Image understanding requires not only individually estimating elements of the visual world but also capturing the interplay among them. In this paper, we provide a framework for placing local object detection in the context of the overall 3D scene by modeling the interdependence of objects, surface orientations, and camera viewpoint. Most object detection methods consider all scales and locations in the image as equally likely. We show that with probabilistic estimates of 3D geometry, both in terms of surfaces and world coordinates, we can put objects into perspective and model the scale and location variance in the image. Our approach reflects the cyclical nature of the problem by allowing probabilistic object hypotheses to refine geometry and vice-versa. Our framework allows painless substitution of almost any object detector and is easily extended to include other aspects of image understanding. Our results confirm the benefits of our integrated approach. 1.
Automatic photo pop-up
- in ACM SIGGRAPH
, 2005
"... Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of ..."
Abstract
-
Cited by 75 (8 self)
- Add to MetaCart
Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, to republish, to post on servers, or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from Permissions
Im2gps: estimating geographic information from a single image
- in IEEE Conference on Computer Vision and Pattern Recognition
, 2008
"... Estimating geographic information from an image is an excellent, difficult high-level computer vision problem whose time has come. The emergence of vast amounts of geographically-calibrated image data is a great reason for computer vision to start looking globally – on the scale of the entire planet ..."
Abstract
-
Cited by 55 (8 self)
- Add to MetaCart
Estimating geographic information from an image is an excellent, difficult high-level computer vision problem whose time has come. The emergence of vast amounts of geographically-calibrated image data is a great reason for computer vision to start looking globally – on the scale of the entire planet! In this paper, we propose a simple algorithm for estimating a distribution over geographic locations from a single image using a purely data-driven scene matching approach. For this task, we will leverage a dataset of over 6 million GPS-tagged images from the Internet. We represent the estimated image location as a probability distribution over the Earth’s surface. We quantitatively evaluate our approach in several geolocation tasks and demonstrate encouraging performance (up to 30 times better than chance). We show that geolocation estimates can provide the basis for numerous other image understanding tasks such as population density estimation, land cover estimation or urban/rural classification.
An Image-Based System for Urban Navigation
- IN BMVC
, 2004
"... We describe the prototype of a system intended to allow a user to navigate in an urban environment using a mobile telephone equipped with a camera. The system ..."
Abstract
-
Cited by 25 (0 self)
- Add to MetaCart
We describe the prototype of a system intended to allow a user to navigate in an urban environment using a mobile telephone equipped with a camera. The system
Recovering the Spatial Layout of Cluttered Rooms
"... In this paper, we consider the problem of recovering the spatial layout of indoor scenes from monocular images. The presence of clutter is a major problem for existing singleview 3D reconstruction algorithms, most of which rely on finding the ground-wall boundary. In most rooms, this boundary is par ..."
Abstract
-
Cited by 23 (5 self)
- Add to MetaCart
In this paper, we consider the problem of recovering the spatial layout of indoor scenes from monocular images. The presence of clutter is a major problem for existing singleview 3D reconstruction algorithms, most of which rely on finding the ground-wall boundary. In most rooms, this boundary is partially or entirely occluded. We gain robustness to clutter by modeling the global room space with a parameteric 3D “box ” and by iteratively localizing clutter and refitting the box. To fit the box, we introduce a structured learning algorithm that chooses the set of parameters to minimize error, based on global perspective cues. On a dataset of 308 images, we demonstrate the ability of our algorithm to recover spatial layout in cluttered rooms and show several examples of estimated free space. 1.
Geometric reasoning for single image structure recovery
- In proc. CVPR
, 2009
"... We study the problem of generating plausible interpretations of a scene from a collection of line segments automatically extracted from a single indoor image. We show that we can recognize the three dimensional structure of the interior of a building, even in the presence of occluding objects. Sever ..."
Abstract
-
Cited by 23 (4 self)
- Add to MetaCart
We study the problem of generating plausible interpretations of a scene from a collection of line segments automatically extracted from a single indoor image. We show that we can recognize the three dimensional structure of the interior of a building, even in the presence of occluding objects. Several physically valid structure hypotheses are proposed by geometric reasoning and verified to find the best fitting model to line segments, which is then converted to a full 3D model. Our experiments demonstrate that our structure recovery from line segments is comparable with methods using full image appearance. Our approach shows how a set of rules describing geometric constraints between groups of segments can be used to prune scene interpretation hypotheses and to generate the most plausible interpretation. Figure 1. Line segments. Can you recognize the building structure? Can you find doors? 1.
Using Geometric Constraints through Parallelepipeds for Calibration and 3D Modelling
, 2005
"... This paper concerns the incorporation of geometric information in camera calibration and 3D modeling. Using geometric constraints enables more stable results and allows us to perform tasks with fewer images. Our approach is motivated and developed within a framework of semi-automatic 3D modeling, wh ..."
Abstract
-
Cited by 21 (3 self)
- Add to MetaCart
This paper concerns the incorporation of geometric information in camera calibration and 3D modeling. Using geometric constraints enables more stable results and allows us to perform tasks with fewer images. Our approach is motivated and developed within a framework of semi-automatic 3D modeling, where the user defines geometric primitives and constraints between them. It is based on the observation that constraints, such as coplanarity, parallelism, or orthogonality, are often embedded intuitively in parallelepipeds. Moreover, parallelepipeds are easy to delineate by a user and are well adapted to model the main structure of, e.g., architectural scenes. In this paper, first a duality that exists between the shape parameters of a parallelepiped and the intrinsic parameters of a camera is described. Then, a factorization-based algorithm exploiting this relation is developed. Using images of parallelepipeds, it allows us to simultaneously calibrate cameras, recover shapes of parallelepipeds, and estimate the relative pose of all entities. Besides geometric constraints expressed via parallelepipeds, our approach simultaneously takes into account the usual self-calibration constraints on cameras. The proposed algorithm is completed by a study of the singular cases of the calibration method. A complete method for the reconstruction of scene primitives that are not modeled by parallelepipeds is also briefly described. The proposed methods are validated by various experiments with real and simulated data, for single-view as well as multiview cases.
Atlanta World: An Expectation Maximization Framework for Simultaneous Low-level Edge Grouping and Camera Calibration in Complex Man-made Environments
- In Int. Conf. on Computer Vision and Pattern Recognition
, 2004
"... Edges in man-made environments, grouped according to vanishing point directions, provide single-view constraints that have been exploited before as a precursor to both scene understanding and camera calibration. A Bayesian approach to edge grouping was proposed in the "Manhattan World" paper by Coug ..."
Abstract
-
Cited by 15 (0 self)
- Add to MetaCart
Edges in man-made environments, grouped according to vanishing point directions, provide single-view constraints that have been exploited before as a precursor to both scene understanding and camera calibration. A Bayesian approach to edge grouping was proposed in the "Manhattan World" paper by Coughlan and Yuille, where they assume the existence of three mutually orthogonal vanishing directions in the scene. We extend the thread of work spawned by Coughlan and Yuille in several significant ways. We propose to use the expectation maximization (EM) algorithm to perform the search over all continuous parameters that influence the location of the vanishing points in a scene. Because EM behaves well in high-dimensional spaces, our method can optimize over many more parameters than the exhaustive and stochastic algorithms used previously for this task. Among other things, this lets us optimize over multiple groups of orthogonal vanishing directions, each of which induces one additional degree of freedom. EM is also well suited to recursive estimation of the kind needed for image sequences and/or in mobile robotics. We present experimental results on images of "Atlanta worlds," complex urban scenes with multiple orthogonal edge-groups, that validate our approach. We also show results for continuous relative orientation estimation on a mobile robot.

