Results 1 - 10
of
36
Constrained Parametric Min-Cuts for Automatic Object Segmentation
, 2010
"... We present a novel framework for generating and rankingplausibleobjectshypothesesin animage using bottom-up processes and mid-level cues. The object hypotheses arerepresented as figure-ground segmentations, and are extracted automatically, withoutpriorknowledgeabout properties of individual object c ..."
Abstract
-
Cited by 24 (4 self)
- Add to MetaCart
We present a novel framework for generating and rankingplausibleobjectshypothesesin animage using bottom-up processes and mid-level cues. The object hypotheses arerepresented as figure-ground segmentations, and are extracted automatically, withoutpriorknowledgeabout properties of individual object classes, by solving a sequence of constrained parametric min-cut problems (CPMC) on a regular image grid. We then learn to rank the object hypotheses by training a continuous model to predict how plausible the segments are, given their mid-level region properties. We show that this algorithm significantly outperforms the state of the art for low-level segmentation in the VOC09 segmentation dataset. It achieves the same average best segmentation covering as the best performing technique to date [2], 0.61 when using just the top 7 ranked segments, instead of the full hierarchy in [2]. Our methodachieves0.78averagebest covering using 154 segments. In a companion paper [18], we also show that the algorithm achieves state-of-the art results when used in a segmentation-based recognition pipeline.
Contour Detection and Hierarchical Image Segmentation
- IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE
, 2010
"... This paper investigates two fundamental problems in computer vision: contour detection and image segmentation. We present state-of-the-art algorithms for both of these tasks. Our contour detector combines multiple local cues into a globalization framework based on spectral clustering. Our segmentati ..."
Abstract
-
Cited by 23 (3 self)
- Add to MetaCart
This paper investigates two fundamental problems in computer vision: contour detection and image segmentation. We present state-of-the-art algorithms for both of these tasks. Our contour detector combines multiple local cues into a globalization framework based on spectral clustering. Our segmentation algorithm consists of generic machinery for transforming the output of any contour detector into a hierarchical region tree. In this manner, we reduce the problem of image segmentation to that of contour detection. Extensive experimental evaluation demonstrates that both our contour detection and segmentation methods significantly outperform competing algorithms. The automatically generated hierarchical segmentations can be interactively refined by userspecified annotations. Computation at multiple image resolutions provides a means of coupling our system to recognition applications.
Object recognition by integrating multiple image segmentations
- In Proceedings European Conference on Computer Vision
, 2008
"... Abstract. The joint tasks of object recognition and object segmentation from a single image are complex in their requirement of not only correct classification, but also deciding exactly which pixels belong to the object. Exploring all possible pixel subsets is prohibitively expensive, leading to re ..."
Abstract
-
Cited by 18 (0 self)
- Add to MetaCart
Abstract. The joint tasks of object recognition and object segmentation from a single image are complex in their requirement of not only correct classification, but also deciding exactly which pixels belong to the object. Exploring all possible pixel subsets is prohibitively expensive, leading to recent approaches which use unsupervised image segmentation to reduce the size of the configuration space. Image segmentation, however, is known to be unstable, strongly affected by small image perturbations, feature choices, or different segmentation algorithms. This instability has led to advocacy for using multiple segmentations of an image. In this paper, we explore the question of how to best integrate the information from multiple bottom-up segmentations of an image to improve object recognition robustness. By integrating the image partition hypotheses in an intuitive combined top-down and bottom-up recognition approach, we improve object and feature support. We further explore possible extensions of our method and whether they provide improved performance. Results are presented on the MSRC 21-class data set and the Pascal VOC2007 object segmentation challenge. 1
From appearance to context-based recognition: Dense labeling in small images
, 2008
"... Traditionally, object recognition is performed based solely on the appearance of the object. However, relevant information also exists in the scene surrounding the object. As supported by our human studies, this contextual information is necessary for accurate recognition in low resolution images. T ..."
Abstract
-
Cited by 14 (5 self)
- Add to MetaCart
Traditionally, object recognition is performed based solely on the appearance of the object. However, relevant information also exists in the scene surrounding the object. As supported by our human studies, this contextual information is necessary for accurate recognition in low resolution images. This scenario with impoverished appearance information, as opposed to using images of higher resolution, provides an appropriate venue for studying the role of context in recognition. In this paper, we explore the role of context for dense scene labeling in small images. Given a segmentation of an image, our algorithm assigns each segment to an object category based on the segment’s appearance and contextual information. We explicitly model context between object categories through the use of relative location and relative scale, in addition to co-occurrence. We perform recognition tests on low and high resolution images, which vary significantly in the amount of appearance information present, using just the object appearance information, the combination of appearance and context, as well as just context without object appearance information (blind recognition). We also perform these tests in human studies and analyze our findings to reveal interesting patterns. With the use of our context model, our algorithm achieves state-of-the-art performance on MSRC and Corel. datasets.
Category Independent Object Proposals
"... Abstract. We propose a category-independent method to produce a bag of regions and rank them, such that top-ranked regions are likely to be good segmentations of different objects. Our key objectives are completeness and diversity: every object should have at least one good proposed region, and a di ..."
Abstract
-
Cited by 8 (0 self)
- Add to MetaCart
Abstract. We propose a category-independent method to produce a bag of regions and rank them, such that top-ranked regions are likely to be good segmentations of different objects. Our key objectives are completeness and diversity: every object should have at least one good proposed region, and a diverse set should be top-ranked. Our approach is to generate a set of segmentations by performing graph cuts based on a seed region and a learned affinity function. Then, the regions are ranked using structured learning based on various cues. Our experiments on BSDS and PASCAL VOC 2008 demonstrate our ability to find most objects within a small bag of proposed regions. 1
3D Laser Scan Classification Using Web Data and Domain Adaptation
"... Abstract — Over the last years, object recognition has become a more and more active field of research in robotics. An important problem in object recognition is the need for sufficient labeled training data to learn good classifiers. In this paper we show how to significantly reduce the need for ma ..."
Abstract
-
Cited by 6 (1 self)
- Add to MetaCart
Abstract — Over the last years, object recognition has become a more and more active field of research in robotics. An important problem in object recognition is the need for sufficient labeled training data to learn good classifiers. In this paper we show how to significantly reduce the need for manually labeled training data by leveraging data sets available on the World Wide Web. Specifically, we show how to use objects from Google’s 3D Warehouse to train classifiers for 3D laser scans collected by a robot navigating through urban environments. In order to deal with the different characteristics of the web data and the real robot data, we additionally use a small set of labeled 3D laser scans and perform domain adaptation. Our experiments demonstrate that additional data taken from the 3D Warehouse along with our domain adaptation greatly improves the classification accuracy on real laser scans. I.
Natural Image Segmentation with Adaptive Texture and Boundary Encoding
, 2009
"... We present a novel algorithm for unsupervised segmentation of natural images that harnesses the principle of minimum description length (MDL). Our method is based on observations that a homogeneously textured region of a natural image can be well modeled by a Gaussian distribution and the region bou ..."
Abstract
-
Cited by 6 (3 self)
- Add to MetaCart
We present a novel algorithm for unsupervised segmentation of natural images that harnesses the principle of minimum description length (MDL). Our method is based on observations that a homogeneously textured region of a natural image can be well modeled by a Gaussian distribution and the region boundary can be effectively coded by an adaptive chain code. The optimal segmentation of an image is the one that gives the shortest coding length for encoding all textures and boundaries in the image, and is obtained via an agglomerative clustering process applied to a hierarchy of decreasing window sizes. The optimal segmentation also provides an accurate estimate of the overall coding length and hence the true entropy of the image. We test our algorithm on two publicly available databases: Berkeley Segmentation Dataset and MSRC Object Recognition Database. It achieves state-of-the-art segmentation results compared to other popular methods. 1
Occlusion boundaries from motion: Low-level detection and mid-level reasoning
- International Journal of Computer Vision
, 2009
"... Abstract The boundaries of objects in an image are often considered a nuisance to be “handled ” due to the occlusion they exhibit. Since most, if not all, computer vision techniques aggregate information spatially within a scene, information spanning these boundaries, and therefore from different ph ..."
Abstract
-
Cited by 6 (0 self)
- Add to MetaCart
Abstract The boundaries of objects in an image are often considered a nuisance to be “handled ” due to the occlusion they exhibit. Since most, if not all, computer vision techniques aggregate information spatially within a scene, information spanning these boundaries, and therefore from different physical surfaces, is invariably and erroneously considered together. In addition, these boundaries convey important perceptual information about 3D scene structure and shape. Consequently, their identification can benefit many different computer vision pursuits, from low-level processing techniques to high-level reasoning tasks. While much focus in computer vision is placed on the processing of individual, static images, many applications actually offer video, or sequences of images, as input. The extra temporal dimension of the data allows the motion of the camera or the scene to be used in processing. In this paper, we focus on the exploitation of subtle relative-motion cues present at occlusion boundaries. When combined with more standard appearance information, we demonstrate these cues ’ utility in detecting occlusion boundaries locally. We also present a novel, mid-level model for reasoning more globally about object boundaries and propagating such local information to extract improved, extended boundaries.
Can Similar Scenes help Surface Layout Estimation?
"... We describe a preliminary investigation of utilising large amounts of unlabelled image data to help in the estimation of rough scene layout. We take the single-view geometry estimation system of Hoiem et al [3] as the baseline and see if it is possible to improve its performance by considering a set ..."
Abstract
-
Cited by 5 (1 self)
- Add to MetaCart
We describe a preliminary investigation of utilising large amounts of unlabelled image data to help in the estimation of rough scene layout. We take the single-view geometry estimation system of Hoiem et al [3] as the baseline and see if it is possible to improve its performance by considering a set of similar scenes gathered from the web. The two complimentary approaches being considered are 1) improving surface classification by using average geometry estimated from the matches, and 2) improving surface segmentation by injecting segments generated from the average of the matched images. The system is evaluated using the labelled 300-image dataset of Hoiem et al. and shows promising results. Figure 1. In this paper we show that Geometric Context [3] performance can be improved by using a set of scene matches drawn from a large unlabelled image collection 1.
M.: Modeling the temporal extent of actions
, 2010
"... Abstract. In this paper, we present a framework for estimating what portions of videos are most discriminative for the task of action recognition. We explore the impact of the temporal cropping of training videos on the overall accuracy of an action recognition system, and we formalize what makes a ..."
Abstract
-
Cited by 4 (0 self)
- Add to MetaCart
Abstract. In this paper, we present a framework for estimating what portions of videos are most discriminative for the task of action recognition. We explore the impact of the temporal cropping of training videos on the overall accuracy of an action recognition system, and we formalize what makes a set of croppings optimal. In addition, we present an algorithm to determine the best set of croppings for a dataset, and experimentally show that our approach increases the accuracy of various state-of-the-art action recognition techniques.

