Results 1 - 10
of
38
TextonBoost for Image Understanding: Multi-Class Object Recognition and Segmentation by Jointly Modeling Texture, Layout, and Context
, 2007
"... This paper details a new approach for learning a discriminative model of object classes, incorporating texture, layout, and context information efficiently. The learned model is used for automatic visual understanding and semantic segmentation of photographs. Our discriminative model exploits textur ..."
Abstract
-
Cited by 44 (5 self)
- Add to MetaCart
This paper details a new approach for learning a discriminative model of object classes, incorporating texture, layout, and context information efficiently. The learned model is used for automatic visual understanding and semantic segmentation of photographs. Our discriminative model exploits texture-layout filters, novel features based on textons, which jointly model patterns of texture and their spatial layout. Unary classification and feature selection is achieved using shared boosting to give an efficient classifier which can be applied to a large number of classes. Accurate image segmentation is achieved by incorporating the unary classifier in a conditional random field, which (i) captures the spatial interactions between class labels of neighboring pixels, and (ii) improves the segmentation of specific object instances. Efficient training of the model on large datasets is achieved by exploiting both random feature selection and piecewise training methods. High classification and segmentation accuracy is
From Contours to Regions: An Empirical Evaluation
"... We propose a generic grouping algorithm that constructs a hierarchy of regions from the output of any contour detector. Our method consists of two steps, an Oriented Watershed Transform (OWT) to form initial regions from contours, followed by construction of an Ultrametric Contour Map (UCM) defining ..."
Abstract
-
Cited by 40 (6 self)
- Add to MetaCart
We propose a generic grouping algorithm that constructs a hierarchy of regions from the output of any contour detector. Our method consists of two steps, an Oriented Watershed Transform (OWT) to form initial regions from contours, followed by construction of an Ultrametric Contour Map (UCM) definingahierarchicalsegmentation. We provideextensive experimentalevaluationtodemonstratethat, when coupled to a high-performance contour detector, the OWT-UCM algorithm produces state-of-the-art image segmentations. These hierarchical segmentations can optionally be further refined by user-specified annotations.
Four-Chamber Heart Modeling and Automatic Segmentation for 3D Cardiac CT Volumes Using Marginal Space Learning and Steerable Features
- IEEE TRANSACTIONS ON MEDICAL IMAGING
, 2008
"... We propose an automatic four-chamber heart segmentation system for the quantitative functional analysis of the heart from cardiac computed tomography (CT) volumes. Two topics are discussed: heart modeling and automatic model fitting to an unseen volume. Heart modeling is a non-trivial task since the ..."
Abstract
-
Cited by 34 (23 self)
- Add to MetaCart
We propose an automatic four-chamber heart segmentation system for the quantitative functional analysis of the heart from cardiac computed tomography (CT) volumes. Two topics are discussed: heart modeling and automatic model fitting to an unseen volume. Heart modeling is a non-trivial task since the heart is a complex nonrigid organ. The model must be anatomically accurate, allow manual editing, and provide sufficient information to guide automatic detection and segmentation. Unlike previous work, we explicitly represent important landmarks (such as the valves and the ventricular septum cusps) among the control points of the model. The control points can be detected reliably to guide the automatic model fitting process. Using this model, we develop an efficient and robust approach for automatic heart chamber segmentation in 3D CT volumes. We formulate the segmentation as a two-step learning problem: anatomical structure localization and boundary delineation. In both steps, we exploit the recent advances in learning discriminative models. A novel algorithm, marginal space learning (MSL), is introduced to solve the 9-dimensional similarity transformation search problem for localizing the heart chambers. After determining the pose of the heart chambers, we estimate the 3D shape through learning-based boundary delineation. The proposed method has been extensively tested on the largest dataset (with 323 volumes from 137 patients) ever reported in the literature. To the best of our knowledge, our system is the fastest with a speed of 4.0 seconds per volume (on a dual-core 3.2 GHz processor) for the automatic segmentation of all four chambers.
Constrained Parametric Min-Cuts for Automatic Object Segmentation
, 2010
"... We present a novel framework for generating and rankingplausibleobjectshypothesesin animage using bottom-up processes and mid-level cues. The object hypotheses arerepresented as figure-ground segmentations, and are extracted automatically, withoutpriorknowledgeabout properties of individual object c ..."
Abstract
-
Cited by 24 (4 self)
- Add to MetaCart
We present a novel framework for generating and rankingplausibleobjectshypothesesin animage using bottom-up processes and mid-level cues. The object hypotheses arerepresented as figure-ground segmentations, and are extracted automatically, withoutpriorknowledgeabout properties of individual object classes, by solving a sequence of constrained parametric min-cut problems (CPMC) on a regular image grid. We then learn to rank the object hypotheses by training a continuous model to predict how plausible the segments are, given their mid-level region properties. We show that this algorithm significantly outperforms the state of the art for low-level segmentation in the VOC09 segmentation dataset. It achieves the same average best segmentation covering as the best performing technique to date [2], 0.61 when using just the top 7 ranked segments, instead of the full hierarchy in [2]. Our methodachieves0.78averagebest covering using 154 segments. In a companion paper [18], we also show that the algorithm achieves state-of-the art results when used in a segmentation-based recognition pipeline.
Contour Detection and Hierarchical Image Segmentation
- IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE
, 2010
"... This paper investigates two fundamental problems in computer vision: contour detection and image segmentation. We present state-of-the-art algorithms for both of these tasks. Our contour detector combines multiple local cues into a globalization framework based on spectral clustering. Our segmentati ..."
Abstract
-
Cited by 23 (3 self)
- Add to MetaCart
This paper investigates two fundamental problems in computer vision: contour detection and image segmentation. We present state-of-the-art algorithms for both of these tasks. Our contour detector combines multiple local cues into a globalization framework based on spectral clustering. Our segmentation algorithm consists of generic machinery for transforming the output of any contour detector into a hierarchical region tree. In this manner, we reduce the problem of image segmentation to that of contour detection. Extensive experimental evaluation demonstrates that both our contour detection and segmentation methods significantly outperform competing algorithms. The automatically generated hierarchical segmentations can be interactively refined by userspecified annotations. Computation at multiple image resolutions provides a means of coupling our system to recognition applications.
Learning to find object boundaries using motion cues
- In IEEE international conference on computer vision (ICCV
, 2007
"... While great strides have been made in detecting and localizing specific objects in natural images, the bottom-up segmentation of unknown, generic objects remains a difficult challenge. We believe that occlusion can provide a strong cue for object segmentation and “pop-out”, but detecting an object’s ..."
Abstract
-
Cited by 13 (3 self)
- Add to MetaCart
While great strides have been made in detecting and localizing specific objects in natural images, the bottom-up segmentation of unknown, generic objects remains a difficult challenge. We believe that occlusion can provide a strong cue for object segmentation and “pop-out”, but detecting an object’s occlusion boundaries using appearance alone is a difficult problem in itself. If the camera or the scene is moving, however, that motion provides an additional powerful indicator of occlusion. Thus, we use standard appearance cues (e.g. brightness/color gradient) in addition to motion cues that capture subtle differences in the relative surface motion (i.e. parallax) on either side of an occlusion boundary. We describe a learned local classifier and global inference approach which provide a framework for combining and reasoning about these appearance and motion cues to estimate which region boundaries of an initial over-segmentation correspond to object/occlusion boundaries in the scene. Through results on a dataset which contains short videos with labeled boundaries, we demonstrate the effectiveness of motion cues for this task. 1.
3D ultrasound tracking of the left ventricles using one-step forward prediction and data fusion of collaborative trackers
- in Proc. IEEE Conf. Computer Vision and Pattern Recognition
, 2008
"... Tracking the left ventricle (LV) in 3D ultrasound data is a challenging task because of the poor image quality and speed requirements. Many previous algorithms applied standard 2D tracking methods to tackle the 3D problem. However, the performance is limited due to increased data size, landmarks amb ..."
Abstract
-
Cited by 12 (9 self)
- Add to MetaCart
Tracking the left ventricle (LV) in 3D ultrasound data is a challenging task because of the poor image quality and speed requirements. Many previous algorithms applied standard 2D tracking methods to tackle the 3D problem. However, the performance is limited due to increased data size, landmarks ambiguity, signal drop-out or non-rigid deformation. In this paper we present a robust, fast and accurate 3D LV tracking algorithm. We propose a novel onestep forward prediction to generate the motion prior using motion manifold learning, and introduce two collaborative trackers to achieve both temporal consistency and failure recovery. Compared with tracking by detection and 3D optical flow, our algorithm provides the best results and subvoxel accuracy. The new tracking algorithm is completely automatic and computationally efficient. It requires less than 1.5 seconds to process a 3D volume which contains 4,925,440 voxels. 1.
Torr Learning Class-specific Edges for Object Detection and Segmentation
"... Abstract. Recent research into recognizing object classes (such as humans, cows and hands) has made use of edge features to hypothesize and localize class instances. However, for the most part, these edge-based methods operate solely on the geometric shape of edges, treating them equally and ignorin ..."
Abstract
-
Cited by 10 (1 self)
- Add to MetaCart
Abstract. Recent research into recognizing object classes (such as humans, cows and hands) has made use of edge features to hypothesize and localize class instances. However, for the most part, these edge-based methods operate solely on the geometric shape of edges, treating them equally and ignoring the fact that for certain object classes, the appearance of the object on the “inside ” of the edge may provide valuable recognition cues. We show how, for such object classes, small regions around edges can be used to classify the edge into object or non-object. This classifier may then be used to prune edges which are not relevant to the object class, and thereby improve the performance of subsequent processing. We demonstrate learning class specific edges for a number of object classes — oranges, bananas and bottles — under challenging scale and illumination variation. Because class-specific edge classification provides a low-level analysis of the image it may be integrated into any edge-based recognition strategy without significant change in the high-level algorithms. We illustrate its application to two algorithms: (i) chamfer matching for object detection, and (ii) modulating contrast terms in MRF based object-specific segmentation. We show that performance of both algorithms (matching and segmentation) is considerably improved by the class-specific edge labelling. 1
A Supervised Approach for Detecting Boundaries in Music Using Difference Features and Boosting
- In Proceedings of the 5th International Conference on Music Information Retrieval (ISMIR
, 2007
"... A musical boundary is a transition between two musical segments such as a verse and a chorus. Our goal is to automatically detect musical boundaries using temporallylocal audio features. We develop a set of difference features that indicate when there are changes in perceptual aspects (e.g., timbre, ..."
Abstract
-
Cited by 8 (2 self)
- Add to MetaCart
A musical boundary is a transition between two musical segments such as a verse and a chorus. Our goal is to automatically detect musical boundaries using temporallylocal audio features. We develop a set of difference features that indicate when there are changes in perceptual aspects (e.g., timbre, harmony, melody, rhythm) of the music. We show that many individual difference features are useful for detecting boundaries. By combining these features and formulating the problem as a supervised learning problem, we can further improve performance. This is an alternative to previous work on music segmentation which has focused on unsupervised approaches based on notions of self-similarity computed over an entire song. We evaluate performance using a publicly available data set of 100 copyright-cleared pop/rock songs, each of which has been segmented by a human expert. 1
Occlusion boundaries from motion: Low-level detection and mid-level reasoning
- International Journal of Computer Vision
, 2009
"... Abstract The boundaries of objects in an image are often considered a nuisance to be “handled ” due to the occlusion they exhibit. Since most, if not all, computer vision techniques aggregate information spatially within a scene, information spanning these boundaries, and therefore from different ph ..."
Abstract
-
Cited by 6 (0 self)
- Add to MetaCart
Abstract The boundaries of objects in an image are often considered a nuisance to be “handled ” due to the occlusion they exhibit. Since most, if not all, computer vision techniques aggregate information spatially within a scene, information spanning these boundaries, and therefore from different physical surfaces, is invariably and erroneously considered together. In addition, these boundaries convey important perceptual information about 3D scene structure and shape. Consequently, their identification can benefit many different computer vision pursuits, from low-level processing techniques to high-level reasoning tasks. While much focus in computer vision is placed on the processing of individual, static images, many applications actually offer video, or sequences of images, as input. The extra temporal dimension of the data allows the motion of the camera or the scene to be used in processing. In this paper, we focus on the exploitation of subtle relative-motion cues present at occlusion boundaries. When combined with more standard appearance information, we demonstrate these cues ’ utility in detecting occlusion boundaries locally. We also present a novel, mid-level model for reasoning more globally about object boundaries and propagating such local information to extract improved, extended boundaries.

