Results 1  10
of
99
Fast approximate energy minimization with label costs
, 2010
"... The αexpansion algorithm [7] has had a significant impact in computer vision due to its generality, effectiveness, and speed. Thus far it can only minimize energies that involve unary, pairwise, and specialized higherorder terms. Our main contribution is to extend αexpansion so that it can simult ..."
Abstract

Cited by 104 (8 self)
 Add to MetaCart
(Show Context)
The αexpansion algorithm [7] has had a significant impact in computer vision due to its generality, effectiveness, and speed. Thus far it can only minimize energies that involve unary, pairwise, and specialized higherorder terms. Our main contribution is to extend αexpansion so that it can simultaneously optimize “label costs ” as well. An energy with label costs can penalize a solution based on the set of labels that appear in it. The simplest special case is to penalize the number of labels in the solution. Our energy is quite general, and we prove optimality bounds for our algorithm. A natural application of label costs is multimodel fitting, and we demonstrate several such applications in vision: homography detection, motion segmentation, and unsupervised image segmentation. Our C++/MATLAB implementation is publicly available.
Category Independent Object Proposals
"... Abstract. We propose a categoryindependent method to produce a bag of regions and rank them, such that topranked regions are likely to be good segmentations of different objects. Our key objectives are completeness and diversity: every object should have at least one good proposed region, and a di ..."
Abstract

Cited by 101 (4 self)
 Add to MetaCart
(Show Context)
Abstract. We propose a categoryindependent method to produce a bag of regions and rank them, such that topranked regions are likely to be good segmentations of different objects. Our key objectives are completeness and diversity: every object should have at least one good proposed region, and a diverse set should be topranked. Our approach is to generate a set of segmentations by performing graph cuts based on a seed region and a learned affinity function. Then, the regions are ranked using structured learning based on various cues. Our experiments on BSDS and PASCAL VOC 2008 demonstrate our ability to find most objects within a small bag of proposed regions. 1
Maxmargin hidden conditional random fields for human action recognition
, 2008
"... We present a new method for classification with structured latent variables. Our model is formulated using the maxmargin formalism in the discriminative learning literature. We propose an efficient learning algorithm based on the cutting plane method and decomposed dual optimization. We apply our m ..."
Abstract

Cited by 50 (9 self)
 Add to MetaCart
(Show Context)
We present a new method for classification with structured latent variables. Our model is formulated using the maxmargin formalism in the discriminative learning literature. We propose an efficient learning algorithm based on the cutting plane method and decomposed dual optimization. We apply our model to the problem of recognizing human actions from video sequences, where we model a human action as a global root template and a constellation of several “parts”. We show that our model outperforms another similar method that uses hidden conditional random fields, and is comparable to other stateoftheart approaches. More importantly, our proposed work is quite general and can potentially be applied in a wide variety of vision problems that involve various complex, interdependent latent structures. 1.
Decision Tree Fields
"... This paper introduces a new formulation for discrete image labeling tasks, the Decision Tree Field (DTF), that combines and generalizes random forests and conditional random fields (CRF) which have been widely used in computer vision. In a typical CRF model the unary potentials are derived from soph ..."
Abstract

Cited by 42 (9 self)
 Add to MetaCart
(Show Context)
This paper introduces a new formulation for discrete image labeling tasks, the Decision Tree Field (DTF), that combines and generalizes random forests and conditional random fields (CRF) which have been widely used in computer vision. In a typical CRF model the unary potentials are derived from sophisticated random forest or boosting based classifiers, however, the pairwise potentials are assumed to (1) have a simple parametric form with a prespecified and fixed dependence on the image data, and (2) to be defined on the basis of a small and fixed neighborhood. In contrast, in DTF, local interactions between multiple variables are determined by means of decision trees evaluated on the image data, allowing the interactions to be adapted to the image content. This results in powerful graphical models which are able to represent complex label structure. Our key technical contribution is to show that the DTF model can be trained efficiently and jointly using a convex approximate likelihood function, enabling us to learn over a million free model parameters. We show experimentally that for applications which have a rich and complex label structure, our model achieves excellent results. 1.
Segmenting salient objects from images and videos
 in ECCV, 2010
"... Abstract. In this paper we introduce a new salient object segmentation method, which is based on combining a saliency measure with a conditional random field (CRF) model. The proposed saliency measure is formulated using a statistical framework and local feature contrast in illumination, color, and ..."
Abstract

Cited by 40 (1 self)
 Add to MetaCart
(Show Context)
Abstract. In this paper we introduce a new salient object segmentation method, which is based on combining a saliency measure with a conditional random field (CRF) model. The proposed saliency measure is formulated using a statistical framework and local feature contrast in illumination, color, and motion information. The resulting saliency map is then used in a CRF model to define an energy minimization based segmentation approach, which aims to recover welldefined salient objects. The method is efficiently implemented by using the integral histogram approach and graph cut solvers. Compared to previous approaches the introduced method is among the few which are applicable to both still images and videos including motion cues. The experiments show that our approach outperforms the current stateoftheart methods in both qualitative and quantitative terms.
Geodesic Star Convexity for Interactive Image Segmentation
"... In this paper we introduce a new shape constraint for interactive image segmentation. It is an extension of Veksler’s [25] starconvexity prior, in two ways: from a single star to multiple stars and from Euclidean rays to Geodesic paths. Global minima of the energy function are obtained subject to t ..."
Abstract

Cited by 34 (2 self)
 Add to MetaCart
(Show Context)
In this paper we introduce a new shape constraint for interactive image segmentation. It is an extension of Veksler’s [25] starconvexity prior, in two ways: from a single star to multiple stars and from Euclidean rays to Geodesic paths. Global minima of the energy function are obtained subject to these new constraints. We also introduce Geodesic Forests, which exploit the structure of shortest paths in implementing the extended constraints. The starconvexity prior is used here in an interactive setting and this is demonstrated in a practical system. The system is evaluated by means of a “robot user ” to measure the amount of interaction required in a precise way. We also introduce a new and harder dataset which augments the existing Grabcut dataset [1] with images and ground truth taken from the PASCAL VOC segmentation challenge [7]. 1.
Maxmargin weight learning for Markov logic networks
 In Proceedings of the European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML/PKDD09). Bled
, 2009
"... Abstract. Markov logic networks (MLNs) are an expressive representation for statistical relational learning that generalizes both firstorder logic and graphical models. Existing discriminative weight learning methods for MLNs all try to learn weights that optimize the Conditional Log Likelihood (CL ..."
Abstract

Cited by 28 (6 self)
 Add to MetaCart
(Show Context)
Abstract. Markov logic networks (MLNs) are an expressive representation for statistical relational learning that generalizes both firstorder logic and graphical models. Existing discriminative weight learning methods for MLNs all try to learn weights that optimize the Conditional Log Likelihood (CLL) of the training examples. In this work, we present a new discriminative weight learning method for MLNs based on a maxmargin framework. This results in a new model, MaxMargin Markov Logic Networks (M3LNs), that combines the expressiveness of MLNs with the predictive accuracy of structural Support Vector Machines (SVMs). To train the proposed model, we design a new approximation algorithm for lossaugmented inference in MLNs based on Linear Programming (LP). The experimental result shows that the proposed approach generally achieves higher F1 scores than the current best discriminative weight learner for MLNs. 1
Segmentation Propagation in ImageNet
"... Abstract. ImageNet is a largescale hierarchical database of object classes. We propose to automatically populate it with pixelwise segmentations, by leveraging existing manual annotations in the form of class labels and boundingboxes. The key idea is to recursively exploit images segmented so far ..."
Abstract

Cited by 26 (0 self)
 Add to MetaCart
(Show Context)
Abstract. ImageNet is a largescale hierarchical database of object classes. We propose to automatically populate it with pixelwise segmentations, by leveraging existing manual annotations in the form of class labels and boundingboxes. The key idea is to recursively exploit images segmented so far to guide the segmentation of new images. At each stage this propagation process expands into the images which are easiest to segment at that point in time, e.g. by moving to the semantically most related classes to those segmented so far. The propagation of segmentation occurs both (a) at the image level, by transferring existing segmentations to estimate the probability of a pixel to be foreground, and (b) at the class level, by jointly segmenting images of the same class and by importing the appearance models of classes that are already segmented. Through an experiment on 577 classes and 500k images we show that our technique (i) annotates a wide range of classes with accurate segmentations; (ii) effectively exploits the hierarchical structure of ImageNet; (iii) scales efficiently; (iv) outperforms a baseline GrabCut [1] initialized on the image center, as well as our recent segmentation transfer technique [2] on which this paper is based. Moreover, our method also delivers stateoftheart results on the recent iCoseg dataset for cosegmentation. 1
On Parameter Learning in CrfBased Approaches to Object Class Image Segmentation
 ECCV
, 2010
"... Recent progress in perpixel object class labeling of natural images can be attributed to the use of multiple types of image features and sound statistical learning approaches. Within the latter, Conditional Random Fields (CRF) are prominently used for their ability to represent interactions betwee ..."
Abstract

Cited by 25 (4 self)
 Add to MetaCart
(Show Context)
Recent progress in perpixel object class labeling of natural images can be attributed to the use of multiple types of image features and sound statistical learning approaches. Within the latter, Conditional Random Fields (CRF) are prominently used for their ability to represent interactions between random variables. Despite their popularity in computer vision, parameter learning for CRFs has remained difficult, popular approaches being crossvalidation and piecewise training. In this work, we propose a simple yet expressive treestructured CRF based on a recent hierarchical image segmentation method. Our model combines and weights multiple image features within a hierarchical representation and allows simple and efficient globallyoptimal learning of ≈ 10 5 parameters. The tractability of our model allows us to pose and answer some of the open questions regarding parameter learning applying to CRFbased approaches. The key findings for learning CRF models are, from the obvious to the surprising, i) multiple image features always help, ii) the limiting dimension with respect to current models is the amount of training data, iii) piecewise training is competitive, iv) current methods for maxmargin training fail for models with many parameters.
A Pylon Model for Semantic Segmentation
"... Graph cut optimization is one of the standard workhorses of image segmentation since for binary random field representations of the image, it gives globally optimal results and there are efficient polynomial time implementations. Often, the random field is applied over a flat partitioning of the ima ..."
Abstract

Cited by 25 (0 self)
 Add to MetaCart
(Show Context)
Graph cut optimization is one of the standard workhorses of image segmentation since for binary random field representations of the image, it gives globally optimal results and there are efficient polynomial time implementations. Often, the random field is applied over a flat partitioning of the image into nonintersecting elements, such as pixels or superpixels. In the paper we show that if, instead of a flat partitioning, the image is represented by a hierarchical segmentation tree, then the resulting energy combining unary and boundary terms can still be optimized using graph cut (with all the corresponding benefits of global optimality and efficiency). As a result of such inference, the image gets partitioned into a set of segments that may come from different layers of the tree. We apply this formulation, which we call the pylon model, to the task of semantic segmentation where the goal is to separate an image into areas belonging to different semantic classes. The experiments highlight the advantage of inference on a segmentation tree (over a flat partitioning) and demonstrate that the optimization in the pylon model is able to flexibly choose the level of segmentation across the image. Overall, the proposed system has superior segmentation accuracy on several datasets (Graz02, Stanford background) compared to previously suggested approaches. 1