Results 1 - 10
of
65
Fast approximate energy minimization via graph cuts
- IEEE Transactions on Pattern Analysis and Machine Intelligence
, 2001
"... In this paper we address the problem of minimizing a large class of energy functions that occur in early vision. The major restriction is that the energy function’s smoothness term must only involve pairs of pixels. We propose two algorithms that use graph cuts to compute a local minimum even when v ..."
Abstract
-
Cited by 905 (38 self)
- Add to MetaCart
In this paper we address the problem of minimizing a large class of energy functions that occur in early vision. The major restriction is that the energy function’s smoothness term must only involve pairs of pixels. We propose two algorithms that use graph cuts to compute a local minimum even when very large moves are allowed. The first move we consider is an α-βswap: for a pair of labels α, β, this move exchanges the labels between an arbitrary set of pixels labeled α and another arbitrary set labeled β. Our first algorithm generates a labeling such that there is no swap move that decreases the energy. The second move we consider is an α-expansion: for a label α, this move assigns an arbitrary set of pixels the label α. Our second
An Experimental Comparison of Min-Cut/Max-Flow Algorithms for Energy Minimization in Vision
- IEEE Transactions on Pattern Analysis and Machine Intelligence
, 2001
"... After [10, 15, 12, 2, 4] minimum cut/maximum ow algorithms on graphs emerged as an increasingly useful tool for exact or approximate energy minimization in low-level vision. The combinatorial optimization literature provides many min-cut/max-ow algorithms with dierent polynomial time complexity. ..."
Abstract
-
Cited by 471 (36 self)
- Add to MetaCart
After [10, 15, 12, 2, 4] minimum cut/maximum ow algorithms on graphs emerged as an increasingly useful tool for exact or approximate energy minimization in low-level vision. The combinatorial optimization literature provides many min-cut/max-ow algorithms with dierent polynomial time complexity. Their practical eciency, however, has to date been studied mainly outside the scope of computer vision.
What energy functions can be minimized via graph cuts
- IEEE Transactions on Pattern Analysis and Machine Intelligence
, 2004
"... Abstract—In the last few years, several new algorithms based on graph cuts have been developed to solve energy minimization problems in computer vision. Each of these techniques constructs a graph such that the minimum cut on the graph also minimizes the energy. Yet, because these graph construction ..."
Abstract
-
Cited by 424 (19 self)
- Add to MetaCart
Abstract—In the last few years, several new algorithms based on graph cuts have been developed to solve energy minimization problems in computer vision. Each of these techniques constructs a graph such that the minimum cut on the graph also minimizes the energy. Yet, because these graph constructions are complex and highly specific to a particular energy function, graph cuts have seen limited application to date. In this paper, we give a characterization of the energy functions that can be minimized by graph cuts. Our results are restricted to functions of binary variables. However, our work generalizes many previous constructions and is easily applicable to vision problems that involve large numbers of labels, such as stereo, motion, image restoration, and scene reconstruction. We give a precise characterization of what energy functions can be minimized using graph cuts, among the energy functions that can be written as a sum of terms containing three or fewer binary variables. We also provide a general-purpose construction to minimize such an energy function. Finally, we give a necessary condition for any energy function of binary variables to be minimized by graph cuts. Researchers who are considering the use of graph cuts to optimize a particular energy function can use our results to determine if this is possible and then follow our construction to create the appropriate graph. A software implementation is freely available.
Interactive Graph Cuts for Optimal Boundary & Region Segmentation of Objects in N-D Images
, 2001
"... In this paper we describe a new technique for general purpose interactive segmentation of N-dimensional images. The user marks certain pixels as “object” or “background” to provide hard constraints for segmentation. Additional soft constraints incorporate both boundary and region information. Graph ..."
Abstract
-
Cited by 413 (8 self)
- Add to MetaCart
In this paper we describe a new technique for general purpose interactive segmentation of N-dimensional images. The user marks certain pixels as “object” or “background” to provide hard constraints for segmentation. Additional soft constraints incorporate both boundary and region information. Graph cuts are used to find the globally optimal segmentation of the N-dimensional image. The obtained solution gives the best balance of boundary and region properties among all segmentations satisfying the constraints. The topology of our segmentation is unrestricted and both “object” and “background” segments may consist of sev-eral isolatedparts. Some experimental results are presented in the context ofphotohideo editing and medical image seg-mentation. We also demonstrate an interesting Gestalt example. A fast implementation of our segmentation method is possible via a new mar-$ow algorithm in [2].
Pictorial Structures for Object Recognition
- IJCV
, 2003
"... In this paper we present a statistical framework for modeling the appearance of objects. Our work is motivated by the pictorial structure models introduced by Fischler and Elschlager. The basic idea is to model an object by a collection of parts arranged in a deformable configuration. The appearance ..."
Abstract
-
Cited by 305 (13 self)
- Add to MetaCart
In this paper we present a statistical framework for modeling the appearance of objects. Our work is motivated by the pictorial structure models introduced by Fischler and Elschlager. The basic idea is to model an object by a collection of parts arranged in a deformable configuration. The appearance of each part is modeled separately, and the deformable configuration is represented by spring-like connections between pairs of parts. These models allow for qualitative descriptions of visual appearance, and are suitable for generic recognition problems. We use these models to address the problem of detecting an object in an image as well as the problem of learning an object model from training examples, and present efficient algorithms for both these problems. We demonstrate the techniques by learning models that represent faces and human bodies and using the resulting models to locate the corresponding objects in novel images.
Computing Visual Correspondence with Occlusions using Graph Cuts
"... Several new algorithms for visual correspondence based on graph cuts [7, 14, 17] have recently been developed. While these methods give very strong results in practice, they do not handle occlusions properly. Specifically, they treat the two input images asymmetrically, and they do not ensure that a ..."
Abstract
-
Cited by 195 (11 self)
- Add to MetaCart
Several new algorithms for visual correspondence based on graph cuts [7, 14, 17] have recently been developed. While these methods give very strong results in practice, they do not handle occlusions properly. Specifically, they treat the two input images asymmetrically, and they do not ensure that a pixel corresponds to at most one pixel in the other image. In this paper, we present a new method which properly addresses occlusions, while preserving the advantages of graph cut algorithms. We give experimental results for stereo as well as motion, which demonstrate that our method performs well both at detecting occlusions and computing disparities.
Approximation Algorithms for Classification Problems with Pairwise Relationships: Metric Labeling and Markov Random Fields
- IN IEEE SYMPOSIUM ON FOUNDATIONS OF COMPUTER SCIENCE
, 1999
"... In a traditional classification problem, we wish to assign one of k labels (or classes) to each of n objects, in a way that is consistent with some observed data that we have about the problem. An active line of research in this area is concerned with classification when one has information about pa ..."
Abstract
-
Cited by 131 (1 self)
- Add to MetaCart
In a traditional classification problem, we wish to assign one of k labels (or classes) to each of n objects, in a way that is consistent with some observed data that we have about the problem. An active line of research in this area is concerned with classification when one has information about pairwise relationships among the objects to be classified; this issue is one of the principal motivations for the framework of Markov random fields, and it arises in areas such as image processing, biometry, and document analysis. In its most basic form, this style of analysis seeks a classification that optimizes a combinatorial function consisting of assignment costs --- based on the individual choice of label we make for each object --- and separation costs --- based on the pair of choices we make for two "related" objects. We formulate a general classification problem of this type, the metric labeling problem; we show that it contains as special cases a number of standard classification f...
Markov random fields with efficient approximations
- In IEEE Conference on Computer Vision and Pattern Recognition
, 1998
"... Markov Random Fields (MRF’s) can be used for a wide variety of vision problems. In this paper we focus on MRF’s with two-valued clique potentials, which form a generalized Potts model. We show that the maximum a posteriori estimate of such an MRF can be obtained by solving a multiway minimum cut pro ..."
Abstract
-
Cited by 123 (21 self)
- Add to MetaCart
Markov Random Fields (MRF’s) can be used for a wide variety of vision problems. In this paper we focus on MRF’s with two-valued clique potentials, which form a generalized Potts model. We show that the maximum a posteriori estimate of such an MRF can be obtained by solving a multiway minimum cut problem on a graph. We develop efficient algorithms for computing good approximations to the minimum multiway cut. The visual correspondence problem can be formulated as an MRF in our framework; this yields quite promising results on real data with ground truth. We also apply our techniques to MRF’s with linear clique potentials. 1
Efficient Matching of Pictorial Structures
- Proc. IEEE Computer Vision and Pattern Recognition Conf.
, 2000
"... A pictorial structure is a collection of parts arranged in a deformable configuration. Each part is represented using a simple appearance model and the deformable configuration is represented by spring-like connections between pairs of parts. While pictorial structures were introduced a number of ye ..."
Abstract
-
Cited by 114 (9 self)
- Add to MetaCart
A pictorial structure is a collection of parts arranged in a deformable configuration. Each part is represented using a simple appearance model and the deformable configuration is represented by spring-like connections between pairs of parts. While pictorial structures were introduced a number of years ago, they have not been broadly applied to matching and recognition problems. This has been due in part to the computational difficulty of matching pictorial structures to images. In this paper we present an efficient algorithm for finding the best global match of a pictorial structure to an image. The running time of the algorithm is optimal and it it takes only a few seconds to match a model with ve to ten parts. With this improved algorithm, pictorial structures provide a practical and powerful framework for qualitative descriptions of objects and scenes, and are suitable for many generic image recognition problems. We illustrate the approach using simple models of a person and a car.
Multiway cut for stereo and motion with slanted surfaces
- In International Conference on Computer Vision
, 1999
"... Slanted surfaces pose a problem for correspondence algorithms utilizing search because of the greatly increased number of possibilities, when compared with frontoparallel surfaces. In this paper we propose an algorithm to compute correspondence between stereo images or between frames of a motionsequ ..."
Abstract
-
Cited by 93 (2 self)
- Add to MetaCart
Slanted surfaces pose a problem for correspondence algorithms utilizing search because of the greatly increased number of possibilities, when compared with frontoparallel surfaces. In this paper we propose an algorithm to compute correspondence between stereo images or between frames of a motionsequence by minimizingan energy functional that accounts for slanted surfaces. The energy is minimized in a greedy strategy that alternates between segmenting the image into a number of non-overlapping regions (using the multiway-cut algorithm of Boykov, Veksler, and Zabih) and finding the affine parameters describing the displacement function of each region. A follow-up step enables the algorithm to escape local minima due to oversegmentation. Experiments on real images show the algorithm’s ability to find an accurate segmentation and displacement map, as well as discontinuities and creases, from a wide variety of stereo and motion imagery. 1

