Results 1  10
of
395
PatchMatch: A Randomized Correspondence Algorithm for . . .
, 2009
"... This paper presents interactive image editing tools using a new randomized algorithm for quickly finding approximate nearestneighbor matches between image patches. Previous research in graphics and vision has leveraged such nearestneighbor searches to provide a variety of highlevel digital image ..."
Abstract

Cited by 239 (9 self)
 Add to MetaCart
This paper presents interactive image editing tools using a new randomized algorithm for quickly finding approximate nearestneighbor matches between image patches. Previous research in graphics and vision has leveraged such nearestneighbor searches to provide a variety of highlevel digital image editing tools. However, the cost of computing a field of such matches for an entire image has eluded previous efforts to provide interactive performance. Our algorithm offers substantial performance improvements over the previous state of the art (20100x), enabling its use in interactive editing tools. The key insights driving the algorithm are that some good patch matches can be found via random sampling, and that natural coherence in the imagery allows us to propagate such matches quickly to surrounding areas. We offer theoretical analysis of the convergence properties of the algorithm, as well as empirical and practical evidence for its high quality and performance. This one simple algorithm forms the basis for a variety of tools – image retargeting, completion and reshuffling – that can be used together in the context of a highlevel image editing application. Finally, we propose additional intuitive constraints on the synthesis process that offer the user a level of control unavailable in previous methods.
Modeling the World from Internet Photo Collections
 INT J COMPUT VIS
, 2007
"... There are billions of photographs on the Internet, comprising the largest and most diverse photo collection ever assembled. How can computer vision researchers exploit this imagery? This paper explores this question from the standpoint of 3D scene modeling and visualization. We present structurefro ..."
Abstract

Cited by 236 (6 self)
 Add to MetaCart
There are billions of photographs on the Internet, comprising the largest and most diverse photo collection ever assembled. How can computer vision researchers exploit this imagery? This paper explores this question from the standpoint of 3D scene modeling and visualization. We present structurefrommotion and imagebased rendering algorithms that operate on hundreds of images downloaded as a result of keywordbased image search queries like “Notre Dame ” or “Trevi Fountain.” This approach, which we call Photo Tourism, has enabled reconstructions of numerous wellknown world sites. This paper presents these algorithms and results as a first step towards 3D modeling of the world’s wellphotographed sites, cities, and landscapes from Internet imagery, and discusses key open problems and challenges for the research community.
Optimizing binary MRFs via extended roof duality
 In Proc. CVPR
, 2007
"... Many computer vision applications rely on the efficient optimization of challenging, socalled nonsubmodular, binary pairwise MRFs. A promising graph cut based approach for optimizing such MRFs known as “roof duality” was recently introduced into computer vision. We study two methods which extend t ..."
Abstract

Cited by 167 (13 self)
 Add to MetaCart
(Show Context)
Many computer vision applications rely on the efficient optimization of challenging, socalled nonsubmodular, binary pairwise MRFs. A promising graph cut based approach for optimizing such MRFs known as “roof duality” was recently introduced into computer vision. We study two methods which extend this approach. First, we discuss an efficient implementation of the “probing ” technique introduced recently by Boros et al. [5]. It simplifies the MRF while preserving the global optimum. Our code is 400700 faster on some graphs than the implementation of [5]. Second, we present a new technique which takes an arbitrary input labeling and tries to improve its energy. We give theoretical characterizations of local minima of this procedure. We applied both techniques to many applications, including image segmentation, new view synthesis, superresolution, diagram recognition, parameter learning, texture restoration, and image deconvolution. For several applications we see that we are able to find the global minimum very efficiently, and considerably outperform the original roof duality approach. In comparison to existing techniques, such as graph cut, TRW, BP, ICM, and simulated annealing, we nearly always find a lower energy. 1.
Autocontext and its Application to Highlevel Vision Tasks
 In Proc. CVPR
"... The notion of using context information for solving highlevel vision and medical image segmentation problems has been increasingly realized in the field. However, how to learn an effective and efficient context model, together with an image appearance model, remains mostly unknown. The current lite ..."
Abstract

Cited by 147 (4 self)
 Add to MetaCart
(Show Context)
The notion of using context information for solving highlevel vision and medical image segmentation problems has been increasingly realized in the field. However, how to learn an effective and efficient context model, together with an image appearance model, remains mostly unknown. The current literature using Markov Random Fields (MRFs) and Conditional Random Fields (CRFs) often involves specific algorithm design, in which the modeling and computing stages are studied in isolation. In this paper, we propose the autocontext algorithm. Given a set of training images and their corresponding label maps, we first learn a classifier on local image patches. The discriminative probability (or classification confidence) maps created by the learned classifier are then used as context information, in addition to the original image patches, to train a new classifier. The algorithm then iterates until convergence. Autocontext integrates lowlevel and context information by fusing a large number of lowlevel appearance features with context and implicit shape information. The resulting discriminative algorithm is general and easy to implement. Under nearly the same parameter settings in training, we apply the algorithm to three challenging vision applications: foreground/background segregation, human body configuration estimation, and scene region labeling. Moreover, context also plays a very important role in medical/brain images where the anatomical structures are mostly constrained to relatively fixed positions. With only some slight changes resulting from using 3D instead of 2D features, the autocontext algorithm applied to brain MRI image segmentation is shown to outperform stateoftheart algorithms specifically designed for this domain. Furthermore, the scope of the proposed algorithm goes beyond image analysis and it has the potential to be used for a wide variety of problems in multivariate labeling.
Global Stereo Reconstruction under Second Order Smoothness Priors
"... Secondorder priors on the smoothness of 3D surfaces are a better model of typical scenes than firstorder priors. However, stereo reconstruction using global inference algorithms, such as graphcuts, has not been able to incorporate secondorder priors because the triple cliques needed to express t ..."
Abstract

Cited by 120 (8 self)
 Add to MetaCart
Secondorder priors on the smoothness of 3D surfaces are a better model of typical scenes than firstorder priors. However, stereo reconstruction using global inference algorithms, such as graphcuts, has not been able to incorporate secondorder priors because the triple cliques needed to express them yield intractable (nonsubmodular) optimization problems. This paper shows that inference with triple cliques can be effectively optimized. Our optimization strategy is a development of recent extensions to αexpansion, based on the “QPBO ” algorithm [5, 14, 26]. The strategy is to repeatedly merge proposal depth maps using a novel extension of QPBO. Proposal depth maps can come from any source, for example frontoparallel planes as in αexpansion, or indeed any existing stereo algorithm, with arbitrary parameter settings. Experimental results demonstrate the usefulness of the secondorder prior and the efficacy of our optimization framework. An implementation of our stereo framework is available online [34].
SIFT Flow: Dense Correspondence across Scenes and its Applications
"... While image alignment has been studied in different areas of computer vision for decades, aligning images depicting different scenes remains a challenging problem. Analogous to optical flow where an image is aligned to its temporally adjacent frame, we propose SIFT flow, a method to align an image ..."
Abstract

Cited by 114 (4 self)
 Add to MetaCart
(Show Context)
While image alignment has been studied in different areas of computer vision for decades, aligning images depicting different scenes remains a challenging problem. Analogous to optical flow where an image is aligned to its temporally adjacent frame, we propose SIFT flow, a method to align an image to its nearest neighbors in a large image corpus containing a variety of scenes. The SIFT flow algorithm consists of matching densely sampled, pixelwise SIFT features between two images, while preserving spatial discontinuities. The SIFT features allow robust matching across different scene/object appearances, whereas the discontinuitypreserving spatial model allows matching of objects located at different parts of the scene. Experiments show that the proposed approach robustly aligns complex scene pairs containing significant spatial differences. Based on SIFT flow, we propose an alignmentbased large database framework for image analysis and synthesis, where image information is transferred from the nearest neighbors to a query image according to the dense scene correspondence. This framework is demonstrated through concrete applications, such as motion field prediction from a single image, motion synthesis via object transfer, satellite image registration and face recognition.
MRF energy minimization and beyond via dual decomposition
 IN: IEEE PAMI. (2011
"... This paper introduces a new rigorous theoretical framework to address discrete MRFbased optimization in computer vision. Such a framework exploits the powerful technique of Dual Decomposition. It is based on a projected subgradient scheme that attempts to solve an MRF optimization problem by first ..."
Abstract

Cited by 107 (10 self)
 Add to MetaCart
(Show Context)
This paper introduces a new rigorous theoretical framework to address discrete MRFbased optimization in computer vision. Such a framework exploits the powerful technique of Dual Decomposition. It is based on a projected subgradient scheme that attempts to solve an MRF optimization problem by first decomposing it into a set of appropriately chosen subproblems and then combining their solutions in a principled way. In order to determine the limits of this method, we analyze the conditions that these subproblems have to satisfy and we demonstrate the extreme generality and flexibility of such an approach. We thus show that, by appropriately choosing what subproblems to use, one can design novel and very powerful MRF optimization algorithms. For instance, in this manner we are able to derive algorithms that: 1) generalize and extend stateoftheart messagepassing methods, 2) optimize very tight LPrelaxations to MRF optimization, 3) and take full advantage of the special structure that may exist in particular MRFs, allowing the use of efficient inference techniques such as, e.g, graphcut based methods. Theoretical analysis on the bounds related with the different algorithms derived from our framework and experimental results/comparisons using synthetic and real data for a variety of tasks in computer vision demonstrate the extreme potentials of our approach.
Fast approximate energy minimization with label costs
, 2010
"... The αexpansion algorithm [7] has had a significant impact in computer vision due to its generality, effectiveness, and speed. Thus far it can only minimize energies that involve unary, pairwise, and specialized higherorder terms. Our main contribution is to extend αexpansion so that it can simult ..."
Abstract

Cited by 104 (8 self)
 Add to MetaCart
(Show Context)
The αexpansion algorithm [7] has had a significant impact in computer vision due to its generality, effectiveness, and speed. Thus far it can only minimize energies that involve unary, pairwise, and specialized higherorder terms. Our main contribution is to extend αexpansion so that it can simultaneously optimize “label costs ” as well. An energy with label costs can penalize a solution based on the set of labels that appear in it. The simplest special case is to penalize the number of labels in the solution. Our energy is quite general, and we prove optimality bounds for our algorithm. A natural application of label costs is multimodel fitting, and we demonstrate several such applications in vision: homography detection, motion segmentation, and unsupervised image segmentation. Our C++/MATLAB implementation is publicly available.