## A Fast Local Descriptor for Dense Matching (2008)

### Cached

### Download Links

Citations: | 68 - 2 self |

### BibTeX

@MISC{Tola08afast,

author = {Engin Tola and Vincent Lepetit and Pascal Fua},

title = {A Fast Local Descriptor for Dense Matching },

year = {2008}

}

### Years of Citing Articles

### OpenURL

### Abstract

We introduce a novel local image descriptor designed for dense wide-baseline matching purposes. We feed our descriptors to a graph-cuts based dense depth map estimation algorithm and this yields better wide-baseline performance than the commonly used correlation windows for which the size is hard to tune. As a result, unlike competing techniques that require many high-resolution images to produce good reconstructions, our descriptor can compute them from pairs of low-quality images such as the ones captured by video streams. Our descriptor is inspired from earlier ones such as SIFT and GLOH but can be computed much faster for our purposes. Unlike SURF which can also be computed efficiently at every pixel, it does not introduce artifacts that degrade the matching performance. Our approach was tested with ground truth laser scanned depth maps as well as on a wide variety of image pairs of different resolutions and we show that good reconstructions are achieved even with only two low quality images.

### Citations

5488 | Distinctive Image Features from Scale-Invariant Keypoints submitted to IJCV
- Lowe
- 2004
(Show Context)
Citation Context ...ows with local region descriptors, which lets us take advantage of powerful global optimization schemes such as graph-cuts to force spatial consistency. Existing local region descriptors such as SIFT =-=[17]-=- or GLOH [19] have been designed for robustness to perspective and lighting changes and havesproved successful for sparse wide-baseline matching. However, they are much more computationally demanding ... |

1458 | Fast approximate energy minimization via graph cuts
- Boykov, Veksler, et al.
(Show Context)
Citation Context ...rst using local measures to estimate the similarity of pixels across images and then on imposing global shape constraints using dynamic programming [3], level sets [9], space carving [15], graph-cuts =-=[21, 6, 14]-=-, PDE [1, 25], or EM [24]. In this paper, we do not focus on the method used to impose the global constraints and use a standard one [6]. Instead, we concentrate on the similarity measure all these al... |

1221 | A performance evaluation of local descriptors
- Mikolajczyk, Schmid
- 2005
(Show Context)
Citation Context ...l region descriptors, which lets us take advantage of powerful global optimization schemes such as graph-cuts to force spatial consistency. Existing local region descriptors such as SIFT [17] or GLOH =-=[19]-=- have been designed for robustness to perspective and lighting changes and havesproved successful for sparse wide-baseline matching. However, they are much more computationally demanding than simple c... |

1096 | A taxonomy and evaluation of dense two-frame stereo correspondence algorithms
- Scharstein, Szeliski
- 2002
(Show Context)
Citation Context ... pairs of different resolutions and we show that good reconstructions are achieved even with only two low quality images. 1. Introduction Though dense shot-baseline stereo matching is well understood =-=[7, 22]-=-, its wide baseline counterpart is, by contrast, much more challenging due to large perspective distortions and increased occluded areas. It is nevertheless worth addressing because it can yield more ... |

1028 | Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natual Scene Catergories
- Lazebnik, Schmid, et al.
(Show Context)
Citation Context ... and the additional dimension to the image gradient direction. They are computed over local regions, usually centered on feature points but sometimes also densely sampled for object recognition tasks =-=[10, 16]-=-. Each pixel belonging to the local region contributes to the histogram depending on its location in the local region, and on the orientation and the norm of the image gradient at its location: As dep... |

578 | A bayesian hierarchical model for learning natural scene categories
- Fei-Fei, Perona
- 2005
(Show Context)
Citation Context ... and the additional dimension to the image gradient direction. They are computed over local regions, usually centered on feature points but sometimes also densely sampled for object recognition tasks =-=[10, 16]-=-. Each pixel belonging to the local region contributes to the histogram depending on its location in the local region, and on the orientation and the norm of the image gradient at its location: As dep... |

472 | Surf: Speeded up robust features
- Bay, Tuytelaars, et al.
- 2006
(Show Context)
Citation Context ...ata as a reference. To be specific, SIFT and GLOH owe much of their strength to the use of gradient orientation histograms, which are relatively robust to distortions. The more recent SURF descriptor =-=[4]-=- approximates them by using integral images to compute the histograms bins. This method is computationally effective with respect to computing the descriptor’s value at every pixel but does away with ... |

471 | S.M.: A theory of shape by space carving
- Kutulakos, Seitz
- 2000
(Show Context)
Citation Context ...ethods rely on first using local measures to estimate the similarity of pixels across images and then on imposing global shape constraints using dynamic programming [3], level sets [9], space carving =-=[15]-=-, graph-cuts [21, 6, 14], PDE [1, 25], or EM [24]. In this paper, we do not focus on the method used to impose the global constraints and use a standard one [6]. Instead, we concentrate on the similar... |

314 |
Gool. Surf: Speeded up robust features
- Bay, Tuytelaars, et al.
- 2006
(Show Context)
Citation Context ...ata as a reference. To be specific, SIFT and GLOH owe much of their strength to the use of gradient orientation histograms, which are relatively robust to distortions. The more recent SURF descriptor =-=[4]-=- approximates them by using integral images to compute the histograms bins. This method is computationally effective with respect to computing the descriptor’s value at every pixel but does away with ... |

275 |
A stereo matching algorithm with an adaptive window: Theory and experiment
- Kanade, Okutomi
- 1994
(Show Context)
Citation Context ...dle occlusion boundaries properly though and we address this issue by using different masks at each location and select the best one by using an EM framework. This is inspired by the earlier works of =-=[11, 13, 12]-=- where multiple or adaptive correlation windows are used. After discussing related work in Sec. 2, we introduce our new local descriptor and present an efficient way to compute it in Sec. 3. In Sec. 4... |

265 | Multi-camera Scene Reconstruction via Graph Cuts
- Kolmogorov, Zabih
- 2002
(Show Context)
Citation Context ...debaseline matching because they are not robust to perspective distortions and tend to straddle areas of different depths or partial occlusions. Thus, most researchers favor simple pixel differencing =-=[21, 5, 14]-=- or correlation over very small windows [24]. They then rely on optimization techniques such as graph-cuts [14] or PDE based diffusion op∗ This work was supported in part by funds of the European Comm... |

238 | A maximum-flow formulation of the ncamera stereo correspondence problem
- Roy, Cox
- 1998
(Show Context)
Citation Context ...debaseline matching because they are not robust to perspective distortions and tend to straddle areas of different depths or partial occlusions. Thus, most researchers favor simple pixel differencing =-=[21, 5, 14]-=- or correlation over very small windows [24]. They then rely on optimization techniques such as graph-cuts [14] or PDE based diffusion op∗ This work was supported in part by funds of the European Comm... |

189 | Human detection based on a probabilistic assembly of robust part detectors - Mikolajczyk, Schmid, et al. |

177 | A pixel dissimilarity measure that is insensitive to image sampling
- Birchfield, Tomasi
- 1998
(Show Context)
Citation Context ...debaseline matching because they are not robust to perspective distortions and tend to straddle areas of different depths or partial occlusions. Thus, most researchers favor simple pixel differencing =-=[21, 5, 14]-=- or correlation over very small windows [24]. They then rely on optimization techniques such as graph-cuts [14] or PDE based diffusion op∗ This work was supported in part by funds of the European Comm... |

177 | Wide baseline stereo matching based on local affinely invariant regions
- Tuytelaars, Gool
- 2000
(Show Context)
Citation Context ...hod does not require an initial reconstruction. Local image descriptors have already been used in dense matching, though in a more traditional way, to match only sparse pixels that are feature points =-=[27, 17]-=-. In [25, 29], these matched points are used as anchors for computing the full reconstruction. [29] propagates the disparities of the matched feature points to their neighbors, while [25] uses them to... |

146 | Integral histogram: A fast way to extract histograms in Cartesian spaces
- Porikli
(Show Context)
Citation Context ...k: The Gaussian convolution simultaneously removes some noise, and gives some invariance to translation to the computed values. This is also better than integral image-like computations of histograms =-=[20]-=- in which all the gradient vectors contribute the same: We can very efficiently reduce the influence of gradient norms from distant locations. Our primary motivation here is to reduce the computationa... |

128 |
Depth from Edge and Intensity Based Stereo
- Baker, Binford
- 1981
(Show Context)
Citation Context ...less areas. Most state-of-the-art methods rely on first using local measures to estimate the similarity of pixels across images and then on imposing global shape constraints using dynamic programming =-=[3]-=-, level sets [9], space carving [15], graph-cuts [21, 6, 14], PDE [1, 25], or EM [24]. In this paper, we do not focus on the method used to impose the global constraints and use a standard one [6]. In... |

128 | Occlusions and Binocular Stereo
- Geiger, Ladendorf, et al.
- 1992
(Show Context)
Citation Context ...dle occlusion boundaries properly though and we address this issue by using different masks at each location and select the best one by using an EM framework. This is inspired by the earlier works of =-=[11, 13, 12]-=- where multiple or adaptive correlation windows are used. After discussing related work in Sec. 2, we introduce our new local descriptor and present an efficient way to compute it in Sec. 3. In Sec. 4... |

106 | R.: Complete dense stereovision using level set methods
- Faugeras, Keriven
- 1998
(Show Context)
Citation Context ... state-of-the-art methods rely on first using local measures to estimate the similarity of pixels across images and then on imposing global shape constraints using dynamic programming [3], level sets =-=[9]-=-, space carving [15], graph-cuts [21, 6, 14], PDE [1, 25], or EM [24]. In this paper, we do not focus on the method used to impose the global constraints and use a standard one [6]. Instead, we concen... |

101 | Learning local image descriptors
- WINDER, BROWN
- 2007
(Show Context)
Citation Context ...struction [25]. We, therefore, introduce a new descriptor that retains the robustness of SIFT and GLOH and can be computed quickly at every single image pixel. Its shape is closely related to that of =-=[28]-=-, which has been shown to be optimal for sparse matching but is not designed for efficiency. We use our descriptor for dense matching and view-based synthesis using stereo-pairs which have too large a... |

84 | Disparity-space images and large occlusion stereo
- Intille, Bobick
- 1994
(Show Context)
Citation Context ...dle occlusion boundaries properly though and we address this issue by using different masks at each location and select the best one by using an EM framework. This is inspired by the earlier works of =-=[11, 13, 12]-=- where multiple or adaptive correlation windows are used. After discussing related work in Sec. 2, we introduce our new local descriptor and present an efficient way to compute it in Sec. 3. In Sec. 4... |

74 |
On benchmarking camera calibration and multi-view stereo for high resolution imagery
- Strecha, Hansen, et al.
(Show Context)
Citation Context ... (x) � �D [k] i �25 q=1 M[q] (7) where M [k] is the kth element of M, and D [k] i (M) the kth histogram � h in Di(M). 5. Results To compare DAISY against that of other descriptors, we used the images =-=[26, 23]-=- of Fig. 5 and an associated depth map obtained using a laser scanner, which we treat as a � 2 , Figure 6. Low resolution and slightly blurry images: Top: Two input 640×480 images taken by a webcam. B... |

72 | Computing differential properties of 3-D shapes from stereoscopic images without 3-D models
- Devernay, Faugeras
- 1994
(Show Context)
Citation Context ...ewer number of images. It does so by considering large image patches while remaining stable under perspective distortions. Earlier approaches to this problem relied on warping the correlation windows =-=[8]-=-. However the warps were estimated from a first reconstruction obtained using classical windows, which is usually not practical in wide baseline situations. By contrast, our method does not require an... |

64 | Gool. Dense matching of multiple wide-baseline views. ICCV
- Strecha, Tuytelaars, et al.
- 2003
(Show Context)
Citation Context ...g weighted sums used by the earlier descriptors by sums of convolutions, which can be computed very quickly. (a) (b) (c) (d) (e) (f) Figure 8. Results on low-resolution versions of the Rathaus images =-=[25]-=-. (a,b,c) Three input images of size 768 × 512 instead of the 3072 × 2048 versions that were used in [24]. (d) Depth map computed using all three images (e) A fourth image not used for reconstruction.... |

55 | Wide-baseline stereo from multiple views: a probabilistic account. CVPR - Strecha, Fransens, et al. - 2004 |

39 |
Fast and reliable passive trinocular stereovision. ICCV
- Ayache, Lustman
- 1987
(Show Context)
Citation Context ...o rely on very small correlation windows or revert to point-wise similarity measures, which loose the discriminative power larger windows could provide. This loss can be compensated by using multiple =-=[2, 25]-=- or highresolution [25] images. The latter is particularly effective because areas that appear uniform at a small scale are often quite textured when imaged at a larger one. However, even then, lighti... |

5 | Tensor voting: Theory and applications
- Medioni, Tang
- 2000
(Show Context)
Citation Context ...htly different: We use a Gaussian kernel whereas the weighting scheme of SIFT and GLOH corresponds to a triangular shaped kernel since the weights are linear. It is also related with tensor voting in =-=[18]-=- if we think of each location in our orientation maps as a voting component and our aggregation kernel as the voting weights. The final values in these descriptors and ours will therefore not be exact... |

5 |
3d modeling and rendering from multiple wide-baseline images by match propagation
- Yao, Cham
(Show Context)
Citation Context ..., they are much more computationally demanding than simple correlation. Thus, for dense wide-baseline matching purposes, local region descriptors have so far only been used to match a few seed points =-=[29]-=- or to provide constraints on the reconstruction [25]. We, therefore, introduce a new descriptor that retains the robustness of SIFT and GLOH and can be computed quickly at every single image pixel. I... |

2 |
Disparity Map Estimation Respecting Image Discontinuities: A PDE and Scale-Space Based Approach
- Dense
- 2002
(Show Context)
Citation Context ...easures to estimate the similarity of pixels across images and then on imposing global shape constraints using dynamic programming [3], level sets [9], space carving [15], graph-cuts [21, 6, 14], PDE =-=[1, 25]-=-, or EM [24]. In this paper, we do not focus on the method used to impose the global constraints and use a standard one [6]. Instead, we concentrate on the similarity measure all these algorithms rely... |

2 |
Multi-view evaluation-http://cvlab.epfl.ch/data
- Strecha
- 2008
(Show Context)
Citation Context ... (x) � �D [k] i �25 q=1 M[q] (7) where M [k] is the kth element of M, and D [k] i (M) the kth histogram � h in Di(M). 5. Results To compare DAISY against that of other descriptors, we used the images =-=[26, 23]-=- of Fig. 5 and an associated depth map obtained using a laser scanner, which we treat as a � 2 , Figure 6. Low resolution and slightly blurry images: Top: Two input 640×480 images taken by a webcam. B... |