• Documents
  • Authors
  • Tables
  • Log in
  • Sign up
  • MetaCart
  • DMCA
  • Donate

CiteSeerX logo

Advanced Search Include Citations
Advanced Search Include Citations

Object recognition from local scale-invariant features (1999)

by David G Lowe
Add To MetaCart

Tools

Sorted by:
Results 1 - 10 of 2,739
Next 10 →

Distinctive Image Features from Scale-Invariant Keypoints

by David G. Lowe , 2003
"... This paper presents a method for extracting distinctive invariant features from images, which can be used to perform reliable matching between different images of an object or scene. The features are invariant to image scale and rotation, and are shown to provide robust matching across a a substa ..."
Abstract - Cited by 8955 (21 self) - Add to MetaCart
This paper presents a method for extracting distinctive invariant features from images, which can be used to perform reliable matching between different images of an object or scene. The features are invariant to image scale and rotation, and are shown to provide robust matching across a a substantial range of affine distortion, addition of noise, change in 3D viewpoint, and change in illumination. The features are highly distinctive, in the sense that a single feature can be correctly matched with high probability against a large database of features from many images. This paper also describes an approach to using these features for object recognition. The recognition proceeds by matching individual features to a database of features from known objects using a fast nearest-neighbor algorithm, followed by a Hough transform to identify clusters belonging to a single object, and finally performing verification through leastsquares solution for consistent pose parameters. This approach to recognition can robustly identify objects among clutter and occlusion while achieving near real-time performance.
(Show Context)

Citation Context

...rs of matched features. The Harris corner detector is very sensitive to changes in image scale, so it does not provide a good basis for matching images of different sizes. Earlier work by the author (=-=Lowe, 1999-=-) extended the local feature approach to achieve scale invariance. This work also described a new local descriptor that provided more distinctive features while being less 3 sensitive to local image d...

Shape Matching and Object Recognition Using Shape Contexts

by Serge Belongie, Jitendra Malik, Jan Puzicha - IEEE Transactions on Pattern Analysis and Machine Intelligence , 2001
"... We present a novel approach to measuring similarity between shapes and exploit it for object recognition. In our framework, the measurement of similarity is preceded by (1) solv- ing for correspondences between points on the two shapes, (2) using the correspondences to estimate an aligning transform ..."
Abstract - Cited by 1809 (21 self) - Add to MetaCart
We present a novel approach to measuring similarity between shapes and exploit it for object recognition. In our framework, the measurement of similarity is preceded by (1) solv- ing for correspondences between points on the two shapes, (2) using the correspondences to estimate an aligning transform. In order to solve the correspondence problem, we attach a descriptor, the shape context, to each point. The shape context at a reference point captures the distribution of the remaining points relative to it, thus offering a globally discriminative characterization. Corresponding points on two similar shapes will have similar shape con- texts, enabling us to solve for correspondences as an optimal assignment problem. Given the point correspondences, we estimate the transformation that best aligns the two shapes; reg- ularized thin plate splines provide a flexible class of transformation maps for this purpose. The dissimilarity between the two shapes is computed as a sum of matching errors between corresponding points, together with a term measuring the magnitude of the aligning trans- form. We treat recognition in a nearest-neighbor classification framework as the problem of finding the stored prototype shape that is maximally similar to that in the image. Results are presented for silhouettes, trademarks, handwritten digits and the COIL dataset.
(Show Context)

Citation Context

...tly solving for correspondences. Amit et al. [1] train decision trees for recognition by learning discriminative spatial configurations of keypoints. Leung et al. [34], Schmid and Mohr [48], and Lowe =-=[35]-=- additionally use gray level information at the keypoints to provide greater discriminative power. It should be noted that not all objects have distinguished key points (think of a circle for instance...

A PERFORMANCE EVALUATION OF LOCAL DESCRIPTORS

by Krystian Mikolajczyk, Cordelia Schmid , 2005
"... In this paper we compare the performance of descriptors computed for local interest regions, as for example extracted by the Harris-Affine detector [32]. Many different descriptors have been proposed in the literature. However, it is unclear which descriptors are more appropriate and how their perfo ..."
Abstract - Cited by 1783 (51 self) - Add to MetaCart
In this paper we compare the performance of descriptors computed for local interest regions, as for example extracted by the Harris-Affine detector [32]. Many different descriptors have been proposed in the literature. However, it is unclear which descriptors are more appropriate and how their performance depends on the interest region detector. The descriptors should be distinctive and at the same time robust to changes in viewing conditions as well as to errors of the detector. Our evaluation uses as criterion recall with respect to precision and is carried out for different image transformations. We compare shape context [3], steerable filters [12], PCA-SIFT [19], differential invariants [20], spin images [21], SIFT [26], complex filters [37], moment invariants [43], and cross-correlation for different types of interest regions. We also propose an extension of the SIFT descriptor, and show that it outperforms the original method. Furthermore, we observe that the ranking of the descriptors is mostly independent of the interest region detector and that the SIFT based descriptors perform best. Moments and steerable filters show the best performance among the low dimensional descriptors.

Video google: A text retrieval approach to object matching in videos

by Josef Sivic, Andrew Zisserman - In ICCV , 2003
"... We describe an approach to object and scene retrieval which searches for and localizes all the occurrences of a user outlined object in a video. The object is represented by a set of viewpoint invariant region descriptors so that recognition can proceed successfully despite changes in viewpoint, ill ..."
Abstract - Cited by 1636 (42 self) - Add to MetaCart
We describe an approach to object and scene retrieval which searches for and localizes all the occurrences of a user outlined object in a video. The object is represented by a set of viewpoint invariant region descriptors so that recognition can proceed successfully despite changes in viewpoint, illumination and partial occlusion. The temporal continuity of the video within a shot is used to track the regions in order to reject unstable regions and reduce the effects of noise in the descriptors. The analogy with text retrieval is in the implementation where matches on descriptors are pre-computed (using vector quantization), and inverted file systems and document rankings are used. The result is that retrieval is immediate, returning a ranked list of key frames/shots in the manner of Google. The method is illustrated for matching on two full length feature films. 1.
(Show Context)

Citation Context

...al relationships (such as epipolar geometry). Proceedings of the Ninth IEEE International Conference on Computer Vision (ICCV 2003) 2-Volume Set 0-7695-1950-4/03 $17.00 © 2003 IEEE 1 Examples include =-=[5, 6, 8, 11, 13, 12, 14, 16, 17]-=-. We explore whether this type of approach to recognition can be recast as text retrieval. In essence this requires a visual analogy of a word, and here we provide this by vector quantizing the descri...

An affine invariant interest point detector

by Krystian Mikolajczyk, Cordelia Schmid - In Proceedings of the 7th European Conference on Computer Vision , 2002
"... Abstract. This paper presents a novel approach for detecting affine invariant interest points. Our method can deal with significant affine transformations including large scale changes. Such transformations introduce significant changes in the point location as well as in the scale and the shape of ..."
Abstract - Cited by 1467 (55 self) - Add to MetaCart
Abstract. This paper presents a novel approach for detecting affine invariant interest points. Our method can deal with significant affine transformations including large scale changes. Such transformations introduce significant changes in the point location as well as in the scale and the shape of the neighbourhood of an interest point. Our approach allows to solve for these problems simultaneously. It is based on three key ideas: 1) The second moment matrix computed in a point can be used to normalize a region in an affine invariant way (skew and stretch). 2) The scale of the local structure is indicated by local extrema of normalized derivatives over scale. 3) An affine-adapted Harris detector determines the location of interest points. A multi-scale version of this detector is used for initialization. An iterative algorithm then modifies location, scale and neighbourhood of each point and converges to affine invariant points. For matching and recognition, the image is characterized by a set of affine invariant points; the affine transformation associated with each point allows the computation of an affine invariant descriptor which is also invariant to affine illumination changes. A quantitative comparison of our detector with existing ones shows a significant improvement in the presence of large affine deformations. Experimental results for wide baseline matching show an excellent performance in the presence of large perspective transformations including significant scale changes. Results for recognition are very good for a database with more than 5000 images.
(Show Context)

Citation Context

...tures. The proposed improvements result in better repeatability and accuracy of interest points. Moreover, the scale invariant Harris-Laplace approach detects different regions than the DoG detector (=-=Lowe, 1999-=-). The latter one detects mainly blobs, whereas the Harris detector responds to corners and highly textured points, hence these detectors extract complementary features in images. If the scale change ...

Robust wide baseline stereo from maximally stable extremal regions

by J. Matas, O. Chum, M. Urban, T. Pajdla - In Proc. BMVC , 2002
"... The wide-baseline stereo problem, i.e. the problem of establishing correspon-dences between a pair of images taken from different viewpoints is studied. A new set of image elements that are put into correspondence, the so called extremal regions, is introduced. Extremal regions possess highly de-sir ..."
Abstract - Cited by 1016 (35 self) - Add to MetaCart
The wide-baseline stereo problem, i.e. the problem of establishing correspon-dences between a pair of images taken from different viewpoints is studied. A new set of image elements that are put into correspondence, the so called extremal regions, is introduced. Extremal regions possess highly de-sirable properties: the set is closed under 1. continuous (and thus projective) transformation of image coordinates and 2. monotonic transformation of im-age intensities. An efficient (near linear complexity) and practically fast de-tection algorithm (near frame rate) is presented for an affinely-invariant stable subset of extremal regions, the maximally stable extremal regions (MSER). A new robust similarity measure for establishing tentative correspon-dences is proposed. The robustness ensures that invariants from multiple measurement regions (regions obtained by invariant constructions from ex-tremal regions), some that are significantly larger (and hence discriminative) than the MSERs, may be used to establish tentative correspondences. The high utility of MSERs, multiple measurement regions and the robust metric is demonstrated in wide-baseline experiments on image pairs from both indoor and outdoor scenes. Significant change of scale (3.5×), illumi-nation conditions, out-of-plane rotation, occlusion, locally anisotropic scale change and 3D translation of the viewpoint are all present in the test prob-lems. Good estimates of epipolar geometry (average distance from corre-sponding points to the epipolar line below 0.09 of the inter-pixel distance) are obtained. 1
(Show Context)

Citation Context

...nsistent subset of correspondences is clearly out of question for computational reasons. Recently, a whole class of stereo matching and object recognition algorithms with common structure has emerged =-=[9, 15, 1, 16, 2, 13, 7, 6]-=-. These methods exploit local invariant descriptors to limit the number of tentative correspondences. Important design decisions at this stage include: 1. the choice of measurement regions, i.e. the p...

Visual categorization with bags of keypoints

by Gabriella Csurka, Christopher R. Dance, Lixin Fan, Jutta Willamowski, Cédric Bray - In Workshop on Statistical Learning in Computer Vision, ECCV , 2004
"... Abstract. We present a novel method for generic visual categorization: the problem of identifying the object content of natural images while generalizing across variations inherent to the object class. This bag of keypoints method is based on vector quantization of affine invariant descriptors of im ..."
Abstract - Cited by 1005 (14 self) - Add to MetaCart
Abstract. We present a novel method for generic visual categorization: the problem of identifying the object content of natural images while generalizing across variations inherent to the object class. This bag of keypoints method is based on vector quantization of affine invariant descriptors of image patches. We propose and compare two alternative implementations using different classifiers: Naïve Bayes and SVM. The main advantages of the method are that it is simple, computationally efficient and intrinsically invariant. We present results for simultaneously classifying seven semantic visual categories. These results clearly demonstrate that the method is robust to background clutter and produces good categorization accuracy even without exploiting geometric information. 1.
(Show Context)

Citation Context

...onverges within a fixed number of iterations. The affine region is then mapped to a circular region, so normalizing it for affine transformations. Scale Invariant Feature Transform (SIFT) descriptors =-=[18]-=- are computed on that region. SIFT descriptors are multi-image representations of an image neighborhood. They are Gaussian derivatives computed at 8 orientation planes over a 4x4 grid of spatial locat...

A bayesian hierarchical model for learning natural scene categories

by Li Fei-fei - In CVPR , 2005
"... We propose a novel approach to learn and recognize natural scene categories. Unlike previous work [9, 17], it does not require experts to annotate the training set. We represent the image of a scene by a collection of local regions, denoted as codewords obtained by unsupervised learning. Each region ..."
Abstract - Cited by 948 (15 self) - Add to MetaCart
We propose a novel approach to learn and recognize natural scene categories. Unlike previous work [9, 17], it does not require experts to annotate the training set. We represent the image of a scene by a collection of local regions, denoted as codewords obtained by unsupervised learning. Each region is represented as part of a “theme”. In previous work, such themes were learnt from hand-annotations of experts, while our method learns the theme distributions as well as the codewords distribution over the themes without supervision. We report satisfactory categorization performances on a large set of 13 categories of complex scenes. 1.
(Show Context)

Citation Context

...ach interest point are between 10 to 30 pixels. 4. Lowe’s DoG Detector. Roughly 100 ∼ 500 regions that are stable and rotationally invariant over different scales are extracted using the DoG detector =-=[7]-=-. Scales of each interest point vary between 20 to 120 pixels. We have used two different representations for describing a patch: normalized 11 × 11 pixel gray values or a 128−dim SIFT vector [7]. Tab...

SURF: Speeded Up Robust Features

by Herbert Bay, Tinne Tuytelaars, Luc Van Gool - ECCV
"... Abstract. In this paper, we present a novel scale- and rotation-invariant interest point detector and descriptor, coined SURF (Speeded Up Ro-bust Features). It approximates or even outperforms previously proposed schemes with respect to repeatability, distinctiveness, and robustness, yet can be comp ..."
Abstract - Cited by 897 (12 self) - Add to MetaCart
Abstract. In this paper, we present a novel scale- and rotation-invariant interest point detector and descriptor, coined SURF (Speeded Up Ro-bust Features). It approximates or even outperforms previously proposed schemes with respect to repeatability, distinctiveness, and robustness, yet can be computed and compared much faster. This is achieved by relying on integral images for image convolutions; by building on the strengths of the leading existing detectors and descrip-tors (in casu, using a Hessian matrix-based measure for the detector, and a distribution-based descriptor); and by simplifying these methods to the essential. This leads to a combination of novel detection, description, and matching steps. The paper presents experimental results on a standard evaluation set, as well as on imagery obtained in the context of a real-life object recognition application. Both show SURF’s strong performance. 1
(Show Context)

Citation Context

...sed a (scale-adapted) Harris measure or the determinant of the Hessian matrix to select the location, and thesSURF: Speeded Up Robust Features 3 Laplacian to select the scale. Focusing on speed, Lowe =-=[12]-=- approximated the Laplacian of Gaussian (LoG) by a Difference of Gaussians (DoG) filter. Several other scale-invariant interest point detectors have been proposed. Examples are the salient region dete...

Space-time Interest Points

by Ivan Laptev, Tony Lindeberg - IN ICCV , 2003
"... Local image features or interest points provide compact and abstract representations of patterns in an image. In this paper, we propose to extend the notion of spatial interest points into the spatio-temporal domain and show how the resulting features often reflect interesting events that can be use ..."
Abstract - Cited by 819 (21 self) - Add to MetaCart
Local image features or interest points provide compact and abstract representations of patterns in an image. In this paper, we propose to extend the notion of spatial interest points into the spatio-temporal domain and show how the resulting features often reflect interesting events that can be used for a compact representation of video data as well as for its interpretation.. To detect
Powered by: Apache Solr
  • About CiteSeerX
  • Submit and Index Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2019 The Pennsylvania State University