Results 1 - 10
of
10
A PERFORMANCE EVALUATION OF LOCAL DESCRIPTORS
, 2005
"... In this paper we compare the performance of descriptors computed for local interest regions, as for example extracted by the Harris-Affine detector [32]. Many different descriptors have been proposed in the literature. However, it is unclear which descriptors are more appropriate and how their perfo ..."
Abstract
-
Cited by 775 (24 self)
- Add to MetaCart
In this paper we compare the performance of descriptors computed for local interest regions, as for example extracted by the Harris-Affine detector [32]. Many different descriptors have been proposed in the literature. However, it is unclear which descriptors are more appropriate and how their performance depends on the interest region detector. The descriptors should be distinctive and at the same time robust to changes in viewing conditions as well as to errors of the detector. Our evaluation uses as criterion recall with respect to precision and is carried out for different image transformations. We compare shape context [3], steerable filters [12], PCA-SIFT [19], differential invariants [20], spin images [21], SIFT [26], complex filters [37], moment invariants [43], and cross-correlation for different types of interest regions. We also propose an extension of the SIFT descriptor, and show that it outperforms the original method. Furthermore, we observe that the ranking of the descriptors is mostly independent of the interest region detector and that the SIFT based descriptors perform best. Moments and steerable filters show the best performance among the low dimensional descriptors.
Using spin images for efficient object recognition in cluttered 3D scenes
- IEEE Transactions on Pattern Analysis and Machine Intelligence
, 1999
"... We present a 3-D shape-based object recognition system for simultaneous recognition of multiple objects in scenes containing clutter and occlusion. Recognition is based on matching surfaces by matching points using the spin-image representation. The spin-image is a data level shape descriptor that i ..."
Abstract
-
Cited by 220 (9 self)
- Add to MetaCart
We present a 3-D shape-based object recognition system for simultaneous recognition of multiple objects in scenes containing clutter and occlusion. Recognition is based on matching surfaces by matching points using the spin-image representation. The spin-image is a data level shape descriptor that is used to match surfaces represented as surface meshes. We present a compression scheme for spin-images that results in efficient multiple object recognition which we verify with results showing the simultaneous recognition of multiple objects from a library of 20 models. Furthermore, we demonstrate the robust performance of recognition in the presence of clutter and occlusion through analysis of recognition trials on 100 scenes. This research was performed at Carnegie Mellon University and was supported by the US Department Surface matching is a technique from 3-D computer vision that has many applications in the area of robotics and automation. Through surface matching, an object can be recognized in a scene by comparing a sensed surface to an object surface stored in memory. When the object surface is matched to the scene surface, an association is made between something known (the object) and
Surface Matching for Object Recognition in Complex 3-D Scenes
- Image and Vision Computing
, 1998
"... We present an approach to recognition of complex objects in cluttered 3-D scenes that does not require feature extraction or segmentation. Our object representation comprises descriptive images associated with oriented points on the surface of an object. Using a single point basis constructed from a ..."
Abstract
-
Cited by 33 (1 self)
- Add to MetaCart
We present an approach to recognition of complex objects in cluttered 3-D scenes that does not require feature extraction or segmentation. Our object representation comprises descriptive images associated with oriented points on the surface of an object. Using a single point basis constructed from an oriented point, the position of other points on the surface of the object can be described by two parameters. The accumulation of these parameters for many points on the surface of the object results in an image at each oriented point. These images, localized descriptions of the global shape of the object, are invariant to rigid transformations. Through correlation of images, point correspondences between a model and scene data are established. Geometric consistency is used to group the correspondences from which plausible rigid transformations that align the model with the scene are calculated. The transformations are then refined and verified using a modified iterative closest point algo...
Efficient Multiple Model Recognition in Cluttered 3-D Scenes
- Proc. IEEE Conference on Computer Vision and Pattern
, 1998
"... We present a 3-D shape-based object recognition system for simultaneous recognition of multiple objects in scenes containing clutter and occlusion. Recognition is based on matching by matching points using the spin-image representation. The spin-image is a data level shape descriptor that is used t ..."
Abstract
-
Cited by 12 (2 self)
- Add to MetaCart
We present a 3-D shape-based object recognition system for simultaneous recognition of multiple objects in scenes containing clutter and occlusion. Recognition is based on matching by matching points using the spin-image representation. The spin-image is a data level shape descriptor that is used to match surfaces represented as meshes. We present a compression scheme for images that results in multiple object recognition which we with results showing the simultaneous recognition of multiple objects from a library of 20 models. Furthermore, we demonstrate the robust performance of recognition in the presence of clutter and occlusion through analysis of recognition trials on 100 scenes. 1
2002 Microsoft Corporation. All rights reserved 3 June 2002. 3 June 2002 • Samba web pages. 3 June 2002 • Wilcox-O'Hearn, Bryce. Zooko introduction
- COPYRIGHT. UPDATE. 3 June 2002 • Open Content. 3 June 2002 • INSTITUT NATIONAL DE RECHERCHE EN INFORMATIQUE ET EN AUTOMATIQUE. © INRIA. 13 May 2002. 3 June 2002 • The Linux Home page at Linux online. © Linux Online Inc. 27 May 2002. 3 June 2002 • Welcome
, 1999
"... In this paper, we report on recent extensions to a surface matching algorithm based on local 3-D signatures. This algorithm was previously shown to be effective in view registration of general surfaces and in object recognition from 3-D model data bases. We describe extensions to the basic matching ..."
Abstract
-
Cited by 9 (1 self)
- Add to MetaCart
In this paper, we report on recent extensions to a surface matching algorithm based on local 3-D signatures. This algorithm was previously shown to be effective in view registration of general surfaces and in object recognition from 3-D model data bases. We describe extensions to the basic matching algorithm which will enable it to address several challenging, and often overlooked, problems encountered with real data. First, we describe extensions that allow us to deal with data sets with large variations in resolution and with large data sets for which computational efficiency is a major issue. The applicability of the enhanced matching algorithm is illustrated by an example application: the construction of large terrain maps and the construction
Unconstrained registration of large 3D point sets for complex model building
- IEEE/RSJ International Conference On Intelligent Robotic
, 1998
"... We present a method for building models of complex environments from range data gathered at multiple viewpoints. Our approach is unique in that no prior knowledge of the relative positions of the viewpoints is needed in order to register data from them. Furthermore, we present a technique for specif ..."
Abstract
-
Cited by 7 (2 self)
- Add to MetaCart
We present a method for building models of complex environments from range data gathered at multiple viewpoints. Our approach is unique in that no prior knowledge of the relative positions of the viewpoints is needed in order to register data from them. Furthermore, we present a technique for specification and utilization of so-called “common-sense ” constraints on the transformations between views to improve the accuracy and speed of the registration process. Results are shown from our effort to map a 60 m. by 20 m. multiple-room storage area containing a cluttered array of objects. The problem of building models from multiple views is critical in various applications, including remote operation,
MIKOLAJCZYK AND SCHMID: A PERFORMANCE EVALUATION OF LOCAL DESCRIPTORS 2
, 2005
"... In this paper we compare the performance of descriptors computed for local interest regions, as for example extracted by the Harris-Affine detector [32]. Many different descriptors have been proposed in the literature. However, it is unclear which descriptors are more appropriate and how their perfo ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
In this paper we compare the performance of descriptors computed for local interest regions, as for example extracted by the Harris-Affine detector [32]. Many different descriptors have been proposed in the literature. However, it is unclear which descriptors are more appropriate and how their performance depends on the interest region detector. The descriptors should be distinctive and at the same time robust to changes in viewing conditions as well as to errors of the detector. Our evaluation uses as criterion recall with respect to precision and is carried out for different image transformations. We compare shape context [3], steerable filters [12], PCA-SIFT [19], differential invariants [20], spin images [21], SIFT [26], complex filters [37], moment invariants [43], and cross-correlation for different types of interest regions. We also propose an extension of the SIFT descriptor, and show that it outperforms the original method. Furthermore, we observe that the ranking of the descriptors is mostly independent of the interest region detector and that the SIFT based descriptors perform best. Moments and steerable filters show the best performance among the low dimensional descriptors.
Geometric Alignment Of Two Overlapping Range Images
"... In this paper, we propose a novel geometric method for the alignment of two overlapping range images. The method first employs the traditional ICP criterion to establish a set of possible correspondences and then refine these correspondences using geometric constraints derived from properties of r ..."
Abstract
- Add to MetaCart
In this paper, we propose a novel geometric method for the alignment of two overlapping range images. The method first employs the traditional ICP criterion to establish a set of possible correspondences and then refine these correspondences using geometric constraints derived from properties of reflected correspondence vectors. In this way, the method overcomes a major limitation of the traditional ICP criterion which is the introduction of false matches in almost every iteration of the alignment. For an accurate estimation of the geometric parameters of interest, the Monte Carlo method is used in conjunction with a median filter. Finally, the quaternion method is used to estimate the motion parameters based on the refined correspondences. Experimental results based on both synthetic data and real images show that the proposed method can effectively align two overlapping range images with a small motion.
JOINT APPEARANCE AND LOCALITY IMAGE REPRESENTATION BY GAUSSIANIZATION
, 2010
"... A novel image representation is proposed in this thesis to capture both the appearance and locality information for image classification applications. First, we model the feature vectors, from various granularity levels including the corpus level, the image level and image patch level, in a hierarch ..."
Abstract
- Add to MetaCart
A novel image representation is proposed in this thesis to capture both the appearance and locality information for image classification applications. First, we model the feature vectors, from various granularity levels including the corpus level, the image level and image patch level, in a hierarchical Bayesian framework using mixtures of Gaussians. After such a hierarchical Gaussianization, each image is represented as a Gaussian mixture model (GMM) for its appearance, and several Gaussian maps for its spatial layout. Then we extract the appearance information from the GMM parameters, and the locality information from the global and the local statistics over Gaussian maps. Finally, we employ a supervised dimension reduction technique called DAP (discriminant adaptive projection) to remove noise directions and to further enhance the discriminating power of our representation. To validate the argument that the new representation is a general representation for images and video frames, we evaluate the representation on several important applications. Firstly, we apply the new presentation to classification and regression tasks taking whole images as inputs. These

