Results 1 - 10
of
22
A PERFORMANCE EVALUATION OF LOCAL DESCRIPTORS
, 2005
"... In this paper we compare the performance of descriptors computed for local interest regions, as for example extracted by the Harris-Affine detector [32]. Many different descriptors have been proposed in the literature. However, it is unclear which descriptors are more appropriate and how their perfo ..."
Abstract
-
Cited by 775 (24 self)
- Add to MetaCart
In this paper we compare the performance of descriptors computed for local interest regions, as for example extracted by the Harris-Affine detector [32]. Many different descriptors have been proposed in the literature. However, it is unclear which descriptors are more appropriate and how their performance depends on the interest region detector. The descriptors should be distinctive and at the same time robust to changes in viewing conditions as well as to errors of the detector. Our evaluation uses as criterion recall with respect to precision and is carried out for different image transformations. We compare shape context [3], steerable filters [12], PCA-SIFT [19], differential invariants [20], spin images [21], SIFT [26], complex filters [37], moment invariants [43], and cross-correlation for different types of interest regions. We also propose an extension of the SIFT descriptor, and show that it outperforms the original method. Furthermore, we observe that the ranking of the descriptors is mostly independent of the interest region detector and that the SIFT based descriptors perform best. Moments and steerable filters show the best performance among the low dimensional descriptors.
Wide Baseline Stereo Matching
- In Proc. ICCV
, 1998
"... The objective of this work is to enlarge the class of camera motions for which epipolar geometry and image correspondences can be computed automatically. This facilitates matching between quite disparate views --- wide baseline stereo. Two extensions are made to the current small baseline algorithms ..."
Abstract
-
Cited by 113 (15 self)
- Add to MetaCart
The objective of this work is to enlarge the class of camera motions for which epipolar geometry and image correspondences can be computed automatically. This facilitates matching between quite disparate views --- wide baseline stereo. Two extensions are made to the current small baseline algorithms: first, and most importantly, a viewpoint invariant measure is developed for assessing the affinity of corner neighbourhoods over image pairs; second, algorithms are given for generating putative corner matches between image pairs using local homographies. Two novel infrastructure developments are also described: the automatic generation of local homographies, and the combination of possibly conflicting sets of matches prior to RANSAC estimation. The wide baseline matching algorithm is demonstrated on a number of image pairs with varying relative motion, and for different scene types. All processing is automatic. 1 Introduction It is now possible to automatically compute the epipolar geom...
Weak hypotheses and boosting for generic object detection and recognition
- In Proc. ECCV
, 2004
"... Abstract. In this paper we describe the first stage of a new learning system for object detection and recognition. For our system we propose Boosting [5] as the underlying learning technique. This allows the use of very diverse sets of visual features in the learning process within a common framewor ..."
Abstract
-
Cited by 107 (7 self)
- Add to MetaCart
Abstract. In this paper we describe the first stage of a new learning system for object detection and recognition. For our system we propose Boosting [5] as the underlying learning technique. This allows the use of very diverse sets of visual features in the learning process within a common framework: Boosting — together with a weak hypotheses finder — may choose very inhomogeneous features as most relevant for combination into a final hypothesis. As another advantage the weak hypotheses finder may search the weak hypotheses space without explicit calculation of all available hypotheses, reducing computation time. This contrasts the related work of Agarwal and Roth [1] where Winnow was used as learning algorithm and all weak hypotheses were calculated explicitly. In our first empirical evaluation we use four types of local descriptors: two basic ones consisting of a set of grayvalues and intensity moments and two high level descriptors: moment invariants [8] and SIFTs [12]. The descriptors are calculated from local patches detected by an interest point operator. The weak hypotheses finder selects one of the local patches and one type of local descriptor and efficiently searches for the most discriminative similarity threshold. This differs from other work on Boosting for object recognition where simple rectangular hypotheses [22] or complex classifiers [20] have been used. In relatively simple images, where the objects are prominent, our approach yields results comparable to the state-of-the-art [3]. But we also obtain very good results on more complex images, where the objects are located in arbitrary positions, poses, and scales in the images. These results indicate that our flexible approach, which also allows the inclusion of features from segmented regions and even spatial relationships, leads us a significant step towards generic object recognition. 1
Generic Object Recognition with Boosting
- IEEE Trans. PAMI
, 2006
"... This paper presents a powerful framework for generic object recognition. Boosting is used as an underlying learning technique. For the first time a combination of various weak classifiers of different types of descriptors is used, which slightly increases the classification result but dramatically i ..."
Abstract
-
Cited by 76 (4 self)
- Add to MetaCart
This paper presents a powerful framework for generic object recognition. Boosting is used as an underlying learning technique. For the first time a combination of various weak classifiers of different types of descriptors is used, which slightly increases the classification result but dramatically improves the stability of a classifier. Besides applying well known techniques to extract salient regions we also present a new segmentation method-“Similarity-Measure-Segmentation”. This approach delivers segments, which can consist of several disconnected parts. This turns out to be a mighty description of local similarity. With regard to the task of object categorization, Similarity-Measure-Segmentation performs equal or better than current state-of-the-art segmentation techniques. In contrast to previous solutions we aim at handling of complex objects appearing in highly cluttered images. Therefore we have set up a database containing images with the required complexity. On these images we obtain very good classification results of up to 87 % ROC-equal error rate. Focusing the performance on common databases for object recognition our approach outperforms all comparable solutions.
Recognizing Color Patterns Irrespective of Viewpoint and Illumination
, 1999
"... New invariant features are presented that can be used for the recognition of planar color patterns such as labels, logos, signs, pictograms, etc., irrespective of the viewpoint or the illumination conditions, and without the need for error prone contour extraction. The new features are based on mome ..."
Abstract
-
Cited by 40 (2 self)
- Add to MetaCart
New invariant features are presented that can be used for the recognition of planar color patterns such as labels, logos, signs, pictograms, etc., irrespective of the viewpoint or the illumination conditions, and without the need for error prone contour extraction. The new features are based on moments of powers of the intensities in the individual color bands and combinations thereof. These moments implicitly characterize the shape, the intensity and the color distribution of the pattern in a uniform manner. The paper gives a classification of all functions of such moments which are invariant under both affine deformations of the pattern (thus achieving viewpoint invariance) as well as linear changes of the intensity values of the color bands (hence, coping with changes in the irradiance pattern due to different lighting conditions and/or viewpoints). The discriminant power and classification performance of the new invariants for color pattern recognition is tested on a data set of im...
Local features for object class recognition
- In Proceedings of the 10th IEEE International Conference on Computer Vision
, 2005
"... In this paper we compare the performance of local detectors and descriptors in the context of object class recognition. Recently, many detectors / descriptors have been evaluated in the context of matching as well as invariance to viewpoint changes [20]. However, it is unclear if these results can b ..."
Abstract
-
Cited by 31 (4 self)
- Add to MetaCart
In this paper we compare the performance of local detectors and descriptors in the context of object class recognition. Recently, many detectors / descriptors have been evaluated in the context of matching as well as invariance to viewpoint changes [20]. However, it is unclear if these results can be generalized to categorization problems, which require different properties of features. We evaluate 5 stateof-the-art scale invariant region detectors and 5 descriptors. Local features are computed for 20 object classes and clustered using hierarchical agglomerative clustering. We measure the quality of appearance clusters and location distributions using entropy as well as precision. We also measure how the clusters generalize from training set to novel test data. Our results indicate that extended SIFT descriptors [22] computed on Hessian-Laplace [20] regions perform best. Second score is obtained by Salient regions [11]. The results also show that these two detectors provide complementary features. The new detectors/descriptors significantly improve the performance of a state-of-the art recognition approach [16] in pedestrian detection task. 1.
Deformation invariant image matching
- In ICCV
, 2005
"... We propose a novel framework to build descriptors of local intensity that are invariant to general deformations. In this framework, an image is embedded as a 2D surface in 3D space, with intensity weighted relative to distance in x-y. We show that as this weight increases, geodesic distances on the ..."
Abstract
-
Cited by 30 (1 self)
- Add to MetaCart
We propose a novel framework to build descriptors of local intensity that are invariant to general deformations. In this framework, an image is embedded as a 2D surface in 3D space, with intensity weighted relative to distance in x-y. We show that as this weight increases, geodesic distances on the embedded surface are less affected by image deformations. In the limit, distances are deformation invariant. We use geodesic sampling to get neighborhood samples for interest points, then use a geodesic-intensity histogram (GIH) as a deformation invariant local descriptor. In addition to its invariance, the new descriptor automatically finds its support region. This means it can safely gather information from a large neighborhood to improve discriminability. Furthermore, we propose a matching method for this descriptor that is invariant to affine lighting changes. We have tested this new descriptor on interest point matching for two data sets, one with synthetic deformation and lighting change, another with real non-affine deformations. Our method shows promising matching results compared to several other approaches. 1.
Geometric Grouping of Repeated Elements within Images
- IN PROC. BMVC
, 1998
"... The objective of this work is the automatic detection and grouping of imaged elements which repeat in a scene. We show that structures that repeat in the world (for example wall paper patterns) are related by particular parametrized transformations in perspective images. These image transformatio ..."
Abstract
-
Cited by 25 (3 self)
- Add to MetaCart
The objective of this work is the automatic detection and grouping of imaged elements which repeat in a scene. We show that structures that repeat in the world (for example wall paper patterns) are related by particular parametrized transformations in perspective images. These image transformations provide powerful grouping constraints, and can be used at the heart of hypothesize and verify grouping algorithms. Parametrized
Registration of Challenging Image Pairs: Initialization, Estimation, and Decision
, 2007
"... Our goal is an automated 2D-image-pair registration algorithm capable of aligning images taken of a wide variety of natural and man-made scenes as well as many medical images. The algorithm should handle low overlap, substantial orientation and scale differences, large illumination variations, and p ..."
Abstract
-
Cited by 15 (4 self)
- Add to MetaCart
Our goal is an automated 2D-image-pair registration algorithm capable of aligning images taken of a wide variety of natural and man-made scenes as well as many medical images. The algorithm should handle low overlap, substantial orientation and scale differences, large illumination variations, and physical changes in the scene. An important component of this is the ability to automatically reject pairs that have no overlap or have too many differences to be aligned well. We propose a complete algorithm including techniques for initialization, for estimating transformation parameters, and for automatically deciding if an estimate is correct. Keypoints extracted and matched between images are used to generate initial similarity transform estimates, each accurate over a small region. These initial estimates are rank-ordered and tested individually in succession. Each estimate is refined using the Dual-Bootstrap ICP algorithm, driven by matching of multiscale features. A three-part decision criteria, combining measurements of alignment accuracy, stability in the estimate, and consistency in the constraints, determines whether the refined transformation estimate is accepted as correct. Experimental results on a data set of 22 challenging image pairs show that the algorithm effectively aligns 19 of the 22 pairs and rejects 99.8 percent of the misalignments that occur when all possible pairs are tried. The algorithm substantially out-performs algorithms based on keypoint matching alone.
Performance characterisation in computer vision: The role of statistics in testing and design
- Imaging and Vision Systems: Theory, Assessment and Applications. NOVA Science Books
, 1993
"... We consider the relationship between the performance characteristics of vision algorithms and algorithm design. In the first part we discuss the issues involved in testing. A description of good practice is given covering test objectives, test data, test metrics and the test protocol. In the second ..."
Abstract
-
Cited by 12 (3 self)
- Add to MetaCart
We consider the relationship between the performance characteristics of vision algorithms and algorithm design. In the first part we discuss the issues involved in testing. A description of good practice is given covering test objectives, test data, test metrics and the test protocol. In the second part we discuss aspects of good algorithmic design including understanding of the statistical properties of data and common algorithmic operations, and suggest how some common problems may be overcome. 1

