Results 1 - 10
of
25
Selection of scale-invariant parts for object class recognition
- In ICCV
, 2003
"... This paper introduces a novel method for constructing and selecting scale-invariant object parts. Scale-invariant local descriptors are first grouped into basic parts. A classifier is then learned for each of these parts, and feature selection is used to determine the most discriminative ones. This ..."
Abstract
-
Cited by 92 (10 self)
- Add to MetaCart
This paper introduces a novel method for constructing and selecting scale-invariant object parts. Scale-invariant local descriptors are first grouped into basic parts. A classifier is then learned for each of these parts, and feature selection is used to determine the most discriminative ones. This approach allows robust part detection, and it is invariant under scale changes—that is, neither the training images nor the test images have to be normalized. The proposed method is evaluated in car detection tasks with significant variations in viewing conditions, and promising results are demonstrated. Different local regions, classifiers and feature selection methods are quantitatively compared. Our evaluation shows that local invariant descriptors are an appropriate representation for object classes such as cars, and it underlines the importance of feature selection. marked in black in the figure. The corresponding patterns are very close, but one of the patches lies on a car, while the other lies in the background. This shows that the corresponding part is not discriminative for cars (in this environment at least). To demonstrate the effect of the proposed feature selection method, Fig. 1(b) shows the initially detected features (white) and discriminative descriptors determined by feature selection (black). These are the ones which should be used in a final, robust detection system. (a) 1.
Learning the discriminative powerinvariance trade-off
- In ICCV
, 2007
"... We investigate the problem of learning optimal descriptors for a given classification task. Many hand-crafted descriptors have been proposed in the literature for measuring visual similarity. Looking past initial differences, what really distinguishes one descriptor from another is the tradeoff that ..."
Abstract
-
Cited by 80 (3 self)
- Add to MetaCart
We investigate the problem of learning optimal descriptors for a given classification task. Many hand-crafted descriptors have been proposed in the literature for measuring visual similarity. Looking past initial differences, what really distinguishes one descriptor from another is the tradeoff that it achieves between discriminative power and invariance. Since this trade-off must vary from task to task, no single descriptor can be optimal in all situations. Our focus, in this paper, is on learning the optimal tradeoff for classification given a particular training set and prior constraints. The problem is posed in the kernel learning framework. We learn the optimal, domain-specific kernel as a combination of base kernels corresponding to base features which achieve different levels of trade-off (such as no invariance, rotation invariance, scale invariance, affine invariance, etc.) This leads to a convex optimisation problem with a unique global optimum which can be solved for efficiently. The method is shown to achieve state-of-the-art performance on the UIUC textures, Oxford flowers and Caltech 101 datasets. 1.
Object class recognition using discriminative local features
- IEEE Transactions on Pattern Analysis and Machine Intelligence
, 2005
"... apport de r e c herche ..."
3D Object modeling and recognition using local affine-invariant image descriptors and multi-view spatial constraints
- International Journal of Computer Vision
, 2006
"... Abstract. This article introduces a novel representation for three-dimensional (3D) objects in terms of local affine-invariant descriptors of their images and the spatial relationships between the corresponding surface patches. Geometric constraints associated with different views of the same patche ..."
Abstract
-
Cited by 58 (11 self)
- Add to MetaCart
Abstract. This article introduces a novel representation for three-dimensional (3D) objects in terms of local affine-invariant descriptors of their images and the spatial relationships between the corresponding surface patches. Geometric constraints associated with different views of the same patches under affine projection are combined with a normalized representation of their appearance to guide matching and reconstruction, allowing the acquisition of true 3D affine and Euclidean models from multiple unregistered images, as well as their recognition in photographs taken from arbitrary viewpoints. The proposed approach does not require a separate segmentation stage, and it is applicable to highly cluttered scenes. Modeling and recognition results are presented.
Object-Specific Figure-Ground Segregation
- In Proc. IEEE Comput. Soc. Conf. Comput. Vision and Pattern Recogn
, 2003
"... We consider the problem of segmenting an image into foreground and background, with foreground containing solely objects of interest known a priori. We propose an integration model that incorporates both edge detection and object part detection results. It consists of two parallel processes: low-lev ..."
Abstract
-
Cited by 33 (0 self)
- Add to MetaCart
We consider the problem of segmenting an image into foreground and background, with foreground containing solely objects of interest known a priori. We propose an integration model that incorporates both edge detection and object part detection results. It consists of two parallel processes: low-level pixel grouping and high-level patch grouping. We seek a solution that optimizes a joint grouping criterion in a reduced space enforced by grouping correspondence between pixels and patches. Using spectral graph partitioning, we show that a near global optimum can be found by solving a constrained eigenvalue problem. We report promising experimental results on a dataset of 15 objects under clutter and occlusion.
A visual vocabulary for flower classification
- In CVPR
, 2006
"... We investigate to what extent ‘bag of visual words ’ models can be used to distinguish categories which have significant visual similarity. To this end we develop and optimize a nearest neighbour classifier architecture, which is evaluated on a very challenging database of flower images. The flower ..."
Abstract
-
Cited by 31 (1 self)
- Add to MetaCart
We investigate to what extent ‘bag of visual words ’ models can be used to distinguish categories which have significant visual similarity. To this end we develop and optimize a nearest neighbour classifier architecture, which is evaluated on a very challenging database of flower images. The flower categories are chosen to be indistinguishable on colour alone (for example), and have considerable variation in shape, scale, and viewpoint. We demonstrate that by developing a visual vocabulary that explicitly represents the various aspects (colour, shape, and texture) that distinguish one flower from another, we can overcome the ambiguities that exist between flower categories. The novelty lies in the vocabulary used for each aspect, and how these vocabularies are combined into a final classifier. The various stages of the classifier (vocabulary selection and combination) are each optimized on a validation set. Results are presented on a dataset of 1360 images consisting of 17 flower species. It is shown that excellent performance can be achieved, far surpassing standard baseline algorithms using (for example) colour cues alone. 1.
S.: Local distance functions: A taxonomy, new algorithms, and an evaluation
- In: Proc. ICCV (2009
"... We present a taxonomy for local distance functions where most existing algorithms can be regarded as approximations of the geodesic distance defined by a metric tensor. We categorize existing algorithms by how, where and when they estimate the metric tensor. We also extend the taxonomy along each ax ..."
Abstract
-
Cited by 8 (0 self)
- Add to MetaCart
We present a taxonomy for local distance functions where most existing algorithms can be regarded as approximations of the geodesic distance defined by a metric tensor. We categorize existing algorithms by how, where and when they estimate the metric tensor. We also extend the taxonomy along each axis. How: We introduce hybrid algorithms that use a combination of dimensionality reduction and metric learning to ameliorate over-fitting. Where: We present an exact polynomial time algorithm to integrate the metric tensor along the lines between the test and training points under the assumption that the metric tensor is piecewise constant. When: We propose an interpolation algorithm where the metric tensor is sampled at a number of references points during the offline phase, which are then interpolated during online classification. We also present a comprehensive evaluation of all the algorithms on tasks in face recognition, object recognition, and digit recognition. 1.
A Deformable Local Image Descriptor
"... This paper presents a novel local image descriptor that is robust to general image deformations. A limitation with traditional image descriptors is that they use a single support region for each interest point. For general image deformations, the amount of deformation for each location varies and is ..."
Abstract
-
Cited by 6 (0 self)
- Add to MetaCart
This paper presents a novel local image descriptor that is robust to general image deformations. A limitation with traditional image descriptors is that they use a single support region for each interest point. For general image deformations, the amount of deformation for each location varies and is unpredictable such that it is difficult to choose the best scale of the support region. To overcome this difficulty, we propose to use multiple support regions of different sizes surrounding an interest point. A feature vector is computed for each support region, and the concatenation of these feature vectors forms the descriptor for this interest point. Furthermore, we propose a new similarity measure model, Local-to-Global Similarity (LGS) model, for point matching that takes advantage of the multi-size support regions. Each support region acts as a ’weak ’ classifier and the weights of these classifiers are learned in an unsupervised manner. The proposed approach is evaluated on a number of images with real and synthetic deformations. The experiment results show that our method outperforms existing techniques under different deformations. 1.
Enhancing binary feature VECTOR SIMILARITY MEASURES
, 2006
"... Similarity and dissimilarity measures play an important role in pattern classification and clustering. For a century, researchers have searched for a good measure. Here, we review, categorize, and evaluate various binary vector similarity / dissimilarity measures. One of the most contentious disput ..."
Abstract
-
Cited by 5 (2 self)
- Add to MetaCart
Similarity and dissimilarity measures play an important role in pattern classification and clustering. For a century, researchers have searched for a good measure. Here, we review, categorize, and evaluate various binary vector similarity / dissimilarity measures. One of the most contentious disputes in the similarity measure selection problem is whether the measure includes or excludes negative matches. While inner-product based similarity measures consider only positive matches, other conventional measures credit both positive and negative matches equally. Hence, we propose an enhanced similarity measure that gives variable credits and show that it is superior to conventional measures in IRIS biometric authentication and offline handwritten character recognition applications. Finally, the proposed similarity measure can be further boosted by applying weights and we demonstrate that it outperforms the weighted Hamming distance.
Discriminative Techniques for the Recognition of Complex-Shaped Objects
, 2003
"... This thesis presents new techniques which enable the automatic recognition of everyday objects like chairs and ladders in images of highly cluttered scenes. Given an image, we extract information about the shape and texture properties present in small patches of the image and use that information to ..."
Abstract
-
Cited by 4 (1 self)
- Add to MetaCart
This thesis presents new techniques which enable the automatic recognition of everyday objects like chairs and ladders in images of highly cluttered scenes. Given an image, we extract information about the shape and texture properties present in small patches of the image and use that information to identify parts of the objects we are interested in. We then assemble those parts into overall hypotheses about what objects are present in the image, and where they are. Solving this problem in a general setting is one of the central problems in computer vision, as doing so would have an immediate impact on a far-reaching set of applications in medicine, surveillance, manufacturing, robotics, and other areas.

