Results 1 - 10
of
11
Visual categorization with bags of keypoints
- In Workshop on Statistical Learning in Computer Vision, ECCV
, 2004
"... Abstract. We present a novel method for generic visual categorization: the problem of identifying the object content of natural images while generalizing across variations inherent to the object class. This bag of keypoints method is based on vector quantization of affine invariant descriptors of im ..."
Abstract
-
Cited by 357 (7 self)
- Add to MetaCart
Abstract. We present a novel method for generic visual categorization: the problem of identifying the object content of natural images while generalizing across variations inherent to the object class. This bag of keypoints method is based on vector quantization of affine invariant descriptors of image patches. We propose and compare two alternative implementations using different classifiers: Naïve Bayes and SVM. The main advantages of the method are that it is simple, computationally efficient and intrinsically invariant. We present results for simultaneously classifying seven semantic visual categories. These results clearly demonstrate that the method is robust to background clutter and produces good categorization accuracy even without exploiting geometric information. 1.
Categorizing Nine Visual Classes Using Local Appearance Descriptors
- In ICPR Workshop on Learning for Adaptable Visual Systems
, 2004
"... We present a novel method for generic visual categorization: the problem of identifying the object content of natural images while generalizing across variations inherent to the object class. This bag of keypoints method is based on vector quantization of affine invariant descriptors of image patche ..."
Abstract
-
Cited by 50 (0 self)
- Add to MetaCart
We present a novel method for generic visual categorization: the problem of identifying the object content of natural images while generalizing across variations inherent to the object class. This bag of keypoints method is based on vector quantization of affine invariant descriptors of image patches. We propose and compare two alternative implementations using different classifiers: Naïve Bayes and SVM. The main advantages of the method are that it is simple, computationally efficient and intrinsically invariant. We present results for classifying nine semantic visual categories and comment on results obtained by Fergus et al using a different method on the same data set. We obtain excellent results as well for multi class categorization as for object detection. A thorough evaluation clearly demonstrates that our method is robust to background clutter and produces good categorization accuracy even without exploiting geometric information. 1.
Names and faces in the news
- In Proc. CVPR
, 2004
"... We show quite good face clustering is possible for a dataset of inaccurately and ambiguously labelled face images. Our dataset is 44,773 face images, obtained by applying a face finder to approximately half a million captioned news images. This dataset is more realistic than usual face recognition d ..."
Abstract
-
Cited by 43 (1 self)
- Add to MetaCart
We show quite good face clustering is possible for a dataset of inaccurately and ambiguously labelled face images. Our dataset is 44,773 face images, obtained by applying a face finder to approximately half a million captioned news images. This dataset is more realistic than usual face recognition datasets, because it contains faces captured “in the wild ” in a variety of configurations with respect to the camera, taking a variety of expressions, and under illumination of widely varying color. Each face image is associated with a set of names, automatically extracted from the associated caption. Many, but not all such sets contain the correct name. We cluster face images in appropriate discriminant coordinates. We use a clustering procedure to break ambiguities in labelling and identify incorrectly labelled faces. A merging procedure then identifies variants of names that refer to the same individual. The resulting representation can be used to label faces in news images or to organize news pictures by individuals present. An alternative view of our procedure is as a process that cleans up noisy supervised data. We demonstrate how to use entropy measures to evaluate such procedures. 1.
Discriminative Training for Object Recognition using Image Patches
- In CVPR 05
"... We present a method for automatically learning discriminative image patches for the recognition of given object classes. The approach applies discriminative training of log-linear models to image patch histograms. We show that it works well on three tasks and performs significantly better than other ..."
Abstract
-
Cited by 40 (20 self)
- Add to MetaCart
We present a method for automatically learning discriminative image patches for the recognition of given object classes. The approach applies discriminative training of log-linear models to image patch histograms. We show that it works well on three tasks and performs significantly better than other methods using the same features. For example, the method decides that patches containing an eye are most important for distinguishing face from background images. The recognition performance is very competitive with error rates presented in other publications. In particular, a new best error rate for the Caltech motorbikes data of 1.5 % is achieved. 1.
Learning Visual Attributes
"... We present a probabilistic generative model of visual attributes, together with an efficient learning algorithm. Attributes are visual qualities of objects, such as ‘red’, ‘striped’, or ‘spotted’. The model sees attributes as patterns of image segments, repeatedly sharing some characteristic propert ..."
Abstract
-
Cited by 27 (0 self)
- Add to MetaCart
We present a probabilistic generative model of visual attributes, together with an efficient learning algorithm. Attributes are visual qualities of objects, such as ‘red’, ‘striped’, or ‘spotted’. The model sees attributes as patterns of image segments, repeatedly sharing some characteristic properties. These can be any combination of appearance, shape, or the layout of segments within the pattern. Moreover, attributes with general appearance are taken into account, such as the pattern of alternation of any two colors which is characteristic for stripes. To enable learning from unsegmented training images, the model is learnt discriminatively, by optimizing a likelihood ratio. As demonstrated in the experimental evaluation, our model can learn in a weakly supervised setting and encompasses a broad range of attributes. We show that attributes can be learnt starting from a text query to Google image search, and can then be used to recognize the attribute and determine its spatial extent in novel real-world images. 1
Proximity Distribution Kernels for Geometric Context in Category Recognition
"... haibin.ling @ siemens.com We propose using the proximity distribution of vectorquantized ..."
Abstract
-
Cited by 15 (3 self)
- Add to MetaCart
haibin.ling @ siemens.com We propose using the proximity distribution of vectorquantized
A “Shape Aware ” Model for semi-supervised Learning of Objects and its Context
"... We present an approach that combines bag-of-words and spatial models to perform semantic and syntactic analysis for recognition of an object based on its internal appearance and its context. We argue that while object recognition requires modeling relative spatial locations of image features within ..."
Abstract
-
Cited by 4 (1 self)
- Add to MetaCart
We present an approach that combines bag-of-words and spatial models to perform semantic and syntactic analysis for recognition of an object based on its internal appearance and its context. We argue that while object recognition requires modeling relative spatial locations of image features within the object, a bag-of-word is sufficient for representing context. Learning such a model from weakly labeled data involves labeling of features into two classes: foreground(object) or “informative” background(context). We present a “shape-aware ” model which utilizes contour information for efficient and accurate labeling of features in the image. Our approach iterates between an MCMC-based labeling and contour based labeling of features to integrate co-occurrence of features and shape similarity. 1
Thème COG — Systèmes cognitifs
"... apport de recherche ISSN 0249-6399 ISRN INRIA/RR--6600--FR+ENGFrom images to shape models for object detection ∗ ..."
Abstract
- Add to MetaCart
apport de recherche ISSN 0249-6399 ISRN INRIA/RR--6600--FR+ENGFrom images to shape models for object detection ∗
unknown title
"... IJCV manuscript No. (will be inserted by the editor) From images to shape models for object detection ..."
Abstract
- Add to MetaCart
IJCV manuscript No. (will be inserted by the editor) From images to shape models for object detection
DOI: 10.1109/CVPR.2006.311 Towards Multi-View Object Class Detection
, 2010
"... We present a novel system for generic object class detection. In contrast to most existing systems which focus on a single viewpoint or aspect, our approach can detect object instances from arbitrary viewpoints. This is achieved by combining the Implicit Shape Model for object class detection propos ..."
Abstract
- Add to MetaCart
We present a novel system for generic object class detection. In contrast to most existing systems which focus on a single viewpoint or aspect, our approach can detect object instances from arbitrary viewpoints. This is achieved by combining the Implicit Shape Model for object class detection proposed by Leibe and Schiele with the multi-view specific object recognition system of Ferrari et al. After learning single-view codebooks, these are interconnected by so-called activation links, obtained through multi-view region tracks across different training views of individual object instances. During recognition, these integrated codebooks work together to determine the location and pose of the object. Experimental results demonstrate the viability of the approach and compare it to a bank of independent single-view detectors. 1.

