• Documents
  • Authors
  • Tables
  • Log in
  • Sign up
  • MetaCart
  • DMCA
  • Donate

CiteSeerX logo

Advanced Search Include Citations
Advanced Search Include Citations

Image retrieval and classification using local distance functions. NIPS (2006)

by A Frome, Y Singer, J Malik
Add To MetaCart

Tools

Sorted by:
Results 1 - 10 of 107
Next 10 →

Learning globally-consistent local distance functions for shape-based image retrieval and classification

by Andrea Frome, Fei Sha, Yoram Singer, Jitendra Malik - In ICCV , 2007
"... We address the problem of visual category recognition by learning an image-to-image distance function that attempts to satisfy the following property: the distance between images from the same category should be less than the distance between images from different categories. We use patch-based feat ..."
Abstract - Cited by 149 (3 self) - Add to MetaCart
We address the problem of visual category recognition by learning an image-to-image distance function that attempts to satisfy the following property: the distance between images from the same category should be less than the distance between images from different categories. We use patch-based feature vectors common in object recognition work as a basis for our image-to-image distance functions. Our large-margin formulation for learning the distance functions is similar to formulations used in the machine learning literature on distance metric learning, however we differ in that we learn local distance functions— a different parameterized function for every image of our training set—whereas typically a single global distance function is learned. This was a novel approach first introduced in Frome, Singer, & Malik, NIPS 2006. In that work we learned the local distance functions independently, and the outputs of these functions could not be compared at test time without the use of additional heuristics or training. Here we introduce a different approach that has the advantage that it learns distance functions that are globally consistent in that they can be directly compared for purposes of retrieval and classification. The output of the learning algorithm are weights assigned to the image features, which is intuitively appealing in the computer vision setting: some features are more salient than others, and which are more salient depends on the category, or image, being considered. We train and test using the Caltech 101 object recognition benchmark. Using fifteen training images per category, we achieved a mean recognition rate of 63.2 % and
(Show Context)

Citation Context

...formalize the problem within a large-margin learning framework. At the general algorithmic level, we follow the formulation in [18], and in the context of image recognition, that in our earlier work, =-=[6]-=-. Both this work and [6] differ from [18] in that we both learn a parameterization (1) for every exemplar (image) and neither our input distances nor our final distances are metrics. This is a departu...

Recognition using Regions

by Chunhui Gu, Joseph J. Lim, Pablo Arbeláez, Jitendra Malik
"... This paper presents a unified framework for object detection, segmentation, and classification using regions. Region features are appealing in this context because: (1) they encode shape and scale information of objects naturally; (2) they are only mildly affected by background clutter. Regions have ..."
Abstract - Cited by 106 (5 self) - Add to MetaCart
This paper presents a unified framework for object detection, segmentation, and classification using regions. Region features are appealing in this context because: (1) they encode shape and scale information of objects naturally; (2) they are only mildly affected by background clutter. Regions have not been popular as features due to their sensitivity to segmentation errors. In this paper, we start by producing a robust bag of overlaid regions for each image using Arbeláez et al., CVPR 2009. Each region is represented by a rich set of image cues (shape, color and texture). We then learn region weights using a max-margin framework. In detection and segmentation, we apply a generalized Hough voting scheme to generate hypotheses of object locations, scales and support, followed by a verification classifier and a constrained segmenter on each hypothesis. The proposed approach significantly outperforms the state of the art on the ETHZ shape database (87.1 % average detection rate compared to Ferrari et al.’s 67.2%), and achieves competitive performance on the Caltech 101 database.
(Show Context)

Citation Context

... equally significant for discriminating an object from another. For example, wheel regions are more important than uniform patches to distinguish a bicycle from a mug. Here, we adapt the framework of =-=[13]-=- for learning region weights. Given an exemplar I con, i = taining one object instance and a query J , denote f I i 1, 2, . . . , M and f J j , j = 1, 2, . . . , N their bags of region features. The d...

Learning visual similarity measures for comparing never seen objects

by Eric Nowak - Proc. IEEE CVPR , 2007
"... In this paper we propose and evaluate an algorithm that learns a similarity measure for comparing never seen objects. The measure is learned from pairs of training images labeled “same ” or “different”. This is far less informative than the commonly used individual image labels (e.g. “car model X”), ..."
Abstract - Cited by 82 (0 self) - Add to MetaCart
In this paper we propose and evaluate an algorithm that learns a similarity measure for comparing never seen objects. The measure is learned from pairs of training images labeled “same ” or “different”. This is far less informative than the commonly used individual image labels (e.g. “car model X”), but it is cheaper to obtain. The proposed algorithm learns the characteristic differences between local descriptors sampled from pairs of “same ” and “different” images. These differences are vector quantized by an ensemble of extremely randomized binary trees, and the similarity measure is computed from the quantized differences. The extremely randomized trees are fast to learn, robust due to the redundant information they carry and they have been proved to be very good clusterers. Furthermore, the trees efficiently combine different feature types (SIFT and geometry). We evaluate our innovative similarity measure on four very different datasets and consistantly outperform the state-of-the-art competitive approaches. 1.

Randomized clustering forests for image classification

by Frank Moosmann, Student Member, Student Member, Frederic Jurie, Ieee Computer Society - Pattern Analysis and Machine Intelligence
"... Abstract—This paper introduces three new contributions to the problems of image classification and image search. First, we propose a new image patch quantization algorithm. Other competitive approaches require a large code book and the sampling of many local regions for accurate image description, a ..."
Abstract - Cited by 82 (3 self) - Add to MetaCart
Abstract—This paper introduces three new contributions to the problems of image classification and image search. First, we propose a new image patch quantization algorithm. Other competitive approaches require a large code book and the sampling of many local regions for accurate image description, at the expense of a prohibitive processing time. We introduce Extremely Randomized Clustering Forests—ensembles of randomly created clustering trees—that are more accurate, much faster to train and test, and more robust to background clutter compared to state-of-the-art methods. Second, we propose an efficient image classification method that combines ERC-Forests and saliency maps very closely with image information sampling. For a given image, a classifier builds a saliency map online, which it uses for classification. We demonstrate speed and accuracy improvement in several state-of-the-art image classification tasks. Finally, we show that our ERC-Forests are used very successfully for learning distances between images of never-seen objects. Our algorithm learns the characteristic differences between local descriptors sampled from pairs of the “same ” or “different ” objects, quantizes these differences with ERC-Forests, and computes the similarity from this quantization. We show significant improvement over state-of-the-art competitive approaches. Index Terms—Randomized trees, image classification, object recognition, similarity measure. Ç 1
(Show Context)

Citation Context

...sed on low-level visual features of the images. Recent machine learning techniques have demonstrated their capability of identifying image categories from image features, like [11], [15], [13], [14], =-=[19]-=-, [51], [57], [61], to mention a few. Even if it has been shown that, to some extent, categories can be discovered without any supervision, the best results are obtained when knowledge about categorie...

Nonparametric Scene Parsing via Label Transfer

by Ce Liu, Jenny Yuen, Antonio Torralba , 2011
"... While there has been a lot of recent work on object recognition and image understanding, the focus has been on carefully establishing mathematical models for images, scenes, and objects. In this paper, we propose a novel, nonparametric approach for object recognition and scene parsing using a new t ..."
Abstract - Cited by 66 (3 self) - Add to MetaCart
While there has been a lot of recent work on object recognition and image understanding, the focus has been on carefully establishing mathematical models for images, scenes, and objects. In this paper, we propose a novel, nonparametric approach for object recognition and scene parsing using a new technology we name label transfer. For an input image, our system first retrieves its nearest neighbors from a large database containing fully annotated images. Then, the system establishes dense correspondences between the input image and each of the nearest neighbors using the dense SIFT flow algorithm [28], which aligns two images based on local image structures. Finally, based on the dense scene correspondences obtained from SIFT flow, our system warps the existing annotations and integrates multiple cues in a Markov random field framework to segment and recognize the query image. Promising experimental results have been achieved by our nonparametric scene parsing system on challenging databases. Compared to existing object recognition approaches that require training classifiers or appearance models for each object category, our system is easy to implement, has few parameters, and embeds contextual information naturally in the retrieval/alignment procedure.
(Show Context)

Citation Context

...er used to prune out object detectors of classes that are unlikely to take place in the image. Nonparametric methods have also been widely used in web data to retrieve similar images. For example, in =-=[17]-=-, a customized distance function is used at a retrieval stage to compute the distance between a query image and images in the training set, which subsequently cast votes to infer the object class of t...

Fast similarity search for learned metrics.

by B Kulis, P Jain, K Grauman - IEEE Trans. Pattern Anal. Mach. Intell., , 2009
"... ..."
Abstract - Cited by 60 (6 self) - Add to MetaCart
Abstract not found
(Show Context)

Citation Context

...ecent advances in metric learning make it possible to learn distance or kernel functions that are more effective for a given problem, provided some partially labeled data or constraints are available =-=[29, 28, 13, 3, 7, 11]-=-. By taking advantage of the prior information, these techniques offer improved accuracy when indexing or classifying examples. Analyzing large volumes of data is of great interest in computational bi...

Online Metric Learning and Fast Similarity Search

by Prateek Jain, Brian Kulis, Inderjit S. Dhillon, Kristen Grauman
"... Metric learning algorithms can provide useful distance functions for a variety of domains, and recent work has shown good accuracy for problems where the learner can access all distance constraints at once. However, in many real applications, constraints are only available incrementally, thus necess ..."
Abstract - Cited by 58 (4 self) - Add to MetaCart
Metric learning algorithms can provide useful distance functions for a variety of domains, and recent work has shown good accuracy for problems where the learner can access all distance constraints at once. However, in many real applications, constraints are only available incrementally, thus necessitating methods that can perform online updates to the learned metric. Existing online algorithms offer bounds on worst-case performance, but typically do not perform well in practice as compared to their offline counterparts. We present a new online metric learning algorithm that updates a learned Mahalanobis metric based on LogDet regularization and gradient descent. We prove theoretical worst-case performance bounds, and empirically compare the proposed method against existing online metric learning algorithms. To further boost the practicality of our approach, we develop an online locality-sensitive hashing scheme which leads to efficient updates to data structures used for fast approximate similarity search. We demonstrate our algorithm on multiple datasets and show that it outperforms relevant baselines. 1

Active Learning for Large Multi-class Problems

by Prateek Jain, Ashish Kapoor
"... Scarcity and infeasibility of human supervision for large scale multi-class classification problems necessitates active learning. Unfortunately, existing active learning methods for multi-class problems are inherently binary methods and do not scale up to a large number of classes. In this paper, we ..."
Abstract - Cited by 53 (1 self) - Add to MetaCart
Scarcity and infeasibility of human supervision for large scale multi-class classification problems necessitates active learning. Unfortunately, existing active learning methods for multi-class problems are inherently binary methods and do not scale up to a large number of classes. In this paper, we introduce a probabilistic variant of the K-Nearest Neighbor method for classification that can be seamlessly used for active learning in multi-class scenarios. Given some labeled training data, our method learns an accurate metric/kernel function over the input space that can be used for classification and similarity search. Unlike existing metric/kernel learning methods, our scheme is highly scalable for classification problems and provides a natural notion of uncertainty over class labels. Further, we use this measure of uncertainty to actively sample training examples that maximize discriminating capabilities of the model. Experiments on benchmark datasets show that the proposed method learns appropriate distance metrics that lead to state-of-the-art performance for object categorization problems. Furthermore, our active learning method effectively samples training examples, resulting in significant accuracy gains over random sampling for multi-class problems involving a large number of classes. 1.
(Show Context)

Citation Context

...d distance or similarity measure to compare two data points. Recent methods in metric learning learns a distance metric parameterized by a Mahalanobis metric that is consistent with the training data =-=[1, 5, 20, 8]-=-. Although these methods provide a principled way for handling multi-class problems, for large scale problems these methods are prohibitively expensive. Furthermore, there have been only a few attempt...

Sparse Distance Learning for Object Recognition Combining RGB and Depth Information

by Kevin Lai, Liefeng Bo, Xiaofeng Ren, Dieter Fox
"... instance recognition in the context of RGB-D (depth) cameras. Motivated by local distance learning, where a novel view of an object is compared to individual views of previously seen objects, we define a view-to-object distance where a novel view is compared simultaneously to all views of a previous ..."
Abstract - Cited by 47 (1 self) - Add to MetaCart
instance recognition in the context of RGB-D (depth) cameras. Motivated by local distance learning, where a novel view of an object is compared to individual views of previously seen objects, we define a view-to-object distance where a novel view is compared simultaneously to all views of a previous object. This novel distance is based on a weighted combination of feature differences between views. We show, through jointly learning perview weights, that this measure leads to superior classification
(Show Context)

Citation Context

...is distance learning (e.g. [27], [26]), in particular local distance learning [23]. Local distance learning has been extensively studied and demonstrated for object recognition, both for color images =-=[9]-=-, [10], [18] and 3D shapes [15]. A key property of these approaches is that they can model complex decision boundaries by combining elementary distances. Local distance learning, however, is not witho...

Accurate image search using the contextual dissimilarity measure

by Hervé Jégou, Cordelia Schmid, Hedi Harzallah, Jakob Verbeek - IEEE Trans. on Pattern Analysis and Machine Intell , 2010
"... This paper introduces the contextual dissimilarity measure which significantly improves the accuracy of bag-of-features based image search. Our measure takes into account the local distribution of the vectors and iteratively estimates distance update terms in the spirit of Sinkhorn’s scaling algorit ..."
Abstract - Cited by 42 (5 self) - Add to MetaCart
This paper introduces the contextual dissimilarity measure which significantly improves the accuracy of bag-of-features based image search. Our measure takes into account the local distribution of the vectors and iteratively estimates distance update terms in the spirit of Sinkhorn’s scaling algorithm, thereby modifying the neighborhood structure. Experimental results show that our approach gives significantly better results than a standard distance and outperforms the state-of-the-art in terms of accuracy on the Nistér-Stewénius and Lola datasets. This paper also evaluates the impact of a large number of parameters, including the number of descriptors, the clustering method, the visual vocabulary size and the distance measure. The optimal parameter choice is shown to be quite context-dependent. In particular using a large number of descriptors is interesting only when using our dissimilarity measure. We have also evaluated two novel variants, multiple assignment and rank aggregation. They are shown to further improve accuracy, at the cost of higher memory usage and lower efficiency. 1
(Show Context)

Citation Context

... the neighborhood can be seen as a local variance. Our CDM is learned in a unsupervised manner, in contrast with a large number of works which learn the distance measure from a set of training images =-=[6, 7, 8, 9]-=-. In contrast to category classification where class members are clearly defined and represented by a sufficiently large set, this does in general not hold for an image search system. Our approach is ...

Powered by: Apache Solr
  • About CiteSeerX
  • Submit and Index Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2019 The Pennsylvania State University