Results 1 - 10
of
31
Hamming embedding and weak geometric consistency for large scale image search
- In ECCV
, 2008
"... Abstract. This paper improves recent methods for large scale image search. State-of-the-art methods build on the bag-of-features image representation. We, first, analyze bag-of-features in the framework of approximate nearest neighbor search. This shows the suboptimality of such a representation for ..."
Abstract
-
Cited by 89 (12 self)
- Add to MetaCart
Abstract. This paper improves recent methods for large scale image search. State-of-the-art methods build on the bag-of-features image representation. We, first, analyze bag-of-features in the framework of approximate nearest neighbor search. This shows the suboptimality of such a representation for matching descriptors and leads us to derive a more precise representation based on 1) Hamming embedding (HE) and 2) weak geometric consistency constraints (WGC). HE provides binary signatures that refine the matching based on visual words. WGC filters matching descriptors that are not consistent in terms of angle and scale. HE and WGC are integrated within the inverted file and are efficiently exploited for all images, even in the case of very large datasets. Experiments performed on a dataset of one million of images show a significant improvement due to the binary signature and the weak geometric consistency constraints as well as their efficiency. Estimation of the full geometric transformation, i.e., a re-ranking step on a short list of images, is complementary to our weak geometric consistency constraints and allows to further improve the accuracy. 1
Lost in quantization: Improving particular object retrieval in large scale image databases
- In CVPR
, 2008
"... The state of the art in visual object retrieval from large databases is achieved by systems that are inspired by text retrieval. A key component of these approaches is that local regions of images are characterized using high-dimensional descriptors which are then mapped to “visual words ” selected ..."
Abstract
-
Cited by 70 (2 self)
- Add to MetaCart
The state of the art in visual object retrieval from large databases is achieved by systems that are inspired by text retrieval. A key component of these approaches is that local regions of images are characterized using high-dimensional descriptors which are then mapped to “visual words ” selected from a discrete vocabulary. This paper explores techniques to map each visual region to a weighted set of words, allowing the inclusion of features which were lost in the quantization stage of previous systems. The set of visual words is obtained by selecting words based on proximity in descriptor space. We describe how this representation may be incorporated into a standard tf-idf architecture, and how spatial verification is modified in the case of this soft-assignment. We evaluate our method on the standard Oxford Buildings dataset, and introduce a new dataset for evaluation. Our results exceed the current state of the art retrieval performance on these datasets, particularly on queries with poor initial recall where techniques like query expansion suffer. Overall we show that soft-assignment is always beneficial for retrieval with large vocabularies, at a cost of increased storage requirements for the index. 1.
Near Duplicate Image Detection: min-Hash and tf-idf Weighting
"... This paper proposes two novel image similarity measures for fast indexing via locality sensitive hashing. The similarity measures are applied and evaluated in the context of near duplicate image detection. The proposed method uses a visual vocabulary of vector quantized local feature descriptors (SI ..."
Abstract
-
Cited by 32 (1 self)
- Add to MetaCart
This paper proposes two novel image similarity measures for fast indexing via locality sensitive hashing. The similarity measures are applied and evaluated in the context of near duplicate image detection. The proposed method uses a visual vocabulary of vector quantized local feature descriptors (SIFT) and for retrieval exploits enhanced min-Hash techniques. Standard min-Hash uses an approximate set intersection between document descriptors was used as a similarity measure. We propose an efficient way of exploiting more sophisticated similarity measures that have proven to be essential in image / particular object retrieval. The proposed similarity measures do not require extra computational effort compared to the original measure. We focus primarily on scalability to very large image and video databases, where fast query processing is necessary. The method requires only a small amount of data need be stored for each image. We demonstrate our method on the TrecVid 2006 data set which contains approximately 146K key frames, and also on challenging the University of Kentucky image retrieval database. 1
Improving bag-of-features for large scale image search
- International Journal of Computer Vision
"... This article improves recent methods for large scale image search. We first analyze the bag-of-features approach in the framework of approximate nearest neighbor search. This leads us to derive a more precise representation based on 1) Hamming embedding (HE) and 2) weak geometric consistency constra ..."
Abstract
-
Cited by 25 (8 self)
- Add to MetaCart
This article improves recent methods for large scale image search. We first analyze the bag-of-features approach in the framework of approximate nearest neighbor search. This leads us to derive a more precise representation based on 1) Hamming embedding (HE) and 2) weak geometric consistency constraints (WGC). HE provides binary signatures that refine the matching based on visual words. WGC filters matching descriptors that are not consistent in terms of angle and scale. HE and WGC are integrated within an inverted file and are efficiently exploited for all images in the dataset. We then introduce a graph-structured quantizer which significantly speeds up the assignment of the descriptors to visual words. A comparison with the state of the art shows the interest of our approach when high accuracy is needed. Experiments performed on three reference datasets and a dataset of one million of images show a significant improvement due to the binary signature and the weak geometric consistency constraints, as well as their efficiency. Estimation of the full geometric transformation, i.e., a reranking step on a short-list of images, is shown to be complementary to our weak geometric consistency constraints. Our approach is shown to outperform the state-of-the-art on the three datasets. 1
Product quantization for nearest neighbor search
, 2010
"... This paper introduces a product quantization based approach for approximate nearest neighbor search. The idea is to decomposes the space into a Cartesian product of low dimensional subspaces and to quantize each subspace separately. A vector is represented by a short code composed of its subspace q ..."
Abstract
-
Cited by 20 (7 self)
- Add to MetaCart
This paper introduces a product quantization based approach for approximate nearest neighbor search. The idea is to decomposes the space into a Cartesian product of low dimensional subspaces and to quantize each subspace separately. A vector is represented by a short code composed of its subspace quantization indices. The Euclidean distance between two vectors can be efficiently estimated from their codes. An asymmetric version increases precision, as it computes the approximate distance between a vector and a code. Experimental results show that our approach searches for nearest neighbors efficiently, in particular in combination with an inverted file system. Results for SIFT and GIST image descriptors show excellent search accuracy outperforming three state-of-the-art approaches. The scalability of our approach is validated on a dataset of two billion vectors.
Bundling features for large scale partial-duplicate web image search
, 2009
"... In state-of-the-art image retrieval systems, an image is represented by a bag of visual words obtained by quantizing high-dimensional local image descriptors, and scalable schemes inspired by text retrieval are then applied for large scale image indexing and retrieval. Bag-of-words representations, ..."
Abstract
-
Cited by 14 (1 self)
- Add to MetaCart
In state-of-the-art image retrieval systems, an image is represented by a bag of visual words obtained by quantizing high-dimensional local image descriptors, and scalable schemes inspired by text retrieval are then applied for large scale image indexing and retrieval. Bag-of-words representations, however: 1) reduce the discriminative power of image features due to feature quantization; and 2) ignore geometric relationships among visual words. Exploiting such geometric constraints, by estimating a 2D affine transformation between a query image and each candidate image, has been shown to greatly improve retrieval precision but at high computational cost. In this paper we present a novel scheme where image features are bundled into local groups. Each group of bundled features becomes much more discriminative than a single feature, and within each group simple and robust geometric constraints can be efficiently enforced. Experiments in web image search, with a database of more than one million images, show that our scheme achieves a 49 % improvement in average precision over the baseline bag-of-words approach. Retrieval performance is comparable to existing full geometric verification approaches while being much less computationally expensive. When combined with full geometric verification we achieve a 77 % precision improvement over the baseline bag-of-words approach, and a 24 % improvement over full geometric verification alone. 1.
INRIA-LEARs video copy detection system
- in: Proceedings of the TREC Video Retrieval Evaluation (TRECVID
, 2008
"... 1 Copyright detection task 2 High-level feature extraction task ..."
Abstract
-
Cited by 12 (1 self)
- Add to MetaCart
1 Copyright detection task 2 High-level feature extraction task
Accurate image search using the contextual dissimilarity measure
- IEEE Trans. on Pattern Analysis and Machine Intell
, 2010
"... This paper introduces the contextual dissimilarity measure which significantly improves the accuracy of bag-of-features based image search. Our measure takes into account the local distribution of the vectors and iteratively estimates distance update terms in the spirit of Sinkhorn’s scaling algorit ..."
Abstract
-
Cited by 10 (2 self)
- Add to MetaCart
This paper introduces the contextual dissimilarity measure which significantly improves the accuracy of bag-of-features based image search. Our measure takes into account the local distribution of the vectors and iteratively estimates distance update terms in the spirit of Sinkhorn’s scaling algorithm, thereby modifying the neighborhood structure. Experimental results show that our approach gives significantly better results than a standard distance and outperforms the state-of-the-art in terms of accuracy on the Nistér-Stewénius and Lola datasets. This paper also evaluates the impact of a large number of parameters, including the number of descriptors, the clustering method, the visual vocabulary size and the distance measure. The optimal parameter choice is shown to be quite context-dependent. In particular using a large number of descriptors is interesting only when using our dissimilarity measure. We have also evaluated two novel variants, multiple assignment and rank aggregation. They are shown to further improve accuracy, at the cost of higher memory usage and lower efficiency. 1
On the burstiness of visual elements
- in CVPR
"... Figure 1. Illustration of burstiness. Features assigned to the most “bursty ” visual word of each image are displayed. Burstiness, a phenomenon initially observed in text retrieval, is the property that a given visual element appears more times in an image than a statistically independent model woul ..."
Abstract
-
Cited by 9 (4 self)
- Add to MetaCart
Figure 1. Illustration of burstiness. Features assigned to the most “bursty ” visual word of each image are displayed. Burstiness, a phenomenon initially observed in text retrieval, is the property that a given visual element appears more times in an image than a statistically independent model would predict. In the context of image search, burstiness corrupts the visual similarity measure, i.e., the scores used to rank the images. In this paper, we propose a strategy to handle visual bursts for bag-of-features based image search systems. Experimental results on three reference datasets show that our method significantly and consistently outperforms the state of the art. 1.

