Results 1  10
of
26
Indexdriven similarity search in metric spaces
 ACM Transactions on Database Systems
, 2003
"... Similarity search is a very important operation in multimedia databases and other database applications involving complex objects, and involves finding objects in a data set S similar to a query object q, based on some similarity measure. In this article, we focus on methods for similarity search th ..."
Abstract

Cited by 183 (7 self)
 Add to MetaCart
Similarity search is a very important operation in multimedia databases and other database applications involving complex objects, and involves finding objects in a data set S similar to a query object q, based on some similarity measure. In this article, we focus on methods for similarity search that make the general assumption that similarity is represented with a distance metric d. Existing methods for handling similarity search in this setting typically fall into one of two classes. The first directly indexes the objects based on distances (distancebased indexing), while the second is based on mapping to a vector space (mappingbased approach). The main part of this article is dedicated to a survey of distancebased indexing methods, but we also briefly outline how search occurs in mappingbased methods. We also present a general framework for performing search based on distances, and present algorithms for common types of queries that operate on an arbitrary “search hierarchy. ” These algorithms can be applied on each of the methods presented, provided a suitable search hierarchy is defined.
Searching in Metric Spaces by Spatial Approximation
, 1999
"... We propose a new data structure to search in metric spaces. A metric space is formed by a collection of objects and a distance function defined among them, which satisfies the triangle inequality. The goal is, given a set of objects and a query, retrieve those objects close enough to the query. The ..."
Abstract

Cited by 76 (21 self)
 Add to MetaCart
We propose a new data structure to search in metric spaces. A metric space is formed by a collection of objects and a distance function defined among them, which satisfies the triangle inequality. The goal is, given a set of objects and a query, retrieve those objects close enough to the query. The complexity measure is the number of distances computed to achieve this goal. Our data structure, called satree ("spatial approximation tree"), is based on approaching spatially the searched objects, that is, getting closer and closer to them, rather than the classical divideandconquer approach of other data structures. We analyze our method and show that the number of distance evaluations to search among n objects is sublinear. We show experimentally that the satree is the best existing technique when the metric space is hard to search or the query has low selectivity. These are the most important unsolved cases in real applications. As a practical advantage, our data structure is one of the few that do not need to tune parameters, which makes it appealing for use by nonexperts.
Dynamic Spatial Approximation Trees for Massive Data
"... Abstract—Metric space searching is an emerging technique to address the problem of efficient similarity searching in many applications, including multimedia databases and other repositories handling complex objects. Although promising, the metric space approach is still immature in several aspects t ..."
Abstract

Cited by 5 (2 self)
 Add to MetaCart
Abstract—Metric space searching is an emerging technique to address the problem of efficient similarity searching in many applications, including multimedia databases and other repositories handling complex objects. Although promising, the metric space approach is still immature in several aspects that are well established in traditional databases. In particular, most indexing schemes are not dynamic, that is, few of them tolerate insertion of elements at reasonable cost over an existing index and only a few work efficiently in secondary memory. In this paper we introduce a secondarymemory variant of the Dynamic Spatial Approximation Tree, which has shown to be competitive in main memory. The resulting index handles well the secondary memory scenario and is competitive with the state of the art, becoming a useful alternative in a wide range of database applications. Moreover, our ideas are applicable to other secondarymemory trees where there is little control over the tree shape. I.
An index data structure for searching in metric space databases
 In Proc. of International Conference on Computational Science 2006 (ICC 2006
, 2006
"... Abstract. This paper presents the Evolutionary Geometric Nearneighbor Access Tree (EGNAT) which is a new data structure devised for searching in metric space databases. The EGNAT is fully dynamic, i.e., it allows combinations of insert and delete operations, and has been optimized for secondary mem ..."
Abstract

Cited by 3 (2 self)
 Add to MetaCart
(Show Context)
Abstract. This paper presents the Evolutionary Geometric Nearneighbor Access Tree (EGNAT) which is a new data structure devised for searching in metric space databases. The EGNAT is fully dynamic, i.e., it allows combinations of insert and delete operations, and has been optimized for secondary memory. Empirical results on different databases show that this tree achieves good performance for highdimensional metric spaces. We also show that this data structure allows efficient parallelization on distributed memory parallel architectures. All this indicates that the EGNAT is suitable for conducting similarity searches on very large metric space databases. 1
Recursive Lists of Clusters: A Dynamic Data Structure for Range Queries in Metric Spaces
"... Abstract. We introduce a novel data structure for solving the range query problem in generic metric spaces. It can be seen as a dynamic version of the List of Clusters data structure of Chávez and Navarro. Experimental results show that, with respect to range queries, it outperforms the original dat ..."
Abstract

Cited by 2 (0 self)
 Add to MetaCart
(Show Context)
Abstract. We introduce a novel data structure for solving the range query problem in generic metric spaces. It can be seen as a dynamic version of the List of Clusters data structure of Chávez and Navarro. Experimental results show that, with respect to range queries, it outperforms the original data structure when the database dimension is below 12. Moreover, the building process is much more efficient, for any size and any dimension of the database. 1
Impact of the initialization in treebased fast similarity search techniques
 SIMBAD. Volume 7005 of Lecture Notes in Computer Science
, 2011
"... Abstract. Many fast similarity search techniques relies on the use of pivots (specially selected points in the data set). Using these points, specific structures (indexes) are built speeding up the search when queering. Usually, pivot selection techniques are incremental, being the first one random ..."
Abstract

Cited by 1 (1 self)
 Add to MetaCart
(Show Context)
Abstract. Many fast similarity search techniques relies on the use of pivots (specially selected points in the data set). Using these points, specific structures (indexes) are built speeding up the search when queering. Usually, pivot selection techniques are incremental, being the first one randomly chosen. This article explores several techniques to choose the first pivot in a treebased fast similarity search technique. We provide experimental results showing that an adequate choice of this pivot leads to significant reductions in distance computations and time complexity. Moreover, most pivot treebased indexes emphasizes in building balanced trees. We provide experimentally and theoretical support that very unbalanced trees can be a better choice than balanced ones. 1
MemoryAdaptative Dynamic Spatial Approximation Trees
, 2003
"... Dynamic spatial approximation trees (dsa{trees for short) have shown to be suitable data structures for searching in high dimensional metric spaces. However, if sucient storage space is available, pivoting schemes beat dsa{trees in any metric space. In this paper we present new data structures for p ..."
Abstract

Cited by 1 (1 self)
 Add to MetaCart
Dynamic spatial approximation trees (dsa{trees for short) have shown to be suitable data structures for searching in high dimensional metric spaces. However, if sucient storage space is available, pivoting schemes beat dsa{trees in any metric space. In this paper we present new data structures for proximity searching in metric spaces, based on combining the concepts of spatial approximation and pivot based algorithms.
Improved Dynamic Spatial Approximation Trees
"... The Spatial Approximation Tree (satree) is a recently proposed data structure for searching in metric spaces. It has been shown that it compares favorably against alternative data structures in spaces of high dimension or queries with low selectivity. The main drawback of the satree was that it ..."
Abstract

Cited by 1 (1 self)
 Add to MetaCart
The Spatial Approximation Tree (satree) is a recently proposed data structure for searching in metric spaces. It has been shown that it compares favorably against alternative data structures in spaces of high dimension or queries with low selectivity. The main drawback of the satree was that it was a static data structure, that is, once built, it was dicult to add new elements to it. This ruled it out for many interesting applications.
Fully Dynamic and MemoryAdaptative Spatial Approximation Trees
"... Hybrid dynamic spatial approximation trees are recently proposed data structures for searching in metric spaces, based on combining the concepts of spatial approximation and pivot based algorithms. These data structures are hybrid schemes, with the full features of dynamic spatial approximation tree ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
(Show Context)
Hybrid dynamic spatial approximation trees are recently proposed data structures for searching in metric spaces, based on combining the concepts of spatial approximation and pivot based algorithms. These data structures are hybrid schemes, with the full features of dynamic spatial approximation trees and able of using the available memory to improve the query time. It has been shown that they compare favorably against alternative data structures in spaces of medium difficulty. In this paper we complete and improve hybrid dynamic spatial approximation trees, by presenting a new search alternative, an algorithm to remove objects from the tree, and an improved way of managing the available memory. The result is a fully dynamic and optimized data structure for similarity searching in metric spaces.
Enlarging Nodes to Improve Dynamic Spatial Approximation Trees ∗
"... The metric space model allows abstracting many similarity search problems. Similarity search has multiple applications especially in the multimedia databases area. The idea is to index the database so as to accelerate similarity queries. Although there are several promising indices, few of them are ..."
Abstract
 Add to MetaCart
(Show Context)
The metric space model allows abstracting many similarity search problems. Similarity search has multiple applications especially in the multimedia databases area. The idea is to index the database so as to accelerate similarity queries. Although there are several promising indices, few of them are dynamic, i.e., once created very few allow to perform insertions and deletions of elements at a reasonable cost. The Dynamic Spatial Approximation Trees (DSA–trees) have shown to be a suitable data structure for searching high dimensional metric spaces or queries with low selectivity (i.e., large radius), and are also completely dynamic. The performance of DSA–trees is directly related to the amount of backtracking in search time. To boost the performance in this data structure a sufficient condition is to maintain in the nodes elements closetoeachother. In this work we propose to obtain a new data structure for searching in metric spaces, based on the DSA–trees, which holds its virtues and takes advantage of element clusters, which are present in many metric spaces, and can also make better use of available memory to improve searches. In fact, we use these element clusters to improve the spatial approximation. Categories and Subject Descriptors H.3.1 [Information storage and retrieval]: Content analysis