Results 11  20
of
44
Fully Dynamic Spatial Approximation Trees
 In Proceedings of the 9th International Symposium on String Processing and Information Retrieval (SPIRE 2002), LNCS 2476
, 2002
"... The Spatial Approximation Tree (satree) is a recently proposed data structure for searching in metric spaces. It has been shown that it compares favorably against alternative data structures in spaces of high dimension or queries with low selectivity. Its main drawbacks are: costly construction ..."
Abstract

Cited by 22 (12 self)
 Add to MetaCart
The Spatial Approximation Tree (satree) is a recently proposed data structure for searching in metric spaces. It has been shown that it compares favorably against alternative data structures in spaces of high dimension or queries with low selectivity. Its main drawbacks are: costly construction time, poor performance in low dimensional spaces or queries with high selectivity, and the fact of being a static data structure, that is, once built, one cannot add or delete elements.
Effective Proximity Retrieval by Ordering Permutations
, 2007
"... We introduce a new probabilistic proximity search algorithm for range and Knearest neighbor (KNN) searching in both coordinate and metric spaces. Although there exist solutions for these problems, they boil down to a linear scan when the space is intrinsically highdimensional, as is the case in m ..."
Abstract

Cited by 21 (4 self)
 Add to MetaCart
We introduce a new probabilistic proximity search algorithm for range and Knearest neighbor (KNN) searching in both coordinate and metric spaces. Although there exist solutions for these problems, they boil down to a linear scan when the space is intrinsically highdimensional, as is the case in many pattern recognition tasks. This, for example, renders the KNN approach to classification rather slow in large databases. Our novel idea is to predict closeness between elements according to how they order their distances towards a distinguished set of anchor objects. Each element in the space sorts the anchor objects from closest to farthest to it, and the similarity between orders turns out to be an excellent predictor of the closeness between the corresponding elements. We present extensive experiments comparing our method against stateoftheart exact and approximate techniques, both in synthetic and real, metric and nonmetric databases, measuring both CPU time and distance computations. The experiments demonstrate that our technique almost always improves upon the performance of alternative techniques, in some cases by a wide margin.
An Effective Clustering Algorithm to Index High Dimensional Metric Spaces
"... A metric space consists of a collection of objects and a distance function defined among them, which satisfies the triangular inequality. The goal is to preprocess the set so that, given a set of objects and a query, retrieve those objects close enough to the query. The number of distances computed ..."
Abstract

Cited by 19 (8 self)
 Add to MetaCart
A metric space consists of a collection of objects and a distance function defined among them, which satisfies the triangular inequality. The goal is to preprocess the set so that, given a set of objects and a query, retrieve those objects close enough to the query. The number of distances computed to achieve this goal is the complexity measure. The problem is very difficult in the socalled highdimensional metric spaces, where the histogram of distances has a large mean and a small variance. A recent survey on methods to index metric spaces has shown that the socalled clustering algorithms are better suited than their competitors, pivotbased algorithms, to cope with highdimensional metric spaces. In this paper we present a new clustering method that achieves much better performance than all the existing data structures. We present analytical and experimental results that support our claims and that give the users the tuning parameters to make optimal use of this data structure.
A New Data Structure For Fast Approximate Matching
, 1994
"... Given a set of objects S and a metric D, we describe how to represent S as a new data structure, the triangulation trie. This data structure can be used to search through S quickly to find approximate matches to a given object. Using the triangle inequality, the search tree is repeatedly pruned to r ..."
Abstract

Cited by 8 (3 self)
 Add to MetaCart
Given a set of objects S and a metric D, we describe how to represent S as a new data structure, the triangulation trie. This data structure can be used to search through S quickly to find approximate matches to a given object. Using the triangle inequality, the search tree is repeatedly pruned to reduce the number of object comparisons required. Much of the work is done within the tree using integer comparisons. This method can result in very fast database searches in applications where object comparisons are traditionally costly. Furthermore, the data structure seems to be applicable to a very wide variety of object types. The trie is unusual in its construction in that objects are partitioned according to their respective distances from a common set of "key" objects. 1 Introduction When searching through databases, people are often interested in close but not necessarily exact matches. This problem of finding an "approximate" match is a common one, occurring in fields as varied as ...
Spatial selection of sparse pivots for similarity search in metric spaces
 IN: SOFSEM 2007: 33RD CONFERENCE ON CURRENT TRENDS IN THEORY AND PRACTICE OF COMPUTER SCIENCE. LNCS (4362
, 2007
"... Similarity search is a fundamental operation for applications that deal with unstructured data sources. In this paper we propose a new pivotbased method for similarity search, called Sparse Spatial Selection (SSS). The main characteristic of this method is that it guarantees a good pivot selection ..."
Abstract

Cited by 8 (2 self)
 Add to MetaCart
Similarity search is a fundamental operation for applications that deal with unstructured data sources. In this paper we propose a new pivotbased method for similarity search, called Sparse Spatial Selection (SSS). The main characteristic of this method is that it guarantees a good pivot selection more efficiently than other methods previously proposed. In addition, SSS adapts itself to the dimensionality of the metric space we are working with, without being necessary to specify in advance the number of pivots to use. Furthermore, SSS is dynamic, that is, it is capable to support object insertions in the database efficiently, it can work with both continuous and discrete distance functions, and it is suitable for secondary memory storage. In this work we provide experimental results that confirm the advantages of the method with several vector and metric spaces. We also show that the efficiency of our proposal is similar to that of other existing ones over vector spaces, although it is better over general metric spaces.
Similarity search using sparse pivots for efficient multimedia information retrieval
 IN: PROC. OF THE 8TH IEEE INTERNATIONAL SYMPOSIUM ON MULTIMEDIA (ISM’06
, 2006
"... Similarity search is a fundamental operation for applications that deal with unstructured data sources. In this paper we propose a new pivotbased method for similarity search, called Sparse Spatial Selection (SSS). This method guarantees a good pivot selection more efficiently than other methods pr ..."
Abstract

Cited by 8 (2 self)
 Add to MetaCart
Similarity search is a fundamental operation for applications that deal with unstructured data sources. In this paper we propose a new pivotbased method for similarity search, called Sparse Spatial Selection (SSS). This method guarantees a good pivot selection more efficiently than other methods previously proposed. In addition, SSS adapts itself to the dimensionality of the metric space we are working with, and it is not necessary to specify in advance the number of pivots to extract. Furthermore, SSS is dynamic, it supports object insertions in the database efficiently, it can work with both continuous and discrete distance functions, and it is suitable for secondary memory storage. In this work we provide experimental results that confirm the advantages of the method with several vector and metric spaces.
Efficient ContentBased Retrieval: Experimental Results
 in CBA
, 1999
"... this paper, we briefly summarize those aspects and present a set of experiments that thoroughly evaluates the FIDS techniques and system. ..."
Abstract

Cited by 7 (0 self)
 Add to MetaCart
this paper, we briefly summarize those aspects and present a set of experiments that thoroughly evaluates the FIDS techniques and system.
Antipole Tree indexing to support range search and Knearest neighbor search in metric spaces
 IEEE/TKDE
, 2005
"... Range and knearest neighbor searching are core problems in pattern recognition. Given a database S of objects in a metric space M and a query object q in M, in a range searching problem the target is to find the objects of S within some threshold distance to q, whereas in a knearest neighbor searc ..."
Abstract

Cited by 7 (0 self)
 Add to MetaCart
Range and knearest neighbor searching are core problems in pattern recognition. Given a database S of objects in a metric space M and a query object q in M, in a range searching problem the target is to find the objects of S within some threshold distance to q, whereas in a knearest neighbor searching problem, the k elements of S closest to q must be produced. These problems can obviously be solved with a linear number of distance calculations, by comparing the query object against every object in the database. However, the goal is to solve such problems much faster. We combine and extend ideas from the MTree, the MultiVantage Point structure, and the FQTree to create a new structure in the “bisector tree ” class, called the Antipole Tree. Bisection is based on the proximity to an “Antipole ” pair of elements generated by a suitable linear randomized tournament. The final winners a, b of such a tournament are far enough apart to approximate the diameter of the splitting set. If dist(a, b) is larger than the chosen cluster diameter threshold, then the cluster is split. The proposed data structure is an indexing scheme suitable for (exact and approximate) best match searching on generic metric spaces. The Antipole Tree compares very well with existing structures such as List of Clusters, MTrees and others, and in many cases it achieves better results.
An index data structure for searching in metric space databases
 In Proc. of International Conference on Computational Science 2006 (ICC 2006
, 2006
"... Abstract. This paper presents the Evolutionary Geometric Nearneighbor Access Tree (EGNAT) which is a new data structure devised for searching in metric space databases. The EGNAT is fully dynamic, i.e., it allows combinations of insert and delete operations, and has been optimized for secondary mem ..."
Abstract

Cited by 3 (2 self)
 Add to MetaCart
Abstract. This paper presents the Evolutionary Geometric Nearneighbor Access Tree (EGNAT) which is a new data structure devised for searching in metric space databases. The EGNAT is fully dynamic, i.e., it allows combinations of insert and delete operations, and has been optimized for secondary memory. Empirical results on different databases show that this tree achieves good performance for highdimensional metric spaces. We also show that this data structure allows efficient parallelization on distributed memory parallel architectures. All this indicates that the EGNAT is suitable for conducting similarity searches on very large metric space databases. 1
Faster proximity searching in metric data
 In Proc. of the Mexican International Conference in Artificial Intelligence (MICAI), Lecture
"... Abstract. A number of problems in computer science can be solved efficiently with the so called memory based or kernel methods. Among this problems (relevant to the AI community) are multimedia indexing, clustering, non supervised learning and recommendation systems. The common ground to this proble ..."
Abstract

Cited by 3 (2 self)
 Add to MetaCart
Abstract. A number of problems in computer science can be solved efficiently with the so called memory based or kernel methods. Among this problems (relevant to the AI community) are multimedia indexing, clustering, non supervised learning and recommendation systems. The common ground to this problems is satisfying proximity queries with an abstract metric database. In this paper we introduce a new technique for making practical indexes for metric range queries. This technique improves existing algorithms based on pivots and signatures, and introduces a new data structure, the Fixed Queries Trie to speedup metric range queries. The result is an O(n) construction time index, with query complexity O(n α), α ≤ 1. The indexing algorithm uses only a few bits of storage for each database element. 1 Introduction and Related Work Proximity queries are those extensions of the exact searching where we want to