Results 11  20
of
60
Fully Dynamic Spatial Approximation Trees
 In Proceedings of the 9th International Symposium on String Processing and Information Retrieval (SPIRE 2002), LNCS 2476
, 2002
"... The Spatial Approximation Tree (satree) is a recently proposed data structure for searching in metric spaces. It has been shown that it compares favorably against alternative data structures in spaces of high dimension or queries with low selectivity. Its main drawbacks are: costly construction ..."
Abstract

Cited by 26 (13 self)
 Add to MetaCart
The Spatial Approximation Tree (satree) is a recently proposed data structure for searching in metric spaces. It has been shown that it compares favorably against alternative data structures in spaces of high dimension or queries with low selectivity. Its main drawbacks are: costly construction time, poor performance in low dimensional spaces or queries with high selectivity, and the fact of being a static data structure, that is, once built, one cannot add or delete elements.
Incremental Similarity Search in Multimedia Databases
, 2000
"... Similarity search is a very important operation in multimedia databases and other database applications involving complex objects, and involves finding objects in a data set S similar to a query object q, based on some distance measure d, usually a distance metric. Existing methods for handling simi ..."
Abstract

Cited by 25 (2 self)
 Add to MetaCart
Similarity search is a very important operation in multimedia databases and other database applications involving complex objects, and involves finding objects in a data set S similar to a query object q, based on some distance measure d, usually a distance metric. Existing methods for handling similarity search in this setting fall into one of two classes. The first is based on mapping to a lowdimensionalvector space (making use of data structures such as the Rtree), while the second directly indexes the objects based on distances (making use of data structures such as the Mtree). We introduce a general framework for performing search based on distances, and present an incremental nearest neighbor algorithm that operates on an arbitrary "search hierarchy". We show how this framework can be applied in both classes of similarity search methods, by defining a suitable search hierarchy for a number of different indexing structures. Armed with an appropriate search hierarchy, our algorithm thus performs incremental similarity search, wherein the result objects are reported one by one in order of similarity to a query object, with as little effort as possible expended to produce each new result object. This is especially important in interactive database applications, as it makes it possible to display partial query results early. The incremental aspect also provides significant benefits in situations when the number of desired neighbors is unknown in advance. Furthermore, our algorithm is at least as efficient as existing knearest neighbor algorithms, in terms of the number of distance computations and index node accesses. In fact, provided that the search hierarchy is properly defined, our algorithm can be shown to be optimal in the sense of performing as few distance ...
An Effective Clustering Algorithm to Index High Dimensional Metric Spaces
"... A metric space consists of a collection of objects and a distance function defined among them, which satisfies the triangular inequality. The goal is to preprocess the set so that, given a set of objects and a query, retrieve those objects close enough to the query. The number of distances computed ..."
Abstract

Cited by 20 (8 self)
 Add to MetaCart
A metric space consists of a collection of objects and a distance function defined among them, which satisfies the triangular inequality. The goal is to preprocess the set so that, given a set of objects and a query, retrieve those objects close enough to the query. The number of distances computed to achieve this goal is the complexity measure. The problem is very difficult in the socalled highdimensional metric spaces, where the histogram of distances has a large mean and a small variance. A recent survey on methods to index metric spaces has shown that the socalled clustering algorithms are better suited than their competitors, pivotbased algorithms, to cope with highdimensional metric spaces. In this paper we present a new clustering method that achieves much better performance than all the existing data structures. We present analytical and experimental results that support our claims and that give the users the tuning parameters to make optimal use of this data structure.
Spatial selection of sparse pivots for similarity search in metric spaces
 IN: SOFSEM 2007: 33RD CONFERENCE ON CURRENT TRENDS IN THEORY AND PRACTICE OF COMPUTER SCIENCE. LNCS (4362
, 2007
"... Similarity search is a fundamental operation for applications that deal with unstructured data sources. In this paper we propose a new pivotbased method for similarity search, called Sparse Spatial Selection (SSS). The main characteristic of this method is that it guarantees a good pivot selection ..."
Abstract

Cited by 15 (4 self)
 Add to MetaCart
(Show Context)
Similarity search is a fundamental operation for applications that deal with unstructured data sources. In this paper we propose a new pivotbased method for similarity search, called Sparse Spatial Selection (SSS). The main characteristic of this method is that it guarantees a good pivot selection more efficiently than other methods previously proposed. In addition, SSS adapts itself to the dimensionality of the metric space we are working with, without being necessary to specify in advance the number of pivots to use. Furthermore, SSS is dynamic, that is, it is capable to support object insertions in the database efficiently, it can work with both continuous and discrete distance functions, and it is suitable for secondary memory storage. In this work we provide experimental results that confirm the advantages of the method with several vector and metric spaces. We also show that the efficiency of our proposal is similar to that of other existing ones over vector spaces, although it is better over general metric spaces.
Similarity search using sparse pivots for efficient multimedia information retrieval
 IN: PROC. OF THE 8TH IEEE INTERNATIONAL SYMPOSIUM ON MULTIMEDIA (ISM’06
, 2006
"... Similarity search is a fundamental operation for applications that deal with unstructured data sources. In this paper we propose a new pivotbased method for similarity search, called Sparse Spatial Selection (SSS). This method guarantees a good pivot selection more efficiently than other methods pr ..."
Abstract

Cited by 10 (3 self)
 Add to MetaCart
(Show Context)
Similarity search is a fundamental operation for applications that deal with unstructured data sources. In this paper we propose a new pivotbased method for similarity search, called Sparse Spatial Selection (SSS). This method guarantees a good pivot selection more efficiently than other methods previously proposed. In addition, SSS adapts itself to the dimensionality of the metric space we are working with, and it is not necessary to specify in advance the number of pivots to extract. Furthermore, SSS is dynamic, it supports object insertions in the database efficiently, it can work with both continuous and discrete distance functions, and it is suitable for secondary memory storage. In this work we provide experimental results that confirm the advantages of the method with several vector and metric spaces.
Antipole Tree indexing to support range search and Knearest neighbor search in metric spaces
 IEEE/TKDE
, 2005
"... Range and knearest neighbor searching are core problems in pattern recognition. Given a database S of objects in a metric space M and a query object q in M, in a range searching problem the target is to find the objects of S within some threshold distance to q, whereas in a knearest neighbor searc ..."
Abstract

Cited by 10 (0 self)
 Add to MetaCart
Range and knearest neighbor searching are core problems in pattern recognition. Given a database S of objects in a metric space M and a query object q in M, in a range searching problem the target is to find the objects of S within some threshold distance to q, whereas in a knearest neighbor searching problem, the k elements of S closest to q must be produced. These problems can obviously be solved with a linear number of distance calculations, by comparing the query object against every object in the database. However, the goal is to solve such problems much faster. We combine and extend ideas from the MTree, the MultiVantage Point structure, and the FQTree to create a new structure in the “bisector tree ” class, called the Antipole Tree. Bisection is based on the proximity to an “Antipole ” pair of elements generated by a suitable linear randomized tournament. The final winners a, b of such a tournament are far enough apart to approximate the diameter of the splitting set. If dist(a, b) is larger than the chosen cluster diameter threshold, then the cluster is split. The proposed data structure is an indexing scheme suitable for (exact and approximate) best match searching on generic metric spaces. The Antipole Tree compares very well with existing structures such as List of Clusters, MTrees and others, and in many cases it achieves better results.
A New Data Structure For Fast Approximate Matching
, 1994
"... Given a set of objects S and a metric D, we describe how to represent S as a new data structure, the triangulation trie. This data structure can be used to search through S quickly to find approximate matches to a given object. Using the triangle inequality, the search tree is repeatedly pruned to r ..."
Abstract

Cited by 8 (3 self)
 Add to MetaCart
(Show Context)
Given a set of objects S and a metric D, we describe how to represent S as a new data structure, the triangulation trie. This data structure can be used to search through S quickly to find approximate matches to a given object. Using the triangle inequality, the search tree is repeatedly pruned to reduce the number of object comparisons required. Much of the work is done within the tree using integer comparisons. This method can result in very fast database searches in applications where object comparisons are traditionally costly. Furthermore, the data structure seems to be applicable to a very wide variety of object types. The trie is unusual in its construction in that objects are partitioned according to their respective distances from a common set of "key" objects. 1 Introduction When searching through databases, people are often interested in close but not necessarily exact matches. This problem of finding an "approximate" match is a common one, occurring in fields as varied as ...
Efficient ContentBased Retrieval: Experimental Results
 in CBA
, 1999
"... this paper, we briefly summarize those aspects and present a set of experiments that thoroughly evaluates the FIDS techniques and system. ..."
Abstract

Cited by 7 (0 self)
 Add to MetaCart
(Show Context)
this paper, we briefly summarize those aspects and present a set of experiments that thoroughly evaluates the FIDS techniques and system.
Similaritybased Search Over Time Series and Trajectory Data
 University of Waterloo
"... I hereby declare that I am the sole author of this thesis. I authorize the University of Waterloo to lend this thesis to other institutions or individuals for the purpose of scholarly research. ..."
Abstract

Cited by 6 (1 self)
 Add to MetaCart
(Show Context)
I hereby declare that I am the sole author of this thesis. I authorize the University of Waterloo to lend this thesis to other institutions or individuals for the purpose of scholarly research.
DBMTree: A Dynamic Metric Access Method Sensitive to Local Density Data
 In SBBD
, 2004
"... Metric Access Methods (MAM) are employed to accelerate the processing of similarity queries, such as the range and the knearest neighbor queries. Current methods improve the query performance minimizing the number of disk accesses, keeping a constant height of the structures stored on disks (height ..."
Abstract

Cited by 5 (3 self)
 Add to MetaCart
(Show Context)
Metric Access Methods (MAM) are employed to accelerate the processing of similarity queries, such as the range and the knearest neighbor queries. Current methods improve the query performance minimizing the number of disk accesses, keeping a constant height of the structures stored on disks (heightbalanced trees). The Slimtree and the Mtree are the most efficient dynamic MAM so far. However, the overlapping between their nodes has a very high influence on their performance. This paper presents a new dynamic MAM called the DBMtree (DensityBased Metric tree), which can minimize the overlap between highdensity nodes by relaxing the heightbalancing of the structure. Thus, the height of the tree is larger in denser regions, in order to keep a tradeoff between breadthsearching and depthsearching. Moreover, an optimization algorithm called Shrink is also presented, which improves the performance of an already built DBMtree by reorganizing the elements among their nodes. Experiments performed over both synthetic and real datasets showed that the DBMtree is, in average, 50 % faster than traditional MAM and reduces the number of distance calculations by up to 72 % and disk accesses by up to 54%. After performing the Shrink algorithm, the performance improves up to 30 % regarding the number of disk accesses for range and knearest neighbor queries. In addition, the DBMtree scales up well, exhibiting sublinear performance with growing number of elements in the database. 1.