Results 1 -
4 of
4
Approximate Nearest Neighbors: Towards Removing the Curse of Dimensionality
, 1998
"... The nearest neighbor problem is the following: Given a set of n points P = fp 1 ; : : : ; png in some metric space X, preprocess P so as to efficiently answer queries which require finding the point in P closest to a query point q 2 X. We focus on the particularly interesting case of the d-dimens ..."
Abstract
-
Cited by 533 (28 self)
- Add to MetaCart
The nearest neighbor problem is the following: Given a set of n points P = fp 1 ; : : : ; png in some metric space X, preprocess P so as to efficiently answer queries which require finding the point in P closest to a query point q 2 X. We focus on the particularly interesting case of the d-dimensional Euclidean space where X = ! d under some l p norm. Despite decades of effort, the current solutions are far from satisfactory; in fact, for large d, in theory or in practice, they provide little improvement over the brute-force algorithm which compares the query point to each data point. Of late, there has been some interest in the approximate nearest neighbors problem, which is: Find a point p 2 P that is an ffl-approximate nearest neighbor of the query q in that for all p 0 2 P , d(p; q) (1 + ffl)d(p 0 ; q). We present two algorithmic results for the approximate version that significantly improve the known bounds: (a) preprocessing cost polynomial in n and d, and a trul...
Distance-based indexing for high-dimensional metric spaces
- In Proc. ACM SIGMOD International Conference on Management of Data
, 1997
"... In many database applications, one of the common queries is to find approximate matches to a given query item from a collection of data items. For example, given an image database, one may want to retrieve all images that are similar to a given query image. Distance based index structures are propos ..."
Abstract
-
Cited by 110 (3 self)
- Add to MetaCart
In many database applications, one of the common queries is to find approximate matches to a given query item from a collection of data items. For example, given an image database, one may want to retrieve all images that are similar to a given query image. Distance based index structures are proposed for applications where the data domain is high dimensional, or the distance function used to compute distances between data objects is non-Euclidean. In this paper, we introduce a distance based index structure called multi-vantage point (mvp) tree for similarity queries on high-dimensional metric spaces. The mvptree uses more than one vantage point to partition the space into spherical cuts at each level. It also utilizes the pre-computed (at construction time) distances between the data points and the vantage points. We have done experiments to compare mvp-trees with vp-trees which have a similar partitioning strategy, but use only one vantage point at each level, and do not make use of the pre-computed distances. Empirical studies show that mvptree outperforms the vp-tree 20 % to 80 % for varying query ranges and different distance distributions. 1.
Indexing Large Metric Spaces for Similarity Search Queries
, 1999
"... In many database applications, one of the common queries is to find approximate matches to a given query item from a collection of data items. For example, given an image database, one may want to retrieve all images that are similar to a given query image. Distance based index structures are propos ..."
Abstract
-
Cited by 57 (0 self)
- Add to MetaCart
In many database applications, one of the common queries is to find approximate matches to a given query item from a collection of data items. For example, given an image database, one may want to retrieve all images that are similar to a given query image. Distance based index structures are proposed for applications where the distance computations between objects of the data domain are expensive (such as high dimensional data), and the distance function used is metric. In this paper, we consider using distance-based index structures for similarity queries on large metric spaces. We elaborate on the approach of using reference points (vantage points) to partition the data space into spherical shell-like regions in a hierarchical manner. We introduce the multi-vantage point tree structure (mvp-tree) that uses more than one vantage points to partition the space into spherical cuts at each level. In answering similarity based queries, the mvp-tree also utilizes the pre-computed (at con...
Index Structures For Temporal And Multimedia Databases
, 1998
"... by TOLGA BOZKAYA This thesis proposes index structures for efficient evaluation of temporal queries in temporal databases and similarity based queries in multimedia databases. To support temporal operators and to increase the efficiency of temporal queries, indexing based on temporal attributes is r ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
by TOLGA BOZKAYA This thesis proposes index structures for efficient evaluation of temporal queries in temporal databases and similarity based queries in multimedia databases. To support temporal operators and to increase the efficiency of temporal queries, indexing based on temporal attributes is required. A temporal database can support two notions of time. Valid time is the time when a data entity is valid in reality, and the transaction time is the time when a data entity is recorded in the database. In this thesis, methods for indexing time intervals in transaction time and valid time databases are proposed. Transaction time databases are append only databases. Data is never deleted from the database, and data versions that were deleted or modified are stored as historical versions. This thesis proposes indexing current and historical versions of temporal entities separately to exploit the behavior of transaction time data. Two structures, namely IB tree and AD*-tree, are proposed...

