## Distance-based indexing for high-dimensional metric spaces (1997)

### Cached

### Download Links

- [www.cs.utexas.edu]
- [erciyes.ces.cwru.edu]
- DBLP

### Other Repositories/Bibliography

Venue: | In Proc. ACM SIGMOD International Conference on Management of Data |

Citations: | 115 - 3 self |

### BibTeX

@INPROCEEDINGS{Bozkaya97distance-basedindexing,

author = {Tolga Bozkaya},

title = {Distance-based indexing for high-dimensional metric spaces},

booktitle = {In Proc. ACM SIGMOD International Conference on Management of Data},

year = {1997},

pages = {357--368}

}

### Years of Citing Articles

### OpenURL

### Abstract

In many database applications, one of the common queries is to find approximate matches to a given query item from a collection of data items. For example, given an image database, one may want to retrieve all images that are similar to a given query image. Distance based index structures are proposed for applications where the data domain is high dimensional, or the distance function used to compute distances between data objects is non-Euclidean. In this paper, we introduce a distance based index structure called multi-vantage point (mvp) tree for similarity queries on high-dimensional metric spaces. The mvptree uses more than one vantage point to partition the space into spherical cuts at each level. It also utilizes the pre-computed (at construction time) distances between the data points and the vantage points. We have done experiments to compare mvp-trees with vp-trees which have a similar partitioning strategy, but use only one vantage point at each level, and do not make use of the pre-computed distances. Empirical studies show that mvptree outperforms the vp-tree 20 % to 80 % for varying query ranges and different distance distributions. 1.

### Citations

1175 |
The Design and Analysis of Spatial Data Structures
- Samet
- 1990
(Show Context)
Citation Context ...lem for high-dimensional metric spaces, and review previous approaches to the problem. 3. Indexing in High-Dimensional Spaces For low-dimensional Euclidean domains, the conventional index structures (=-=[Sam89]-=-) such as R-trees (and its variations) [Gut84, SRF87, BKSS90] can be used effectively to answer similarity queries. In such cases, a near neighbor search query would ask for all the objects in (or tha... |

982 | B.Seeger, "The R*-Tree: An Efficient and Robust Access Method for Points and Rectangles - Beckmann, Kriegel, et al. - 1990 |

484 | Nearest neighbor queries
- Roussopoulos, Kelley, et al.
- 1995
(Show Context)
Citation Context ...ere the center is the query object and the radius is the tolerance factor r. There are some special techniques for other forms of similarity queries, such as nearest neighbor queries. For example, in =-=[RKV95]-=-, some heuristics are introduced to efficiently search the R-tree structure to answer nearest neighbor queries. However, the conventional spatial structures stop being efficient if the dimensionality ... |

426 | Fast subsequence matching in time-series databases - Faloutsos, Ranganathan, et al. - 1994 |

262 | The R+-Tree: A Dynamic Index for Multi-Dimensional Objects," VLDB - Sellis, Roussopoulos, et al. - 1987 |

178 | Near neighbor search in large metric spaces. 21st VLDB
- Brin
- 1995
(Show Context)
Citation Context ...o. If the two pivot points are well-selected at every level, the gh-tree tends to be a well-balanced structure. More recently, Brin introduced the GNAT (Geometric Near-Neighbor Access Tree) structure =-=[Bri95]-=-. A k number of split points are chosen at the top level. Each one of the remaining points are associated with one of the k datasets (one for each split point), depending on which split point they are... |

165 |
Satisfying General Proximity/Similarity Queries with Metric Trees
- Uhlmann
- 1991
(Show Context)
Citation Context ...tage point tree) as a general solution to the problem of answering similarity based queries efficiently for high-dimensional metric spaces. The mvp-tree is similar to the vp-tree (vantage point tree) =-=[Uhl91]-=- in the sense that both structures use relative distances from a vantage point to partition the domain space. In vp-trees, at every node of the tree, a vantage point is chosen among the dataspoints, a... |

129 |
Some approaches to best-match file searching
- Burkhard, Keller
- 1973
(Show Context)
Citation Context ...f the distance based indexing techniques below. 3.2 Distance-Based Index Structures There are a number of research results on efficiently answering similarity search queries in different contexts. In =-=[BK73]-=-, Burkhard & Keller suggested the use of three different techniques for the problem of finding best matching (closest) key words in a file to a given query key. They employ a metric distance function ... |

115 | Fast Similarity Search - Agrawal, Lin, et al. - 1995 |

71 | Content-based image indexing
- Chiueh
- 1994
(Show Context)
Citation Context ... tree, it is also possible to generalize it to a multi-way tree for larger fanouts. In [Yia93], the vp-tree structure was enhanced by an algorithm to pick vantage-points for better decompositions. In =-=[Chi94]-=- the vp-tree structure is modified to answer nearest neighbor queries. We talk about the vp-trees in detail in section 3.3. The gh-tree (generalized hyperplane tree) structure was also introduced in [... |

53 | New Techniques for Best-Match Retrieval
- Shasha, Wang
- 1990
(Show Context)
Citation Context .... Note that keys may appear in more than one clique, so the aim is to select the representative keys to be the ones that appear in as many cliques as possible. In another approach, such as the one in =-=[SW90]-=-, precomputed distances between the data elements are used to efficiently answer similarity search queries. The aim is to minimize the number of distance computations as much as possible, as they are ... |

51 | et al. â€œEfficient and effective querying by image content - Faloutsos - 1993 |

15 |
Approximate Matching with High Dimensionality R-trees", M.Sc. scholarly paper
- Otterman
- 1992
(Show Context)
Citation Context ...ed to efficiently search the R-tree structure to answer nearest neighbor queries. However, the conventional spatial structures stop being efficient if the dimensionality is high. Experimental results =-=[Ott92]-=- show that R-trees become inefficient for n-dimensional spaces where n is greater than 20.sThe problem of indexing high-dimensional spaces can be approached in different ways. One approach is to use d... |

14 | et al., "Efficient and effective querying by image content - Faloutsos - 1994 |

6 |
Data Structures and Algorithms for Nearest Neighbor Search in General Metric Spaces
- Yiannilos
- 1993
(Show Context)
Citation Context ...low that node, which are constructed in the same way recursively. Although the vp-tree was introduced as a binary tree, it is also possible to generalize it to a multi-way tree for larger fanouts. In =-=[Yia93]-=-, the vp-tree structure was enhanced by an algorithm to pick vantage-points for better decompositions. In [Chi94] the vp-tree structure is modified to answer nearest neighbor queries. We talk about th... |

1 | R-Trees: A Dynamic Index Strcuture for Spatial Searching - Guttman - 1984 |