## Similarity Indexing: Algorithms and Performance (1996)

@INPROCEEDINGS{White96similarityindexing:,

author = {David A. White and Ramesh Jain},

title = {Similarity Indexing: Algorithms and Performance},

booktitle = {In Storage and Retrieval for Image and Video Databases (SPIE},

year = {1996},

pages = {62--73}

}

Efficient indexing support is essential to allow content-based image and video databases using similaritybased retrieval to scale to large databases (tens of thousands up to millions of images). In this paper, we take an in depth look at this problem. One of the major difficulties in solving this problem is the high dimension (6-100) of the feature vectors that are used to represent objects. We provide an overview of the work in computational geometry on this problem and highlight the results we found are most useful in practice, including the use of approximate nearest neighbor algorithms. We also present a variant of the optimized k-d tree we call the VAM k-d tree, and provide algorithms to create an optimized R-tree we call the VAMSplit R-tree. We found that the VAMSplit R-tree provided better overall performance than all competing structures we tested for main memory and secondary memory applications. We observed large improvements in performance relative to the R*-tree and SS-tree...

