Results 1 - 10
of
59
Approximate Nearest Neighbors: Towards Removing the Curse of Dimensionality
, 1998
"... The nearest neighbor problem is the following: Given a set of n points P = fp 1 ; : : : ; png in some metric space X, preprocess P so as to efficiently answer queries which require finding the point in P closest to a query point q 2 X. We focus on the particularly interesting case of the d-dimens ..."
Abstract
-
Cited by 533 (28 self)
- Add to MetaCart
The nearest neighbor problem is the following: Given a set of n points P = fp 1 ; : : : ; png in some metric space X, preprocess P so as to efficiently answer queries which require finding the point in P closest to a query point q 2 X. We focus on the particularly interesting case of the d-dimensional Euclidean space where X = ! d under some l p norm. Despite decades of effort, the current solutions are far from satisfactory; in fact, for large d, in theory or in practice, they provide little improvement over the brute-force algorithm which compares the query point to each data point. Of late, there has been some interest in the approximate nearest neighbors problem, which is: Find a point p 2 P that is an ffl-approximate nearest neighbor of the query q in that for all p 0 2 P , d(p; q) (1 + ffl)d(p 0 ; q). We present two algorithmic results for the approximate version that significantly improve the known bounds: (a) preprocessing cost polynomial in n and d, and a trul...
Voronoi diagrams -- a survey of a fundamental geometric data structure
- ACM COMPUTING SURVEYS
, 1991
"... This paper presents a survey of the Voronoi diagram, one of the most fundamental data structures in computational geometry. It demonstrates the importance and usefulness of the Voronoi diagram in a wide variety of fields inside and outside computer science and surveys the history of its development. ..."
Abstract
-
Cited by 472 (5 self)
- Add to MetaCart
This paper presents a survey of the Voronoi diagram, one of the most fundamental data structures in computational geometry. It demonstrates the importance and usefulness of the Voronoi diagram in a wide variety of fields inside and outside computer science and surveys the history of its development. The paper puts particular emphasis on the unified exposition of its mathematical and algorithmic properties. Finally, the paper provides the first comprehensive bibliography on Voronoi diagrams and related structures.
A Separator Theorem for Planar Graphs f
, 1977
"... Let G be any n-vertex planar graph. We prove that the vertices of G can be partitioned into three sets A, B, C such that no edge joins a vertex in A with a vertex in B, neither A nor B contains more than 2n/3 vertices, and C contains no more than 2& & vertices. We exhibit an algorithm which finds su ..."
Abstract
-
Cited by 338 (1 self)
- Add to MetaCart
Let G be any n-vertex planar graph. We prove that the vertices of G can be partitioned into three sets A, B, C such that no edge joins a vertex in A with a vertex in B, neither A nor B contains more than 2n/3 vertices, and C contains no more than 2& & vertices. We exhibit an algorithm which finds such a partition A, B, C in O(n) time.
Data Structures and Algorithms for Nearest Neighbor Search in General Metric Spaces
, 1993
"... We consider the computational problem of finding nearest neighbors in general metric spaces. Of particular interest are spaces that may not be conveniently embedded or approximated in Euclidian space, or where the dimensionality of a Euclidian representation is very high. Also relevant are high-dim ..."
Abstract
-
Cited by 225 (4 self)
- Add to MetaCart
We consider the computational problem of finding nearest neighbors in general metric spaces. Of particular interest are spaces that may not be conveniently embedded or approximated in Euclidian space, or where the dimensionality of a Euclidian representation is very high. Also relevant are high-dimensional Euclidian settings in which the distribution of data is in some sense of lower dimension and embedded in the space. The vp-tree (vantage point tree) is introduced in several forms, together with associated algorithms, as an improved method for these difficult search problems. Tree construction executes in O(n log(n)) time, and search is under certain circumstances and in the limit, O(log(n)) expected time. The theoretical basis for this approach is developed and the results of several experiments are reported. In Euclidian cases, kd-tree performance is compared.
Efficient Search for Approximate Nearest Neighbor in High Dimensional Spaces
, 1998
"... We address the problem of designing data structures that allow efficient search for approximate nearest neighbors. More specifically, given a database consisting of a set of vectors in some high dimensional Euclidean space, we want to construct a space-efficient data structure that would allow us to ..."
Abstract
-
Cited by 173 (9 self)
- Add to MetaCart
We address the problem of designing data structures that allow efficient search for approximate nearest neighbors. More specifically, given a database consisting of a set of vectors in some high dimensional Euclidean space, we want to construct a space-efficient data structure that would allow us to search, given a query vector, for the closest or nearly closest vector in the database. We also address this problem when distances are measured by the L 1 norm, and in the Hamming cube. Significantly improving and extending recent results of Kleinberg, we construct data structures whose size is polynomial in the size of the database, and search algorithms that run in time nearly linear or nearly quadratic in the dimension (depending on the case; the extra factors are polylogarithmic in the size of the database). Computer Science Department, Technion --- IIT, Haifa 32000, Israel. Email: eyalk@cs.technion.ac.il y Bell Communications Research, MCC-1C365B, 445 South Street, Morristown, NJ ...
Two Algorithms for Nearest-Neighbor Search in High Dimensions
, 1997
"... Representing data as points in a high-dimensional space, so as to use geometric methods for indexing, is an algorithmic technique with a wide array of uses. It is central to a number of areas such as information retrieval, pattern recognition, and statistical data analysis; many of the problems aris ..."
Abstract
-
Cited by 150 (0 self)
- Add to MetaCart
Representing data as points in a high-dimensional space, so as to use geometric methods for indexing, is an algorithmic technique with a wide array of uses. It is central to a number of areas such as information retrieval, pattern recognition, and statistical data analysis; many of the problems arising in these applications can involve several hundred or several thousand dimensions. We consider the nearest-neighbor problem for d-dimensional Euclidean space: we wish to pre-process a database of n points so that given a query point, one can efficiently determine its nearest neighbors in the database. There is a large literature on algorithms for this problem, in both the exact and approximate cases. The more sophisticated algorithms typically achieve a query time that is logarithmic in n at the expense of an exponential dependence on the dimension d; indeed, even the averagecase analysis of heuristics such as k-d trees reveals an exponential dependence on d in the query time. In this wor...
An optimal algorithm for intersecting line segments in the plane
- J. ACM
, 1992
"... Abstract. Themain contribution ofthiswork is an O(nlogr ~ +k)-timeal gorithmfo rcomputingall k intersections among n line segments in the plane, This time complexity IS easdy shown to be optimal. Within thesame asymptotic cost, ouralgorithm canalso construct thesubdiwslon of theplancdefmed by the se ..."
Abstract
-
Cited by 148 (2 self)
- Add to MetaCart
Abstract. Themain contribution ofthiswork is an O(nlogr ~ +k)-timeal gorithmfo rcomputingall k intersections among n line segments in the plane, This time complexity IS easdy shown to be optimal. Within thesame asymptotic cost, ouralgorithm canalso construct thesubdiwslon of theplancdefmed by the segments and compute which segment (if any) lies right above (or below) each intersection and each endpoint. The algorithm has been implemented and performs very well. The storage requirement is on the order of n + k in the worst case, but it is considerably lower in practice. To analyze the complexity of the algorithm, an amortization argument based on a new combinatorial theorem on line arrangements is used.
A Simple Algorithm for Nearest Neighbor Search in High Dimensions
- IEEE Transactions on Pattern Analysis and Machine Intelligence
, 1997
"... Abstract—The problem of finding the closest point in high-dimensional spaces is common in pattern recognition. Unfortunately, the complexity of most existing search algorithms, such as k-d tree and R-tree, grows exponentially with dimension, making them impractical for dimensionality above 15. In ne ..."
Abstract
-
Cited by 111 (1 self)
- Add to MetaCart
Abstract—The problem of finding the closest point in high-dimensional spaces is common in pattern recognition. Unfortunately, the complexity of most existing search algorithms, such as k-d tree and R-tree, grows exponentially with dimension, making them impractical for dimensionality above 15. In nearly all applications, the closest point is of interest only if it lies within a user-specified distance e. We present a simple and practical algorithm to efficiently search for the nearest neighbor within Euclidean distance e. The use of projection search combined with a novel data structure dramatically improves performance in high dimensions. A complexity analysis is presented which helps to automatically determine e in structured problems. A comprehensive set of benchmarks clearly shows the superiority of the proposed algorithm for a variety of structured and unstructured search problems. Object recognition is demonstrated as an example application. The simplicity of the algorithm makes it possible to construct an inexpensive hardware search engine which can be 100 times faster than its software equivalent. A C++ implementation of our algorithm is available upon request to search@cs.columbia.edu/CAVE/.
Lower bounds for high dimensional nearest neighbor search and related problems
, 1999
"... In spite of extensive and continuing research, for various geometric search problems (such as nearest neighbor search), the best algorithms known have performance that degrades exponentially in the dimension. This phenomenon is sometimes called the curse of dimensionality. Recent results [38, 37, 40 ..."
Abstract
-
Cited by 47 (2 self)
- Add to MetaCart
In spite of extensive and continuing research, for various geometric search problems (such as nearest neighbor search), the best algorithms known have performance that degrades exponentially in the dimension. This phenomenon is sometimes called the curse of dimensionality. Recent results [38, 37, 40] show that in some sense it is possible to avoid the curse of dimensionality for the approximate nearest neighbor search problem. But must the exact nearest neighbor search problem suffer this curse? We provide some evidence in support of the curse. Specifically we investigate the exact nearest neighbor search problem and the related problem of exact partial match within the asymmetric communication model first used by Miltersen [43] to study data structure problems. We derive non-trivial asymptotic lower bounds for the exact problem that stand in contrast to known algorithms for approximate nearest neighbor search. 1
Locality-Preserving Hashing in Multidimensional Spaces
- In Proceedings of the 29th ACM Symposium on Theory of Computing
, 1997
"... this paper was published in Proceedings of the 29th Annual ACM Symposium on Theory of Computing, pages 618--625, 1997 ..."
Abstract
-
Cited by 46 (2 self)
- Add to MetaCart
this paper was published in Proceedings of the 29th Annual ACM Symposium on Theory of Computing, pages 618--625, 1997

