An Optimal Algorithm for Approximate Nearest Neighbor Searching in Fixed Dimensions
 ACMSIAM SYMPOSIUM ON DISCRETE ALGORITHMS
, 1994
Cited by 790 (31 self)
Consider a set S of n data points in real ddimensional space, R d , where distances are measured using any Minkowski metric. In nearest neighbor searching we preprocess S into a data structure, so that given any query point q 2 R d , the closest point of S to q can be reported quickly. Given any positive real ffl, a data point p is a (1 + ffl)approximate nearest neighbor of q if its distance from q is within a factor of (1 + ffl) of the distance to the true nearest neighbor. We show that it is possible to preprocess a set of n points in R d in O(dn log n) time and O(dn) space, so that given a query point q 2 R d , and ffl ? 0, a (1 + ffl)approximate nearest neighbor of q can be computed in O(c d;ffl log n) time, where c d;ffl d d1 + 6d=ffle d is a factor depending only on dimension and ffl. In general, we show that given an integer k 1, (1 + ffl)approximations to the k nearest neighbors of q can be computed in additional O(kd log n) time.
EpsilonNets and Simplex Range Queries
, 1986
Cited by 261 (7 self)
We present a new technique for halfspace and simplex range query using O(n) space and O(n a) query time, where a < if(al) +7 for all dimensions d ~2 a(al) + 1 and 7> 0. These bounds are better than those previously published for all d ~ 2. The technique uses random sampling to build a partitiontree structure. We introduce the concept of an enet for an abstract set of ranges to describe the desired result of this random sampling and give necessary and sufficient conditions that a random sample is an enet with high probability. We illustrate the application of these ideas to other range query problems.
Two Algorithms for NearestNeighbor Search in High Dimensions
, 1997
Cited by 172 (0 self)
Representing data as points in a highdimensional space, so as to use geometric methods for indexing, is an algorithmic technique with a wide array of uses. It is central to a number of areas such as information retrieval, pattern recognition, and statistical data analysis; many of the problems arising in these applications can involve several hundred or several thousand dimensions. We consider the nearestneighbor problem for ddimensional Euclidean space: we wish to preprocess a database of n points so that given a query point, one can efficiently determine its nearest neighbors in the database. There is a large literature on algorithms for this problem, in both the exact and approximate cases. The more sophisticated algorithms typically achieve a query time that is logarithmic in n at the expense of an exponential dependence on the dimension d; indeed, even the averagecase analysis of heuristics such as kd trees reveals an exponential dependence on d in the query time. In this wor...
Active Storage For LargeScale Data Mining and Multimedia
, 1998
Cited by 133 (15 self)
The increasing performance and decreasing cost of processors and memory are causing system intelligence to move into peripherals from the CPU. Storage system designers are using this trend toward "excess" compute power to perform more complex processing and optimizations inside storage devices. To date, such optimizations have been at relatively low levels of the storage protocol. At the same time, trends in storage density, mechanics, and electronics are eliminating the bottleneck in moving data off the media and putting pressure on interconnects and host processors to move data more efficiently. We propose a system called Active Disks that takes advantage of processing power on individual disk drives to run applicationlevel code. Moving portions of an application's processing to execute directly at disk drives can dramatically reduce data traffic and take advantage of the storage parallelism already present in large systems today. We discuss several types of appl...
Approximate Nearest Neighbor Queries in Fixed Dimensions
, 1993
Cited by 104 (10 self)
Given a set of n points in ddimensional Euclidean space, S ae E d , and a query point q 2 E d , we wish to determine the nearest neighbor of q, that is, the point of S whose Euclidean distance to q is minimum. The goal is to preprocess the point set S, such that queries can be answered as efficiently as possible. We assume that the dimension d is a constant independent of n. Although reasonably good solutions to this problem exist when d is small, as d increases the performance of these algorithms degrades rapidly. We present a randomized algorithm for approximate nearest neighbor searching. Given any set of n points S ae E d , and a constant ffl ? 0, we produce a data structure, such that given any query point, a point of S will be reported whose distance from the query point is at most a factor of (1 + ffl) from that of the true nearest neighbor. Our algorithm runs in O(log 3 n) expected time and requires O(n log n) space. The data structure can be built in O(n 2 ) expe...
Approximating extent measure of points
 Journal of ACM
Cited by 98 (29 self)
We present a general technique for approximating various descriptors of the extent of a set of points in�when the dimension�is an arbitrary fixed constant. For a given extent measure�and a parameter��, it computes in time a subset�of size, with the property that. The specific applications of our technique include�approximation algorithms for (i) computing diameter, width, and smallest bounding box, ball, and cylinder of, (ii) maintaining all the previous measures for a set of moving points, and (iii) fitting spheres and cylinders through a point set. Our algorithms are considerably simpler, and faster in many cases, than previously known algorithms. 1
Arrangements and Their Applications
 Handbook of Computational Geometry
, 1998
Cited by 78 (20 self)
The arrangement of a finite collection of geometric objects is the decomposition of the space into connected cells induced by them. We survey combinatorial and algorithmic properties of arrangements of arcs in the plane and of surface patches in higher dimensions. We present many applications of arrangements to problems in motion planning, visualization, range searching, molecular modeling, and geometric optimization. Some results involving planar arrangements of arcs have been presented in a companion chapter in this book, and are extended in this chapter to higher dimensions. Work by P.A. was supported by Army Research Office MURI grant DAAH049610013, by a Sloan fellowship, by an NYI award, and by a grant from the U.S.Israeli Binational Science Foundation. Work by M.S. was supported by NSF Grants CCR9122103 and CCR9311127, by a MaxPlanck Research Award, and by grants from the U.S.Israeli Binational Science Foundation, the Israel Science Fund administered by the Israeli Ac...
Range Searching
, 1996
Cited by 70 (1 self)
Range searching is one of the central problems in computational geometry, because it arises in many applications and a wide variety of geometric problems can be formulated as a rangesearching problem. A typical rangesearching problem has the following form. Let S be a set of n points in R d , and let R be a family of subsets; elements of R are called ranges . We wish to preprocess S into a data structure so that for a query range R, the points in S " R can be reported or counted efficiently. Typical examples of ranges include rectangles, halfspaces, simplices, and balls. If we are only interested in answering a single query, it can be done in linear time, using linear space, by simply checking for each point p 2 S whether p lies in the query range.
Algorithms for Fast Vector Quantization
 Proc. of DCC '93: Data Compression Conference
, 1993
Cited by 65 (12 self)
Nearest neighbor searching is an important geometric subproblem in vector quantization.
Diamond: A storage architecture for early discard in interactive search
, 2004
Cited by 62 (19 self)
Permission is granted for noncommercial reproduction of the work for educational or research purposes.