Results 1  10
of
71
An Optimal Algorithm for Approximate Nearest Neighbor Searching in Fixed Dimensions
 ACMSIAM SYMPOSIUM ON DISCRETE ALGORITHMS
, 1994
"... Consider a set S of n data points in real ddimensional space, R d , where distances are measured using any Minkowski metric. In nearest neighbor searching we preprocess S into a data structure, so that given any query point q 2 R d , the closest point of S to q can be reported quickly. Given any po ..."
Abstract

Cited by 786 (31 self)
 Add to MetaCart
Consider a set S of n data points in real ddimensional space, R d , where distances are measured using any Minkowski metric. In nearest neighbor searching we preprocess S into a data structure, so that given any query point q 2 R d , the closest point of S to q can be reported quickly. Given any positive real ffl, a data point p is a (1 + ffl)approximate nearest neighbor of q if its distance from q is within a factor of (1 + ffl) of the distance to the true nearest neighbor. We show that it is possible to preprocess a set of n points in R d in O(dn log n) time and O(dn) space, so that given a query point q 2 R d , and ffl ? 0, a (1 + ffl)approximate nearest neighbor of q can be computed in O(c d;ffl log n) time, where c d;ffl d d1 + 6d=ffle d is a factor depending only on dimension and ffl. In general, we show that given an integer k 1, (1 + ffl)approximations to the k nearest neighbors of q can be computed in additional O(kd log n) time.
Data Structures and Algorithms for Nearest Neighbor Search in General Metric Spaces
, 1993
"... We consider the computational problem of finding nearest neighbors in general metric spaces. Of particular interest are spaces that may not be conveniently embedded or approximated in Euclidian space, or where the dimensionality of a Euclidian representation is very high. Also relevant are highdim ..."
Abstract

Cited by 273 (4 self)
 Add to MetaCart
We consider the computational problem of finding nearest neighbors in general metric spaces. Of particular interest are spaces that may not be conveniently embedded or approximated in Euclidian space, or where the dimensionality of a Euclidian representation is very high. Also relevant are highdimensional Euclidian settings in which the distribution of data is in some sense of lower dimension and embedded in the space. The vptree (vantage point tree) is introduced in several forms, together with associated algorithms, as an improved method for these difficult search problems. Tree construction executes in O(n log(n)) time, and search is under certain circumstances and in the limit, O(log(n)) expected time. The theoretical basis for this approach is developed and the results of several experiments are reported. In Euclidian cases, kdtree performance is compared.
A Decomposition of MultiDimensional Point Sets with Applications to kNearestNeighbors and nBody Potential Fields
 J. ACM
, 1992
"... We define the notion of a wellseparated pair decomposition of points in ddimensional space. We then develop efficient sequential and parallel algorithms for computing such a decomposition. We apply the resulting decomposition to the efficient computation of knearest neighbors and nbody potential ..."
Abstract

Cited by 244 (4 self)
 Add to MetaCart
We define the notion of a wellseparated pair decomposition of points in ddimensional space. We then develop efficient sequential and parallel algorithms for computing such a decomposition. We apply the resulting decomposition to the efficient computation of knearest neighbors and nbody potential fields.
Locality Preserving Projections
, 2002
"... Many problems in information processing involve some form of dimensionality reduction. In this paper, we introduce Locality Preserving Projections (LPP). These are linear projective maps that arise by solving a variational problem that optimally preserves the neighborhood structure of the data s ..."
Abstract

Cited by 209 (15 self)
 Add to MetaCart
Many problems in information processing involve some form of dimensionality reduction. In this paper, we introduce Locality Preserving Projections (LPP). These are linear projective maps that arise by solving a variational problem that optimally preserves the neighborhood structure of the data set. LPP should be seen as an alternative to Principal Component Analysis (PCA)  a classical linear technique that projects the data along the directions of maximal variance. When the high dimensional data lies on a low dimensional manifold embedded in the ambient space, the Locality Preserving Projections are obtained by finding the optimal linear approximations to the eigenfunctions of the Laplace Beltrami operator on the manifold. As a result, LPP shares many of the data representation properties of nonlinear techniques such as Laplacian Eigenmaps or Locally Linear Embedding. Yet LPP is linear and more crucially is defined everywhere in ambient space rather than just on the training data points. This is borne out by illustrative examples on some high dimensional data sets.
Nearest neighbor queries in metric spaces
 Discrete Comput. Geom
, 1997
"... Given a set S of n sites (points), and a distance measure d, the nearest neighbor searching problem is to build a data structure so that given a query point q, the site nearest to q can be found quickly. This paper gives data structures for this problem when the sites and queries are in a metric spa ..."
Abstract

Cited by 115 (1 self)
 Add to MetaCart
Given a set S of n sites (points), and a distance measure d, the nearest neighbor searching problem is to build a data structure so that given a query point q, the site nearest to q can be found quickly. This paper gives data structures for this problem when the sites and queries are in a metric space. One data structure, D(S), uses a divideandconquer recursion. The other data structure, M(S, Q), is somewhat like a skiplist. Both are simple and implementable. The data structures are analyzed when the metric space obeys a certain spherepacking bound, and when the sites and query points are random and have distributions with an exchangeability property. This property implies, for example, that query point q is a random element of S ∪ {q}. Under these conditions, the preprocessing and space bounds for the algorithms are close to linear in n. They depend also on the spherepacking bound, and on the logarithm of the distance ratio Υ(S) of S, the ratio of the distance between the farthest pair of points in S to the distance between the closest pair. The data structure M(S, Q) requires as input data an additional set Q, taken to be representative of the query points. The resource bounds of M(S, Q) have a dependence on the distance ratio of S ∪ Q. While M(S, Q) can return wrong answers, its failure probability can be bounded, and is decreasing in a parameter K. Here K ≤ Q/n is chosen when building M(S, Q). The expected query time for M(S, Q) is O(K log n) log Υ(S ∪ Q), and the resource bounds increase linearly in K. The data structure D(S) has expected O(log n) O(1) query time, for fixed distance ratio. The preprocessing algorithm for M(S, Q) can be used to solve the allnearestneighbor problem for S in O(n(log n) 2 (log Υ(S)) 2) expected time. 1
Incremental Distance Join Algorithms for Spatial Databases
, 1998
"... Two new spatial join operations, distance join and distance semijoin, are introduced where the join output is ordered by the distance between the spatial attribute values of the joined tuples. Incremental algorithms are presented for computing these operations, which can be used in a pipelined fashi ..."
Abstract

Cited by 111 (10 self)
 Add to MetaCart
Two new spatial join operations, distance join and distance semijoin, are introduced where the join output is ordered by the distance between the spatial attribute values of the joined tuples. Incremental algorithms are presented for computing these operations, which can be used in a pipelined fashion, thereby obviating the need to wait for their completion when only a few tuples are needed. The algorithms can be used with a large class of hierarchical spatial data structures and arbitrary spatial data types in any dimensions. In addition, any distance metric may be employed. A performance study using Rtrees shows that the incremental algorithms outperform nonincremental approaches by an order of magnitude if only a small part of the result is needed, while the penalty, if any, for the incremental processing is modest if the entire join result is required.
Fast construction of nets in lowdimensional metrics and their applications
 SIAM Journal on Computing
, 2006
"... We present a near linear time algorithm for constructing hierarchical nets in finite metric spaces with constant doubling dimension. This datastructure is then applied to obtain improved algorithms for the following problems: approximate nearest neighbor search, wellseparated pair decomposition, s ..."
Abstract

Cited by 98 (10 self)
 Add to MetaCart
We present a near linear time algorithm for constructing hierarchical nets in finite metric spaces with constant doubling dimension. This datastructure is then applied to obtain improved algorithms for the following problems: approximate nearest neighbor search, wellseparated pair decomposition, spanner construction, compact representation scheme, doubling measure, and computation of the (approximate) Lipschitz constant of a function. In all cases, the running (preprocessing) time is near linear and the space being used is linear. 1
Approximate Range Searching
 in Proc. 11th Annu. ACM Sympos. Comput. Geom
, 1995
"... The range searching problem is a fundamental problem in computational geometry, with numerous important applications. Most research has focused on solving this problem exactly, but lower bounds show that if linear space is assumed, the problem cannot be solved in polylogarithmic time, except for the ..."
Abstract

Cited by 86 (20 self)
 Add to MetaCart
The range searching problem is a fundamental problem in computational geometry, with numerous important applications. Most research has focused on solving this problem exactly, but lower bounds show that if linear space is assumed, the problem cannot be solved in polylogarithmic time, except for the case of orthogonal ranges. In this paper we show that if one is willing to allow approximate ranges, then it is possible to do much better. In particular, given a bounded range Q of diameter w and ffl ? 0, an approximate range query treats the range as a fuzzy object, meaning that points lying within distance fflw of the boundary of Q either may or may not be counted. We show that in any fixed dimension d, a set of n points in R d can be preprocessed in O(n log n) time and O(n) space, such that approximate queries can be answered in O(logn + (1=ffl) d ) time. The only assumption we make about ranges is that the intersection of a range and a ddimensional cube can be answered in const...
ClosestPoint Problems in Computational Geometry
, 1997
"... This is the preliminary version of a chapter that will appear in the Handbook on Computational Geometry, edited by J.R. Sack and J. Urrutia. A comprehensive overview is given of algorithms and data structures for proximity problems on point sets in IR D . In particular, the closest pair problem, th ..."
Abstract

Cited by 65 (14 self)
 Add to MetaCart
This is the preliminary version of a chapter that will appear in the Handbook on Computational Geometry, edited by J.R. Sack and J. Urrutia. A comprehensive overview is given of algorithms and data structures for proximity problems on point sets in IR D . In particular, the closest pair problem, the exact and approximate postoffice problem, and the problem of constructing spanners are discussed in detail. Contents 1 Introduction 1 2 The static closest pair problem 4 2.1 Preliminary remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 2.2 Algorithms that are optimal in the algebraic computation tree model . 5 2.2.1 An algorithm based on the Voronoi diagram . . . . . . . . . . . 5 2.2.2 A divideandconquer algorithm . . . . . . . . . . . . . . . . . . 5 2.2.3 A plane sweep algorithm . . . . . . . . . . . . . . . . . . . . . . 6 2.3 A deterministic algorithm that uses indirect addressing . . . . . . . . . 7 2.3.1 The degraded grid . . . . . . . . . . . . . . . . . . ...