Results 1  10
of
136
An Optimal Algorithm for Approximate Nearest Neighbor Searching in Fixed Dimensions
 ACMSIAM SYMPOSIUM ON DISCRETE ALGORITHMS
, 1994
"... Consider a set S of n data points in real ddimensional space, R d , where distances are measured using any Minkowski metric. In nearest neighbor searching we preprocess S into a data structure, so that given any query point q 2 R d , the closest point of S to q can be reported quickly. Given any po ..."
Abstract

Cited by 786 (31 self)
 Add to MetaCart
Consider a set S of n data points in real ddimensional space, R d , where distances are measured using any Minkowski metric. In nearest neighbor searching we preprocess S into a data structure, so that given any query point q 2 R d , the closest point of S to q can be reported quickly. Given any positive real ffl, a data point p is a (1 + ffl)approximate nearest neighbor of q if its distance from q is within a factor of (1 + ffl) of the distance to the true nearest neighbor. We show that it is possible to preprocess a set of n points in R d in O(dn log n) time and O(dn) space, so that given a query point q 2 R d , and ffl ? 0, a (1 + ffl)approximate nearest neighbor of q can be computed in O(c d;ffl log n) time, where c d;ffl d d1 + 6d=ffle d is a factor depending only on dimension and ffl. In general, we show that given an integer k 1, (1 + ffl)approximations to the k nearest neighbors of q can be computed in additional O(kd log n) time.
Nearoptimal hashing algorithms for approximate nearest neighbor in high dimensions
, 2008
"... In this article, we give an overview of efficient algorithms for the approximate and exact nearest neighbor problem. The goal is to preprocess a dataset of objects (e.g., images) so that later, given a new query object, one can quickly return the dataset object that is most similar to the query. The ..."
Abstract

Cited by 237 (4 self)
 Add to MetaCart
In this article, we give an overview of efficient algorithms for the approximate and exact nearest neighbor problem. The goal is to preprocess a dataset of objects (e.g., images) so that later, given a new query object, one can quickly return the dataset object that is most similar to the query. The problem is of significant interest in a wide variety of areas.
Navigating nets: Simple algorithms for proximity search (Extended Abstract)
, 2004
"... Robert Krauthgamer # James R. Lee + Abstract We present a simple deterministic data structure for maintaining a set S of points in a general metric space, while supporting proximity search (nearest neighbor and range queries) and updates to S (insertions and deletions). Our data structure consists ..."
Abstract

Cited by 126 (12 self)
 Add to MetaCart
Robert Krauthgamer # James R. Lee + Abstract We present a simple deterministic data structure for maintaining a set S of points in a general metric space, while supporting proximity search (nearest neighbor and range queries) and updates to S (insertions and deletions). Our data structure consists of a sequence of progressively finer #nets of S, with pointers that allow us to navigate easily from one scale to the next.
Efficient Similarity Search and Classification Via Rank Aggregation
 In Proceedings of the 2003 ACM SIGMOD International Conference on Management of Data
, 2003
"... We propose a novel approach to performing efficient similarity search and classification in high dimensional data. In this framework, the database elements are vectors in a Euclidean space. Given a query vector in the same space, the goal is to find elements of the database that are similar to the ..."
Abstract

Cited by 122 (5 self)
 Add to MetaCart
We propose a novel approach to performing efficient similarity search and classification in high dimensional data. In this framework, the database elements are vectors in a Euclidean space. Given a query vector in the same space, the goal is to find elements of the database that are similar to the query. In our approach, a small number of independent "voters" rank the database elements based on similarity to the query. These rankings are then combined by a highly efficient aggregation algorithm. Our methodology leads both to techniques for computing approximate nearest neighbors and to a conceptually rich alternative to nearest neighbors.
Secure multiparty computation of approximations
, 2001
"... Approximation algorithms can sometimes provide efficient solutions when no efficient exact computation is known. In particular, approximations are often useful in a distributed setting where the inputs are held by different parties and may be extremely large. Furthermore, for some applications, the ..."
Abstract

Cited by 98 (24 self)
 Add to MetaCart
Approximation algorithms can sometimes provide efficient solutions when no efficient exact computation is known. In particular, approximations are often useful in a distributed setting where the inputs are held by different parties and may be extremely large. Furthermore, for some applications, the parties want to compute a function of their inputs securely, without revealing more information than necessary. In this work we study the question of simultaneously addressing the above efficiency and security concerns via what we call secure approximations. We start by extending standard definitions of secure (exact) computation to the setting of secure approximations. Our definitions guarantee that no additional information is revealed by the approximation beyond what follows from the output of the function being approximated. We then study the complexity of specific secure approximation problems. In particular, we obtain a sublinearcommunication protocol for securely approximating the Hamming distance and a polynomialtime protocol for securely approximating the permanent and related #Phard problems. 1
Mean Shift Based Clustering in High Dimensions: A Texture Classification Example
, 2003
"... Feature space analysis is the main module in many computer vision tasks. The most popular technique, kmeans clustering, however, has two inherent limitations: the clusters are constrained to be spherically symmetric and their number has to be known a priori. In nonparametric clustering methods, lik ..."
Abstract

Cited by 95 (2 self)
 Add to MetaCart
Feature space analysis is the main module in many computer vision tasks. The most popular technique, kmeans clustering, however, has two inherent limitations: the clusters are constrained to be spherically symmetric and their number has to be known a priori. In nonparametric clustering methods, like the one based on mean shift, these limitations are eliminated but the amount of computation becomes prohibitively large as the dimension of the space increases. We exploit a recently proposed approximation technique, localitysensitive hashing (LSH), to reduce the computational complexity of adaptive mean shift. In our implementation of LSH the optimal parameters of the data structure are determined by a pilot learning procedure, and the partitions are data driven. As an application, the performance of mode and kmeans based textons are compared in a texture classification study.
A Replacement for Voronoi Diagrams of Near Linear Size
 In Proc. 42nd Annu. IEEE Sympos. Found. Comput. Sci
, 2001
"... For a set P of n points in R^d, we define a new type of space decomposition. The new diagram provides an εapproximation to the distance function associated with the Voronoi diagram of P, while being of near linear size, for d ≥ 2. This contrasts with the standard Voronoi diagram that has ..."
Abstract

Cited by 87 (6 self)
 Add to MetaCart
For a set P of n points in R^d, we define a new type of space decomposition. The new diagram provides an εapproximation to the distance function associated with the Voronoi diagram of P, while being of near linear size, for d ≥ 2. This contrasts with the standard Voronoi diagram that has complexity Ω(n^⌈d/2⌉) in the worst case.
Sublinear Time Algorithms for Metric Space Problems
"... In this paper we give approximation algorithms for the following problems on metric spaces: Furthest Pair, k median, Minimum Routing Cost Spanning Tree, Multiple Sequence Alignment, Maximum Traveling Salesman Problem, Maximum Spanning Tree and Average Distance. The key property of our algorithms i ..."
Abstract

Cited by 80 (2 self)
 Add to MetaCart
In this paper we give approximation algorithms for the following problems on metric spaces: Furthest Pair, k median, Minimum Routing Cost Spanning Tree, Multiple Sequence Alignment, Maximum Traveling Salesman Problem, Maximum Spanning Tree and Average Distance. The key property of our algorithms is that their running time is linear in the number of metric space points. As the full specification o`f an npoint metric space is of size \Theta(n 2 ), the complexity of our algorithms is sublinear with respect to the input size. All previous algorithms (exact or approximate) for the problems we consider have running time\Omega\Gamma n 2 ). We believe that our techniques can be applied to get similar bounds for other problems. 1 Introduction In recent years there has been a dramatic growth of interest in algorithms operating on massive data sets. This poses new challenges for algorithm design, as algorithms quite efficient on small inputs (for example, having quadratic running time) ...
An investigation of practical approximate nearest neighbor algorithms
, 2004
"... This paper concerns approximate nearest neighbor searching algorithms, which have become increasingly important, especially in high dimensional perception areas such as computer vision, with dozens of publications in recent years. Much of this enthusiasm is due to a successful new approximate neares ..."
Abstract

Cited by 79 (2 self)
 Add to MetaCart
This paper concerns approximate nearest neighbor searching algorithms, which have become increasingly important, especially in high dimensional perception areas such as computer vision, with dozens of publications in recent years. Much of this enthusiasm is due to a successful new approximate nearest neighbor approach called Locality Sensitive Hashing (LSH). In this paper we ask the question: can earlier spatial data structure approaches to exact nearest neighbor, such as metric trees, be altered to provide approximate answers to proximity queries and if so, how? We introduce a new kind of metric tree that allows overlap: certain datapoints may appear in both the children of a parent. We also introduce new approximate kNN search algorithms on this structure. We show why these structures should be able to exploit the same randomprojectionbased approximations that LSH enjoys, but with a simpler algorithm and perhaps with greater efficiency. We then provide a detailed empirical evaluation on five large, high dimensional datasets which show up to 31fold accelerations over LSH. This result holds true throughout the spectrum of approximation levels.
Approximation of functions over redundant dictionaries using coherence
 Proc. of SODA
, 2003
"... ..."