Results 1 
2 of
2
When Is "Nearest Neighbor" Meaningful?
 In Int. Conf. on Database Theory
, 1999
"... . We explore the effect of dimensionality on the "nearest neighbor " problem. We show that under a broad set of conditions (much broader than independent and identically distributed dimensions), as dimensionality increases, the distance to the nearest data point approaches the distance to the fa ..."
Abstract

Cited by 292 (1 self)
 Add to MetaCart
. We explore the effect of dimensionality on the "nearest neighbor " problem. We show that under a broad set of conditions (much broader than independent and identically distributed dimensions), as dimensionality increases, the distance to the nearest data point approaches the distance to the farthest data point. To provide a practical perspective, we present empirical results on both real and synthetic data sets that demonstrate that this effect can occur for as few as 1015 dimensions. These results should not be interpreted to mean that highdimensional indexing is never meaningful; we illustrate this point by identifying some highdimensional workloads for which this effect does not occur. However, our results do emphasize that the methodology used almost universally in the database literature to evaluate highdimensional indexing techniques is flawed, and should be modified. In particular, most such techniques proposed in the literature are not evaluated versus simple...
When Is "Nearest Neighbor" Meaningful?
 In Int. Conf. on Database Theory
, 1999
"... this paper, we study the nearest neighbor problem and make the following contributions: ffl We show that under certain conditions (in terms of data and query distributions, or workload), as dimensionality increases, the distance to the nearest neighbor approaches the distance to the farthest neighb ..."
Abstract

Cited by 2 (0 self)
 Add to MetaCart
this paper, we study the nearest neighbor problem and make the following contributions: ffl We show that under certain conditions (in terms of data and query distributions, or workload), as dimensionality increases, the distance to the nearest neighbor approaches the distance to the farthest neighbor. In other words, virtually every data point is as good as any other, and slight perturbations to the query point would result in another data point being chosen as the nearest neighbor. Our result characterizes the problem itself, rather than specific algorithms that address the problem. This observation places some fundamental limits upon current approaches to multimedia similarity search based upon highdimensional feature vector representations. In addition, our observations apply equally to the knearest neigbor variant of the problem