## PK-TREE: A SPATIAL INDEX STRUCTURE FOR HIGH DIMENSIONAL POINT DATA

Citations: | 16 - 1 self |

### BibTeX

@MISC{Wang_pk-tree:a,

author = {Wei Wang and Jiong Yang and Richard Muntz},

title = { PK-TREE: A SPATIAL INDEX STRUCTURE FOR HIGH DIMENSIONAL POINT DATA},

year = {}

}

### OpenURL

### Abstract

In this chapter we present the PK-tree which is an index structure for high dimensional point data. The proposed indexing structure can be viewed as combining aspects of the PR-quad or K-D tree but where unnecessary nodes are eliminated. The unnecessary nodes are typically the result of skew in the point distribution and we show that by eliminating these nodes the performance of the resulting index is robust to skewed data distributions. The index structure is formally defined, efficiently updatable and bounds on the number of nodes and the mean height of the tree can be proved. Bounds on the expected height of the tree can be given under certain mild constraints on the spatial distribution of points. Empirical evidence both on real data sets and generated data sets shows that the PK-tree outperforms the recently proposed spatial indexes based on the R-tree such as the SR-tree and X-tree by a wide margin. It is also significant that the relative performance advantage of the PK-tree grows with the dimensionality of the data set.

### Citations

2381 | R-trees: A dynamic index structure for spatial searching - Guttman - 1984 |

1929 | Randomized Algorithms - Motwani, Raghavan - 1995 |

1246 | Design and Analysis of Spatial Data Structures - Samet - 1990 |

1210 |
Multidimensional binary search trees used for associative searching
- Bentley
- 1975
(Show Context)
Citation Context ...eparating two data points rather than the total number of data points. Depending on the application this can lead to inefficient storage/search performance and severely unbalanced trees. The K-D tree =-=[Ben75]-=- removes some of the unnecessary nodes from PR quad tree. However, since data can be highly skew distributed over a space, the height of the K-D tree can be very large. To construct a height balanced ... |

1066 | The r*-tree: An efficient and robust access method for points and rectangles
- Beckmann, Kriegel, et al.
- 1990
(Show Context)
Citation Context ...it has to store both the minimum bounding boxes and minimum bounding spheres. For the same reason, the structure creation time and update time are impacted. The X-tree [Ber96] is based on the R -tree =-=[Bec90]-=-. The major difference between these two index structures is that when a node needs to be split, the R*- tree always splits the node according to some heuristics. However, in the X-tree, if no good sp... |

556 | M-tree: An efficient access method for similarity search in metric spaces - Ciaccia, Partella, et al. - 1997 |

544 | The X-Tree: An index structure for high-dimensional data - BERCHTOLD, KEIM, et al. - 1996 |

407 | The K-D-B-Tree: A Search Structure for Large Multidimensional Dynamic Indexes - Robinson - 1981 |

401 | The sr-tree: An index structure for high-dimensional nearest neighbor queries - Katayama, Satoh - 1997 |

290 | The R+-tree: A dynamic index for multi-dimensional objects - Sellis, Roussopoulos, et al. - 1987 |

208 | The TV-Tree: An Index Structure for HighDimensional Data - Lin, Jagadish, et al. - 1994 |

191 | The hB-tree: A multiattribute indexing method with good guaranteed performance - Lomet, Salzberg - 1990 |

191 | Hilbert R-tree: An Improved Rtree using Fractals - Kamel, Faloutsos - 1994 |

179 | Spatial Query Processing in an Object-Oriented Database System - Orenstein - 1986 |

153 | Fractals for secondary key retrieval - Faloutsos, S - 1989 |

109 | The sequoia 2000 storage benchmark - Stonebraker, Frew, et al. - 1993 |

108 | The LSD Tree: Spatial Access to Multidimensional Point- and Non-Point-Objects - Henrich, Six, et al. - 1989 |

103 | Dimensionality reduction for similarity searching in dynamic databases - Kanth, Agrawal, et al. - 1998 |

36 | Efficient processing of window queries in the pyramid data structure - Aref, Samet - 1990 |

23 | codes for partial match and range queries - Gray - 1988 |

20 | Multidimensional access methods: Trees have grown everywhere - Sellis, Roussopoulos, et al. - 1997 |

7 | Improved concurrency control techniques for multi-dimensional index structures - Kanth, Serena, et al. - 1998 |

2 | Ohsawa and Masao Sakauchi. Multidimensional data management structure with Efficient Dynamic Characteristics - Yutaka - 1983 |

2 | PK-tree: a dynamic spatial index structure for large data sets - Wang, Yang, et al. |

2 | Yet another spatial indexing structure - Yang, Wang, et al. - 1997 |

1 |
The X-tree : an index height size KNN
- Berchtold, Keim, et al.
(Show Context)
Citation Context ...ons, e.g., in content based access in image databases, high dimensional spaces are common. Therefore, a variety of dynamic spatial indexing structures have been proposed in the past few years [Sam90] =-=[Ber96]-=- [Hen89] [Kat97].Most index structures become less efficient with increasing dimensionality and this is where the challenge lies. The 5th International Conference on Foundations of Data Organization (... |