## A Unified Approach For Indexed and Non-Indexed Spatial Joins (2000)

### Cached

### Download Links

Citations: | 20 - 6 self |

### BibTeX

@MISC{Arge00aunified,

author = {Lars Arge and Octavian Procopiuc and Sridhar Ramaswamy and Torsten Suel and Jan Vahrenhold and Jeffrey Scott Vitter},

title = {A Unified Approach For Indexed and Non-Indexed Spatial Joins},

year = {2000}

}

### Years of Citing Articles

### OpenURL

### Abstract

. Most spatial join algorithms either assume the existence of a spatial index structure that is traversed during the join process, or solve the problem by sorting, partitioning, or on-the-fly index construction. In this paper, we develop a simple plane-sweeping algorithm that unifies the index-based and non-index based approaches. This algorithm processes indexed as well as non-indexed inputs, extends naturally to multi-way joins, and can be built easily from a few standard operations. We present the results of a comparative study of the new algorithm with several index-based and non-index based spatial join algorithms. We consider a number of factors, including the relative performance of CPU and disk, the quality of the spatial indexes, and the sizes of the input relations. An important conclusion from our work is that using an index-based approach whenever indexes are available does not always lead to the best execution time, and hence we propose the use of a simple cost...

### Citations

2384 | R-trees: A dynamic index structure for spatial searching - Guttman - 1984 |

1857 | Computational Geometry: An Introduction - Preparata, Shamos - 1985 |

1248 |
The Design and Analysis of Spatial Data Structures
- Samet
- 1989
(Show Context)
Citation Context ...[37] and a grid file. Spatial Index-Based Approaches. Several join algorithms have been proposed that use spatial index structures such as the R-tree [14], R+-tree [34], R -tree [7], or PMR quad-tree =-=[33]-=-. Brinkhoff, Kriegel, and Seeger [8] propose an algorithm based on R - trees that performs a carefully synchronized depth-first traversal of the two trees to be joined. An optimized version of this al... |

498 |
The R* tree: An efficient and robust access method for points and rectangles
- Beckmann, Kriegel, et al.
- 1990
(Show Context)
Citation Context ...in index of Valduriez [37] and a grid file. Spatial Index-Based Approaches. Several join algorithms have been proposed that use spatial index structures such as the R-tree [14], R+-tree [34], R -tree =-=[7]-=-, or PMR quad-tree [33]. Brinkhoff, Kriegel, and Seeger [8] propose an algorithm based on R - trees that performs a carefully synchronized depth-first traversal of the two trees to be joined. An optim... |

398 | The grid file: an adaptable, symmetric multikey file structure
- Nievergelt, Hinterberger, et al.
- 1984
(Show Context)
Citation Context ...formational approach [6], the MBRs of two-dimensional spatial objects are transformed into points in four dimensions. These points are stored in a multi-attribute data structure such as the grid file =-=[27]-=-, which is then used to perform the join. An efficient algorithm for the rectangle intersection problem based on plane-sweeping wasproposed by Güting and Schilling [13], who observed that real data s... |

338 | Efficient processing of spatial joins using R-trees
- Brinkhof, Kriegel, et al.
- 1994
(Show Context)
Citation Context ...ased Approaches. Several join algorithms have been proposed that use spatial index structures such as the R-tree [14], R+-tree [34], R -tree [7], or PMR quad-tree [33]. Brinkhoff, Kriegel, and Seeger =-=[8]-=- propose an algorithm based on R - trees that performs a carefully synchronized depth-first traversal of the two trees to be joined. An optimized version of this algorithm was described in [16]. Günth... |

304 | The R+ tree: A dynamic index for multi-dimensional objects - Sellis, Roussopoulos, et al. - 1987 |

291 | The R+-tree: A Dynamic Index for Multi-Dimensional Objects
- Sellis, Roussopoulos, et al.
- 1987
(Show Context)
Citation Context ...ased on the join index of Valduriez [37] and a grid file. Spatial Index-Based Approaches. Several join algorithms have been proposed that use spatial index structures such as the R-tree [14], R+-tree =-=[34]-=-, R -tree [7], or PMR quad-tree [33]. Brinkhoff, Kriegel, and Seeger [8] propose an algorithm based on R - trees that performs a carefully synchronized depth-first traversal of the two trees to be joi... |

232 | On packing r-trees - Kamel, Faloutsos - 1993 |

213 | Join indices
- Valduriez
- 1987
(Show Context)
Citation Context ... rule, i.e., in a set ofNrectangles there are onlyO(pN)rectangles that intersect a given vertical or horizontal line. Rotem [32] proposes a spatial join algorithm based on the join index of Valduriez =-=[37]-=- and a grid file. Spatial Index-Based Approaches. Several join algorithms have been proposed that use spatial index structures such as the R-tree [14], R+-tree [34], R -tree [7], or PMR quad-tree [33]... |

176 | Patel: “Partition Based Spatial-Merge Join
- DeWitt, M
- 1996
(Show Context)
Citation Context ... using spatial sampling techniques, and then use the tree join algorithm of [8] to compute the join. Another recent paper [20] proposes an algorithm based on a filter tree structure. Patel and DeWitt =-=[30]-=- and Lo and Ravishankar [23] both propose hash-based algorithms that use a spatial partitioning function to subdivide the input such that each partition fits in memory. Patel and DeWitt then use a sta... |

158 | The bu®er tree: A new technique for optimal I/O-algorithms
- Arge
- 1995
(Show Context)
Citation Context ...ssumes that the priorityqueue never grows larger than the amount of internal memory available. Note, however, that PQ can be modified to handle overflow gracefully by using an external priority queue =-=[2, 9]-=-, and that it can also be combined with the partitioning step along one dimension that SSSJ performs in the case of an overflow of the interval data structure. We omit these details here since they ar... |

125 | UNIX Internals — The New Frontiers - Vahalia - 1996 |

124 |
STL Tutorial and Reference Guide: C++ Programming with the Standard Template Library. Addison-Wesley
- DR, Saini
- 1996
(Show Context)
Citation Context ...mplementation. PQ use the same internal memory components as SSSJ (see Section 3.1). For the priority queue, we chose the heap-based implementation provided by the C++ Standard Template Library (STL) =-=[26]-=-. To optimize the performance and reduce the space requirements of the priority queue, we actually maintained two priority queues: one for the bounding rectangles of the internal nodes and one for the... |

123 | External-memory computational geometry
- Goodrich, Tsay, et al.
- 1993
(Show Context)
Citation Context ...s will be relatively small. To handle cases where the structures do not fit in memory, SSSJ combines the plane-sweep approach with an I/O-optimal algorithm based on the distributionsweeping technique =-=[5, 11]-=-. In all experiments performed for this study the data structures were always significantly smaller than the available internal memory, and thus SSSJ essentially consists of a sorting step followed by... |

107 |
PROBE spatial data modeling and query processing in an image database application
- Orenstein, Manola
- 1988
(Show Context)
Citation Context ...ction 5 describes our experimental platform, and in Section 6 we present and discuss the experimental results. Finally, Section 7 offers some concluding remarks. 2 Previous Work Early Work. Orenstein =-=[29]-=- uses a transformational approach based on space-filling curves, and then performs a sort-merge join along the curve to solve the join problem. In another transformational approach [6], the MBRs of tw... |

105 |
Spatial Joins Using Seeded Trees
- Lo, Ravishankar
(Show Context)
Citation Context ...gher join selectivities, spatial indexes perform better. Hoel and Samet [15] propose to use PMR quad-trees for the spatial join and compare it against members of the R-tree family. Lo and Ravishankar =-=[21]-=- discuss the case where only one of the relations has an index. They construct an index for the other relation on the fly, by using the existing index as a starting point (or seed). Afterwards, the tr... |

99 | Ravishankar: “Spatial Hash-Joins
- Lo, V
- 1996
(Show Context)
Citation Context ...niques, and then use the tree join algorithm of [8] to compute the join. Another recent paper [20] proposes an algorithm based on a filter tree structure. Patel and DeWitt [30] and Lo and Ravishankar =-=[23]-=- both propose hash-based algorithms that use a spatial partitioning function to subdivide the input such that each partition fits in memory. Patel and DeWitt then use a standard plane-sweeping techniq... |

94 | Ecient computation of spatial joins
- Gunther
- 1993
(Show Context)
Citation Context ...opose an algorithm based on R - trees that performs a carefully synchronized depth-first traversal of the two trees to be joined. An optimized version of this algorithm was described in [16]. Günther =-=[12]-=- studies the tradeoffs between using join indexes and spatial indexes for the spatial join. He concludes that a join index approach is better for low join selectivities, while for higher join selectiv... |

88 | Spatial Joins using R-trees: Breadth-First Traversal with Global Optimizations
- Huang, Jing, et al.
- 1997
(Show Context)
Citation Context ... Seeger [8] propose an algorithm based on R - trees that performs a carefully synchronized depth-first traversal of the two trees to be joined. An optimized version of this algorithm was described in =-=[16]-=-. Günther [12] studies the tradeoffs between using join indexes and spatial indexes for the spatial join. He concludes that a join index approach is better for low join selectivities, while for higher... |

75 | Selectivity Estimation in Spatial Databases
- Acharya, Poosala, et al.
(Show Context)
Citation Context ...figuration, it is advantageous to use the index only when the join involves less than 60%of the leaf nodes. An estimate of this number can be obtained using, e.g., the spatial histograms developed in =-=[1]-=-. Using such a cost-based approach to choose between the index-based and non-index based algorithms, PQ should have the best overall execution time in most cases. We also comment on the relationship b... |

74 |
Spatial Join Indices
- Rotem
- 1991
(Show Context)
Citation Context ...t real data sets from VLSI applications tend to obey a so-called square-root rule, i.e., in a set ofNrectangles there are onlyO(pN)rectangles that intersect a given vertical or horizontal line. Rotem =-=[32]-=- proposes a spatial join algorithm based on the join index of Valduriez [37] and a grid file. Spatial Index-Based Approaches. Several join algorithms have been proposed that use spatial index structur... |

65 | Scalable sweeping-based spatial join
- ARGE, PROCOPIUC, et al.
- 1998
(Show Context)
Citation Context ...h partition fits in memory. Patel and DeWitt then use a standard plane-sweeping technique to perform the join for each partition, while Lo and Ravishankar use an indexed nested loop join. Arge et al. =-=[4]-=- propose an algorithm based on plane-sweeping and partitioning along a single axis that guarantees an asymptotically optimal number of disk accesses in the worst case. The algorithm is essentially an ... |

63 | Size Separation Spatial Join
- Koudas, Sevcik
- 1997
(Show Context)
Citation Context ...relations has an index. Lo and Ravishankar [22] propose to first build indexes using spatial sampling techniques, and then use the tree join algorithm of [8] to compute the join. Another recent paper =-=[20]-=- proposes an algorithm based on a filter tree structure. Patel and DeWitt [30] and Lo and Ravishankar [23] both propose hash-based algorithms that use a spatial partitioning function to subdivide the ... |

60 |
A comparison of spatial query processing techniques for native and parameter spaces
- Orenstein
- 1990
(Show Context)
Citation Context ...while a Visiting Scholar at Duke University.axis-parallel rectangle that completely contains it, called the minimal bounding rectangle (MBR). Spatial overlay joins can then be performed in two steps =-=[28]-=-: – Filter Step: The spatial operation is first performed on the MBR representation, i.e., the first step is to identify all intersecting pairs of MBRs. – Refinement Step: The exact representations of... |

49 | A qualitative comparison study of data structures for large linear segment databases
- Hoel, Samet
- 1992
(Show Context)
Citation Context ...spatial indexes for the spatial join. He concludes that a join index approach is better for low join selectivities, while for higher join selectivities, spatial indexes perform better. Hoel and Samet =-=[15]-=- propose to use PMR quad-trees for the spatial join and compare it against members of the R-tree family. Lo and Ravishankar [21] discuss the case where only one of the relations has an index. They con... |

37 | Worst-case efficient external-memory priority queues
- Brodal, Katajainen
- 1998
(Show Context)
Citation Context ...ssumes that the priorityqueue never grows larger than the amount of internal memory available. Note, however, that PQ can be modified to handle overflow gracefully by using an external priority queue =-=[2, 9]-=-, and that it can also be combined with the partitioning step along one dimension that SSSJ performs in the case of an overflow of the interval data structure. We omit these details here since they ar... |

32 | Processing and Optimization of Multi-way Spatial Joins Using R-trees
- Papadias, Mamoulis, et al.
- 1999
(Show Context)
Citation Context ...ime per priority queue operation. 1 However, for more complicated multi-way joins which do not correspond ton-way intersections, it is not clear how to extend the algorithm in an elegant fashion; see =-=[25]-=- for a discussion of such cases.5 Experimental Platforms In this section, we describe the experimental set-up for our studies, providing detailed information on the hardware, software, and data sets ... |

28 |
A New Algorithm for Computing Joins with Grid Files
- Becker, Hinrichs, et al.
- 1993
(Show Context)
Citation Context ...ork. Orenstein [29] uses a transformational approach based on space-filling curves, and then performs a sort-merge join along the curve to solve the join problem. In another transformational approach =-=[6]-=-, the MBRs of two-dimensional spatial objects are transformed into points in four dimensions. These points are stored in a multi-attribute data structure such as the grid file [27], which is then used... |

24 |
Generating seeded trees from data sets
- Lo, Ravishankar
- 1995
(Show Context)
Citation Context ...tiple joins occurring in more complex spatial queries. Non Index-Based Approaches. Recently a lot of work has focused on the case where neither of the input relations has an index. Lo and Ravishankar =-=[22]-=- propose to first build indexes using spatial sampling techniques, and then use the tree join algorithm of [8] to compute the join. Another recent paper [20] proposes an algorithm based on a filter tr... |

21 | Theory and practice of I/O-efficient algorithms for multidimensional batched searching problems - Arge, Procopiuc, et al. - 1998 |

16 | Technical Documentation - Files - 1997 |

11 |
TPIE User Manual and Reference (edition 0.9.01a
- Arge, Barve, et al.
- 1999
(Show Context)
Citation Context ...s. Thus, on Machine 1 we always requested two blocks per I/O-operation. 5.2 Software Environment We implemented the algorithms in C++ using the Transparent Parallel I/O Programming Environment (TPIE) =-=[3]-=-, a templated library that supports high-level, yet efficient implementations of external memory algorithms. In TPIE, the actual page transfers between disk and internal memory is performed by a so-ca... |

10 |
Join strategies on kd-tree indexed relations
- Kitsuregawa, Harada, et al.
- 1989
(Show Context)
Citation Context ...llowingwe describe how this extraction is performed by means of a tree traversal. A somewhat similar way of traversing the indexed data in sorted order was proposed by Kitsuregawa, Harada, and Takagi =-=[19]-=- in the context of joining two relations indexed by ak-d-tree. Here we present a conceptually simpler algorithm based on a priority queue. The main idea in our traversal algorithm is to run a horizont... |

9 |
Sibling clustering of tree-based spatial indexes for ecient spatial query processing
- Kim, Cha
- 1998
(Show Context)
Citation Context ...e been no possibility of improving over the sorting based SSSJ, unless the cost of building or periodic rebuilding is amortized over several spatial join operations. 2 Note, however, that Kim and Cha =-=[18]-=- have recently described how to locally reorganize the tree during updates to maintain a good layout of sibling nodes.7 Conclusions and Open Problems In this paper, we presented a simple algorithm th... |

9 | Integration of Spatial Join Algorithms for Joining Multiple Inputs
- Mamoulis, Papadias
- 1999
(Show Context)
Citation Context ...eed). Afterwards, the tree join algorithm of [8] is used to perform the actual join. Another algorithm for the case where only one relation has an index was recently proposed by Mamoulis and Papadias =-=[24]-=-, who also discuss how to perform multiple joins occurring in more complex spatial queries. Non Index-Based Approaches. Recently a lot of work has focused on the case where neither of the input relati... |

5 |
A practical divide-and conquer algorithm for the rectangle intersection problem
- Güting, Schilling
- 1984
(Show Context)
Citation Context ...tructure such as the grid file [27], which is then used to perform the join. An efficient algorithm for the rectangle intersection problem based on plane-sweeping wasproposed by Güting and Schilling =-=[13]-=-, who observed that real data sets from VLSI applications tend to obey a so-called square-root rule, i.e., in a set ofNrectangles there are onlyO(pN)rectangles that intersect a given vertical or horiz... |

3 | Technical Documentation - TIGERLine - 1992 |

1 |
Technical Documentation
- TIGERLineTMFiles
- 1997
(Show Context)
Citation Context ...tabase systems [36, p. 535]). We compiled all programs using the GNU C++ compiler (version 2.8), with -O2 level of optimization. 5.3 Data Sets The TIGER/Line data set from the US Bureau of the Census =-=[35]-=- is one of the standard benchmarks for spatial databases. Its current distribution consists of six CD-ROMs of data. We extracted the hydrographic and road features of the entire United States and crea... |