## Partition Based Spatial-Merge Join (1996)

### Cached

### Download Links

- [www.eecs.umich.edu]
- [www.eecs.umich.edu]
- DBLP

### Other Repositories/Bibliography

Citations: | 177 - 11 self |

### BibTeX

@INPROCEEDINGS{Patel96partitionbased,

author = {Jignesh M. Patel and David J. DeWitt},

title = {Partition Based Spatial-Merge Join},

booktitle = {},

year = {1996},

pages = {259--270}

}

### Years of Citing Articles

### OpenURL

### Abstract

This paper describes PBSM (Partition Based Spatial--Merge), a new algorithm for performing spatial join operation. This algorithm is especially effective when neither of the inputs to the join have an index on the joining attribute. Such a situation could arise if both inputs to the join are intermediate results in a complex query, or in a parallel environment where the inputs must be dynamically redistributed. The PBSM algorithm partitions the inputs into manageable chunks, and joins them using a computational geometry based plane--sweeping technique. This paper also presents a performance study comparing the the traditional indexed nested loops join algorithm, a spatial join algorithm based on joining spatial indices, and the PBSM algorithm. These comparisons are based on complete implementations of these algorithms in Paradise, a database system for handling GIS applications. Using real data sets, the performance study examines the behavior of these spatial join algorithms in a vari...

### Citations

1854 |
of the Census
- Bureau
- 1990
(Show Context)
Citation Context ...e PBSM algorithm. The performance study is based on actual implementations of the three algorithms in Paradise [DKL + 94], which is an experimental GIS database system. Using real data from the TIGER =-=[Tig]-=- and the Sequoia [SFGM93] data sets, the study examines the behavior of the algorithms in a variety of situations, including the cases when none, one, or both the inputs to the join have a suitable in... |

1228 | Multidimensional binary search trees used for associative searching - Bentley - 1975 |

1083 | The R∗-tree: an efficient and robust access method for points and rectangles - Beckmann, Kriegel, et al. - 1990 |

450 |
Introduction to VLSI systems
- Mead, Conway
- 1980
(Show Context)
Citation Context ..., and the seeded trees are joined using the tree join algorithm of [BKS93]. The problem of finding pairwise intersection between two sets of rectangles has been extensively studied in the VLSI domain =-=[MC80]-=-, and numerous solutions exist for the case when both the input set of rectangles fit in memory [PS88]. In [GS87], Guting and Shilling examine the rectangle intersection problem when the inputs are to... |

404 | The grid file: an adaptable, symmetric multikey file structure
- Nievergelt, Hinterberger, et al.
- 1984
(Show Context)
Citation Context ...results of a spatial join. The algorithm for building the spatial join index requires grid files for indexing the spatial data, and uses these grid files to compute the spatial join index. Grid files =-=[NHS84]-=- and kd--trees [Ben75, Ben79] have also been employed for evaluating multi--attribute joins in the relational domain [KHT89, HNKT90, BHF93]. These methods can also be used for evaluating the filter st... |

341 | Parallel processing of spatial joins using R-trees
- Brinkhoff, Kriegel, et al.
- 1996
(Show Context)
Citation Context ...lectivities generalization trees are more efficient. The proposed join algorithm using the generalization trees, is similar to the join algorithm on R--trees proposed by Brinkhoff, Kriegel and Seeger =-=[BKS93]-=-. This algorithm can be used only if an R--tree index exists on both the join inputs, and can be described as a synchronized depth--first search of both indices, with the two depth--first searches bei... |

286 | Principles of Geographical Information Systems for Land Resources Assessment - BURROUGH - 1986 |

216 | Join Indices
- Valduriez
- 1987
(Show Context)
Citation Context ...id increases the efficiency of the filtering technique, but it also increases the space requirement since a larger number of z--values are required to approximate an object. In the relational domain, =-=[Val87]-=- proposed the use of join indices to improve the performance of the relational join operator. Drawing an analogy from this, Rotem [Rot91] proposed a spatial join index that partially precomputes the r... |

181 | Spatial query processing in an object-oriented database system - Orenstein - 1986 |

161 |
The Case for Shared Nothing
- Stonebraker
- 1986
(Show Context)
Citation Context ... large inputs, can also be used for declustering spatial data. We are currently examining these issues in the broader context of extending Paradise [DKL + 94] to run on shared-- nothing architectures =-=[Sto86]-=-. Parallel spatial databases are emerging as an attractive solution for storing and manipulating large volumes of spatial data [DLPY93], and some techniques for declustering spatial data have recently... |

153 |
A class of data structures for associative searching
- Orenstein, Merrett
- 1984
(Show Context)
Citation Context ...--values, are then used in a spatial join algorithm that merges two sequences of z--values. The z--values, being 1--dimensional values, can be stored in traditional indexing structures like a B--tree =-=[OM84]-=-. The performance of the spatial join algorithm using z--values was found to be sensitive to the choice of the grid [Ore89]. Choosing a fine grid increases the efficiency of the filtering technique, b... |

143 | Multi-step processing of spatial joins
- Brinkhoff, Kriegel, et al.
- 1994
(Show Context)
Citation Context ...gure 14: Comparison of the Join Algorithms with indices, TIGER Data (Join Road with Hydrography). Figure 15: Comparison of the Join Algorithms with indices, TIGER Data (Join Road with Rail). the join =-=[BKSS94]-=- (by an order of magnitude in many cases). These techniques rely on using as a filter in the refinement step, extra information that is precomputed and stored along with each spatial feature. As an ex... |

109 | PROBE spatial data modeling and query processing in an image database application - Orenstein, Manola - 1988 |

109 | The sequoia 2000 storage benchmark
- Stonebraker, Frew, et al.
- 1993
(Show Context)
Citation Context ...performance study is based on actual implementations of the three algorithms in Paradise [DKL + 94], which is an experimental GIS database system. Using real data from the TIGER [Tig] and the Sequoia =-=[SFGM93]-=- data sets, the study examines the behavior of the algorithms in a variety of situations, including the cases when none, one, or both the inputs to the join have a suitable index. The study also inves... |

106 | Practical skew handling in parallel joins
- DeWitt, Naughton, et al.
- 1992
(Show Context)
Citation Context ...Partitioning Function using Tiles. The spatial partitioning function just described is the spatial analog of virtual processor round robin partitioning for handling skews in parallel relational joins =-=[DNSS92]-=-. A similar partitioning function has been independently proposed for redundancy--baseddeclustering of spatial objects in a parallel spatial database [TY95], but in that proposal the number of tiles a... |

106 |
Spatial joins using seeded trees
- Lo, Ravishankar
- 1994
(Show Context)
Citation Context ...uad tree, and compare the efficiency of variants of the PMR quad tree with variants of the R--tree [HS95]. When one of the inputs to the spatial join does not have a spatial index, Lo and Ravishankar =-=[LR94]-=- propose building a seeded tree index on that input. A seeded tree is a R--tree that is allowed to be height unbalanced. The algorithm for constructing the seeded tree uses the existing index on one o... |

101 | Spatial hash-joins
- Lo, Ravishankar
- 1996
(Show Context)
Citation Context ...ynchronized Tree ffl External VLSI algo [GS87] directly in the Traversal [BKS93, Gun93, HS95] ffl PBSM two dimensional space ffl Build 1 or 2 indices before joining [LR94, LR95] ffl Spatial Hash Join =-=[LR96]-=- Table 1: Classification of Various Spatial Join Algorithms data structures. In [HS95], Hoel and Samet propose a tree join algorithm for the PMR quad tree, and compare the efficiency of variants of th... |

95 | Efficient computation of spatial joins
- Günther
- 1993
(Show Context)
Citation Context ...l models, Gunther compares join algorithms that use generalization trees (which is a class of tree structures that includes the R-tree, R*-tree and R+tree) with the nested loops join and join indices =-=[Gun93]-=-. This study concludes that for low join selectivities, join indices usually provide the best join performance, but for higher join selectivities generalization trees are more efficient. The proposed ... |

76 | Redundancy in spatial databases
- Orenstein
- 1989
(Show Context)
Citation Context ...ional values, can be stored in traditional indexing structures like a B--tree [OM84]. The performance of the spatial join algorithm using z--values was found to be sensitive to the choice of the grid =-=[Ore89]-=-. Choosing a fine grid increases the efficiency of the filtering technique, but it also increases the space requirement since a larger number of z--values are required to approximate an object. In the... |

75 |
Spatial join indices
- Rotem
- 1991
(Show Context)
Citation Context ...e required to approximate an object. In the relational domain, [Val87] proposed the use of join indices to improve the performance of the relational join operator. Drawing an analogy from this, Rotem =-=[Rot91]-=- proposed a spatial join index that partially precomputes the results of a spatial join. The algorithm for building the spatial join index requires grid files for indexing the spatial data, and uses t... |

72 |
Analysis of object oriented spatial access methods
- FALOUTSOS, SELLIS, et al.
- 1987
(Show Context)
Citation Context ... be used for evaluating the filter step by storing the bounding box of the spatial objects as points in a higher dimension [BHF93]. Recently, spatial index structures like R--trees [Gut84], R+--trees =-=[CFR87]-=-, R*--trees [BKSS90], and PMR quad trees [NS86] have been used to speed up the evaluation of the spatial join. Using analytical models, Gunther compares join algorithms that use generalization trees (... |

62 |
A consistent hierarchical representation for vector data
- Nelson, Samet
- 1986
(Show Context)
Citation Context ...ng the bounding box of the spatial objects as points in a higher dimension [BHF93]. Recently, spatial index structures like R--trees [Gut84], R+--trees [CFR87], R*--trees [BKSS90], and PMR quad trees =-=[NS86]-=- have been used to speed up the evaluation of the spatial join. Using analytical models, Gunther compares join algorithms that use generalization trees (which is a class of tree structures that includ... |

60 |
A comparison of spatial query processing techniques for native and parameter spaces
- Orenstein
- 1990
(Show Context)
Citation Context ...object representing a swiss-- cheese--polygon might require thousands of points to represent the exact geometric shape), spatial operations, including the spatial join, typically operate in two steps =-=[Ore90]-=-: ffl Filter Step: In this step, an approximation of each spatial object, such as its minimum bounding rectangle, is used to eliminate tuples that cannot be part of the result. This step produces cand... |

44 |
An Adaptive Hash Join Algorithm for Multiuser Environments
- Zeller, Gray
- 1990
(Show Context)
Citation Context ...cally repartition the overflown partition pair. Another alternative is to increase the number of partitions (limited to M ) and using schemes similar to those used by the Adaptive Hash join algorithm =-=[ZG90]-=-. However, the current implementation of PBSM does not incorporate any of these techniques. 4 Performance Evaluation In this section, we compare the PBSM join algorithm with two other spatial join alg... |

33 |
RTree – A dynamic index structure for spatial searching, SIGMOD
- Gutman
- 1984
(Show Context)
Citation Context ...se methods can also be used for evaluating the filter step by storing the bounding box of the spatial objects as points in a higher dimension [BHF93]. Recently, spatial index structures like R--trees =-=[Gut84]-=-, R+--trees [CFR87], R*--trees [BKSS90], and PMR quad trees [NS86] have been used to speed up the evaluation of the spatial join. Using analytical models, Gunther compares join algorithms that use gen... |

31 | Benchmarking spatial join operations with spatial output
- Hoel, Samet
- 1995
(Show Context)
Citation Context ..., HS95] ffl PBSM two dimensional space ffl Build 1 or 2 indices before joining [LR94, LR95] ffl Spatial Hash Join [LR96] Table 1: Classification of Various Spatial Join Algorithms data structures. In =-=[HS95]-=-, Hoel and Samet propose a tree join algorithm for the PMR quad tree, and compare the efficiency of variants of the PMR quad tree with variants of the R--tree [HS95]. When one of the inputs to the spa... |

29 |
A new algorithm for computing joins with grid files
- Becker, Hinrichs, et al.
- 1993
(Show Context)
Citation Context ...joins in the relational domain [KHT89, HNKT90, BHF93]. These methods can also be used for evaluating the filter step by storing the bounding box of the spatial objects as points in a higher dimension =-=[BHF93]-=-. Recently, spatial index structures like R--trees [Gut84], R+--trees [CFR87], R*--trees [BKSS90], and PMR quad trees [NS86] have been used to speed up the evaluation of the spatial join. Using analyt... |

29 |
The Montage Extensible DataBlade Architecture
- Ubell
- 1994
(Show Context)
Citation Context ...se system has been employed to meet these requirements. Examples of commercial database systems that have been used for these applications are ARC/INFO [Arc95], Intergraph's MGE [Cor95], and Illustra =-=[Ube94]-=-). Data stored in these spatial database systems includes simple geometric types like points, lines, polygons, and surfaces, This work was partially supported by NASA Contracts #USRA--555517, #NAGW--3... |

25 |
Generating seeded trees from data sets
- Lo, Ravishankar
- 1995
(Show Context)
Citation Context ...s have been dynamically redistributed. A spatial DBMS must evaluate these joins efficiently. One solution to this problem is to build a spatial index on both inputs and then use a tree join algorithm =-=[LR95]-=-. Another solution to this problem comes from the VLSI domain where one needs to compute the pairwise intersection between two potentially large sets of rectangles that don't fit entirely in main memo... |

13 | Query Processing Method for MultiAttribute Clustered Relations - Harada - 1990 |

13 | Spatial Query Processing in an Object Oriented Database System - Orenstein - 1986 |

12 | Join strategies on KD-tree indexed relations - Kitsuregawa, Harada, et al. - 1989 |

6 |
A Performance Study of Declustering Strategies for Parallel Spatial
- Tan, Yu
- 1995
(Show Context)
Citation Context ...ndling skews in parallel relational joins [DNSS92]. A similar partitioning function has been independently proposed for redundancy--baseddeclustering of spatial objects in a parallel spatial database =-=[TY95]-=-, but in that proposal the number of tiles always equals the number of partitions. The design space for choosing the spatial partitioning function has two axes: the number of tiles used in the partiti... |

5 |
A practical divide-and conquer algorithm for the rectangle intersection problem
- Güting, Schilling
- 1984
(Show Context)
Citation Context ...nother solution to this problem comes from the VLSI domain where one needs to compute the pairwise intersection between two potentially large sets of rectangles that don't fit entirely in main memory =-=[GS87]-=-. However, the VLSI algorithms are generally not very efficient with respect to the number of disk I/Os. This paper makes two contributions. First, it presents a new spatial join algorithm, called the... |

5 |
editors, "Computation Geometry
- Preparata, Shamos
- 1988
(Show Context)
Citation Context ...rwise intersection between two sets of rectangles has been extensively studied in the VLSI domain [MC80], and numerous solutions exist for the case when both the input set of rectangles fit in memory =-=[PS88]-=-. In [GS87], Guting and Shilling examine the rectangle intersection problem when the inputs are too large to fit in memory, and analyze the time and space complexity of two algorithms that are based o... |

4 |
Paradise -- A Parallel Geographic Information System
- DeWitt, Luo, et al.
- 1993
(Show Context)
Citation Context ...ding Paradise [DKL + 94] to run on shared-- nothing architectures [Sto86]. Parallel spatial databases are emerging as an attractive solution for storing and manipulating large volumes of spatial data =-=[DLPY93]-=-, and some techniques for declustering spatial data have recently been proposed [TY95]. However, unless the spatial data is uniformly distributed, these techniques can result in unbalanced partitions.... |

4 | Efficient computation of spatial joins - ünther - 1993 |

2 |
ARC/INFO: The World's GIS. An ESRI White Paper
- ESRI, CA
- 1995
(Show Context)
Citation Context ... manipulate spatial data. Increasingly, a database system has been employed to meet these requirements. Examples of commercial database systems that have been used for these applications are ARC/INFO =-=[Arc95]-=-, Intergraph's MGE [Cor95], and Illustra [Ube94]). Data stored in these spatial database systems includes simple geometric types like points, lines, polygons, and surfaces, This work was partially sup... |

2 | Geographic Information Systems", volume 1 - Maguire, Goodchild, et al. - 1991 |

2 |
A Practical Divide--and--Conquer Algorithm for the Rectangle Intersection Problem
- uting, Shilling
- 1987
(Show Context)
Citation Context ...nother solution to this problem comes from the VLSI domain where one needs to compute the pairwise intersection between two potentially large sets of rectangles that don't fit entirely in main memory =-=[GS87]-=-. However, the VLSI algorithms are generally not very efficient with respect to the number of disk I/Os. This paper makes two contributions. First, it presents a new spatial join algorithm, called the... |

1 |
GIS/AM/FM Information". "http://www.intergraph.com/utilmap.shtml
- Corporation
- 1995
(Show Context)
Citation Context ...Increasingly, a database system has been employed to meet these requirements. Examples of commercial database systems that have been used for these applications are ARC/INFO [Arc95], Intergraph's MGE =-=[Cor95]-=-, and Illustra [Ube94]). Data stored in these spatial database systems includes simple geometric types like points, lines, polygons, and surfaces, This work was partially supported by NASA Contracts #... |

1 |
Partition Based Spatial--Merge Join". http://www.cs.wisc.edu/paradise/paradise.papers.html
- Patel, DeWitt
(Show Context)
Citation Context ... number of tiles on the execution time of PBSM, but found that changing the number of tiles had a very small effect on the overall execution time (less than 5%). The full length version of this paper =-=[PD]-=- presents this result. The performance study was carried out in two parts. The first part examined the performance of the three algorithms when neither join input had a pre--existing index, and the se... |