Results 1 -
9 of
9
Efficient Processing of Spatial Joins Using R-Trees
, 1993
"... Abstract: In this paper, we show that spatial joins are very suitable to be processed on a parallel hardware platform. The parallel system is equipped with a so-called shared virtual memory which is well-suited for the design and implementation of parallel spatial join algorithms. We start with an a ..."
Abstract
-
Cited by 286 (12 self)
- Add to MetaCart
Abstract: In this paper, we show that spatial joins are very suitable to be processed on a parallel hardware platform. The parallel system is equipped with a so-called shared virtual memory which is well-suited for the design and implementation of parallel spatial join algorithms. We start with an algorithm that consists of three phases: task creation, task assignment and parallel task execu-tion. In order to reduce CPU- and I/O-cost, the three phases are processed in a fashion that pre-serves spatial locality. Dynamic load balancing is achieved by splitting tasks into smaller ones and reassigning some of the smaller tasks to idle processors. In an experimental performance compar-ison, we identify the advantages and disadvantages of several variants of our algorithm. The most efficient one shows an almost optimal speed-up under the assumption that the number of disks is sufficiently large. Topics: spatial database systems, parallel database systems 1
Distance Browsing in Spatial Databases
, 1999
"... Two different techniques of browsing through a collection of spatial objects stored in an R-tree spatial data structure on the basis of their distances from an arbitrary spatial query object are compared. The conventional approach is one that makes use of a k-nearest neighbor algorithm where k is kn ..."
Abstract
-
Cited by 240 (17 self)
- Add to MetaCart
Two different techniques of browsing through a collection of spatial objects stored in an R-tree spatial data structure on the basis of their distances from an arbitrary spatial query object are compared. The conventional approach is one that makes use of a k-nearest neighbor algorithm where k is known prior to the invocation of the algorithm. Thus if m#kneighbors are needed, the k-nearest neighbor algorithm needs to be reinvoked for m neighbors, thereby possibly performing some redundant computations. The second approach is incremental in the sense that having obtained the k nearest neighbors, the k +1 st neighbor can be obtained without having to calculate the k +1nearest neighbors from scratch. The incremental approach finds use when processing complex queries where one of the conditions involves spatial proximity (e.g., the nearest city to Chicago with population greater than a million), in which case a query engine can make use of a pipelined strategy. A general incremental nearest neighbor algorithm is presented that is applicable to a large class of hierarchical spatial data structures. This algorithm is adapted to the R-tree and its performance is compared to an existing k-nearest neighbor algorithm for R-trees [45]. Experiments show that the incremental nearest neighbor algorithm significantly outperforms the k-nearest neighbor algorithm for distance browsing queries in a spatial database that uses the R-tree as a spatial index. Moreover, the incremental nearest neighbor algorithm also usually outperforms the k-nearest neighbor algorithm when applied to the k-nearest neighbor problem for the R-tree, although the improvement is not nearly as large as for distance browsing queries. In fact, we prove informally that, at any step in its execution, the incremental...
Incremental Distance Join Algorithms for Spatial Databases
, 1998
"... Two new spatial join operations, distance join and distance semijoin, are introduced where the join output is ordered by the distance between the spatial attribute values of the joined tuples. Incremental algorithms are presented for computing these operations, which can be used in a pipelined fashi ..."
Abstract
-
Cited by 97 (9 self)
- Add to MetaCart
Two new spatial join operations, distance join and distance semijoin, are introduced where the join output is ordered by the distance between the spatial attribute values of the joined tuples. Incremental algorithms are presented for computing these operations, which can be used in a pipelined fashion, thereby obviating the need to wait for their completion when only a few tuples are needed. The algorithms can be used with a large class of hierarchical spatial data structures and arbitrary spatial data types in any dimensions. In addition, any distance metric may be employed. A performance study using Rtrees shows that the incremental algorithms outperform non-incremental approaches by an order of magnitude if only a small part of the result is needed, while the penalty, if any, for the incremental processing is modest if the entire join result is required.
Spatial Joins Using R-trees: Breadth-First Traversal with Global Optimizations
- Proc. of VLDB
, 1997
"... R-tree based spatial join is useful because of both its superior performance and the wide spread implementation of R-trees. We present a new R-tree join method called BFRJ (Breadth-First R-tree Join). BFRJ synchronously traverses both R-trees in breadthfirst order while processing join computation o ..."
Abstract
-
Cited by 82 (4 self)
- Add to MetaCart
R-tree based spatial join is useful because of both its superior performance and the wide spread implementation of R-trees. We present a new R-tree join method called BFRJ (Breadth-First R-tree Join). BFRJ synchronously traverses both R-trees in breadthfirst order while processing join computation one level at a time. At each level, BFRJ creates an intermediate join index and deploys global optimization strategies (ordering, memory management, buffer management) to improve the join computation at the next level. We also present an experimental evaluation of the proposed optimizations as well as a performance comparison between BFRJ and the state-of-the-art approach. Our experimental results indicate that BFRJ with global optimizations can outperform the competitor by a significant margin (up to 50%). 0.05in This work was supported in part by the University of Michigan ITS Research Center of Excellence grant (DTFH61-93-X00017 -Sub) sponsored by the U.S. Dept. of Transportation and by...
Integration of Spatial Join Algorithms for Processing Multiple Inputs
, 1999
"... Several techniques that compute the join between two spatial datasets have been proposed during the last decade. Among these methods, some consider existing indices for the joined inputs, while others treat datasets with no index, providing solutions for the case where at least one input comes as an ..."
Abstract
-
Cited by 37 (12 self)
- Add to MetaCart
Several techniques that compute the join between two spatial datasets have been proposed during the last decade. Among these methods, some consider existing indices for the joined inputs, while others treat datasets with no index, providing solutions for the case where at least one input comes as an intermediate result of another database operator. In this paper we analyze previous work on spatial joins and propose a novel algorithm, called slot index spatial join (SISJ), that efficiently computes the spatial join between two inputs, only one of which is indexed by an R-tree. Going one step further, we show how SISJ and other spatial join algorithms can be implemented as operators in a database environment that joins more than two spatial datasets. We study the differences between relational and spatial multiway joins, and propose a dynamic programming algorithm that optimizes the execution of complex spatial queries. Keywords Spatial Joins, Spatial Query Processing, Query Optimizati...
Multiway Spatial Joins
- ACM Transactions on Database Systems (TODS
, 2001
"... Due to the evolution of Geographical Information Systems, large collections of spatial data having various thematic contents are currently available. As a result, the interest of users is not limited to simple spatial selections and joins, but complex query types that implicate numerous spatial inpu ..."
Abstract
-
Cited by 28 (6 self)
- Add to MetaCart
Due to the evolution of Geographical Information Systems, large collections of spatial data having various thematic contents are currently available. As a result, the interest of users is not limited to simple spatial selections and joins, but complex query types that implicate numerous spatial inputs become more common. Although several algorithms have been proposed for computing the result of pairwise spatial joins, limited work exists on processing and optimization of multiway spatial joins. In this article, we review pairwise spatial join algorithms and show how they can be combined for multiple inputs. In addition, we explore the application of synchronous traversal (ST), a methodology that processes synchronously all inputs without producing intermediate results. Then, we integrate the two approaches in an engine that includes ST and pairwise algorithms, using dynamic programming to determine the optimal execution plan. The results show that, in most cases, multiway spatial joins are best processed by combining ST with pairwise methods. Finally, we study the optimization of very large queries by employing randomized search algorithms.
The Design and Implementation of Seeded Trees: An Efficient Method for Spatial Joins
, 1998
"... Existing methods for spatial joins require pre-existing spatial indices or other precomputation, but such approaches are inefficient and limited in generality. Operand data sets of spatial joins may not all have precomputed indices, particularly when they are dynamically generated by other selectio ..."
Abstract
-
Cited by 11 (1 self)
- Add to MetaCart
Existing methods for spatial joins require pre-existing spatial indices or other precomputation, but such approaches are inefficient and limited in generality. Operand data sets of spatial joins may not all have precomputed indices, particularly when they are dynamically generated by other selection or join operations. Also, existing spatial indices are mostly designed for spatial selections, and are not always efficient for joins. This paper explores the design and implementation of seeded trees [1], which are effective for spatial joins and efficient to construct at join time. Seeded trees are R-tree-like structures, but divided into seed levels and grown levels. This structure facilitates using information regarding the join to accelerate the join process, and allows efficient buffer management. In addition to the basic structure and behavior of seeded trees, we present techniques for efficient seeded tree construction, a new buffer management strategy to lower I/O costs, and theoretical analysis for choosing algorithmic parameters. We also present methods for reducing space requirements and improving the stability of seeded tree performance with no additional I/O costs. Our performance studies show that the seeded tree method outperforms other tree-based methods by far both in terms of the number disk pages accessed and weighted I/O costs. Further, its performance gain is stable across different input data, and its incurred CPU penalties are also lower.
BFRJ: Global Optimization of Spatial Joins Using R-trees
, 1997
"... This paper describes a new spatial join method called BFRJ #BreadthFirst R-tree Join#. BFRJ synchronously traverses both R-trees in breadth-#rst order processing the join computation one level at a time. This wayanintermediate join index can be created at each level to guide the join process at the ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
This paper describes a new spatial join method called BFRJ #BreadthFirst R-tree Join#. BFRJ synchronously traverses both R-trees in breadth-#rst order processing the join computation one level at a time. This wayanintermediate join index can be created at each level to guide the join process at the next lower level. Unlike the limitation of the state-of-the-art depth-#rst R-tree join method which can only optimize I#O within local sub-trees, the breadth-#rst ordering allows BFRJ to deploy global optimization strategies among all nodes at the next lower level. In particular, BFRJ optimization strategies include index ordering, memory management, and bu#er management of the intermediate join indices. This paper also presents an experimental evaluation of the e#ect of the proposed optimizations as well as a performance comparison between BFRJ and the state-of-the-art approach. Our experimental results indicate that BFRJ with global optimizations can outperform the competitor by a signi#cant margin #up to 50##.
Speeding up Bulk-Loading of Quadtrees
- ACM– GIS 50–53
, 1997
"... Spatial indexes, such as the PMR quadtree, are important in spatial databases for efficient execution of queries involving spatial constraints, especially when the queries involve spatial joins. We investigate the issue of speeding up building PMR quadtrees for a set of objects and develop two appro ..."
Abstract
- Add to MetaCart
Spatial indexes, such as the PMR quadtree, are important in spatial databases for efficient execution of queries involving spatial constraints, especially when the queries involve spatial joins. We investigate the issue of speeding up building PMR quadtrees for a set of objects and develop two approaches to achieve this goal. In an empirical study, we find that the better method of the two offers significant improvements in execution time, and present evidence of the usefulness of spatial indexing for executing spatial join queries. 1 The support of the National Science Foundation under Grant IRI-92-16970 and the Department of Energy under Contract DEFG0295ER25237 is gratefully acknowledged.

