Results 1 - 10
of
74
Multidimensional Access Methods
, 1998
"... Search operations in databases require special support at the physical level. This is true for conventional databases as well as spatial databases, where typical search operations include the point query (find all objects that contain a given search point) and the region query (find all objects that ..."
Abstract
-
Cited by 508 (3 self)
- Add to MetaCart
Search operations in databases require special support at the physical level. This is true for conventional databases as well as spatial databases, where typical search operations include the point query (find all objects that contain a given search point) and the region query (find all objects that overlap a given search region). More
Hilbert R-tree: An improved R-tree using fractals
, 1994
"... We propose a new R-tree structure that outperforms all the older ones. The heart of the idea is to facilitate the deferred splitting approach in R-trees. This is done by proposing an ordering on the R-tree nodes. This ordering has to be 'good', in the sense that it should group 'similar' data rectan ..."
Abstract
-
Cited by 170 (9 self)
- Add to MetaCart
We propose a new R-tree structure that outperforms all the older ones. The heart of the idea is to facilitate the deferred splitting approach in R-trees. This is done by proposing an ordering on the R-tree nodes. This ordering has to be 'good', in the sense that it should group 'similar' data rectangles together, to minimize the area and perimeter of the resulting minimum bounding rectangles (MBRs). Following [19] we have chosen the so-called '2D-c' method, which sorts rectangles according to the Hilbert value of the center of the rectangles. Given the ordering, every node has a welldefined set of sibling nodes; thus, we can use deferred splitting. By adjusting the split policy, the Hilbert R-tree can achieve as high utilization as desired. To the contrary, the R -tree has no control over the space utilization, typically achieving up to 70%. We designed the manipulation algorithms in detail, and we did a full implementation of the Hilbert R-tree. Our experiments show that the '2-to-...
The hB-tree: A multiattribute indexing method with good guaranteed performance
- ACM Transactions on Database Systems
, 1990
"... A new multiattribute index structure called the hB-tree is introduced. It is derived from the K-D-B-tree of Robinson [15] but has additional desirable properties. The hB-tree internode search and growth processes are precisely analogous to the corresponding processes in B-trees [l]. The intranode pr ..."
Abstract
-
Cited by 167 (8 self)
- Add to MetaCart
A new multiattribute index structure called the hB-tree is introduced. It is derived from the K-D-B-tree of Robinson [15] but has additional desirable properties. The hB-tree internode search and growth processes are precisely analogous to the corresponding processes in B-trees [l]. The intranode processes are unique. A k-d tree is used as the structure within nodes for very efficient searching. Node splitting requires that this k-d tree be split. This produces nodes which no longer represent brick-like regions in k-space, but that can be characterized as holey bricks, bricks in which subregions have been extracted. We present results that guarantee hB-tree users decent storage utilization, reasonable size index terms, and good search and insert performance. These results guarantee that the hB-tree copes well with arbitrary distributions of keys.
A Model for the Prediction of R-tree Performance
, 1996
"... In this paper we present an analytical model that predicts the performance of R-trees (and its variants) when a range query needs to be answered. The cost model uses knowledge of the dataset only, i.e., the proposed formula that estimates the number of disk accesses is a function of data properties ..."
Abstract
-
Cited by 138 (19 self)
- Add to MetaCart
In this paper we present an analytical model that predicts the performance of R-trees (and its variants) when a range query needs to be answered. The cost model uses knowledge of the dataset only, i.e., the proposed formula that estimates the number of disk accesses is a function of data properties, namely, the amount of data and their density in the work space. In other words, the proposed model is applicable even before the construction of the R-tree index, a fact that makes it a useful tool for dynamic spatial databases. Several experiments on synthetic and real datasets show that the proposed analytical model is very accurate, the relative error being usually around 10%-15%, for uniform and non-uniform distributions. We believe that this error is involved with the gap between efficient R-tree variants, like the R*-tree, and an optimum, not implemented yet, method. Our work extends previous research concerning R-tree analysis and constitutes a useful tool for spatial query optimiz...
A Survey of Research on Deductive Database Systems
- JOURNAL OF LOGIC PROGRAMMING
, 1993
"... The area of deductive databases has matured in recent years, and it now seems appropriate to re ect upon what has been achieved and what the future holds. In this paper, we provide an overview of the area and briefly describe a number of projects that have led to implemented systems. ..."
Abstract
-
Cited by 90 (4 self)
- Add to MetaCart
The area of deductive databases has matured in recent years, and it now seems appropriate to re ect upon what has been achieved and what the future holds. In this paper, we provide an overview of the area and briefly describe a number of projects that have led to implemented systems.
Dimensionality Reduction for Similarity Searching in Dynamic Databases
, 1998
"... Databases are increasingly being used to store multi-media objects such as maps, images, audio and video. Storage and retrieval of these objects is accomplished using multi-dimensional index structures such as R*-trees and SS-trees. As dimensionality increases, query performance in these index struc ..."
Abstract
-
Cited by 88 (5 self)
- Add to MetaCart
Databases are increasingly being used to store multi-media objects such as maps, images, audio and video. Storage and retrieval of these objects is accomplished using multi-dimensional index structures such as R*-trees and SS-trees. As dimensionality increases, query performance in these index structures degrades. This phenomenon, generally referred to as the dimensionality curse, can be circumvented by reducing the dimensionality of the data. Such a reduction is however accompanied by a loss of precision of query results. Current techniques such as QBIC use SVD transform-based dimensionality reduction to ensure high query precision. The drawback of this approach is that SVD is expensive to compute, and therefore not readily applicable to dynamic databases. In this paper, we propose novel techniques for performing SVD-based dimensionality reduction in dynamic databases. When the data distribution changes considerably so as to degrade query precision, we recompute the SVD transform a...
Efficient cost models for spatial queries using r-trees
- IEEE Transactions on Knowledge and Data Engineering
, 2000
"... AbstractÐSelection and join queries are fundamental operations in Data Base Management Systems (DBMS). Support for nontraditional data, including spatial objects, in an efficient manner is of ongoing interest in database research. Toward this goal, access methods and cost models for spatial queries ..."
Abstract
-
Cited by 44 (4 self)
- Add to MetaCart
AbstractÐSelection and join queries are fundamental operations in Data Base Management Systems (DBMS). Support for nontraditional data, including spatial objects, in an efficient manner is of ongoing interest in database research. Toward this goal, access methods and cost models for spatial queries are necessary tools for spatial query processing and optimization. In this paper, we present analytical models that estimate the cost (in terms of node and disk accesses) of selection and join queries using R-treebased structures. The proposed formulae need no knowledge of the underlying R-tree structure(s) and are applicable to uniform-like and nonuniform data distributions. In addition, experimental results are presented which show the accuracy of the analytical estimations when compared to actual runs on both synthetic and real data sets. Index TermsÐSpatial databases, access methods, query optimization, cost models, R-trees. 1
Design and Implementation of the Glue-Nail Database System
, 1993
"... We describe the design and implementation of the Glue-Nail database system. The Nail language is a purely declarative query language; Glue is a procedural language used for non-query activities. The two languages combined are sufficient to write a complete application. Nail and Glue code both compil ..."
Abstract
-
Cited by 34 (1 self)
- Add to MetaCart
We describe the design and implementation of the Glue-Nail database system. The Nail language is a purely declarative query language; Glue is a procedural language used for non-query activities. The two languages combined are sufficient to write a complete application. Nail and Glue code both compile into the target language IGlue. The Nail compiler uses variants of the magic sets algorithm, and supports well-founded models. Static optimization is performed by the Glue compiler using techniques that include peephole methods and data flow analysis. The IGlue code is executed by the IGlue interpreter, which features a run-time adaptive optimizer. The three optimizers each deal with separate optimization domains, and experiments indicate that an effective synergism is achieved. The Glue-Nail system is largely complete and has been tested using a suite of representative applications.
Efficient Processing Of Spatial Queries In Line Segment Databases
, 1991
"... A study is performed of the issues arising in the efficient processing of spatial queries in large spatial databases. The domain is restricted to line segment databases such as those found in transportation networks and polygonal maps. Three classes of queries are identified. Those that deal with th ..."
Abstract
-
Cited by 31 (11 self)
- Add to MetaCart
A study is performed of the issues arising in the efficient processing of spatial queries in large spatial databases. The domain is restricted to line segment databases such as those found in transportation networks and polygonal maps. Three classes of queries are identified. Those that deal with the line segments themselves, those that involve both the line segments and the space from which they are drawn (e.g., proximity queries), and those that involve attributes of the line segments. Handling the three types of queries requires that the line segments be stored implicitly using a bucketing approach on the space from which they are drawn. A number of bucketing approaches are examined and the pmr quadtree is chosen as the most suitable representation. Its storage and execution time requirements are evaluated in the context of finding the nearest line segment to a given point. This operation is shown to take time proportional to the splitting threshold (similar to the bucket capacity) ...
Declustering Spatial Databases on a Multi-Computer Architecture
, 1996
"... . We present a technique to decluster a spatial access method on a shared-nothing multi-computer architecture [DGS + 90]. We propose a software architecture with the R-tree as the underlying spatial access method, with its non-leaf levels on the `master-server' and its leaf nodes distributed acros ..."
Abstract
-
Cited by 27 (2 self)
- Add to MetaCart
. We present a technique to decluster a spatial access method on a shared-nothing multi-computer architecture [DGS + 90]. We propose a software architecture with the R-tree as the underlying spatial access method, with its non-leaf levels on the `master-server' and its leaf nodes distributed across the servers. The major contribution of our work is the study of the optimal capacity of leaf nodes, or `chunk size' (or `striping unit'): we express the response time on range queries as a function of the `chunk size', and we show how to optimize it. We implemented our method on a network of workstations, using a real dataset, and we compared the experimental and the theoretical results. The conclusion is that our formula for the response time is very accurate (the maximum relative error was 29%; the typical error was in the vicinity of 10-15%). We illustrate one of the possible ways to exploit such an accurate formula, by examining several `what-if' scenarios. One major, practical conclus...

