Results 1  10
of
184
Multidimensional Access Methods
, 1998
"... Search operations in databases require special support at the physical level. This is true for conventional databases as well as spatial databases, where typical search operations include the point query (find all objects that contain a given search point) and the region query (find all objects that ..."
Abstract

Cited by 689 (3 self)
 Add to MetaCart
Search operations in databases require special support at the physical level. This is true for conventional databases as well as spatial databases, where typical search operations include the point query (find all objects that contain a given search point) and the region query (find all objects that overlap a given search region).
Fast subsequence matching in timeseries databases
 Proceedings of the 1994 ACM SIGMOD International Conference on Management of data
, 1994
"... We present an ecient indexing method to locate 1dimensional subsequences within a collection of sequences, such that the subsequences match a given (query) pattern within a specied tolerance. The idea is to map each data sequence into a small set of multidimensional rectangles in feature space. The ..."
Abstract

Cited by 529 (24 self)
 Add to MetaCart
(Show Context)
We present an ecient indexing method to locate 1dimensional subsequences within a collection of sequences, such that the subsequences match a given (query) pattern within a specied tolerance. The idea is to map each data sequence into a small set of multidimensional rectangles in feature space. Then, these rectangles can be readily indexed using traditional spatial access methods, like the R*tree [9]. In more detail, we use a sliding window over the data sequence and extract its features; the result is a trail in feature space. We propose an ecient and eective algorithm to divide such trails into subtrails, which are subsequently represented by their Minimum Bounding Rectangles (MBRs). We also examine queries of varying lengths, and we show how to handle each case eciently. We implemented our method and carried out experiments on synthetic and real data (stock price movements). We compared the method to sequential scanning, which is the only obvious competitor. The results were excellent: our method accelerated the search time from 3 times up to 100 times. 1
FastMap: A Fast Algorithm for Indexing, DataMining and Visualization of Traditional and Multimedia Datasets
, 1995
"... A very promising idea for fast searching in traditional and multimedia databases is to map objects into points in kd space, using k featureextraction functions, provided by a domain expert [25]. Thus, we can subsequently use highly finetuned spatial access methods (SAMs), to answer several types ..."
Abstract

Cited by 499 (23 self)
 Add to MetaCart
(Show Context)
A very promising idea for fast searching in traditional and multimedia databases is to map objects into points in kd space, using k featureextraction functions, provided by a domain expert [25]. Thus, we can subsequently use highly finetuned spatial access methods (SAMs), to answer several types of queries, including the `Query By Example' type (which translates to a range query); the `all pairs' query (which translates to a spatial join [8]); the nearestneighbor or bestmatch query, etc. However, designing feature extraction functions can be hard. It is relatively easier for a domain expert to assess the similarity/distance of two objects. Given only the distance information though, it is not obvious how to map objects into points. This is exactly the topic of this paper. We describe a fast algorithm to map objects into points in some kdimensional space (k is userdefined), such that the dissimilarities are preserved. There are two benefits from this mapping: (a) efficient ret...
The R+Tree: A Dynamic Index For MultiDimensional Objects
, 1987
"... The problem of indexing multidimensional objects is considered. First, a classification of existing methods is given along with a discussion of the major issues involved in multidimensional data indexing. Second, a variation to Guttman's Rtrees (R trees) that avoids overlapping rectangles in ..."
Abstract

Cited by 343 (19 self)
 Add to MetaCart
(Show Context)
The problem of indexing multidimensional objects is considered. First, a classification of existing methods is given along with a discussion of the major issues involved in multidimensional data indexing. Second, a variation to Guttman's Rtrees (R trees) that avoids overlapping rectangles in intermediate nodes of the tree is introduced. Algorithms for searching, updating, initial packing and reorganization of the structure are discussed in detail. Finally, we provide analytical results indicating that R trees achieve up to 50% savings in disk accesses compared to an Rtree when searching files of thousands of rectangles. 1 Also with University of Maryland Systems Research Center. 2 Also with University of Maryland Institute for Advanced Computer Studies (UMIACS). This research was sponsored partialy by the National Science Foundation under Grant CDR8500108. 1. Introduction It has been recognized in the past that existing Database Management Systems (DBMSs) do not ...
The R + tree: A dynamic index for multidimensional objects
 Proc. 13th VLDB Conf
, 1987
"... The problem of indexing multidimensional objects is considered. First, a classification of existing methods is given along with a discussion of the major issues involved in multidimensional data indexing. Second, a variation to Guttman’s Rtrees (R +trees) that avoids overlapping rectangles in inte ..."
Abstract

Cited by 308 (36 self)
 Add to MetaCart
(Show Context)
The problem of indexing multidimensional objects is considered. First, a classification of existing methods is given along with a discussion of the major issues involved in multidimensional data indexing. Second, a variation to Guttman’s Rtrees (R +trees) that avoids overlapping rectangles in intermediate nodes of the tree is introduced. Algorithms for searching, updating, initial packing and reorganization of the structure are discussed in detail. Finally, we provide analytical results indicating that R +trees achieve up to 50 % savings in disk accesses compared to an Rtree when searching files of thousands of rectangles. 1
Hilbert Rtree: An Improved Rtree Using Fractals
 Proceedings 20th VLDB Conference
, 1994
"... We propose a new Rtree structure that outperforms all the older ones. The heart of the idea is to facilitate the deferred splitting approach in Rtrees. This is done by proposing an ordering on the Rtree nodes. This ordering has to be 'good', in the sense that it should group 'simil ..."
Abstract

Cited by 222 (12 self)
 Add to MetaCart
(Show Context)
We propose a new Rtree structure that outperforms all the older ones. The heart of the idea is to facilitate the deferred splitting approach in Rtrees. This is done by proposing an ordering on the Rtree nodes. This ordering has to be 'good', in the sense that it should group 'similar ' data rectangles together, to minimize the area and perimeter of the resulting minimum bounding rectangles (MBRs). Following [19] we have chosen the socalled '2Dc ' method, which sorts rectangles according to the Hilbert value of the center of the rectangles. Given the ordering, every node has a wellde ned set of sibling nodes; thus, we can use deferred splitting. By adjusting the split policy, the Hilbert Rtree can achieve as high utilization as desired. To the contrary, the Rtree has no control over the space utilization, typically achieving up to 70%. We designed the manipulation algorithms in detail, and we did a full implementation of the Hilbert Rtree. Our experiments show that the '2to3 ' split policy provides a compromise between the insertion complexity and the search cost, giving up to 28 % savings over the R tree [3] on real data. 1
An Introduction to Spatial Database Systems
 THE VLDB JOURNAL
, 1994
"... We propose a definition of a spatial database system as a database system that offers spatial data types in its data model and query language, and supports ..."
Abstract

Cited by 216 (9 self)
 Add to MetaCart
(Show Context)
We propose a definition of a spatial database system as a database system that offers spatial data types in its data model and query language, and supports
Analysis of the clustering properties of the Hilbert spacefilling curve
 IEEE Transactions on Knowledge and Data Engineering
, 2001
"... AbstractÐSeveral schemes for the linear mapping of a multidimensional space have been proposed for various applications, such as access methods for spatiotemporal databases and image compression. In these applications, one of the most desired properties from such linear mappings is clustering, whic ..."
Abstract

Cited by 189 (12 self)
 Add to MetaCart
(Show Context)
AbstractÐSeveral schemes for the linear mapping of a multidimensional space have been proposed for various applications, such as access methods for spatiotemporal databases and image compression. In these applications, one of the most desired properties from such linear mappings is clustering, which means the locality between objects in the multidimensional space being preserved in the linear space. It is widely believed that the Hilbert spacefilling curve achieves the best clustering [1], [14]. In this paper, we analyze the clustering property of the Hilbert spacefilling curve by deriving closedform formulas for the number of clusters in a given query region of an arbitrary shape (e.g., polygons and polyhedra). Both the asymptotic solution for the general case and the exact solution for a special case generalize previous work [14]. They agree with the empirical results that the number of clusters depends on the hypersurface area of the query region and not on its hypervolume. We also show that the Hilbert curve achieves better clustering than the z curve. From a practical point of view, the formulas given in this paper provide a simple measure that can be used to predict the required disk access behaviors and, hence, the total access time.
Partition Based SpatialMerge Join
, 1996
"... This paper describes PBSM (Partition Based SpatialMerge), a new algorithm for performing spatial join operation. This algorithm is especially effective when neither of the inputs to the join have an index on the joining attribute. Such a situation could arise if both inputs to the join are interme ..."
Abstract

Cited by 185 (12 self)
 Add to MetaCart
This paper describes PBSM (Partition Based SpatialMerge), a new algorithm for performing spatial join operation. This algorithm is especially effective when neither of the inputs to the join have an index on the joining attribute. Such a situation could arise if both inputs to the join are intermediate results in a complex query, or in a parallel environment where the inputs must be dynamically redistributed. The PBSM algorithm partitions the inputs into manageable chunks, and joins them using a computational geometry based planesweeping technique. This paper also presents a performance study comparing the the traditional indexed nested loops join algorithm, a spatial join algorithm based on joining spatial indices, and the PBSM algorithm. These comparisons are based on complete implementations of these algorithms in Paradise, a database system for handling GIS applications. Using real data sets, the performance study examines the behavior of these spatial join algorithms in a vari...
Beyond uniformity and independence: Analysis of rtrees using the concept of fractal dimension
 In Proc. PODS
, 1994
"... We propose the concept of fractal dimension of a set of points, in order to quantify the deviation from the uniformity distribution. Using measurements on real data sets (road intersections of U.S. counties, star coordinates from NASA’s InfraredUltraviolet Explorer etc.) we provide evidence that re ..."
Abstract

Cited by 167 (20 self)
 Add to MetaCart
(Show Context)
We propose the concept of fractal dimension of a set of points, in order to quantify the deviation from the uniformity distribution. Using measurements on real data sets (road intersections of U.S. counties, star coordinates from NASA’s InfraredUltraviolet Explorer etc.) we provide evidence that real data indeed are skewed, and, moreover, we show that they behave as mathematical fractals, with a measurable, noninteger fract al dimension. Armed with this tool, we then show its practical use in predicting the performance of spatial access methods, and specifically of the Rtrees. We provide the jirst analysis of Rtrees for skewed distributions of points: We develop a formula that estimates the number of disk accesses for range queries, given only the fractal dimension of the point set, and its count. Experiments on real data sets show that the formula is very accurate: the relative error is usually below 5%, and it rarely exceeds 10%. We believe that the fractal dimension will help replace the uniformity and independence assumptions, allowing more accurate analysis for any spatial access method, as well as better estimates for query optimization on multiattribute queries. 1