Results 1 
4 of
4
Bulk loading a Data Warehouse built upon a UBTree
, 2000
"... This paper considers the issue of bulk loading large data sets for the UBTree, a multidimensional index structure. Especially in dataware housing (DW), data mining and OLAP it is necessary to have efficient bulk loading techniques, because loading occurs not continuously, but only from time to time ..."
Abstract

Cited by 4 (1 self)
 Add to MetaCart
(Show Context)
This paper considers the issue of bulk loading large data sets for the UBTree, a multidimensional index structure. Especially in dataware housing (DW), data mining and OLAP it is necessary to have efficient bulk loading techniques, because loading occurs not continuously, but only from time to time with usually large data sets. We propose two techniques, one for initial loading, which creates a new UBTree, and one for incremental loading, which adds data to an existing UBTree. Both techniques try to minimize I/O and CPU cost. Measurements with artificial data and data of a commercial data warehouse demonstrate that our algorithms are efficient and able to handle large data sets. As well as the UBTree, they are easily integrated into a RDBMS. Keywords: bulk loading, UBtree, multidimensional index, dataware housing, data mining, OLAP 1 Introduction In case of loading a huge amount of data into a data base indexed table, it is usually not feasible to use the standard insert opera...
Improved BulkLoading Algorithms for Quadtrees
 In Proceedings of the 7th ACM International Symposium on Advances in Geographic Information Systems
, 1999
"... Spatial indexes, such as the PMR quadtree, are important in spatial databases for efficient execution of queries involving spatial constraints, especially when the queries involve spatial joins. In this paper we report recent improvements in bulkloading PMR quadtrees, which index arbitrary spatial ..."
Abstract
 Add to MetaCart
Spatial indexes, such as the PMR quadtree, are important in spatial databases for efficient execution of queries involving spatial constraints, especially when the queries involve spatial joins. In this paper we report recent improvements in bulkloading PMR quadtrees, which index arbitrary spatial objects, and present a new algorithm for bulkloading PR quadtrees, which index point data. Our algorithms assume that the quadtree is implemented using a linear quadtree, a diskresident representation that stores objects contained in the leaf nodes of the quadtree in a linear index (e.g., a Btree) ordered on the basis of a spacefilling curve. We show with experiments that our algorithms yield significant performance improvements for bulkloading quadtrees.
A DistanceBased Packing Method for High Dimensional Data
, 2003
"... Minkowskisum cost model indicates that balanced data partitioning is not beneficial for high dimensional data. Thus we study several unbalanced partitioning methods and propose cost models for them based on Minkowskisum cost model. Our cost models indicate that the distance to one of both ends of ..."
Abstract
 Add to MetaCart
Minkowskisum cost model indicates that balanced data partitioning is not beneficial for high dimensional data. Thus we study several unbalanced partitioning methods and propose cost models for them based on Minkowskisum cost model. Our cost models indicate that the distance to one of both ends of data space dominates the expected value under uniform data distribution. We generalize studied methods to adapt to data distribution and propose a new partitioning method, called DDCSP (Distancebased Distributionadaptive Cyclic Sliced Partition), for highdimensional index structures. At each partition, it splits data from lower end or higher end to the center of data space based on distance cost function. Based on this fact, we propose a data structure called DSR (Dimensionindependent Single value Representation) which takes constant amount of storage to represent MBHs (Minimum Bounding Hypercubes) independent of dimension. In our
An Edge Quadtree for External Memory
"... Abstract. We consider the problem of building a quadtree subdivision for a set E of n nonintersecting edges in the plane. Our approach is to first build a quadtree on the vertices corresponding to the endpoints of the edges, and then compute the intersections between E and the cells in the subdivis ..."
Abstract
 Add to MetaCart
(Show Context)
Abstract. We consider the problem of building a quadtree subdivision for a set E of n nonintersecting edges in the plane. Our approach is to first build a quadtree on the vertices corresponding to the endpoints of the edges, and then compute the intersections between E and the cells in the subdivision. For any k ≥ 1, we call a Kquadtree a linear compressed quadtree that has O(n/k) cells with O(k) vertices each, where each cell stores the edges intersecting the cell. We show how to build a Kquadtree in O(sort(n + l)) i/o’s, where l = O(n2/k) is the number of such intersections. The value of k can be chosen to trade off between the number of cells and the size of a cell in the quadtree. We give an empirical evaluation in external memory on triangulated terrains and USA TIGER data. As an application, we consider the problem of map overlay, or finding the pairwise intersections between two sets of edges. Our findings confirm that the Kquadtree is viable for these types of data and its construction is scalable to hundreds of millions of edges. 1