Results 1  10
of
14
External Memory Data Structures
, 2001
"... In many massive dataset applications the data must be stored in space and query efficient data structures on external storage devices. Often the data needs to be changed dynamically. In this chapter we discuss recent advances in the development of provably worstcase efficient external memory dynami ..."
Abstract

Cited by 79 (36 self)
 Add to MetaCart
In many massive dataset applications the data must be stored in space and query efficient data structures on external storage devices. Often the data needs to be changed dynamically. In this chapter we discuss recent advances in the development of provably worstcase efficient external memory dynamic data structures. We also briefly discuss some of the most popular external data structures used in practice.
Range Searching
, 1996
"... Range searching is one of the central problems in computational geometry, because it arises in many applications and a wide variety of geometric problems can be formulated as a rangesearching problem. A typical rangesearching problem has the following form. Let S be a set of n points in R d , an ..."
Abstract

Cited by 69 (1 self)
 Add to MetaCart
Range searching is one of the central problems in computational geometry, because it arises in many applications and a wide variety of geometric problems can be formulated as a rangesearching problem. A typical rangesearching problem has the following form. Let S be a set of n points in R d , and let R be a family of subsets; elements of R are called ranges . We wish to preprocess S into a data structure so that for a query range R, the points in S " R can be reported or counted efficiently. Typical examples of ranges include rectangles, halfspaces, simplices, and balls. If we are only interested in answering a single query, it can be done in linear time, using linear space, by simply checking for each point p 2 S whether p lies in the query range.
The Priority RTree: A Practically Efficient and WorstCase Optimal RTree
 SIGMOD 2004 JUNE 1318, 2004, PARIS, FRANCE
, 2004
"... We present the Priority Rtree, or PRtree, which is the first Rtree variant that always answers a window query using O((N/B) 1−1/d + T/B) I/Os, where N is the number of ddimensional (hyper) rectangles stored in the Rtree, B is the disk block size, and T is the output size. This is provably asymp ..."
Abstract

Cited by 56 (7 self)
 Add to MetaCart
We present the Priority Rtree, or PRtree, which is the first Rtree variant that always answers a window query using O((N/B) 1−1/d + T/B) I/Os, where N is the number of ddimensional (hyper) rectangles stored in the Rtree, B is the disk block size, and T is the output size. This is provably asymptotically optimal and significantly better than other Rtree variants, where a query may visit all N/B leaves in the tree even when T = 0. We also present an extensive experimental study of the practical performance of the PRtree using both reallife and synthetic data. This study shows that the PRtree performs similar to the best known Rtree variants on reallife and relatively nicely distributed data, but outperforms them significantly on more extreme data.
Implementing I/OEfficient Data Structures Using TPIE
 In Proc. European Symposium on Algorithms
, 2002
"... In recent years, many theoretically I/Oefficient algorithms and data structures have been developed. The TPIE project at Duke University was started to investigate the practical importance of these theoretical results. The goal of this ongoing project is to provide a portable, extensible, flexib ..."
Abstract

Cited by 32 (6 self)
 Add to MetaCart
In recent years, many theoretically I/Oefficient algorithms and data structures have been developed. The TPIE project at Duke University was started to investigate the practical importance of these theoretical results. The goal of this ongoing project is to provide a portable, extensible, flexible, and easy to use C++ programming environment for efficiently implementing I/Oalgorithms and data structures. The TPIE library has been developed in two phases. The first phase focused on supporting algorithms with a sequential I/O pattern, while the recently developed second phase has focused on supporting online I/Oefficient data structures, which exhibit a more random I/O pattern. This paper describes the design and implementation of the second phase of TPIE.
Bkdtree: A dynamic scalable kdtree
 In Proc. International Symposium on Spatial and Temporal Databases
, 2003
"... ..."
Cacheoblivious data structures for orthogonal range searching
 IN PROC. ACM SYMPOSIUM ON COMPUTATIONAL GEOMETRY
, 2003
"... We develop cacheoblivious data structures for orthogonal range searching, the problem of finding all T points in a set of N points in Rd lying in a query hyperrectangle. Cacheoblivious data structures are designed to be efficient in arbitrary memory hierarchies. We describe a dynamic linearsize ..."
Abstract

Cited by 22 (6 self)
 Add to MetaCart
We develop cacheoblivious data structures for orthogonal range searching, the problem of finding all T points in a set of N points in Rd lying in a query hyperrectangle. Cacheoblivious data structures are designed to be efficient in arbitrary memory hierarchies. We describe a dynamic linearsize data structure that answers ddimensional queries in O((N/B)11/d + T/B) memory transfers, where B is the block size of any two levels of a multilevel memory hierarchy. A point can be inserted into or deleted from this data structure in O(log2B N) memory transfers. We also develop a static structure for the twodimensional case that answers queries in O(logB N + T /B) memory transfers using O(N log22 N) space. The analysis of the latter structure requires that B = 22 c for some nonnegative integer constant c.
iWalk: Interactive outofcore rendering of large models
, 2002
"... We present iWalk, a system for interactive outofcore rendering of large models on an inexpensive PC. The system uses a new outofcore preprocessing algorithm and a new multithreaded outofcore rendering approach. The outofcore preprocessing algorithm is incremental and fast, and it builds an on ..."
Abstract

Cited by 20 (8 self)
 Add to MetaCart
We present iWalk, a system for interactive outofcore rendering of large models on an inexpensive PC. The system uses a new outofcore preprocessing algorithm and a new multithreaded outofcore rendering approach. The outofcore preprocessing algorithm is incremental and fast, and it builds an ondisk hierarchical representation for a model larger than main memory. The outofcore rendering approach uses multiple threads to overlap rendering, visibility computation, and disk operations. A rendering thread uses a frompoint visibility algorithm to find the nodes of the model hierarchy that the user sees, and sends fetch requests to a geometry cache, which reads nodes from disk into memory. To avoid bursts of disk operations, a lookahead thread guesses the nodes that the user may see next, and sends prefetch requests to the geometry cache. The system can run in approximate mode for interactive rendering, or in conservative mode for rendering with guaranteed accuracy. On a commodity PC, iWalk can preprocess a 13millionpolygon model in 17 minutes, and then render it in approximate mode with 98 % accuracy at 9 frames per second. Thus, iWalk allows us to use an inexpensive PC to visualize models that would typically require expensive highend graphics workstations or parallel machines. 1
CacheOblivious RTrees
, 2005
"... We develop a cacheoblivious data structure for storing a set S of N axisaligned rectangles in the plane, such that all rectangles in S intersecting a query rectangle or point can be found efficiently. Our structure is an axisaligned boundingbox hierarchy and as such it is the first cacheoblivio ..."
Abstract

Cited by 11 (3 self)
 Add to MetaCart
We develop a cacheoblivious data structure for storing a set S of N axisaligned rectangles in the plane, such that all rectangles in S intersecting a query rectangle or point can be found efficiently. Our structure is an axisaligned boundingbox hierarchy and as such it is the first cacheoblivious Rtree with provable performance guarantees. If no point in the plane is contained in B or more rectangles in S, the structure answers a rectangle query using O(\sqrt{N/B} + T/B) memory transfers and a point query using O((N/B)^ε) memory transfers for any ε>0, where B is the block size of memory transfers between any two levels of a multilevel memory hierarchy. We also develop a variant of our structure that achieves the same performance on input sets with arbitrary overlap among the rectangles. The rectangle query bound matches the bound of the best known linearspace cacheaware structure.
From point cloud to grid DEM: A scalable approach
 In Proc. 12th International Symposium on Spatial Data Handling
, 2006
"... Summary. Given a set S of points in R 3 sampled from an elevation function H: R 2 → R, we present a scalable algorithm for constructing a grid digital elevation model (DEM). Our algorithm consists of three stages: First, we construct a quad tree on S to partition the point set into a set of nonover ..."
Abstract

Cited by 9 (8 self)
 Add to MetaCart
Summary. Given a set S of points in R 3 sampled from an elevation function H: R 2 → R, we present a scalable algorithm for constructing a grid digital elevation model (DEM). Our algorithm consists of three stages: First, we construct a quad tree on S to partition the point set into a set of nonoverlapping segments. Next, for each segment q, we compute the set of points in q and all segments neighboring q. Finally, we interpolate each segment independently using points within the segment and its neighboring segments. Data sets acquired by LIDAR and other modern mapping technologies consist of hundreds of millions of points and are too large to fit in main memory. When processing such massive data sets, the transfer of data between disk and main memory (also called I/O), rather than the CPU time, becomes the performance bottleneck. We therefore present an I/Oefficient algorithm for constructing a grid DEM. Our experiments show that the algorithm scales to data sets much larger than the size of main memory, while existing algorithms do not scale. For example, using a machine with 1GB RAM, we were able to construct a grid DEM containing 1.3 billion cells (occupying 1.2GB) from a LIDAR data set of over 390 million points (occupying 20GB) in about 53 hours. Neither ArcGIS nor GRASS, two popular GIS products, were able to process this data set. 1
Hierarchical Graph Indexing
 Proc. of Int’l Conf. on Information and Knowledge Management (CIKM
, 2003
"... Tra#c analysis, in the context of Telecommunications or Internet and Web data, is crucial for large network operations. Data in such networks is often provided as large graphs with hundreds of millions of vertices and edges. We propose e#cient techniques for managing such graphs at the storage level ..."
Abstract

Cited by 4 (1 self)
 Add to MetaCart
Tra#c analysis, in the context of Telecommunications or Internet and Web data, is crucial for large network operations. Data in such networks is often provided as large graphs with hundreds of millions of vertices and edges. We propose e#cient techniques for managing such graphs at the storage level in order to facilitate its processing at the interface level(visualization). The methods are based on a hierarchical decomposition of the graph edge set that is inherited from a hierarchical decomposition of the vertex set. Real time navigation is provided by an e#cient two level indexing schema called the gkd # tree. The first level is a variation of a kdtree index that partitions the edge set in a way that conforms to the hierarchical decomposition and the data distribution (the gkdtree). The second level is a redundant R # tree that indexes the leaf pages of the gkd tree. We provide computational results that illustrate the superiority of the gkd # tree against conventional indexes like the kdtree and the R # tree both in creation as well as query response times.