Results 1 
8 of
8
External Memory Data Structures
, 2001
"... In many massive dataset applications the data must be stored in space and query efficient data structures on external storage devices. Often the data needs to be changed dynamically. In this chapter we discuss recent advances in the development of provably worstcase efficient external memory dynami ..."
Abstract

Cited by 79 (36 self)
 Add to MetaCart
In many massive dataset applications the data must be stored in space and query efficient data structures on external storage devices. Often the data needs to be changed dynamically. In this chapter we discuss recent advances in the development of provably worstcase efficient external memory dynamic data structures. We also briefly discuss some of the most popular external data structures used in practice.
I/OEfficient Algorithms for Problems on Gridbased Terrains (Extended Abstract)
 In Proc. Workshop on Algorithm Engineering and Experimentation
, 2000
"... Lars Arge Laura Toma Jeffrey Scott Vitter Center for Geometric Computing Department of Computer Science Duke University Durham, NC 277080129 Abstract The potential and use of Geographic Information Systems (GIS) is rapidly increasing due to the increasing availability of massive amoun ..."
Abstract

Cited by 31 (14 self)
 Add to MetaCart
Lars Arge Laura Toma Jeffrey Scott Vitter Center for Geometric Computing Department of Computer Science Duke University Durham, NC 277080129 Abstract The potential and use of Geographic Information Systems (GIS) is rapidly increasing due to the increasing availability of massive amounts of geospatial data from projects like NASA's Mission to Planet Earth. However, the use of these massive datasets also exposes scalability problems with existing GIS algorithms. These scalability problems are mainly due to the fact that most GIS algorithms have been designed to minimize internal computation time, while I/O communication often is the bottleneck when processing massive amounts of data.
On ExternalMemory Planar Depth First Search
 Journal of Graph Algorithms and Applications
"... Even though a large number of I/Oefficient graph algorithms have been developed, a number of fundamental problems still remain open. For example, no space and I/Oefficient algorithms are known for depthfirst search or breadthfirst search in sparse graphs. In this paper we present two new re ..."
Abstract

Cited by 24 (15 self)
 Add to MetaCart
Even though a large number of I/Oefficient graph algorithms have been developed, a number of fundamental problems still remain open. For example, no space and I/Oefficient algorithms are known for depthfirst search or breadthfirst search in sparse graphs. In this paper we present two new results on I/Oefficient depthfirst search in an important class of sparse graphs, namely undirected embedded planar graphs. We develop a new efficient depthfirst search algorithm and show how planar depthfirst search in general can be reduced to planar breadthfirst search. As part of the first result we develop the first I/Oefficient algorithm for finding a simple cycle separator of a biconnected planar graph. Together with other recent reducibility results, the second result provides further evidence that external memory breadthfirst search is among the hardest problems on planar graphs. 1
Finding Shortest Paths in Large Network Systems
 In Proceedings of the ninth ACM international symposium on Advances in geographic information systems
, 2001
"... This paper describes a diskbased algorithm for finding shortest paths in a large network system. It employs a strategy of processing the network piece by piece and is based on new algorithms for graph partitioning and for finding shortest paths that overcome the problem of existing approaches. To s ..."
Abstract

Cited by 13 (2 self)
 Add to MetaCart
This paper describes a diskbased algorithm for finding shortest paths in a large network system. It employs a strategy of processing the network piece by piece and is based on new algorithms for graph partitioning and for finding shortest paths that overcome the problem of existing approaches. To show that it is scalable to large network systems and is adaptable to different computing environment, seven states in Tiger/Line files are extracted as test cases and are experimented on machines with different configurations. The running time for finding the shortest path depends primarily on the power of the underlying systems. Moreover, to run the algorithm optimally, the memory requirement is not large, even for a very large network system such as the road system in several states in Tiger/Line files. To evaluate its performance, New Mexico state road system is used as the test case, and is compared with Dijkstra's algorithm. The average running time of the proposed algorithm is, in the worst case, about two and a half times slower than that of Dijkstra's algorithm; provided that in Dijkstra's algorithm, the whole graph can be fit into main memory and is already loaded in advance. If the I/O time for loading the whole graph is counted, the proposed algorithm is faster in essentially all cases.
Optimization and evaluation of shortest path queries
 VLDB J
"... We investigate the problem of how to evaluate efficiently a collection of shortest path queries on massive graphs that are too big to fit in the main memory. To evaluate a shortest path query efficiently, we introduce two pruning algorithms. These algorithms differ on the extent of materialization o ..."
Abstract

Cited by 10 (0 self)
 Add to MetaCart
We investigate the problem of how to evaluate efficiently a collection of shortest path queries on massive graphs that are too big to fit in the main memory. To evaluate a shortest path query efficiently, we introduce two pruning algorithms. These algorithms differ on the extent of materialization of shortest path cost and on how the search space is pruned. By grouping shortest path queries properly, batch processing improves the performance of shortest path query evaluation. Extensive study is also done on fragment sizes, cache sizes and query types that we show that affect the performance of a diskbased shortest path algorithm. The performance and scalability of proposed techniques are evaluated with large road systems in the Eastern United States. To demonstrate that the proposed diskbased algorithms are viable, we show that their search times are significant better than that of mainmemory Dijkstra’s algorithm. 1
The Complexity of Flow on Fat Terrains and its I/OEfficient Computation
"... We study the complexity and the I/Oefficient computation of flow on triangulated terrains. We present an acyclic graph, the descent graph, that enables us to trace flow paths in triangulations i/oefficiently. We use the descent graph to obtain i/oefficient algorithms for computing river networks ..."
Abstract

Cited by 2 (1 self)
 Add to MetaCart
We study the complexity and the I/Oefficient computation of flow on triangulated terrains. We present an acyclic graph, the descent graph, that enables us to trace flow paths in triangulations i/oefficiently. We use the descent graph to obtain i/oefficient algorithms for computing river networks and watershedarea maps in O(Sort(d + r)) i/o’s, where r is the complexity of the river network and d of the descent graph. Furthermore we describe a data structure based on the subdivision of the terrain induced by the edges of the triangulation and paths of steepest ascent and descent from its vertices. This data structure can be used to report the boundary of the watershed of a query point q or the flow path from q in O(l(s) + Scan(k)) i/o’s, where s is the complexity of the subdivision underlying the data structure, l(s) is the number of i/o’s used for planar point location in this subdivision, and k is the size of the reported output. On αfat terrains, that is, triangulated terrains where the minimum angle of any triangle is bounded from below by α, we show that the worstcase complexity of the descent graph and of any path of steepest descent is O(n/α 2), where n is the number of triangles in the terrain. The worstcase complexity of the river network and the abovementioned data structure on such terrains is O(n 2 /α 2). When α is a positive constant this improves the corresponding bounds for arbitrary terrains by a linear factor. We prove that similar bounds cannot be proven for Delaunay triangulations: these can have river networks of complexity Θ(n 3). 1
External Data Structures for Shortest Path Queries on Planar Digraphs
"... Abstract. In this paper we present spacequery tradeoffs for external memory data structures that answer shortest path queries on planar directed graphs. For any S = Ω(N 1+ɛ)andS = O(N 2 /B), our main result is a family of structures that use S space and answer queries in O ( N2 SB) I/Os, thus obta ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
Abstract. In this paper we present spacequery tradeoffs for external memory data structures that answer shortest path queries on planar directed graphs. For any S = Ω(N 1+ɛ)andS = O(N 2 /B), our main result is a family of structures that use S space and answer queries in O ( N2 SB) I/Os, thus obtaining optimal spacequery product O(N2 /B). An S space structure can be constructed in O ( √ S · sort(N)) I/Os, where sort(N) is the number of I/Os needed to sort N elements, B is the disk block size, and N is the size of the graph. 1
CacheOblivious String Dictionaries
"... We present static cacheoblivious dictionary structures for strings which provide analogues of tries and suffix trees in the cacheoblivious model. Our construction takes as input either a set of strings to store, a single string for which all suffixes are to be stored, a trie, a compressed trie, or ..."
Abstract
 Add to MetaCart
We present static cacheoblivious dictionary structures for strings which provide analogues of tries and suffix trees in the cacheoblivious model. Our construction takes as input either a set of strings to store, a single string for which all suffixes are to be stored, a trie, a compressed trie, or a suffix tree, and creates a cacheoblivious data structure which performs prefix queries in O(log B n + P /B) I/Os, where n is the number of leaves in the trie, P is the query string, and B is the block size. This query cost is optimal for unbounded alphabets. The data structure uses linear space. 1