Results 1 - 10
of
41
A Functional Approach to External Graph Algorithms
- Algorithmica
, 1998
"... . We present a new approach for designing external graph algorithms and use it to design simple external algorithms for computing connected components, minimum spanning trees, bottleneck minimum spanning trees, and maximal matchings in undirected graphs and multi-graphs. Our I/O bounds compete w ..."
Abstract
-
Cited by 81 (1 self)
- Add to MetaCart
. We present a new approach for designing external graph algorithms and use it to design simple external algorithms for computing connected components, minimum spanning trees, bottleneck minimum spanning trees, and maximal matchings in undirected graphs and multi-graphs. Our I/O bounds compete with those of previous approaches. Unlike previous approaches, ours is purely functional---without side effects---and is thus amenable to standard checkpointing and programming language optimization techniques. This is an important practical consideration for applications that may take hours to run. 1 Introduction We present a divide-and-conquer approach for designing external graph algorithms, i.e., algorithms on graphs that are too large to fit in main memory. Our approach is simple to describe and implement: it builds a succession of graph transformations that reduce to sorting, selection, and a recursive bucketing technique. No sophisticated data structures are needed. We apply our t...
External Memory Data Structures
, 2001
"... In many massive dataset applications the data must be stored in space and query efficient data structures on external storage devices. Often the data needs to be changed dynamically. In this chapter we discuss recent advances in the development of provably worst-case efficient external memory dynami ..."
Abstract
-
Cited by 78 (34 self)
- Add to MetaCart
In many massive dataset applications the data must be stored in space and query efficient data structures on external storage devices. Often the data needs to be changed dynamically. In this chapter we discuss recent advances in the development of provably worst-case efficient external memory dynamic data structures. We also briefly discuss some of the most popular external data structures used in practice.
Cache-oblivious priority queue and graph algorithm applications
- In Proc. 34th Annual ACM Symposium on Theory of Computing
, 2002
"... In this paper we develop an optimal cache-oblivious priority queue data structure, supporting insertion, deletion, and deletemin operations in O ( 1 B logM/B N) amortized memory B transfers, where M and B are the memory and block transfer sizes of any two consecutive levels of a multilevel memory hi ..."
Abstract
-
Cited by 56 (10 self)
- Add to MetaCart
In this paper we develop an optimal cache-oblivious priority queue data structure, supporting insertion, deletion, and deletemin operations in O ( 1 B logM/B N) amortized memory B transfers, where M and B are the memory and block transfer sizes of any two consecutive levels of a multilevel memory hierarchy. In a cache-oblivious data structure, M and B are not used in the description of the structure. The bounds match the bounds of several previously developed external-memory (cache-aware) priority queue data structures, which all rely crucially on knowledge about M and B. Priority queues are a critical component in many of the best known external-memory graph algorithms, and using our cache-oblivious priority queue we develop several cacheoblivious graph algorithms.
On External Memory Graph Traversal
- IN PROC. ACM-SIAM SYMP. ON DISCRETE ALGORITHMS
, 2000
"... We describe a new external memory data structure, the buffered repository tree, and use it to provide the first non-trivial external memory algorithm for directed breadth-first search (BFS) and an improved external algorithm for directed depth-first search. We also demonstrate the equivalence of var ..."
Abstract
-
Cited by 54 (1 self)
- Add to MetaCart
We describe a new external memory data structure, the buffered repository tree, and use it to provide the first non-trivial external memory algorithm for directed breadth-first search (BFS) and an improved external algorithm for directed depth-first search. We also demonstrate the equivalence of various formulations of external undirected BFS, and we use these to give the first I/O-optimal BFS algorithm for undirected trees.
Efficient External Memory Algorithms by Simulating Coarse-Grained Parallel Algorithms
, 2003
"... External memory (EM) algorithms are designed for large-scale computational problems in which the size of the internal memory of the computer is only a small fraction of the problem size. Typical EM algorithms are specially crafted for the EM situation. In the past, several attempts have been made to ..."
Abstract
-
Cited by 39 (10 self)
- Add to MetaCart
External memory (EM) algorithms are designed for large-scale computational problems in which the size of the internal memory of the computer is only a small fraction of the problem size. Typical EM algorithms are specially crafted for the EM situation. In the past, several attempts have been made to relate the large body of work on parallel algorithms to EM, but with limited success. The combination of EM computing, on multiple disks, with multiprocessor parallelism has been posted as a challenge by the ACMWorking Group on Storage I/O for Large-Scale Computing.
External-memory breadth-first search with sublinear I/O
- IN PROCEEDINGS OF THE 10TH ANNUAL EUROPEAN SYMPOSIUM ON ALGORITHMS
, 2002
"... Breadth-first search (BFS) is a basic graph exploration technique. We give the first external memory algorithm for sparse undirected graphs with sublinear I/O. The best previous algorithm requires \Theta (n + n+mD\Delta B \Delta logM=B n+mB) I/Os on a graph with n nodes and m edges and a machine w ..."
Abstract
-
Cited by 38 (11 self)
- Add to MetaCart
Breadth-first search (BFS) is a basic graph exploration technique. We give the first external memory algorithm for sparse undirected graphs with sublinear I/O. The best previous algorithm requires \Theta (n + n+mD\Delta B \Delta logM=B n+mB) I/Os on a graph with n nodes and m edges and a machine with main-memory of size M, D parallel disks, and block size B. We present two versions of a new algorithm which requires only O i (p 1D\Delta B + p nm) \Delta n+mpD\Delta B \Delta logM=B n+mB
On External Memory MST, SSSP and Multi-way Planar Graph Separation (Extended Abstract)
, 2000
"... Recently external memory graph algorithms have received considerable attention because massive graphs arise naturally in many applications involving massive data sets. Even though a large number of I/O-efficient graph algorithms have been developed, a number of fundamental problems still remain ..."
Abstract
-
Cited by 30 (9 self)
- Add to MetaCart
Recently external memory graph algorithms have received considerable attention because massive graphs arise naturally in many applications involving massive data sets. Even though a large number of I/O-efficient graph algorithms have been developed, a number of fundamental problems still remain open. In this paper we develop improved algorithms for the problem of computing a minimum spanning tree of a general graph G = (V; E), as well as new algorithms for the single source shortest paths and the multi-way graph separation problems on planar graphs.
STXXL: Standard template library for XXL data sets
- In: Proc. of ESA 2005. Volume 3669 of LNCS
, 2005
"... for processing huge data sets that can fit only on hard disks. It supports parallel disks, overlapping between disk I/O and computation and it is the first I/O-efficient algorithm library that supports the pipelining technique that can save more than half of the I/Os. STXXL has been applied both in ..."
Abstract
-
Cited by 30 (4 self)
- Add to MetaCart
for processing huge data sets that can fit only on hard disks. It supports parallel disks, overlapping between disk I/O and computation and it is the first I/O-efficient algorithm library that supports the pipelining technique that can save more than half of the I/Os. STXXL has been applied both in academic and industrial environments for a range of problems including text processing, graph algorithms, computational geometry, gaussian elimination, visualization, and analysis of microscopic images, differential cryptographic analysis, etc. The performance of STXXL and its applications is evaluated on synthetic and real-world inputs. We present the design of the library, how its performance features are supported, and demonstrate how the library integrates with STL. KEY WORDS: very large data sets; software library; C++ standard template library; algorithm engineering 1.
I/O-Efficient Algorithms for Problems on Grid-based Terrains (Extended Abstract)
- In Proc. Workshop on Algorithm Engineering and Experimentation
, 2000
"... Lars Arge Laura Toma Jeffrey Scott Vitter Center for Geometric Computing Department of Computer Science Duke University Durham, NC 27708--0129 Abstract The potential and use of Geographic Information Systems (GIS) is rapidly increasing due to the increasing availability of massive amoun ..."
Abstract
-
Cited by 28 (13 self)
- Add to MetaCart
Lars Arge Laura Toma Jeffrey Scott Vitter Center for Geometric Computing Department of Computer Science Duke University Durham, NC 27708--0129 Abstract The potential and use of Geographic Information Systems (GIS) is rapidly increasing due to the increasing availability of massive amounts of geospatial data from projects like NASA's Mission to Planet Earth. However, the use of these massive datasets also exposes scalability problems with existing GIS algorithms. These scalability problems are mainly due to the fact that most GIS algorithms have been designed to minimize internal computation time, while I/O communication often is the bottleneck when processing massive amounts of data.
Compressing pattern databases
- In Proceedings of the Nineteenth National Conference on Artificial Intelligence (AAAI04
, 2004
"... A pattern database (PDB) is a heuristic function implemented as a lookup table that stores the lengths of optimal solutions for subproblem instances. Standard PDBs have a distinct entry in the table for each subproblem instance. In this paper we investigate compressing PDBs by merging several entrie ..."
Abstract
-
Cited by 23 (12 self)
- Add to MetaCart
A pattern database (PDB) is a heuristic function implemented as a lookup table that stores the lengths of optimal solutions for subproblem instances. Standard PDBs have a distinct entry in the table for each subproblem instance. In this paper we investigate compressing PDBs by merging several entries into one, thereby allowing the use of PDBs that exceed available memory in their uncompressed form. We introduce a number of methods for determining which entries to merge and discuss their relative merits. These vary from domainindependent approaches that allow any set of entries in the PDB to be merged, to more intelligent methods that take into account the structure of the problem. The choice of the best compression method is based on domain-dependent attributes. We present experimental results on a number of combinatorial problems, including the four-peg Towers of Hanoi problem, the sliding-tile puzzles, and the Top-Spin puzzle. For the Towers of Hanoi, we show that the search time can be reduced by up to three orders of magnitude by using compressed PDBs compared to uncompressed PDBs of the same size. More modest improvements were observed for the other domains.

