Results 1 - 10
of
30
External-Memory Graph Algorithms
, 1995
"... We present a collection of new techniques for designing and analyzing efficient external-memory algorithms for graph problems and illustrate how these techniques can be applied to a wide variety of specific problems. Our results include: ffl Proximate-neighboring. We present a simple method for der ..."
Abstract
-
Cited by 159 (22 self)
- Add to MetaCart
We present a collection of new techniques for designing and analyzing efficient external-memory algorithms for graph problems and illustrate how these techniques can be applied to a wide variety of specific problems. Our results include: ffl Proximate-neighboring. We present a simple method for deriving external-memory lower bounds via reductions from a problem we call the "proximate neighbors" problem. We use this technique to derive non-trivial lower bounds for such problems as list ranking, expression tree evaluation, and connected components. ffl PRAM simulation. We give methods for efficiently simulating PRAM computations in external memory, even for some cases in which the PRAM algorithm is not work-optimal. We apply this to derive a number of optimal (and simple) external-memory graph algorithms. ffl Time-forward processing. We present a general technique for evaluating circuits (or "circuit-like" computations) in external memory. We also use this in a deterministic list rank...
Asymptotically Tight Bounds for Performing BMMC Permutations on Parallel Disk Systems
, 1994
"... This paper presents asymptotically equal lower and upper bounds for the number of parallel I/O operations required to perform bit-matrix-multiply/complement (BMMC) permutations on the Parallel Disk Model proposed by Vitter and Shriver. A BMMC permutation maps a source index to a target index by an a ..."
Abstract
-
Cited by 59 (19 self)
- Add to MetaCart
This paper presents asymptotically equal lower and upper bounds for the number of parallel I/O operations required to perform bit-matrix-multiply/complement (BMMC) permutations on the Parallel Disk Model proposed by Vitter and Shriver. A BMMC permutation maps a source index to a target index by an affine transformation over GF (2), where the source and target indices are treated as bit vectors. The class of BMMC permutations includes many common permutations, such as matrix transposition (when dimensions are powers of 2), bit-reversal permutations, vector-reversal permutations, hypercube permutations, matrix reblocking, Graycode permutations, and inverse Gray-code permutations. The upper bound improves upon the asymptotic bound in the previous best known BMMC algorithm and upon the constant factor in the previous best known bit-permute/complement (BPC) permutation algorithm. The algorithm achieving the upper bound uses basic linear-algebra techniques to factor the characteristic matrix...
Integrating Theory and Practice in Parallel File Systems
- PROCEEDINGS OF THE 1993 DAGS/PC SYMPOSIUM (THE DARTMOUTH INSTITUTE FOR ADVANCED GRADUATE STUDIES
, 1993
"... Several algorithms for parallel disk systems have appeared in the literature recently, and they are asymptotically optimal in terms of the number of disk accesses. Scalable systems with parallel disks must be able to run these algorithms. We present for the first time a list of capabilities that mus ..."
Abstract
-
Cited by 48 (11 self)
- Add to MetaCart
Several algorithms for parallel disk systems have appeared in the literature recently, and they are asymptotically optimal in terms of the number of disk accesses. Scalable systems with parallel disks must be able to run these algorithms. We present for the first time a list of capabilities that must be provided by the system to support these optimal algorithms: control over declustering, querying about the configuration, independent I/O, and turning off parity, file caching, and prefetching. We summarize recent theoretical and empirical work that justifies the need for these capabilities. In addition, we sketch an organization for a parallel file interface with low-level primitives and higher-level operations.
Efficient External-Memory Data Structures and Applications
, 1996
"... In this thesis we study the Input/Output (I/O) complexity of large-scale problems arising e.g. in the areas of database systems, geographic information systems, VLSI design systems and computer graphics, and design I/O-efficient algorithms for them. A general theme in our work is to design I/O-effic ..."
Abstract
-
Cited by 38 (12 self)
- Add to MetaCart
In this thesis we study the Input/Output (I/O) complexity of large-scale problems arising e.g. in the areas of database systems, geographic information systems, VLSI design systems and computer graphics, and design I/O-efficient algorithms for them. A general theme in our work is to design I/O-efficient algorithms through the design of I/O-efficient data structures. One of our philosophies is to try to isolate all the I/O specific parts of an algorithm in the data structures, that is, to try to design I/O algorithms from internal memory algorithms by exchanging the data structures used in internal memory with their external memory counterparts. The results in the thesis include a technique for transforming an internal memory tree data structure into an external data structure which can be used in a batched dynamic setting, that is, a setting where we for example do not require that the result of a search operation is returned immediately. Using this technique we develop batched dynamic external versions of the (one-dimensional) range-tree and the segment-tree and we develop an external priority queue. Following our general philosophy we show how these structures can be used in standard internal memory sorting algorithms
ViC*: A preprocessor for virtual-memory C*
, 1994
"... This paper describes the functionality of ViC*, a compiler-like preprocessor forout-of core C*. The input to ViC* is a C * program but with certain shapes declared outofcore, which means that all parallel variables of these shapes reside on disk. The output is a standard C* program with the appropri ..."
Abstract
-
Cited by 38 (3 self)
- Add to MetaCart
This paper describes the functionality of ViC*, a compiler-like preprocessor forout-of core C*. The input to ViC* is a C * program but with certain shapes declared outofcore, which means that all parallel variables of these shapes reside on disk. The output is a standard C* program with the appropriate I/O and library calls added for efficient access to out-of-core parallel variables.
I/O-Efficient Scientific Computation Using TPIE
- In Proceedings of the Goddard Conference on Mass Storage Systems and Technologies, NASA Conference Publication 3340, Volume II
, 1995
"... In recent years, I/O-efficient algorithms for a wide variety of problems have appeared in the literature. Thus far, however, systems specifically designed to assist programmers in implementing such algorithms have remained scarce. TPIE is a system designed to fill this void. It supports I/O-eff ..."
Abstract
-
Cited by 33 (10 self)
- Add to MetaCart
In recent years, I/O-efficient algorithms for a wide variety of problems have appeared in the literature. Thus far, however, systems specifically designed to assist programmers in implementing such algorithms have remained scarce. TPIE is a system designed to fill this void. It supports I/O-efficient paradigms for problems from a variety of domains, including computational geometry, graph algorithms, and scientific computation. The TPIE interface frees programmers from having to deal not only of explicit read and write calls, but also the complex memory management that must be performed for I/O-efficient computation.
Experiments on the Practical I/O Efficiency of Geometric Algorithms: Distribution Sweep vs. Plane Sweep
, 1995
"... We present an extensive experimental study comparing the performance of four algorithms for the following orthogonal segment intersection problem: given a set of horizontal and vertical line segments in the plane, report all intersecting horizontal-vertical pairs. The problem has important applicati ..."
Abstract
-
Cited by 25 (7 self)
- Add to MetaCart
We present an extensive experimental study comparing the performance of four algorithms for the following orthogonal segment intersection problem: given a set of horizontal and vertical line segments in the plane, report all intersecting horizontal-vertical pairs. The problem has important applications in VLSI layout and graphics, which are large-scale in nature. The algorithms under evaluation are distribution sweep and three variations of plane sweep. Distribution sweep is specifically designed for the situations in which the problem is too large to be solved in internal memory, and theoretically has optimal I/O cost. Plane sweep is a well-known and powerful technique in computational geometry, and is optimal for this particular problem in terms of internal computation. The three variations of plane sweep differ by the sorting methods (external vs. internal sorting) used in the preprocessing phase and the dynamic data structures (B tree vs. 2-3-4 tree) used in the sweeping ...
External-Memory Algorithms with Applications in Geographic Information Systems
- Algorithmic Foundations of GIS
, 1997
"... In the design of algorithms for large-scale applications it is essential to consider the problem of minimizing Input/Output (I/O) communication. Geographical information systems (GIS) are good examples of such large-scale applications as they frequently handle huge amounts of spatial data. In this n ..."
Abstract
-
Cited by 24 (9 self)
- Add to MetaCart
In the design of algorithms for large-scale applications it is essential to consider the problem of minimizing Input/Output (I/O) communication. Geographical information systems (GIS) are good examples of such large-scale applications as they frequently handle huge amounts of spatial data. In this note we survey the recent developments in external-memory algorithms with applications in GIS. First we discuss the Aggarwal-Vitter I/O-model and illustrate why normal internal-memory algorithms for even very simple problems can perform terribly in an I/O-environment. Then we describe the fundamental paradigms for designing I/O-efficient algorithms by using them to design efficient sorting algorithms. We then go on and survey external-memory algorithms for computational geometry problems -- with special emphasis on problems with applications in GIS -- and techniques for designing such algorithms: Using the orthogonal line segment intersection problem we illustrate the distribution-sweeping and ...
Early experiences in evaluating the Parallel Disk Model with the ViC* implementation
, 1996
"... Although several algorithms have been developed for the Parallel Disk Model (PDM), few have beenimplemented. Consequently, little has been known about the accuracy of thePDMin measuring I/O time and total running time toperform an out-of-core computation. This paper analyzes timing results on multip ..."
Abstract
-
Cited by 19 (6 self)
- Add to MetaCart
Although several algorithms have been developed for the Parallel Disk Model (PDM), few have beenimplemented. Consequently, little has been known about the accuracy of thePDMin measuring I/O time and total running time toperform an out-of-core computation. This paper analyzes timing results on multiple-disk platforms fortwo PDM algorithms, out-of-core radix sort and BMMC permutations, to determine the strengths and weaknesses of thePDM. The results indicate the following. First, good PDM algorithms are usually not I/O bound. Second, of the four PDM parameters, one (problem size) is a good indicator of I/O time and running time, one (memory size) is a good indicator of I/O time but not necessarily running time, and the other two (block size and number of disks) do not necessarily indicate either I/O or running time. Third, because PDM algorithms tendnottobeI/Obound, using asynchronous I/O can reduce I/O wait times signi cantly. The software interface to the PDM is part of the ViC * run-time library. The interface is a set of wrappers that are designed to be both e cient and portable across several underlying le systems and target machines. 1
Markov Analysis of Multiple-Disk Prefetching Strategies for External Merging
- Theoretical Computer Science
, 1994
"... Multiple-disk organizations can be used to improve the I/O performance of problems like external merging. Concurrency can be introduced by overlapping I/O requests at different disks and by prefetching additional blocks on each I/O operation. To support this prefetching, a memory cache is require ..."
Abstract
-
Cited by 19 (7 self)
- Add to MetaCart
Multiple-disk organizations can be used to improve the I/O performance of problems like external merging. Concurrency can be introduced by overlapping I/O requests at different disks and by prefetching additional blocks on each I/O operation. To support this prefetching, a memory cache is required. Markov models for two prefetching strategies are developed and analyzed. Closed-form expressions for the average parallelism obtainable for a given cache size and number of disks are derived for both prefetching strategies. These analytic results are confirmed by simulation. Keywords : Parallel I/O, Prefetching, Disk Cache, External Merging, Declustered Disks, Markov Chains. To appear in Theoretical Computer Science Short version in 1992 Intl. Conf. on Parallel Processing. Partially supported by an NSF Graduate Research Fellowship while at the ECE Department, Rice University. y Partially supported by NSF Research Initiation Award CCR 9010534. z Partially supported by NSF and D...

