ExternalMemory Graph Algorithms
, 1995
"... We present a collection of new techniques for designing and analyzing efficient externalmemory algorithms for graph problems and illustrate how these techniques can be applied to a wide variety of specific problems. Our results include: ffl Proximateneighboring. We present a simple method for der ..."
Cited by 175 (24 self)
We present a collection of new techniques for designing and analyzing efficient externalmemory algorithms for graph problems and illustrate how these techniques can be applied to a wide variety of specific problems. Our results include: ffl Proximateneighboring. We present a simple method for deriving externalmemory lower bounds via reductions from a problem we call the "proximate neighbors" problem. We use this technique to derive nontrivial lower bounds for such problems as list ranking, expression tree evaluation, and connected components. ffl PRAM simulation. We give methods for efficiently simulating PRAM computations in external memory, even for some cases in which the PRAM algorithm is not workoptimal. We apply this to derive a number of optimal (and simple) externalmemory graph algorithms. ffl Timeforward processing. We present a general technique for evaluating circuits (or "circuitlike" computations) in external memory. We also use this in a deterministic list rank...
ExternalMemory Algorithms for Processing Line Segments in Geographic Information Systems
, 2007
"... In the design of algorithms for largescale applications it is essential to consider the problem of minimizing I/O communication. Geographical information systems (GIS) are good examples of such largescale applications as they frequently handle huge amounts of spatial data. In this paper we develop ..."
Cited by 76 (30 self)
In the design of algorithms for largescale applications it is essential to consider the problem of minimizing I/O communication. Geographical information systems (GIS) are good examples of such largescale applications as they frequently handle huge amounts of spatial data. In this paper we develop efficient externalmemory algorithms for a number of important problems involving line segments in the plane, including trapezoid decomposition, batched planar point location, triangulation, red–blue line segment intersection reporting, and general line segment intersection reporting. In GIS systems the first three problems are useful for rendering and modeling, and the latter two are frequently used for overlaying maps and extracting information from them.
Integrating Theory and Practice in Parallel File Systems
 PROCEEDINGS OF THE 1993 DAGS/PC SYMPOSIUM (THE DARTMOUTH INSTITUTE FOR ADVANCED GRADUATE STUDIES
, 1993
"... Several algorithms for parallel disk systems have appeared in the literature recently, and they are asymptotically optimal in terms of the number of disk accesses. Scalable systems with parallel disks must be able to run these algorithms. We present for the first time a list of capabilities that mus ..."
Cited by 51 (11 self)
Several algorithms for parallel disk systems have appeared in the literature recently, and they are asymptotically optimal in terms of the number of disk accesses. Scalable systems with parallel disks must be able to run these algorithms. We present for the first time a list of capabilities that must be provided by the system to support these optimal algorithms: control over declustering, querying about the configuration, independent I/O, and turning off parity, file caching, and prefetching. We summarize recent theoretical and empirical work that justifies the need for these capabilities. In addition, we sketch an organization for a parallel file interface with lowlevel primitives and higherlevel operations.
Towards a theory of cacheefficient algorithms
 PROCEEDINGS OF THE SYMPOSIUM ON DISCRETE
, 2000
"... We present a model that enables us to analyze the running time of an algorithm on a computer with a memory hierarchy with limited associativity, in terms of various cache parameters. Our cache model, an extension of Aggarwal and Vitter’s I/O model, enables us to establish useful relationships betw ..."
Cited by 47 (3 self)
We present a model that enables us to analyze the running time of an algorithm on a computer with a memory hierarchy with limited associativity, in terms of various cache parameters. Our cache model, an extension of Aggarwal and Vitter’s I/O model, enables us to establish useful relationships between the cache complexity and the I/O complexity of computations. As a corollary, we obtain cacheefficient algorithms in the singlelevel cache model for fundamental problems like sorting, FFT, and an important subclass of permutations. We also analyze the averagecase cache behavior of mergesort, show that ignoring associativity concerns could lead to inferior performance, and present supporting experimental evidence. We further extend our model to multiple levels of cache with limited associativity and present optimal algorithms for matrix transpose and sorting. Our techniques may be used for systematic
ViC*: A preprocessor for virtualmemory C*
, 1994
"... This paper describes the functionality of ViC*, a compilerlike preprocessor foroutof core C*. The input to ViC* is a C * program but with certain shapes declared outofcore, which means that all parallel variables of these shapes reside on disk. The output is a standard C* program with the appropri ..."
Cited by 40 (3 self)
This paper describes the functionality of ViC*, a compilerlike preprocessor foroutof core C*. The input to ViC* is a C * program but with certain shapes declared outofcore, which means that all parallel variables of these shapes reside on disk. The output is a standard C* program with the appropriate I/O and library calls added for efficient access to outofcore parallel variables.
Efficient ExternalMemory Data Structures and Applications
, 1996
"... In this thesis we study the Input/Output (I/O) complexity of largescale problems arising e.g. in the areas of database systems, geographic information systems, VLSI design systems and computer graphics, and design I/Oefficient algorithms for them. A general theme in our work is to design I/Oeffic ..."
Cited by 38 (12 self)
In this thesis we study the Input/Output (I/O) complexity of largescale problems arising e.g. in the areas of database systems, geographic information systems, VLSI design systems and computer graphics, and design I/Oefficient algorithms for them. A general theme in our work is to design I/Oefficient algorithms through the design of I/Oefficient data structures. One of our philosophies is to try to isolate all the I/O specific parts of an algorithm in the data structures, that is, to try to design I/O algorithms from internal memory algorithms by exchanging the data structures used in internal memory with their external memory counterparts. The results in the thesis include a technique for transforming an internal memory tree data structure into an external data structure which can be used in a batched dynamic setting, that is, a setting where we for example do not require that the result of a search operation is returned immediately. Using this technique we develop batched dynamic external versions of the (onedimensional) rangetree and the segmenttree and we develop an external priority queue. Following our general philosophy we show how these structures can be used in standard internal memory sorting algorithms
I/OEfficient Scientific Computation Using TPIE
 In Proceedings of the Goddard Conference on Mass Storage Systems and Technologies, NASA Conference Publication 3340, Volume II
, 1995
"... In recent years, I/Oefficient algorithms for a wide variety of problems have appeared in the literature. Thus far, however, systems specifically designed to assist programmers in implementing such algorithms have remained scarce. TPIE is a system designed to fill this void. It supports I/Oeff ..."
Cited by 34 (10 self)
In recent years, I/Oefficient algorithms for a wide variety of problems have appeared in the literature. Thus far, however, systems specifically designed to assist programmers in implementing such algorithms have remained scarce. TPIE is a system designed to fill this void. It supports I/Oefficient paradigms for problems from a variety of domains, including computational geometry, graph algorithms, and scientific computation. The TPIE interface frees programmers from having to deal not only of explicit read and write calls, but also the complex memory management that must be performed for I/Oefficient computation.
A Transparent Parallel I/O Environment
 In Proc. 1994 DAGS Symposium on Parallel Computation
, 1994
"... We describe TPIE, a Transparent Parallel I/O Environment. TPIE is a system designed to bridge the gap between current theoretical knowledge about the construction of I/Ooptimal algorithms on parallel disk systems and the design and implementation of parallel I/O systems. We discuss the design of ..."
Cited by 34 (2 self)
We describe TPIE, a Transparent Parallel I/O Environment. TPIE is a system designed to bridge the gap between current theoretical knowledge about the construction of I/Ooptimal algorithms on parallel disk systems and the design and implementation of parallel I/O systems. We discuss the design of TPIE and its interface, the structure of a typical implementation, applications of the system, our prototype, and future research directions. The initial goal of our work is a prototype system to demonstrate: 1) that optimal algorithms can be made to run efficiently on parallel I/O devices; and 2) that high level hardware independent interfaces to the I/O paradigms required to implement such algorithms can be provided to application programmers. The TPIE interface is designed to be portable across a variety of parallel hardware platforms; thus code that runs efficiently on one machine will run efficiently on others. Longer term goals for TPIE include extending the prototype in ways t...
ExternalMemory Algorithms with Applications in Geographic Information Systems
 Algorithmic Foundations of GIS
, 1997
"... In the design of algorithms for largescale applications it is essential to consider the problem of minimizing Input/Output (I/O) communication. Geographical information systems (GIS) are good examples of such largescale applications as they frequently handle huge amounts of spatial data. In this n ..."
Cited by 27 (9 self)
In the design of algorithms for largescale applications it is essential to consider the problem of minimizing Input/Output (I/O) communication. Geographical information systems (GIS) are good examples of such largescale applications as they frequently handle huge amounts of spatial data. In this note we survey the recent developments in externalmemory algorithms with applications in GIS. First we discuss the AggarwalVitter I/Omodel and illustrate why normal internalmemory algorithms for even very simple problems can perform terribly in an I/Oenvironment. Then we describe the fundamental paradigms for designing I/Oefficient algorithms by using them to design efficient sorting algorithms. We then go on and survey externalmemory algorithms for computational geometry problems  with special emphasis on problems with applications in GIS  and techniques for designing such algorithms: Using the orthogonal line segment intersection problem we illustrate the distributionsweeping and ...
On Sorting Strings in External Memory
, 1997
"... ) Lars Arge Paolo Ferragina y Roberto Grossi z Jeffrey Scott Vitter x Abstract. In this paper we address for the first time the I/O complexity of the problem of sorting strings in external memory, which is a fundamental component of many largescale text applications. In the standard unitcost RAM c ..."
Cited by 27 (12 self)
) Lars Arge Paolo Ferragina y Roberto Grossi z Jeffrey Scott Vitter x Abstract. In this paper we address for the first time the I/O complexity of the problem of sorting strings in external memory, which is a fundamental component of many largescale text applications. In the standard unitcost RAM comparison model, the complexity of sorting K strings of total length N is \Theta(K log 2 K+N). By analogy, in the external memory (or I/O) model, where the internal memory has size M and the block transfer size is B, it would be natural to guess that the I/O complexity of sorting strings is \Theta( K B log M=B K B + N B ), but the known algorithms do not come even close to achieving this bound. Our results show, somewhat counterintuitively, that the I/O complexity of string sorting depends upon the length of the strings relative to the block size. We first consider a simple comparison I/O model, where one is not allowed to break the strings into their characters, and we sho...