Results 11  20
of
177
External Memory Data Structures
, 2001
"... In many massive dataset applications the data must be stored in space and query efficient data structures on external storage devices. Often the data needs to be changed dynamically. In this chapter we discuss recent advances in the development of provably worstcase efficient external memory dynami ..."
Abstract

Cited by 81 (36 self)
 Add to MetaCart
In many massive dataset applications the data must be stored in space and query efficient data structures on external storage devices. Often the data needs to be changed dynamically. In this chapter we discuss recent advances in the development of provably worstcase efficient external memory dynamic data structures. We also briefly discuss some of the most popular external data structures used in practice.
CacheOblivious Algorithms
, 1999
"... This thesis presents "cacheoblivious" algorithms that use asymptotically optimal amounts of work, and move data asymptotically optimally among multiple levels of cache. An algorithm is cache oblivious if no program variables dependent on hardware configuration parameters, such as cache size and cac ..."
Abstract

Cited by 79 (1 self)
 Add to MetaCart
This thesis presents "cacheoblivious" algorithms that use asymptotically optimal amounts of work, and move data asymptotically optimally among multiple levels of cache. An algorithm is cache oblivious if no program variables dependent on hardware configuration parameters, such as cache size and cacheline length need to be tuned to minimize the number of cache misses. We show that the ordinary algorithms for matrix transposition, matrix multiplication, sorting, and Jacobistyle multipass filtering are not cache optimal. We present algorithms for rectangular matrix transposition, FFT, sorting, and multipass filters, which are asymptotically optimal on computers with multiple levels of caches. For a cache with size Z and cacheline length L, where Z =# (L 2 ), the number of cache misses for an m × n matrix transpose is #(1 + mn=L). The number of cache misses for either an npoint FFT or the sorting of n numbers is #(1 + (n=L)(1 + log Z n)). The cache complexity of computing n ...
ExternalMemory Algorithms for Processing Line Segments in Geographic Information Systems
, 2007
"... In the design of algorithms for largescale applications it is essential to consider the problem of minimizing I/O communication. Geographical information systems (GIS) are good examples of such largescale applications as they frequently handle huge amounts of spatial data. In this paper we develop ..."
Abstract

Cited by 76 (30 self)
 Add to MetaCart
In the design of algorithms for largescale applications it is essential to consider the problem of minimizing I/O communication. Geographical information systems (GIS) are good examples of such largescale applications as they frequently handle huge amounts of spatial data. In this paper we develop efficient externalmemory algorithms for a number of important problems involving line segments in the plane, including trapezoid decomposition, batched planar point location, triangulation, red–blue line segment intersection reporting, and general line segment intersection reporting. In GIS systems the first three problems are useful for rendering and modeling, and the latter two are frequently used for overlaying maps and extracting information from them.
Improved Algorithms and Data Structures for Solving Graph Problems in External Memory
 In Proc. IEEE Symp. on Parallel and Distributed Processing
, 1996
"... Recently, the study of I/Oefficient algorithms has moved beyond fundamental problems of sorting and permuting and into wider areas such as computational geometry and graph algorithms. With this expansion has come a need for new algorithmic techniques and data structures. In this paper, we present I ..."
Abstract

Cited by 75 (0 self)
 Add to MetaCart
Recently, the study of I/Oefficient algorithms has moved beyond fundamental problems of sorting and permuting and into wider areas such as computational geometry and graph algorithms. With this expansion has come a need for new algorithmic techniques and data structures. In this paper, we present I/Oefficient analogues of wellknown data structures that we show to be useful for obtaining simpler and improved algorithms for several graph problems. Our results include improved algorithms for minimum spanning trees, breadthfirst search, and singlesource shortest paths. The descriptions of these algorithms are greatly simplified by their use of welldefined I/Oefficient data structures with good amortized performance bounds. We expect that I/Oefficient data structures such as these will be a useful tool for the design of I/Oefficient algorithms. 1. Introduction 1.1. Background and model The study of I/Oefficient algorithms has been receiving increased attention as increases in pro...
I/O Optimal Isosurface Extraction
, 1997
"... In this paper we give I/Ooptimal techniques for the extraction of isosurfaces from volumetric data, by a novel application of the I/Ooptimal interval tree of Arge and Vitter. The main idea is to preprocess the dataset once and for all to build an efficient search structure in disk, and then each ti ..."
Abstract

Cited by 73 (17 self)
 Add to MetaCart
In this paper we give I/Ooptimal techniques for the extraction of isosurfaces from volumetric data, by a novel application of the I/Ooptimal interval tree of Arge and Vitter. The main idea is to preprocess the dataset once and for all to build an efficient search structure in disk, and then each time we want to extract an isosurface, we perform an outputsensitive query on the search structure to retrieve only those active cells that are intersected by the isosurface. During the query operation, only two blocks of main memory space are needed, and only those active cells are brought into the main memory, plus some negligible overhead of disk accesses. This implies that we can efficiently visualize very large datasets on workstations with just enough main memory to hold the isosurfaces themselves. The implementation is delicate but not complicated. We give the first implementation of the I/Ooptimal interval tree, and also implement our methods as an I/O filter for Vtk's isosurface ext...
I/OComplexity of Graph Algorithms
, 1999
"... We show lower bounds of \Omega\Gamma E V Sort(V )) for the I/Ocomplexity of graph theoretic problems like connected components, biconnected components, and minimum spanning trees, where E and V are the number of edges and vertices in the input graph, respectively. We also present a deterministic O ..."
Abstract

Cited by 68 (0 self)
 Add to MetaCart
We show lower bounds of \Omega\Gamma E V Sort(V )) for the I/Ocomplexity of graph theoretic problems like connected components, biconnected components, and minimum spanning trees, where E and V are the number of edges and vertices in the input graph, respectively. We also present a deterministic O( E V Sort(V ) \Delta max(1; log log V BD E )) algorithm for the problem of graph connectivity, where B and D denote respectively the block size and number of disks. Our algorithm includes a breadth first search; this maybe of independent interest. 1 Introduction Data sets of many modern applications are too large to fit into main memory, and must reside on disk. To run such applications efficiently, it is often necessary to explicitly manage disk accesses as a part of the algorithm. In other words, the algorithm must be designed for a model that includes disk, rather than the customary RAM model. Recently, this area has received a lot of attention, and algorithms have been developed for...
Algorithms for Parallel Memory II: Hierarchical Multilevel Memories
 ALGORITHMICA
, 1993
"... In this paper we introduce parallel versions of two hierarchical memory models and give optimal algorithms in these models for sorting, FFT, and matrix multiplication. In our parallel models, there are P memory hierarchies operating simultaneously; communication among the hierarchies takes place ..."
Abstract

Cited by 66 (5 self)
 Add to MetaCart
In this paper we introduce parallel versions of two hierarchical memory models and give optimal algorithms in these models for sorting, FFT, and matrix multiplication. In our parallel models, there are P memory hierarchies operating simultaneously; communication among the hierarchies takes place at a base memory level. Our optimal sorting algorithm is randomized and is based upon the probabilistic partitioning technique developed in the companion paper for optimal disk sorting in a twolevel memory with parallel block transfer. The probability of using l times the optimal running time is exponentially small in l(log l) log P.
Simple Randomized Mergesort on Parallel Disks
 PARALLEL COMPUTING
, 1996
"... We consider the problem of sorting a file of N records on the Ddisk model of parallel I/O [VS94] in which there are two sources of parallelism. Records are transferred to and from disk concurrently in blocks of B contiguous records. In each I/O operation, up to one block can be transferred to or fr ..."
Abstract

Cited by 63 (11 self)
 Add to MetaCart
We consider the problem of sorting a file of N records on the Ddisk model of parallel I/O [VS94] in which there are two sources of parallelism. Records are transferred to and from disk concurrently in blocks of B contiguous records. In each I/O operation, up to one block can be transferred to or from each of the D disks in parallel. We propose a simple, efficient, randomized mergesort algorithm called SRM that uses a forecastandflush approach to overcome the inherent difficulties of simple merging on parallel disks. SRM exhibits a limited use of randomization and also has a useful deterministic version. Generalizing the technique of forecasting [Knu73], our algorithm is able to read in, at any time, the "right" block from any disk, and using the technique of flushing, our algorithm evicts, without any I/O overhead, just the "right" blocks from memory to make space for new ones to be read in. The disk layout of SRM is such that it enjoys perfect write parallelism, avoiding fundamenta...
Asymptotically Tight Bounds for Performing BMMC Permutations on Parallel Disk Systems
, 1994
"... This paper presents asymptotically equal lower and upper bounds for the number of parallel I/O operations required to perform bitmatrixmultiply/complement (BMMC) permutations on the Parallel Disk Model proposed by Vitter and Shriver. A BMMC permutation maps a source index to a target index by an a ..."
Abstract

Cited by 61 (19 self)
 Add to MetaCart
This paper presents asymptotically equal lower and upper bounds for the number of parallel I/O operations required to perform bitmatrixmultiply/complement (BMMC) permutations on the Parallel Disk Model proposed by Vitter and Shriver. A BMMC permutation maps a source index to a target index by an affine transformation over GF (2), where the source and target indices are treated as bit vectors. The class of BMMC permutations includes many common permutations, such as matrix transposition (when dimensions are powers of 2), bitreversal permutations, vectorreversal permutations, hypercube permutations, matrix reblocking, Graycode permutations, and inverse Graycode permutations. The upper bound improves upon the asymptotic bound in the previous best known BMMC algorithm and upon the constant factor in the previous best known bitpermute/complement (BPC) permutation algorithm. The algorithm achieving the upper bound uses basic linearalgebra techniques to factor the characteristic matrix...
A Comparison of Sequential Delaunay Triangulation Algorithms
, 1996
"... This paper presents an experimental comparison of a number of different algorithms for computing the Deluanay triangulation. The algorithms examined are: Dwyer’s divide and conquer algorithm, Fortune’s sweepline algorithm, several versions of the incremental algorithm (including one by Ohya, Iri, an ..."
Abstract

Cited by 55 (0 self)
 Add to MetaCart
This paper presents an experimental comparison of a number of different algorithms for computing the Deluanay triangulation. The algorithms examined are: Dwyer’s divide and conquer algorithm, Fortune’s sweepline algorithm, several versions of the incremental algorithm (including one by Ohya, Iri, and Murota, a new bucketingbased algorithm described in this paper, and Devillers’s version of a Delaunaytree based algorithm that appears in LEDA), an algorithm that incrementally adds a correct Delaunay triangle adjacent to a current triangle in a manner similar to gift wrapping algorithms for convex hulls, and Barber’s convex hull based algorithm. Most of the algorithms examined are designed for good performance on uniformly distributed sites. However, we also test implementations of these algorithms on a number of nonuniform distibutions. The experiments go beyond measuring total running time, which tends to be machinedependent. We also analyze the major highlevel primitives that algorithms use and do an experimental analysis of how often implementations of these algorithms perform each operation.