Results 1  10
of
16
External Memory Algorithms and Data Structures
, 1998
"... Data sets in large applications are often too massive to fit completely inside the computer's internal memory. The resulting input/output communication (or I/O) between fast internal memory and slower external memory (such as disks) can be a major performance bottleneck. In this paper, we survey the ..."
Abstract

Cited by 320 (24 self)
 Add to MetaCart
Data sets in large applications are often too massive to fit completely inside the computer's internal memory. The resulting input/output communication (or I/O) between fast internal memory and slower external memory (such as disks) can be a major performance bottleneck. In this paper, we survey the state of the art in the design and analysis of external memory algorithms and data structures (which are sometimes referred to as "EM" or "I/O" or "outofcore" algorithms and data structures). EM algorithms and data structures are often designed and analyzed using the parallel disk model (PDM). The three machineindependent measures of performance in PDM are the number of I/O operations, the CPU time, and the amount of disk space. PDM allows for multiple disks (or disk arrays) and parallel CPUs, and it can be generalized to handle tertiary storage and hierarchical memory. We discuss several important paradigms for how to solve batched and online problems efficiently in external memory. Programming tools and environments are available for simplifying the programming task. The TPIE system (Transparent Parallel I/O programming Environment) is both easy to use and efficient in terms of execution speed. We report on some experiments using TPIE in the domain of spatial databases. The newly developed EM algorithms and data structures that incorporate the paradigms we discuss are significantly faster than methods currently used in practice.
Terrain Simplification Simplified: A General Framework for ViewDependent OutofCore Visualization
, 2002
"... This paper describes a general framework for outofcore rendering and management of massive terrain surfaces. The two key components of this framework are: viewdependent refinement of the terrain mesh; and a simple scheme for organizing the terrain data to improve coherence and reduce the number o ..."
Abstract

Cited by 81 (2 self)
 Add to MetaCart
This paper describes a general framework for outofcore rendering and management of massive terrain surfaces. The two key components of this framework are: viewdependent refinement of the terrain mesh; and a simple scheme for organizing the terrain data to improve coherence and reduce the number of paging events from external storage to main memory. Similar to several previously proposed methods for viewdependent refinement, we recursively subdivide a triangle mesh defined over regularly gridded data using longestedge bisection. As part of this single, perframe refinement pass, we perform triangle stripping, view frustum culling, and smooth blending of geometry using geomorphing. Meanwhile, our refinement framework supports a large class of error metrics, is highly competitive in terms of rendering performance, and is surprisingly simple to implement. Independent
I/OComplexity of Graph Algorithms
, 1999
"... We show lower bounds of \Omega\Gamma E V Sort(V )) for the I/Ocomplexity of graph theoretic problems like connected components, biconnected components, and minimum spanning trees, where E and V are the number of edges and vertices in the input graph, respectively. We also present a deterministic O ..."
Abstract

Cited by 68 (0 self)
 Add to MetaCart
We show lower bounds of \Omega\Gamma E V Sort(V )) for the I/Ocomplexity of graph theoretic problems like connected components, biconnected components, and minimum spanning trees, where E and V are the number of edges and vertices in the input graph, respectively. We also present a deterministic O( E V Sort(V ) \Delta max(1; log log V BD E )) algorithm for the problem of graph connectivity, where B and D denote respectively the block size and number of disks. Our algorithm includes a breadth first search; this maybe of independent interest. 1 Introduction Data sets of many modern applications are too large to fit into main memory, and must reside on disk. To run such applications efficiently, it is often necessary to explicitly manage disk accesses as a part of the algorithm. In other words, the algorithm must be designed for a model that includes disk, rather than the customary RAM model. Recently, this area has received a lot of attention, and algorithms have been developed for...
Visualization of Large Terrains Made Easy
, 2001
"... We present an elegant and simple to implement framework for performing outofcore visualization and viewdependent refinement of large terrain surfaces. Contrary to the recent trend of increasingly elaborate algorithms for largescale terrain visualization, our algorithms and data structures have b ..."
Abstract

Cited by 64 (4 self)
 Add to MetaCart
We present an elegant and simple to implement framework for performing outofcore visualization and viewdependent refinement of large terrain surfaces. Contrary to the recent trend of increasingly elaborate algorithms for largescale terrain visualization, our algorithms and data structures have been designed with the primary goal of simplicity and efficiency of implementation. Our approach to managing large terrain data also departs from more conventional strategies based on data tiling. Rather than emphasizing how to segment and efficiently bring data in and out of memory, we focus on the manner in which the data is laid out to achieve good memory coherency for data accesses made in a topdown (coarsetofine) refinement of the terrain. We present and compare the results of using several different data indexing schemes, and propose a simple to compute index that yields substantial improvements in locality and speed over more commonly used data layouts. Our second contribution is a new and simple, yet easy to generalize method for viewdependent refinement. Similar to several published methods in this area, we use longest edge bisection in a topdown traversal of the mesh hierarchy to produce a continuous surface with subdivision connectivity. In tandem with the refinement, we perform view frustum culling and triangle stripping. These three components are done together in a single pass over the mesh. We show how this framework supports virtually any error metric, while still being highly memory and compute efficient. 1
Scalable sweepingbased spatial join
 IN PROC. 24TH INT. CONF. VERY LARGE DATA BASES, VLDB
, 1998
"... In this paper, we consider the filter step of the spatial join problem, for the case where neither of the inputs are indexed. We present a new algorithm, Scalable SweepingBased Spatial Join (SSSJ), that achieves both efficiency on reallife data and robustness against highly skewed and worstcase d ..."
Abstract

Cited by 64 (7 self)
 Add to MetaCart
In this paper, we consider the filter step of the spatial join problem, for the case where neither of the inputs are indexed. We present a new algorithm, Scalable SweepingBased Spatial Join (SSSJ), that achieves both efficiency on reallife data and robustness against highly skewed and worstcase data sets. The algorithm combines a method with theoretically optimal bounds on I/O transfers based on the recently proposed distributionsweeping technique with a highly optimized implementation of internalmemory planesweeping. We present experimental results based on an efficient implementation of the SSSJ algorithm, and compare it to the stateoftheart PartitionBased SpatialMerge (PBSM) algorithm of Pate1 and DeWitt.
Global Static Indexing for Realtime Exploration of Very Large Regular Grids
, 2001
"... In this paper we introduce a new indexing scheme for progressive traversal and visualization of large regular grids. We demonstrate the potential of our approach by providing a tool that displays at interactive rates planar slices of scalar field data with very modest computing resources. We obtain ..."
Abstract

Cited by 44 (7 self)
 Add to MetaCart
In this paper we introduce a new indexing scheme for progressive traversal and visualization of large regular grids. We demonstrate the potential of our approach by providing a tool that displays at interactive rates planar slices of scalar field data with very modest computing resources. We obtain unprecedented results both in terms of absolute performance and, more importantly, in terms of scalability. On a laptop computer we provide real time interaction with a 2048 3 grid (8 Giganodes) using only 20MB of memory. On an SGI Onyx we slice interactively an 8192 3 grid ( teranodes) using only 60MB of memory. The scheme relies simply on the determination of an appropriate reordering of the rectilinear grid data and a progressive construction of the output slice. The reordering minimizes the amount of I/O performed during the outofcore computation. The progressive and asynchronous computation of the output provides flexible quality/speed tradeoffs and a timecritical and interruptible user interface. 1.
On External Memory MST, SSSP and Multiway Planar Graph Separation (Extended Abstract)
, 2000
"... Recently external memory graph algorithms have received considerable attention because massive graphs arise naturally in many applications involving massive data sets. Even though a large number of I/Oefficient graph algorithms have been developed, a number of fundamental problems still remain ..."
Abstract

Cited by 33 (11 self)
 Add to MetaCart
Recently external memory graph algorithms have received considerable attention because massive graphs arise naturally in many applications involving massive data sets. Even though a large number of I/Oefficient graph algorithms have been developed, a number of fundamental problems still remain open. In this paper we develop improved algorithms for the problem of computing a minimum spanning tree of a general graph G = (V; E), as well as new algorithms for the single source shortest paths and the multiway graph separation problems on planar graphs.
ExternalMemory Algorithms with Applications in Geographic Information Systems
 Algorithmic Foundations of GIS
, 1997
"... In the design of algorithms for largescale applications it is essential to consider the problem of minimizing Input/Output (I/O) communication. Geographical information systems (GIS) are good examples of such largescale applications as they frequently handle huge amounts of spatial data. In this n ..."
Abstract

Cited by 27 (9 self)
 Add to MetaCart
In the design of algorithms for largescale applications it is essential to consider the problem of minimizing Input/Output (I/O) communication. Geographical information systems (GIS) are good examples of such largescale applications as they frequently handle huge amounts of spatial data. In this note we survey the recent developments in externalmemory algorithms with applications in GIS. First we discuss the AggarwalVitter I/Omodel and illustrate why normal internalmemory algorithms for even very simple problems can perform terribly in an I/Oenvironment. Then we describe the fundamental paradigms for designing I/Oefficient algorithms by using them to design efficient sorting algorithms. We then go on and survey externalmemory algorithms for computational geometry problems  with special emphasis on problems with applications in GIS  and techniques for designing such algorithms: Using the orthogonal line segment intersection problem we illustrate the distributionsweeping and ...
Optimal sparse matrix dense vector multiplication in the I/OModel
, 2010
"... We study the problem of sparsematrix densevector multiplication (SpMV) in external memory. The task of SpMV is to compute y: = Ax, where A is a sparse N × N matrix and x is a vector. We express sparsity by a parameter k, and for each choice of k consider the class of matrices where the number of n ..."
Abstract

Cited by 18 (5 self)
 Add to MetaCart
We study the problem of sparsematrix densevector multiplication (SpMV) in external memory. The task of SpMV is to compute y: = Ax, where A is a sparse N × N matrix and x is a vector. We express sparsity by a parameter k, and for each choice of k consider the class of matrices where the number of nonzero entries is kN, i.e., where the average number of nonzero entries per column is k. We investigate what is the external worstcase complexity, i.e., the best possible upper bound on the number of I/Os, as a function of k, N and the parameters M (memory size) and B (track size) of the I/Omodel. We determine this complexity up to a constant factor for all meaningful choices of these parameters, as long as k ≤ N 1−ε, where ε depends on the problem variant. Our model of computation for the lower bound is a combination of the I/Omodels of Aggarwal and Vitter, and of Hong and Kung. We study variants of the problem, differing in the memory layout of A. If A is stored in n column major layout, we prove that SpMV has I/O comkN plexity Θ min B max