Results 1 - 10
of
12
Improved Algorithms and Data Structures for Solving Graph Problems in External Memory
- In Proc. IEEE Symp. on Parallel and Distributed Processing
, 1996
"... Recently, the study of I/O-efficient algorithms has moved beyond fundamental problems of sorting and permuting and into wider areas such as computational geometry and graph algorithms. With this expansion has come a need for new algorithmic techniques and data structures. In this paper, we present I ..."
Abstract
-
Cited by 71 (0 self)
- Add to MetaCart
Recently, the study of I/O-efficient algorithms has moved beyond fundamental problems of sorting and permuting and into wider areas such as computational geometry and graph algorithms. With this expansion has come a need for new algorithmic techniques and data structures. In this paper, we present I/O-efficient analogues of well-known data structures that we show to be useful for obtaining simpler and improved algorithms for several graph problems. Our results include improved algorithms for minimum spanning trees, breadth-first search, and single-source shortest paths. The descriptions of these algorithms are greatly simplified by their use of well-defined I/O-efficient data structures with good amortized performance bounds. We expect that I/Oefficient data structures such as these will be a useful tool for the design of I/O-efficient algorithms. 1. Introduction 1.1. Background and model The study of I/O-efficient algorithms has been receiving increased attention as increases in pro...
Towards a Theory of Cache-Efficient Algorithms
, 1999
"... We describe a model that enables us to analyze the running time of an algorithm in a computer with a memory hierarchy with limited associativity, in terms of various cache parameters. Our model, an extension of Aggarwal and Vitter's I/O model, enables us to establish useful relationships between the ..."
Abstract
-
Cited by 43 (3 self)
- Add to MetaCart
We describe a model that enables us to analyze the running time of an algorithm in a computer with a memory hierarchy with limited associativity, in terms of various cache parameters. Our model, an extension of Aggarwal and Vitter's I/O model, enables us to establish useful relationships between the cache complexity and the I/O complexity of computations. As a corollary, we obtain cache-optimal algorithms for some fundamental problems like sorting, FFT, and an important subclass of permutations in the single-level cache model. We also show that ignoring associativity concerns could lead to inferior performance, by analyzing the average-case cache behavior of mergesort. We further extend our model to multiple levels of cache with limited associativity and present optimal algorithms for matrix transpose and sorting. Our techniques may be used for systematic exploitation of the memory hierarchy starting from the algorithm design stage, and dealing with the hitherto unresolved problem of l...
Efficient External Memory Algorithms by Simulating Coarse-Grained Parallel Algorithms
, 2003
"... External memory (EM) algorithms are designed for large-scale computational problems in which the size of the internal memory of the computer is only a small fraction of the problem size. Typical EM algorithms are specially crafted for the EM situation. In the past, several attempts have been made to ..."
Abstract
-
Cited by 39 (10 self)
- Add to MetaCart
External memory (EM) algorithms are designed for large-scale computational problems in which the size of the internal memory of the computer is only a small fraction of the problem size. Typical EM algorithms are specially crafted for the EM situation. In the past, several attempts have been made to relate the large body of work on parallel algorithms to EM, but with limited success. The combination of EM computing, on multiple disks, with multiprocessor parallelism has been posted as a challenge by the ACMWorking Group on Storage I/O for Large-Scale Computing.
External-memory breadth-first search with sublinear I/O
- IN PROCEEDINGS OF THE 10TH ANNUAL EUROPEAN SYMPOSIUM ON ALGORITHMS
, 2002
"... Breadth-first search (BFS) is a basic graph exploration technique. We give the first external memory algorithm for sparse undirected graphs with sublinear I/O. The best previous algorithm requires \Theta (n + n+mD\Delta B \Delta logM=B n+mB) I/Os on a graph with n nodes and m edges and a machine w ..."
Abstract
-
Cited by 38 (11 self)
- Add to MetaCart
Breadth-first search (BFS) is a basic graph exploration technique. We give the first external memory algorithm for sparse undirected graphs with sublinear I/O. The best previous algorithm requires \Theta (n + n+mD\Delta B \Delta logM=B n+mB) I/Os on a graph with n nodes and m edges and a machine with main-memory of size M, D parallel disks, and block size B. We present two versions of a new algorithm which requires only O i (p 1D\Delta B + p nm) \Delta n+mpD\Delta B \Delta logM=B n+mB
STXXL: Standard template library for XXL data sets
- In: Proc. of ESA 2005. Volume 3669 of LNCS
, 2005
"... for processing huge data sets that can fit only on hard disks. It supports parallel disks, overlapping between disk I/O and computation and it is the first I/O-efficient algorithm library that supports the pipelining technique that can save more than half of the I/Os. STXXL has been applied both in ..."
Abstract
-
Cited by 30 (4 self)
- Add to MetaCart
for processing huge data sets that can fit only on hard disks. It supports parallel disks, overlapping between disk I/O and computation and it is the first I/O-efficient algorithm library that supports the pipelining technique that can save more than half of the I/Os. STXXL has been applied both in academic and industrial environments for a range of problems including text processing, graph algorithms, computational geometry, gaussian elimination, visualization, and analysis of microscopic images, differential cryptographic analysis, etc. The performance of STXXL and its applications is evaluated on synthetic and real-world inputs. We present the design of the library, how its performance features are supported, and demonstrate how the library integrates with STL. KEY WORDS: very large data sets; software library; C++ standard template library; algorithm engineering 1.
Cache-Efficient Matrix Transposition
"... We investigate the memory system performance of several algorithms for transposing an N N matrix in-place, where N is large. Specifically, we investigate the relative contributions of the data cache, the translation lookaside buffer, register tiling, and the array layout function to the overall runn ..."
Abstract
-
Cited by 22 (2 self)
- Add to MetaCart
We investigate the memory system performance of several algorithms for transposing an N N matrix in-place, where N is large. Specifically, we investigate the relative contributions of the data cache, the translation lookaside buffer, register tiling, and the array layout function to the overall running time of the algorithms. We use various memory models to capture and analyze the effect of various facets of cache memory architecture that guide the choice of a particular algorithm, and attempt to experimentally validate the predictions of the model. Our major conclusions are as follows: limited associativity in the mapping from main memory addresses to cache sets can significantly degrade running time; the limited number of TLB entries can easily lead to thrashing; the fanciest optimal algorithms are not competitive on real machines even at fairly large problem sizes unless cache miss penalties are quite high; low-level performance tuning “hacks”, such as register tiling and array alignment, can significantly distort the effects of improved algorithms; and hierarchical nonlinear layouts are inherently superior to the standard canonical layouts (such as row- or column-major) for
this problem.
External A*
- IN GERMAN CONFERENCE ON ARTIFICIAL INTELLIGENCE (KI
, 2004
"... In this paper we study External A*, a variant of the conventional (internal) A* algorithm that makes use of external memory, e.g., a hard disk. The approach applies to implicit, undirected, unweighted state space problem graphs with consistent estimates. It combines all three aspects of best-firs ..."
Abstract
-
Cited by 17 (8 self)
- Add to MetaCart
In this paper we study External A*, a variant of the conventional (internal) A* algorithm that makes use of external memory, e.g., a hard disk. The approach applies to implicit, undirected, unweighted state space problem graphs with consistent estimates. It combines all three aspects of best-first search, frontier search and delayed duplicate detection and can still operate on very small internal memory. The complexity of the external algorithm is almost linear in external sorting time and accumulates to O(sort(|E|) + scan(|V I/O operations, where V and E are the set of nodes and edges in the explored portion of the state space graph. Given that delayed duplicate elimination has to be performed, the established bound is I/O optimal. In contrast to the internal algorithm, we exploit memory locality to allow blockwise rather than random access. The algorithmic design refers to external shortest path search in explicit graphs and extends the strategy of delayed duplicate detection recently suggested for breadth-first search to best-first search. We conduct experiments with sliding-tile puzzle instances.
Large-scale directed model checking LTL
- In Model Checking Software (SPIN
, 2006
"... Abstract. To analyze larger models for explicit-state model checking, directed model checking applies error-guided search, external model checking uses secondary storage media, and distributed model checking exploits parallel exploration on multiple processors. In this paper we propose an external, ..."
Abstract
-
Cited by 14 (6 self)
- Add to MetaCart
Abstract. To analyze larger models for explicit-state model checking, directed model checking applies error-guided search, external model checking uses secondary storage media, and distributed model checking exploits parallel exploration on multiple processors. In this paper we propose an external, distributed and directed on-the-fly model checking algorithm to check general LTL properties in the model checker SPIN. Previous attempts restricted to checking safety properties. The worst-case I/O complexity is bounded by O(sort(|F||R|)/p + l · scan(|F||S|)), where S and R are the sets of visited states and transitions in the synchronized product of the Büchi automata for the model and the property specification, F is the number of accepting states, l is the length of the shortest counterexample, and p is the number of processors. The algorithm we propose returns minimal lasso-shaped counterexamples and includes refinements for property-driven exploration. 1
Giga-Mining
, 1998
"... We describe an industrial-strength data mining application in telecommunications. The application requires building a short (7 byte) profile for all telephone numbers seen on a large telecom network. By large, we mean very large: we maintain approximately 350 million profiles. In addition, the ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
We describe an industrial-strength data mining application in telecommunications. The application requires building a short (7 byte) profile for all telephone numbers seen on a large telecom network. By large, we mean very large: we maintain approximately 350 million profiles. In addition, the procedure for updating these profiles is based on processing approximately 275 million call records per day. We discuss the motivation for massive tracking and fully describe the definition and the computation of one of the more interesting bytes in the profile.

