Results 1  10
of
16
A localitypreserving cacheoblivious dynamic dictionary
, 2002
"... This paper presents a simple dictionary structure designed for a hierarchical memory. The proposed data structure is cache oblivious and locality preserving. A cacheoblivious data structure has memory performance optimized for all levels of the memory hierarchy even though it has no memoryhierar ..."
Abstract

Cited by 71 (22 self)
 Add to MetaCart
This paper presents a simple dictionary structure designed for a hierarchical memory. The proposed data structure is cache oblivious and locality preserving. A cacheoblivious data structure has memory performance optimized for all levels of the memory hierarchy even though it has no memoryhierarchyspecific parameterization. A localitypreserving dictionary maintains elements of similar key values stored close together for fast access to ranges of data with consecutive keys. The data structure presented here is a simplification of the cacheoblivious Btree of Bender, Demaine, and FarachColton. Like the cacheoblivious Btree, this structure supports search operations using only O(logB N) block operations at a level of the memory hierarchy with block size B. Insertion and deletion operations use O(logB N + log2 N=B) amortized block transfers. Finally, the data structure returns all k data items in a given search range using O(logB N + kB) block operations. This data structure was implemented and its performance was evaluated on a simulated memory hierarchy. This paper presents the results of this simulation for various combinations of block and memory sizes.
Cacheoblivious priority queue and graph algorithm applications
 In Proc. 34th Annual ACM Symposium on Theory of Computing
, 2002
"... In this paper we develop an optimal cacheoblivious priority queue data structure, supporting insertion, deletion, and deletemin operations in O ( 1 B logM/B N) amortized memory B transfers, where M and B are the memory and block transfer sizes of any two consecutive levels of a multilevel memory hi ..."
Abstract

Cited by 68 (9 self)
 Add to MetaCart
(Show Context)
In this paper we develop an optimal cacheoblivious priority queue data structure, supporting insertion, deletion, and deletemin operations in O ( 1 B logM/B N) amortized memory B transfers, where M and B are the memory and block transfer sizes of any two consecutive levels of a multilevel memory hierarchy. In a cacheoblivious data structure, M and B are not used in the description of the structure. The bounds match the bounds of several previously developed externalmemory (cacheaware) priority queue data structures, which all rely crucially on knowledge about M and B. Priority queues are a critical component in many of the best known externalmemory graph algorithms, and using our cacheoblivious priority queue we develop several cacheoblivious graph algorithms.
Efficient ExternalMemory Data Structures and Applications
, 1996
"... In this thesis we study the Input/Output (I/O) complexity of largescale problems arising e.g. in the areas of database systems, geographic information systems, VLSI design systems and computer graphics, and design I/Oefficient algorithms for them. A general theme in our work is to design I/Oeffic ..."
Abstract

Cited by 38 (9 self)
 Add to MetaCart
(Show Context)
In this thesis we study the Input/Output (I/O) complexity of largescale problems arising e.g. in the areas of database systems, geographic information systems, VLSI design systems and computer graphics, and design I/Oefficient algorithms for them. A general theme in our work is to design I/Oefficient algorithms through the design of I/Oefficient data structures. One of our philosophies is to try to isolate all the I/O specific parts of an algorithm in the data structures, that is, to try to design I/O algorithms from internal memory algorithms by exchanging the data structures used in internal memory with their external memory counterparts. The results in the thesis include a technique for transforming an internal memory tree data structure into an external data structure which can be used in a batched dynamic setting, that is, a setting where we for example do not require that the result of a search operation is returned immediately. Using this technique we develop batched dynamic external versions of the (onedimensional) rangetree and the segmenttree and we develop an external priority queue. Following our general philosophy we show how these structures can be used in standard internal memory sorting algorithms
ExternalMemory Algorithms with Applications in Geographic Information Systems
, 1996
"... ..."
(Show Context)
A Unified Analysis of Paging and Caching
, 1998
"... Paging (caching) is the problem of managing a twolevel memory hierarchy in order to minimize the time required to process a sequence of memory accesses. In order to measure this quantity, which we refer to as the total memory access time, we define the system parameter miss penalty to represent th ..."
Abstract

Cited by 25 (0 self)
 Add to MetaCart
(Show Context)
Paging (caching) is the problem of managing a twolevel memory hierarchy in order to minimize the time required to process a sequence of memory accesses. In order to measure this quantity, which we refer to as the total memory access time, we define the system parameter miss penalty to represent the extra time required to access slow memory. We also introduce the system parameter page size. In the context of paging, miss penalty is quite large, so most previous studies of online paging have implicitly set miss penaltyD 1 in order to simplify the model. We show that this seemingly insignificant simplification substantially alters the precision of derived results. For example, previous studies have essentially ignored page size. Consequently, we reintroduce the miss penalty and page size parameters to the paging problem and present a more accurate analysis of online paging (and caching). We validate using this more accurate model by deriving intuitively appealing results for the paging problem which cannot be derived using the simplified model. First, we present a natural, quantifiable definition of the amount of locality of reference in any access sequence. We also point out that the amount of locality of reference in an access sequence should depend on page size among other factors. We then show that deterministic and randomized marking algorithms such as the popular least recently used (LRU) algorithm achieve constant competitive ratios when processing typical access sequences which exhibit significant locality of reference; this represents the first competitive analysis
Memory Paging for Connectivity and Path Problems in Graphs
, 1998
"... We extend the Paging Problem to the case in which the items that are stored in the cache memory represent information about a graph. We propose online algorithms for two dierent connectivity problems in this context, for particular classes of graphs and under dierent cost assumptions. In the Pathp ..."
Abstract

Cited by 14 (0 self)
 Add to MetaCart
We extend the Paging Problem to the case in which the items that are stored in the cache memory represent information about a graph. We propose online algorithms for two dierent connectivity problems in this context, for particular classes of graphs and under dierent cost assumptions. In the Pathpaging problem we assume that the cache contains edges of the graph and queries to be answered are of the kind \report a path from i to j"; to answer the query it is necessary to have in memory all the edges of a path from i to j. In this case the answer to a query is not a single piece of information stored in memory. In the Connectivity problem the edges of the transitive closure of a given graph are stored in memory and we want to answer connectivity queries. In order to positively answer connectivity queries of the type \is i connected with j?", it is possible to answer the query even if the cache does not contain the edge (i; j). Most of our algorithms are optimal and fairly simple. Co...
Cache oblivious algorithms
 Algorithms for Memory Hierarchies, LNCS 2625
, 2003
"... Abstract. The cache oblivious model is a simple and elegant model to design algorithms that perform well in hierarchical memory models ubiquitous on current systems. This model was first formulated in [22] and has since been a topic of intense research. Analyzing and designing algorithms and data st ..."
Abstract

Cited by 13 (0 self)
 Add to MetaCart
(Show Context)
Abstract. The cache oblivious model is a simple and elegant model to design algorithms that perform well in hierarchical memory models ubiquitous on current systems. This model was first formulated in [22] and has since been a topic of intense research. Analyzing and designing algorithms and data structures in this model involves not only an asymptotic analysis of the number of steps executed in terms of the input size, but also the movement of data optimally among the different levels of the memory hierarchy. This chapter is aimed as an introduction to the “idealcache ” model of [22] and techniques used to design cache oblivious algorithms. The chapter also presents some experimental insights and results. Part of this work was done while the author was visiting MPISaarbrücken. The
Towards an Optimal BitReversal Permutation Program
 In Proceeding of IEEE Foundations of Computer Science
, 1998
"... The speed of many computations is limited not by the number of arithmetic operations but by the time it takes to move and rearrange data in the increasingly complicated memory hierarchies of modern computers. Array transpose and the bitreversal permutation  trivial operations on a RAM  present ..."
Abstract

Cited by 13 (2 self)
 Add to MetaCart
(Show Context)
The speed of many computations is limited not by the number of arithmetic operations but by the time it takes to move and rearrange data in the increasingly complicated memory hierarchies of modern computers. Array transpose and the bitreversal permutation  trivial operations on a RAM  present nontrivial problems when designing highlytuned scientific library functions, particular for the Fast Fourier Transform. We prove a precise bound for RoCol, a simple pebbletype game that is relevant to implementing these permutations. We use RoCol to give lower bounds on the amount of memory traffic in a computer with fourlevels of memory (registers, cache, TLB, and memory), taking into account such "messy" features as block moves and setassociative caches. The insights from this analysis lead to a bitreversal algorithm whose performance is close to the theoretical minimum. Experiments show it performs significantly better than every program in a comprehensive study of 30 published algo...
Cacheoblivious algorithms and data structures
 IN SWAT
, 2004
"... Frigo, Leiserson, Prokop and Ramachandran in 1999 introduced the idealcache model as a formal model of computation for developing algorithms in environments with multiple levels of caching, and coined the terminology of cacheoblivious algorithms. Cacheoblivious algorithms are described as stand ..."
Abstract

Cited by 11 (1 self)
 Add to MetaCart
Frigo, Leiserson, Prokop and Ramachandran in 1999 introduced the idealcache model as a formal model of computation for developing algorithms in environments with multiple levels of caching, and coined the terminology of cacheoblivious algorithms. Cacheoblivious algorithms are described as standard RAM algorithms with only one memory level, i.e. without any knowledge about memory hierarchies, but are analyzed in the twolevel I/O model of Aggarwal and Vitter for an arbitrary memory and block size and an optimal offline cache replacement strategy. The result are algorithms that automatically apply to multilevel memory hierarchies. This paper gives an overview of the results achieved on cacheoblivious algorithms and data structures since the seminal paper by Frigo et al.
An Optimal CacheOblivious Priority Queue and its Application to Graph Algorithms
 SIAM JOURNAL ON COMPUTING
, 2007
"... We develop an optimal cacheoblivious priority queue data structure, supporting insertion, deletion, and deletemin operations in $O(\frac{1}{B}\log_{M/B}\frac{N}{B})$ amortized memory transfers, where $M$ and $B$ are the memory and block transfer sizes of any two consecutive levels of a multilevel ..."
Abstract

Cited by 9 (0 self)
 Add to MetaCart
(Show Context)
We develop an optimal cacheoblivious priority queue data structure, supporting insertion, deletion, and deletemin operations in $O(\frac{1}{B}\log_{M/B}\frac{N}{B})$ amortized memory transfers, where $M$ and $B$ are the memory and block transfer sizes of any two consecutive levels of a multilevel memory hierarchy. In a cacheoblivious data structure, $M$ and $B$ are not used in the description of the structure. Our structure is as efficient as several previously developed external memory (cacheaware) priority queue data structures, which all rely crucially on knowledge about $M$ and $B$. Priority queues are a critical component in many of the best known external memory graph algorithms, and using our cacheoblivious priority queue we develop several cacheoblivious graph algorithms.