Results 1  10
of
28
Cacheoblivious Btrees
, 2000
"... Abstract. This paper presents two dynamic search trees attaining nearoptimal performance on any hierarchical memory. The data structures are independent of the parameters of the memory hierarchy, e.g., the number of memory levels, the blocktransfer size at each level, and the relative speeds of me ..."
Abstract

Cited by 134 (22 self)
 Add to MetaCart
Abstract. This paper presents two dynamic search trees attaining nearoptimal performance on any hierarchical memory. The data structures are independent of the parameters of the memory hierarchy, e.g., the number of memory levels, the blocktransfer size at each level, and the relative speeds of memory levels. The performance is analyzed in terms of the number of memory transfers between two memory levels with an arbitrary blocktransfer size of B; this analysis can then be applied to every adjacent pair of levels in a multilevel memory hierarchy. Both search trees match the optimal search bound of Θ(1+logB+1 N) memory transfers. This bound is also achieved by the classic Btree data structure on a twolevel memory hierarchy with a known blocktransfer size B. The first search tree supports insertions and deletions in Θ(1 + logB+1 N) amortized memory transfers, which matches the Btree’s worstcase bounds. The second search tree supports scanning S consecutive elements optimally in Θ(1 + S/B) memory transfers and supports insertions and deletions in Θ(1 + logB+1 N + log2 N) amortized memory transfers, matching the performance of the Btree for B = B Ω(log N log log N).
CacheOblivious Data Structures and Algorithms for Undirected BreadthFirst Search and Shortest Paths
 IN PROCEEDINGS OF THE 9TH SCANDINAVIAN WORKSHOP ON ALGORITHM THEORY
, 2004
"... We present improved cacheoblivious data structures and algorithms for breadthfirst search and the singlesource shortest path problem on undirected graphs with nonnegative edge weights. Our results close the performance gap between the currently best cacheaware algorithms for these problems and ..."
Abstract

Cited by 25 (9 self)
 Add to MetaCart
We present improved cacheoblivious data structures and algorithms for breadthfirst search and the singlesource shortest path problem on undirected graphs with nonnegative edge weights. Our results close the performance gap between the currently best cacheaware algorithms for these problems and their cacheoblivious counterparts. Our shortestpath algorithm relies on a new data structure, called bucket heap, which is the first cacheoblivious priority queue to efficiently support a weak DecreaseKey operation.
Engineering a cacheoblivious sorting algorithm
 In Proc. 6th Workshop on Algorithm Engineering and Experiments
, 2004
"... The cacheoblivious model of computation is a twolevel memory model with the assumption that the parameters of the model are unknown to the algorithms. A consequence of this assumption is that an algorithm efficient in the cache oblivious model is automatically efficient in a multilevel memory mod ..."
Abstract

Cited by 23 (1 self)
 Add to MetaCart
The cacheoblivious model of computation is a twolevel memory model with the assumption that the parameters of the model are unknown to the algorithms. A consequence of this assumption is that an algorithm efficient in the cache oblivious model is automatically efficient in a multilevel memory model. Since the introduction of the cacheoblivious model by Frigo et al. in 1999, a number of algorithms and data structures in the model has been proposed and analyzed. However, less attention has been given to whether the nice theoretical proporities of cacheoblivious algorithms carry over into practice. This paper is an algorithmic engineering study of cacheoblivious sorting. We investigate a number of implementation issues and parameters choices for the cacheoblivious sorting algorithm Lazy Funnelsort by empirical methods, and compare the final algorithm with Quicksort, the established standard for comparison based sorting, as well as with recent cacheaware proposals. The main result is a carefully implemented cacheoblivious sorting algorithm, which we compare to the best implementation of Quicksort we can find, and find that it competes very well for input residing in RAM, and outperforms Quicksort for input on disk. 1
Cacheoblivious string dictionaries
 In Proc. 17th Annual Symposium on Discrete Algorithm (SODA
, 2006
"... Abstract We present static cacheoblivious dictionary structuresfor strings which provide analogues of tries and suffix trees in the cacheoblivious model. Our constructiontakes as input either a set of strings to store, a single string for which all suffixes are to be stored, a trie, acompressed tr ..."
Abstract

Cited by 18 (2 self)
 Add to MetaCart
Abstract We present static cacheoblivious dictionary structuresfor strings which provide analogues of tries and suffix trees in the cacheoblivious model. Our constructiontakes as input either a set of strings to store, a single string for which all suffixes are to be stored, a trie, acompressed trie, or a suffix tree, and creates a cacheoblivious data structure which performs prefix queriesin O(logB n + P /B) I/Os, where n is the number ofleaves in the trie, P is the query string, and B is theblock size. This query cost is optimal for unbounded alphabets. The data structure uses linear space. 1 Introduction Strings are one of basic data models of computer science.They have numerous applications, e.g. for textual and biological data, and generalize other models such asintegers and multidimensional data. A basic problem in the model is to store a set of strings such thatstrings in the set having a given query string
Optimal sparse matrix dense vector multiplication in the I/OModel
, 2010
"... We study the problem of sparsematrix densevector multiplication (SpMV) in external memory. The task of SpMV is to compute y: = Ax, where A is a sparse N × N matrix and x is a vector. We express sparsity by a parameter k, and for each choice of k consider the class of matrices where the number of n ..."
Abstract

Cited by 16 (5 self)
 Add to MetaCart
We study the problem of sparsematrix densevector multiplication (SpMV) in external memory. The task of SpMV is to compute y: = Ax, where A is a sparse N × N matrix and x is a vector. We express sparsity by a parameter k, and for each choice of k consider the class of matrices where the number of nonzero entries is kN, i.e., where the average number of nonzero entries per column is k. We investigate what is the external worstcase complexity, i.e., the best possible upper bound on the number of I/Os, as a function of k, N and the parameters M (memory size) and B (track size) of the I/Omodel. We determine this complexity up to a constant factor for all meaningful choices of these parameters, as long as k ≤ N 1−ε, where ε depends on the problem variant. Our model of computation for the lower bound is a combination of the I/Omodels of Aggarwal and Vitter, and of Hong and Kung. We study variants of the problem, differing in the memory layout of A. If A is stored in n column major layout, we prove that SpMV has I/O comkN plexity Θ min B max
CacheOblivious Planar Orthogonal Range Searching and Counting
 In Proc. ACM Symposium on Computational Geometry
, 2005
"... We present the first cacheoblivious data structure for planar orthogonal range counting, and improve on previous results for cacheoblivious planar orthogonal range searching. Our range counting structure uses O(N log2 N) space and answers queries using O(logB N) memory transfers, where B is the bl ..."
Abstract

Cited by 15 (4 self)
 Add to MetaCart
We present the first cacheoblivious data structure for planar orthogonal range counting, and improve on previous results for cacheoblivious planar orthogonal range searching. Our range counting structure uses O(N log2 N) space and answers queries using O(logB N) memory transfers, where B is the block size of any memory level in a multilevel memory hierarchy. Using bit manipulation techniques, the space can be further reduced to O(N). The structure can also be modified to support more general semigroup range sum queries in O(logB N) memory transfers, using O(N log2 N) space for threesided queries and O(N log 2 2 N / log2 log2 N)
NetworkOblivious Algorithms
 IN PROC. OF 21ST INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM
, 2007
"... The design of algorithms that can run unchanged yet efficiently on a variety of machines characterized by different degrees of parallelism and communication capabilities is a highly desirable goal. We propose a framework for networkobliviousness based on a model of computation where the only parame ..."
Abstract

Cited by 9 (4 self)
 Add to MetaCart
The design of algorithms that can run unchanged yet efficiently on a variety of machines characterized by different degrees of parallelism and communication capabilities is a highly desirable goal. We propose a framework for networkobliviousness based on a model of computation where the only parameter is the problem’s input size. Algorithms are then evaluated on a model with two parameters, capturing parallelism and granularity of communication. We show that, for a wide class of networkoblivious algorithms, optimality in the latter model implies optimality in a blockvariant of the Decomposable BSP model, which effectively describes a wide and significant class of parallel platforms. We illustrate our framework by providing optimal networkoblivious algorithms for a few key problems, and also establish some negative results. 1
Cacheaware and cacheoblivious adaptive sorting
 In Proc. 32nd International Colloquium on Automata, Languages, and Programming, Lecture Notes in Computer Science
, 2005
"... Abstract. Two new adaptive sorting algorithms are introduced which perform an optimal number of comparisons with respect to the number of inversions in the input. The first algorithm is based on a new linear time reduction to (nonadaptive) sorting. The second algorithm is based on a new division pr ..."
Abstract

Cited by 9 (4 self)
 Add to MetaCart
Abstract. Two new adaptive sorting algorithms are introduced which perform an optimal number of comparisons with respect to the number of inversions in the input. The first algorithm is based on a new linear time reduction to (nonadaptive) sorting. The second algorithm is based on a new division protocol for the GenericSort algorithm by EstivillCastro and Wood. From both algorithms we derive I/Ooptimal cacheaware and cacheoblivious adaptive sorting algorithms. These are the first I/Ooptimal adaptive sorting algorithms. 1
Cacheoblivious algorithms and data structures
 IN SWAT
, 2004
"... Frigo, Leiserson, Prokop and Ramachandran in 1999 introduced the idealcache model as a formal model of computation for developing algorithms in environments with multiple levels of caching, and coined the terminology of cacheoblivious algorithms. Cacheoblivious algorithms are described as stand ..."
Abstract

Cited by 8 (1 self)
 Add to MetaCart
Frigo, Leiserson, Prokop and Ramachandran in 1999 introduced the idealcache model as a formal model of computation for developing algorithms in environments with multiple levels of caching, and coined the terminology of cacheoblivious algorithms. Cacheoblivious algorithms are described as standard RAM algorithms with only one memory level, i.e. without any knowledge about memory hierarchies, but are analyzed in the twolevel I/O model of Aggarwal and Vitter for an arbitrary memory and block size and an optimal offline cache replacement strategy. The result are algorithms that automatically apply to multilevel memory hierarchies. This paper gives an overview of the results achieved on cacheoblivious algorithms and data structures since the seminal paper by Frigo et al.
An Optimal CacheOblivious Priority Queue and its Application to Graph Algorithms
 SIAM JOURNAL ON COMPUTING
, 2007
"... We develop an optimal cacheoblivious priority queue data structure, supporting insertion, deletion, and deletemin operations in $O(\frac{1}{B}\log_{M/B}\frac{N}{B})$ amortized memory transfers, where $M$ and $B$ are the memory and block transfer sizes of any two consecutive levels of a multilevel ..."
Abstract

Cited by 5 (0 self)
 Add to MetaCart
We develop an optimal cacheoblivious priority queue data structure, supporting insertion, deletion, and deletemin operations in $O(\frac{1}{B}\log_{M/B}\frac{N}{B})$ amortized memory transfers, where $M$ and $B$ are the memory and block transfer sizes of any two consecutive levels of a multilevel memory hierarchy. In a cacheoblivious data structure, $M$ and $B$ are not used in the description of the structure. Our structure is as efficient as several previously developed external memory (cacheaware) priority queue data structures, which all rely crucially on knowledge about $M$ and $B$. Priority queues are a critical component in many of the best known external memory graph algorithms, and using our cacheoblivious priority queue we develop several cacheoblivious graph algorithms.