Results 1  10
of
29
On the limits of cacheobliviousness
 IN PROC. 35TH ANNUAL ACM SYMPOSIUM ON THEORY OF COMPUTING
, 2003
"... In this paper, we present lower bounds for permuting and sorting in the cacheoblivious model. We prove that (1) I/O optimal cacheoblivious comparison based sorting is not possible without a tall cache assumption, and (2) there does not exist an I/O optimalcacheoblivious algorithm for permuting, ..."
Abstract

Cited by 41 (8 self)
 Add to MetaCart
(Show Context)
In this paper, we present lower bounds for permuting and sorting in the cacheoblivious model. We prove that (1) I/O optimal cacheoblivious comparison based sorting is not possible without a tall cache assumption, and (2) there does not exist an I/O optimalcacheoblivious algorithm for permuting, not even in the presence of a tall cache assumption.Our results for sorting show the existence of an inherent tradeoff in the cacheoblivious model between the strength of the tall cache assumption and the overhead for the case M >> B, and show that Funnelsort and recursive binary mergesort are optimal algorithms in the sense that they attain this tradeoff.
CacheOblivious Data Structures and Algorithms for Undirected BreadthFirst Search and Shortest Paths
 IN PROCEEDINGS OF THE 9TH SCANDINAVIAN WORKSHOP ON ALGORITHM THEORY
, 2004
"... We present improved cacheoblivious data structures and algorithms for breadthfirst search and the singlesource shortest path problem on undirected graphs with nonnegative edge weights. Our results close the performance gap between the currently best cacheaware algorithms for these problems and ..."
Abstract

Cited by 26 (10 self)
 Add to MetaCart
(Show Context)
We present improved cacheoblivious data structures and algorithms for breadthfirst search and the singlesource shortest path problem on undirected graphs with nonnegative edge weights. Our results close the performance gap between the currently best cacheaware algorithms for these problems and their cacheoblivious counterparts. Our shortestpath algorithm relies on a new data structure, called bucket heap, which is the first cacheoblivious priority queue to efficiently support a weak DecreaseKey operation.
Engineering a cacheoblivious sorting algorithm
 In Proc. 6th Workshop on Algorithm Engineering and Experiments
, 2004
"... The cacheoblivious model of computation is a twolevel memory model with the assumption that the parameters of the model are unknown to the algorithms. A consequence of this assumption is that an algorithm efficient in the cache oblivious model is automatically efficient in a multilevel memory mod ..."
Abstract

Cited by 25 (1 self)
 Add to MetaCart
The cacheoblivious model of computation is a twolevel memory model with the assumption that the parameters of the model are unknown to the algorithms. A consequence of this assumption is that an algorithm efficient in the cache oblivious model is automatically efficient in a multilevel memory model. Since the introduction of the cacheoblivious model by Frigo et al. in 1999, a number of algorithms and data structures in the model has been proposed and analyzed. However, less attention has been given to whether the nice theoretical proporities of cacheoblivious algorithms carry over into practice. This paper is an algorithmic engineering study of cacheoblivious sorting. We investigate a number of implementation issues and parameters choices for the cacheoblivious sorting algorithm Lazy Funnelsort by empirical methods, and compare the final algorithm with Quicksort, the established standard for comparison based sorting, as well as with recent cacheaware proposals. The main result is a carefully implemented cacheoblivious sorting algorithm, which we compare to the best implementation of Quicksort we can find, and find that it competes very well for input residing in RAM, and outperforms Quicksort for input on disk. 1
CacheOblivious Planar Orthogonal Range Searching and Counting
 In Proc. ACM Symposium on Computational Geometry
, 2005
"... We present the first cacheoblivious data structure for planar orthogonal range counting, and improve on previous results for cacheoblivious planar orthogonal range searching. Our range counting structure uses O(N log2 N) space and answers queries using O(logB N) memory transfers, where B is the bl ..."
Abstract

Cited by 18 (5 self)
 Add to MetaCart
(Show Context)
We present the first cacheoblivious data structure for planar orthogonal range counting, and improve on previous results for cacheoblivious planar orthogonal range searching. Our range counting structure uses O(N log2 N) space and answers queries using O(logB N) memory transfers, where B is the block size of any memory level in a multilevel memory hierarchy. Using bit manipulation techniques, the space can be further reduced to O(N). The structure can also be modified to support more general semigroup range sum queries in O(logB N) memory transfers, using O(N log2 N) space for threesided queries and O(N log 2 2 N / log2 log2 N)
The cost of cacheoblivious searching
 IN PROC. 44TH ANN. SYMP. ON FOUNDATIONS OF COMPUTER SCIENCE (FOCS
, 2003
"... This paper gives tight bounds on the cost of cacheoblivious searching. The paper shows that no cacheoblivious search structure can guarantee a search performance of fewer than lgelog B N memory transfers between any two levels of the memory hierarchy. This lower bound holds even if all of the bloc ..."
Abstract

Cited by 18 (9 self)
 Add to MetaCart
This paper gives tight bounds on the cost of cacheoblivious searching. The paper shows that no cacheoblivious search structure can guarantee a search performance of fewer than lgelog B N memory transfers between any two levels of the memory hierarchy. This lower bound holds even if all of the block sizes are limited to be powers of 2. The paper gives modified versions of the van Emde Boas layout, where the expected number of memory transfers between any two levels of the memory hierarchy is arbitrarily close to [lge+O(lglgB/lgB)]log B N +O(1). This factor approaches lge ≈ 1.443 as B increases. The expectation is taken over the random placement in memory of the first element of the structure. Because searching in the diskaccess machine (DAM) model can be performed in log B N+O(1) block transfers, thisresultestablishes aseparation between the (2level) DAM model and cacheoblivious model. The DAM model naturally extends to k levels. The paper also shows that as k grows, the search costs of the optimal klevel DAM search structure and the optimal cacheoblivious search structure rapidly converge. This result demonstrates that for a multilevel memory hierarchy, a simple cacheoblivious structure almost replicates the performance of an optimal parameterized klevel DAM structure.
Cache oblivious algorithms
 Algorithms for Memory Hierarchies, LNCS 2625
, 2003
"... Abstract. The cache oblivious model is a simple and elegant model to design algorithms that perform well in hierarchical memory models ubiquitous on current systems. This model was first formulated in [22] and has since been a topic of intense research. Analyzing and designing algorithms and data st ..."
Abstract

Cited by 14 (0 self)
 Add to MetaCart
(Show Context)
Abstract. The cache oblivious model is a simple and elegant model to design algorithms that perform well in hierarchical memory models ubiquitous on current systems. This model was first formulated in [22] and has since been a topic of intense research. Analyzing and designing algorithms and data structures in this model involves not only an asymptotic analysis of the number of steps executed in terms of the input size, but also the movement of data optimally among the different levels of the memory hierarchy. This chapter is aimed as an introduction to the “idealcache ” model of [22] and techniques used to design cache oblivious algorithms. The chapter also presents some experimental insights and results. Part of this work was done while the author was visiting MPISaarbrücken. The
Cacheoblivious shortest paths in graphs using buffer heap
 In Proceedings of the 16th ACM Symposium on Parallelism in Algorithms and Architectures
, 2004
"... We present the Buffer Heap (BH), a cacheoblivious priority queue that supports DeleteMin, Delete, and DecreaseKey operations in O ( 1 B log2 N) amortized block transfers from B external memory, where B is the (unknown) blocksize and N is the maximum number of elements in the queue. As is common ..."
Abstract

Cited by 12 (3 self)
 Add to MetaCart
(Show Context)
We present the Buffer Heap (BH), a cacheoblivious priority queue that supports DeleteMin, Delete, and DecreaseKey operations in O ( 1 B log2 N) amortized block transfers from B external memory, where B is the (unknown) blocksize and N is the maximum number of elements in the queue. As is common in cacheoblivious algorithms, we assume a ‘tall cache ’ (i.e., M = Ω(B 1+ɛ), where M is the size of the main memory). We also assume the DecreaseKey operation only verifies that the element does not exist in the priority queue with a smaller key value, hence it also supports the insert operation in the same amortized bound. The amortized time bound for each operation is O(log N). We also present a CacheOblivious Tournament Tree (COTT), which is simpler than the Buffer Heap, but has weaker bounds. Using the Buffer Heap we present cacheoblivious algorithms for undirected and directed singlesource shortest path (SSSP) problems for graphs with nonnegative edgeweights. On a graph with V vertices and E edges, our algorithm for the undirected case performs O(V + E B log V 2 B) block transfers and for the directed case performs O((V + E B) · log2 V) block transfers. The running time of both algoB rithms is O((V + E) · log V). For both priority queues with DecreaseKey operation, and for shortest path problems on general graphs, our results appear to give the first nontrivial cacheoblivious bounds.
CacheOblivious RTrees
, 2005
"... We develop a cacheoblivious data structure for storing a set S of N axisaligned rectangles in the plane, such that all rectangles in S intersecting a query rectangle or point can be found efficiently. Our structure is an axisaligned boundingbox hierarchy and as such it is the first cacheoblivio ..."
Abstract

Cited by 12 (3 self)
 Add to MetaCart
We develop a cacheoblivious data structure for storing a set S of N axisaligned rectangles in the plane, such that all rectangles in S intersecting a query rectangle or point can be found efficiently. Our structure is an axisaligned boundingbox hierarchy and as such it is the first cacheoblivious Rtree with provable performance guarantees. If no point in the plane is contained in B or more rectangles in S, the structure answers a rectangle query using O(\sqrt{N/B} + T/B) memory transfers and a point query using O((N/B)^ε) memory transfers for any ε>0, where B is the block size of memory transfers between any two levels of a multilevel memory hierarchy. We also develop a variant of our structure that achieves the same performance on input sets with arbitrary overlap among the rectangles. The rectangle query bound matches the bound of the best known linearspace cacheaware structure.
Cacheoblivious algorithms and data structures
 IN SWAT
, 2004
"... Frigo, Leiserson, Prokop and Ramachandran in 1999 introduced the idealcache model as a formal model of computation for developing algorithms in environments with multiple levels of caching, and coined the terminology of cacheoblivious algorithms. Cacheoblivious algorithms are described as stand ..."
Abstract

Cited by 10 (2 self)
 Add to MetaCart
(Show Context)
Frigo, Leiserson, Prokop and Ramachandran in 1999 introduced the idealcache model as a formal model of computation for developing algorithms in environments with multiple levels of caching, and coined the terminology of cacheoblivious algorithms. Cacheoblivious algorithms are described as standard RAM algorithms with only one memory level, i.e. without any knowledge about memory hierarchies, but are analyzed in the twolevel I/O model of Aggarwal and Vitter for an arbitrary memory and block size and an optimal offline cache replacement strategy. The result are algorithms that automatically apply to multilevel memory hierarchies. This paper gives an overview of the results achieved on cacheoblivious algorithms and data structures since the seminal paper by Frigo et al.
On adaptive integer sorting
 In 12th Annual European Symposium on Algorithms, ESA 2004
, 2004
"... Abstract. This paper considers integer sorting on a RAM. We show that adaptive sorting of a sequence with qn inversions is asymptotically equivalent to multisorting groups of at most q keys, and a total of n keys. Using the recent O(n √ log log n) expected time sorting of Han and Thorup on each set, ..."
Abstract

Cited by 5 (0 self)
 Add to MetaCart
(Show Context)
Abstract. This paper considers integer sorting on a RAM. We show that adaptive sorting of a sequence with qn inversions is asymptotically equivalent to multisorting groups of at most q keys, and a total of n keys. Using the recent O(n √ log log n) expected time sorting of Han and Thorup on each set, we immediately get an adaptive expected sorting time of O(n √ log log q). Interestingly, for any positive constant ε, we show that multisorting and adaptive inversion sorting can be performed in (log n)1−ε linear time if q ≤ 2. We also show how to asymptotically improve the running time of any traditional sorting algorithm on a class of inputs much broader than those with few inversions. 1