Results 1 -
4 of
4
On the limits of cache-obliviousness
- IN PROC. 35TH ANNUAL ACM SYMPOSIUM ON THEORY OF COMPUTING
, 2003
"... In this paper, we present lower bounds for permuting and sorting in the cache-oblivious model. We prove that (1) I/O optimal cache-oblivious comparison based sorting is not possible without a tall cache assumption, and (2) there does not exist an I/O optimalcache-oblivious algorithm for permuting, ..."
Abstract
-
Cited by 34 (7 self)
- Add to MetaCart
In this paper, we present lower bounds for permuting and sorting in the cache-oblivious model. We prove that (1) I/O optimal cache-oblivious comparison based sorting is not possible without a tall cache assumption, and (2) there does not exist an I/O optimalcache-oblivious algorithm for permuting, not even in the presence of a tall cache assumption.Our results for sorting show the existence of an inherent trade-off in the cache-oblivious model between the strength of the tall cache assumption and the overhead for the case M >> B, and show that Funnelsort and recursive binary mergesort are optimal algorithms in the sense that they attain this trade-off.
Engineering a cache-oblivious sorting algorithm
- In Proc. 6th Workshop on Algorithm Engineering and Experiments
, 2004
"... The cache-oblivious model of computation is a two-level memory model with the assumption that the parameters of the model are unknown to the algorithms. A consequence of this assumption is that an algorithm efficient in the cache oblivious model is automatically efficient in a multi-level memory mod ..."
Abstract
-
Cited by 20 (1 self)
- Add to MetaCart
The cache-oblivious model of computation is a two-level memory model with the assumption that the parameters of the model are unknown to the algorithms. A consequence of this assumption is that an algorithm efficient in the cache oblivious model is automatically efficient in a multi-level memory model. Since the introduction of the cache-oblivious model by Frigo et al. in 1999, a number of algorithms and data structures in the model has been proposed and analyzed. However, less attention has been given to whether the nice theoretical proporities of cache-oblivious algorithms carry over into practice. This paper is an algorithmic engineering study of cache-oblivious sorting. We investigate a number of implementation issues and parameters choices for the cache-oblivious sorting algorithm Lazy Funnelsort by empirical methods, and compare the final algorithm with Quicksort, the established standard for comparison based sorting, as well as with recent cache-aware proposals. The main result is a carefully implemented cache-oblivious sorting algorithm, which we compare to the best implementation of Quicksort we can find, and find that it competes very well for input residing in RAM, and outperforms Quicksort for input on disk. 1
The cost of cache-oblivious searching
- IN PROC. 44TH ANN. SYMP. ON FOUNDATIONS OF COMPUTER SCIENCE (FOCS
, 2003
"... This paper gives tight bounds on the cost of cache-oblivious searching. The paper shows that no cache-oblivious search structure can guarantee a search performance of fewer than lgelog B N memory transfers between any two levels of the memory hierarchy. This lower bound holds even if all of the bloc ..."
Abstract
-
Cited by 17 (7 self)
- Add to MetaCart
This paper gives tight bounds on the cost of cache-oblivious searching. The paper shows that no cache-oblivious search structure can guarantee a search performance of fewer than lgelog B N memory transfers between any two levels of the memory hierarchy. This lower bound holds even if all of the block sizes are limited to be powers of 2. The paper gives modified versions of the van Emde Boas layout, where the expected number of memory transfers between any two levels of the memory hierarchy is arbitrarily close to [lge+O(lglgB/lgB)]log B N +O(1). This factor approaches lge ≈ 1.443 as B increases. The expectation is taken over the random placement in memory of the first element of the structure. Because searching in the disk-access machine (DAM) model can be performed in log B N+O(1) block transfers, thisresultestablishes aseparation between the (2-level) DAM model and cache-oblivious model. The DAM model naturally extends to k levels. The paper also shows that as k grows, the search costs of the optimal k-level DAM search structure and the optimal cache-oblivious search structure rapidly converge. This result demonstrates that for a multilevel memory hierarchy, a simple cache-oblivious structure almost replicates the performance of an optimal parameterized k-level DAM structure.
unknown title
"... Abstract The cache-oblivious model of computation is a two-level memory model with the assump-tion that the parameters of the model are unknown to the algorithms. A consequence of this assumption is that an algorithm efficient in the cache oblivious model is automatically efficientin a multi-level m ..."
Abstract
- Add to MetaCart
Abstract The cache-oblivious model of computation is a two-level memory model with the assump-tion that the parameters of the model are unknown to the algorithms. A consequence of this assumption is that an algorithm efficient in the cache oblivious model is automatically efficientin a multi-level memory model. Since the introduction of the cache-oblivious model by Frigo et al. in 1999, a number ofalgorithms and data structures in the model has been proposed and analyzed. However, less attention has been given to whether the nice theoretical proporities of cache-oblivious algorithmscarry over into practice. This paper is an algorithmic engineering study of cache-oblivious sorting. We investigate anumber of implementation issues and parameters choices for the cache-oblivious sorting algorithm Lazy Funnelsort by empirical methods, and compare the final algorithm with Quicksort,the established standard for comparison based sorting, as well as with recent cache-aware proposals.The main result is a carefully implemented cache-oblivious sorting algorithm, which we compare to the best implementation of Quicksort we can find, and find that it competes verywell for input residing in RAM, and outperforms Quicksort for input on disk. 1 Introduction Modern computers contain a hierarchy of memory levels, with each level acting as a cache for the next. Typical components of the memory hierarchy are: registers, level 1 cache, level 2 cache, level 3 cache, main memory, and disk. The time for accessing a level increases for each new level (most dramatically when going from main memory to disk), making the cost of a memory access depend highly on what is the current lowest memory level containing the element accessed.

