Results 1  10
of
14
Cacheoblivious Btrees
, 2000
"... Abstract. This paper presents two dynamic search trees attaining nearoptimal performance on any hierarchical memory. The data structures are independent of the parameters of the memory hierarchy, e.g., the number of memory levels, the blocktransfer size at each level, and the relative speeds of me ..."
Abstract

Cited by 135 (22 self)
 Add to MetaCart
Abstract. This paper presents two dynamic search trees attaining nearoptimal performance on any hierarchical memory. The data structures are independent of the parameters of the memory hierarchy, e.g., the number of memory levels, the blocktransfer size at each level, and the relative speeds of memory levels. The performance is analyzed in terms of the number of memory transfers between two memory levels with an arbitrary blocktransfer size of B; this analysis can then be applied to every adjacent pair of levels in a multilevel memory hierarchy. Both search trees match the optimal search bound of Θ(1+logB+1 N) memory transfers. This bound is also achieved by the classic Btree data structure on a twolevel memory hierarchy with a known blocktransfer size B. The first search tree supports insertions and deletions in Θ(1 + logB+1 N) amortized memory transfers, which matches the Btree’s worstcase bounds. The second search tree supports scanning S consecutive elements optimally in Θ(1 + S/B) memory transfers and supports insertions and deletions in Θ(1 + logB+1 N + log2 N) amortized memory transfers, matching the performance of the Btree for B = B Ω(log N log log N).
Engineering a cacheoblivious sorting algorithm
 In Proc. 6th Workshop on Algorithm Engineering and Experiments
, 2004
"... The cacheoblivious model of computation is a twolevel memory model with the assumption that the parameters of the model are unknown to the algorithms. A consequence of this assumption is that an algorithm efficient in the cache oblivious model is automatically efficient in a multilevel memory mod ..."
Abstract

Cited by 25 (1 self)
 Add to MetaCart
The cacheoblivious model of computation is a twolevel memory model with the assumption that the parameters of the model are unknown to the algorithms. A consequence of this assumption is that an algorithm efficient in the cache oblivious model is automatically efficient in a multilevel memory model. Since the introduction of the cacheoblivious model by Frigo et al. in 1999, a number of algorithms and data structures in the model has been proposed and analyzed. However, less attention has been given to whether the nice theoretical proporities of cacheoblivious algorithms carry over into practice. This paper is an algorithmic engineering study of cacheoblivious sorting. We investigate a number of implementation issues and parameters choices for the cacheoblivious sorting algorithm Lazy Funnelsort by empirical methods, and compare the final algorithm with Quicksort, the established standard for comparison based sorting, as well as with recent cacheaware proposals. The main result is a carefully implemented cacheoblivious sorting algorithm, which we compare to the best implementation of Quicksort we can find, and find that it competes very well for input residing in RAM, and outperforms Quicksort for input on disk. 1
CacheOblivious Planar Orthogonal Range Searching and Counting
 In Proc. ACM Symposium on Computational Geometry
, 2005
"... We present the first cacheoblivious data structure for planar orthogonal range counting, and improve on previous results for cacheoblivious planar orthogonal range searching. Our range counting structure uses O(N log2 N) space and answers queries using O(logB N) memory transfers, where B is the bl ..."
Abstract

Cited by 15 (4 self)
 Add to MetaCart
We present the first cacheoblivious data structure for planar orthogonal range counting, and improve on previous results for cacheoblivious planar orthogonal range searching. Our range counting structure uses O(N log2 N) space and answers queries using O(logB N) memory transfers, where B is the block size of any memory level in a multilevel memory hierarchy. Using bit manipulation techniques, the space can be further reduced to O(N). The structure can also be modified to support more general semigroup range sum queries in O(logB N) memory transfers, using O(N log2 N) space for threesided queries and O(N log 2 2 N / log2 log2 N)
Cacheoblivious algorithms and data structures
 IN SWAT
, 2004
"... Frigo, Leiserson, Prokop and Ramachandran in 1999 introduced the idealcache model as a formal model of computation for developing algorithms in environments with multiple levels of caching, and coined the terminology of cacheoblivious algorithms. Cacheoblivious algorithms are described as stand ..."
Abstract

Cited by 10 (1 self)
 Add to MetaCart
Frigo, Leiserson, Prokop and Ramachandran in 1999 introduced the idealcache model as a formal model of computation for developing algorithms in environments with multiple levels of caching, and coined the terminology of cacheoblivious algorithms. Cacheoblivious algorithms are described as standard RAM algorithms with only one memory level, i.e. without any knowledge about memory hierarchies, but are analyzed in the twolevel I/O model of Aggarwal and Vitter for an arbitrary memory and block size and an optimal offline cache replacement strategy. The result are algorithms that automatically apply to multilevel memory hierarchies. This paper gives an overview of the results achieved on cacheoblivious algorithms and data structures since the seminal paper by Frigo et al.
An Optimal CacheOblivious Priority Queue and its Application to Graph Algorithms
 SIAM JOURNAL ON COMPUTING
, 2007
"... We develop an optimal cacheoblivious priority queue data structure, supporting insertion, deletion, and deletemin operations in $O(\frac{1}{B}\log_{M/B}\frac{N}{B})$ amortized memory transfers, where $M$ and $B$ are the memory and block transfer sizes of any two consecutive levels of a multilevel ..."
Abstract

Cited by 5 (0 self)
 Add to MetaCart
We develop an optimal cacheoblivious priority queue data structure, supporting insertion, deletion, and deletemin operations in $O(\frac{1}{B}\log_{M/B}\frac{N}{B})$ amortized memory transfers, where $M$ and $B$ are the memory and block transfer sizes of any two consecutive levels of a multilevel memory hierarchy. In a cacheoblivious data structure, $M$ and $B$ are not used in the description of the structure. Our structure is as efficient as several previously developed external memory (cacheaware) priority queue data structures, which all rely crucially on knowledge about $M$ and $B$. Priority queues are a critical component in many of the best known external memory graph algorithms, and using our cacheoblivious priority queue we develop several cacheoblivious graph algorithms.
Simple and semidynamic structures for cacheoblivious planar orthogonal range searching
 In Proc. 22nd ACM Symposium on Computational Geometry
, 2006
"... In this paper, we develop improved cacheoblivious data structures for two and threesided planar orthogonal range searching. Our main result is an optimal static structure for twosided range searching that uses linear space and supports queries in O(logB N + T/B) memory transfers, where B is the ..."
Abstract

Cited by 4 (3 self)
 Add to MetaCart
In this paper, we develop improved cacheoblivious data structures for two and threesided planar orthogonal range searching. Our main result is an optimal static structure for twosided range searching that uses linear space and supports queries in O(logB N + T/B) memory transfers, where B is the block size of any level in a multilevel memory hierarchy and T is the number of reported points. Our structure is the first linearspace cacheoblivious structure for a planar range searching problem with the optimal O(logB N +T/B) query bound. The structure is very simple, and we believe it to be of practical interest. We also show that our twosided range search structure can be constructed cacheobliviously in O(N logB N) memory transfers. Using the logarithmic method and fractional cascading, this leads to a semidynamic linearspace structure that supports twosided range queries in O(log2 N + T/B) memory transfers and insertions in O(log2 N ·logB N) memory transfers amortized. This structure is the first (semi)dynamic structure for any planar range searching problem with a query bound that is logarithmic in the number of elements in the structure and linear in the output size. Finally, using a simple standard construction, we also obtain a static O(N log2 N)space structure for threesided range searching that supports queries in the optimal bound of O(logB N +T/B) memory transfers. These bounds match the bounds of the best previously known structure for this
Cacheoblivious range reporting with optimal queries requires superlinear space
 In Proceedings of the 25th ACM Symposium on Computational Geometry
, 2009
"... We consider a number of range reporting problems in two and three dimensions and prove lower bounds on the amount of space required by any cacheoblivious data structure for these problems that achieves an optimal query bound of O(log B N + K/B) block transfers in the worst case, where K is the size ..."
Abstract

Cited by 4 (2 self)
 Add to MetaCart
We consider a number of range reporting problems in two and three dimensions and prove lower bounds on the amount of space required by any cacheoblivious data structure for these problems that achieves an optimal query bound of O(log B N + K/B) block transfers in the worst case, where K is the size of the query output. The problems we study are threesided range reporting, 3d dominance reporting, and 3d halfspace range reporting. We prove that, in order to achieve the above query bound or even a bound of O((log B N) c (1 + K/B)), for any constant c> 0, the structure has to use Ω(N(log log N) ε) space, where ε> 0 is a constant that depends on c and on the constant hidden in the bigOh notation of the query bound. Our result has a number of interesting consequences. The first one is a new type of separation between the I/O model and the cacheoblivious model, as I/Oefficient data structures with the optimal query bound and using linear or O(N log ∗ N) space are known for the above problems. The second consequence is the nonexistence of a linearspace cacheoblivious persistent Btree with worstcase optimal 1d range reporting queries. Part of this work was done while visiting Dalhousie University.
HATtrie: A Cacheconscious Triebased Data Structure for Strings
, 2007
"... Tries are the fastest treebased data structures for managing strings inmemory, but are spaceintensive. The bursttrie is almost as fast but reduces space by collapsing triechains into buckets. This is not however, a cacheconscious approach and can lead to poor performance on current processors. ..."
Abstract

Cited by 2 (1 self)
 Add to MetaCart
Tries are the fastest treebased data structures for managing strings inmemory, but are spaceintensive. The bursttrie is almost as fast but reduces space by collapsing triechains into buckets. This is not however, a cacheconscious approach and can lead to poor performance on current processors. In this paper, we introduce the HATtrie, a cacheconscious triebased data structure that is formed by carefully combining existing components. We evaluate performance using several realworld datasets and against other highperformance data structures. We show strong improvements in both time and space; in most cases approaching that of the cacheconscious hash table. Our HATtrie is shown to be the most efficient triebased data structure for managing variablelength strings inmemory while maintaining sort order.
Redesigning the String Hash Table, Burst Trie, and BST to Exploit Cache
, 2011
"... A key decision when developing inmemory computing applications is choice of a mechanism to store and retrieve strings. The most efficient current data structures for this task are the hash table with movetofront chains and the burst trie, both of which use linked lists as a substructure, and vari ..."
Abstract

Cited by 2 (1 self)
 Add to MetaCart
A key decision when developing inmemory computing applications is choice of a mechanism to store and retrieve strings. The most efficient current data structures for this task are the hash table with movetofront chains and the burst trie, both of which use linked lists as a substructure, and variants of binary search tree. These data structures are computationally efficient, but typical implementations use large numbers of nodes and pointers to manage strings, which is not efficient in use of cache. In this article, we explore two alternatives to the standard representation: the simple expedient of including the string in its node, and, for linked lists, the more drastic step of replacing each list of nodes by a contiguous array of characters. Our experiments show that, for large sets of strings, the improvement is dramatic. For hashing, in the best case the total space overhead is reduced to less than 1 bit per string. For the burst trie, over 300MB of strings can be stored in a total of under 200MB of memory with significantly improved search time. These results, on a variety of data sets, show that cachefriendly variants of fundamental data structures can yield remarkable gains in performance.
CacheOblivious Hashing
, 2010
"... The hash table, especially its external memory version, is one of the most important index structures in large databases. Assuming a truly random hash function, it is known that in a standard external hash table with block size b, searching for a particular key only takes expected average tq = 1+1/2 ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
The hash table, especially its external memory version, is one of the most important index structures in large databases. Assuming a truly random hash function, it is known that in a standard external hash table with block size b, searching for a particular key only takes expected average tq = 1+1/2 Ω(b) disk accesses for any load factor α bounded away from 1. However, such nearperfect performance is achieved only when b is known and the hash table is particularly tuned for working with such a blocking. In this paper we study if it is possible to build a cacheoblivious hash table that works well with any blocking. Such a hash table will automatically perform well across all levels of the memory hierarchy and does not need any hardwarespecific tuning, an important feature in autonomous databases. We first show that linear probing, a classical collision resolution strategy for hash tables, can be easily made cacheoblivious but it only achieves tq = 1 + O(α/b). Then we demonstrate that it is possible to obtain tq = 1 + 1/2 Ω(b), thus matching the cacheaware bound, if the following two conditions hold: (a) b is a power of 2; and (b) every block starts at a memory address divisible by b. Both conditions hold on a real machine, although they are not stated in the cacheoblivious model. Interestingly, we also show that neither condition is dispensable: if either of them is removed, the best obtainable bound is tq = 1 + O(α/b), which is exactly what linear probing achieves.