Results 1  10
of
72
Cacheoblivious Btrees
, 2000
"... Abstract. This paper presents two dynamic search trees attaining nearoptimal performance on any hierarchical memory. The data structures are independent of the parameters of the memory hierarchy, e.g., the number of memory levels, the blocktransfer size at each level, and the relative speeds of me ..."
Abstract

Cited by 155 (23 self)
 Add to MetaCart
Abstract. This paper presents two dynamic search trees attaining nearoptimal performance on any hierarchical memory. The data structures are independent of the parameters of the memory hierarchy, e.g., the number of memory levels, the blocktransfer size at each level, and the relative speeds of memory levels. The performance is analyzed in terms of the number of memory transfers between two memory levels with an arbitrary blocktransfer size of B; this analysis can then be applied to every adjacent pair of levels in a multilevel memory hierarchy. Both search trees match the optimal search bound of Θ(1+logB+1 N) memory transfers. This bound is also achieved by the classic Btree data structure on a twolevel memory hierarchy with a known blocktransfer size B. The first search tree supports insertions and deletions in Θ(1 + logB+1 N) amortized memory transfers, which matches the Btree’s worstcase bounds. The second search tree supports scanning S consecutive elements optimally in Θ(1 + S/B) memory transfers and supports insertions and deletions in Θ(1 + logB+1 N + log2 N) amortized memory transfers, matching the performance of the Btree for B = B Ω(log N log log N).
Cacheoblivious priority queue and graph algorithm applications
 In Proc. 34th Annual ACM Symposium on Theory of Computing
, 2002
"... In this paper we develop an optimal cacheoblivious priority queue data structure, supporting insertion, deletion, and deletemin operations in O ( 1 B logM/B N) amortized memory B transfers, where M and B are the memory and block transfer sizes of any two consecutive levels of a multilevel memory hi ..."
Abstract

Cited by 68 (9 self)
 Add to MetaCart
(Show Context)
In this paper we develop an optimal cacheoblivious priority queue data structure, supporting insertion, deletion, and deletemin operations in O ( 1 B logM/B N) amortized memory B transfers, where M and B are the memory and block transfer sizes of any two consecutive levels of a multilevel memory hierarchy. In a cacheoblivious data structure, M and B are not used in the description of the structure. The bounds match the bounds of several previously developed externalmemory (cacheaware) priority queue data structures, which all rely crucially on knowledge about M and B. Priority queues are a critical component in many of the best known externalmemory graph algorithms, and using our cacheoblivious priority queue we develop several cacheoblivious graph algorithms.
Cache Oblivious Search Trees via Binary Trees of Small Height
 IN PROC. ACMSIAM SYMP. ON DISCRETE ALGORITHMS
, 2002
"... We propose a version of cache oblivious search trees which is simpler than the previous proposal of Bender, Demaine and FarachColton and has the same complexity bounds. In particular, our data structure avoids the use of weight balanced Btrees, and can be implemented as just a single array of ..."
Abstract

Cited by 65 (8 self)
 Add to MetaCart
(Show Context)
We propose a version of cache oblivious search trees which is simpler than the previous proposal of Bender, Demaine and FarachColton and has the same complexity bounds. In particular, our data structure avoids the use of weight balanced Btrees, and can be implemented as just a single array of data elements, without the use of pointers. The structure also improves space utilization.
STXXL: Standard template library for XXL data sets
 In: Proc. of ESA 2005. Volume 3669 of LNCS
, 2005
"... for processing huge data sets that can fit only on hard disks. It supports parallel disks, overlapping between disk I/O and computation and it is the first I/Oefficient algorithm library that supports the pipelining technique that can save more than half of the I/Os. STXXL has been applied both in ..."
Abstract

Cited by 56 (5 self)
 Add to MetaCart
(Show Context)
for processing huge data sets that can fit only on hard disks. It supports parallel disks, overlapping between disk I/O and computation and it is the first I/Oefficient algorithm library that supports the pipelining technique that can save more than half of the I/Os. STXXL has been applied both in academic and industrial environments for a range of problems including text processing, graph algorithms, computational geometry, gaussian elimination, visualization, and analysis of microscopic images, differential cryptographic analysis, etc. The performance of STXXL and its applications is evaluated on synthetic and realworld inputs. We present the design of the library, how its performance features are supported, and demonstrate how the library integrates with STL. KEY WORDS: very large data sets; software library; C++ standard template library; algorithm engineering 1.
Cacheoblivious algorithms and data structures
 IN LECTURE NOTES FROM THE EEF SUMMER SCHOOL ON MASSIVE DATA SETS
, 2002
"... A recent direction in the design of cacheefficient and diskefficient algorithms and data structures is the notion of cache obliviousness, introduced by Frigo, Leiserson, Prokop, and Ramachandran in 1999. Cacheoblivious algorithms perform well on a multilevel memory hierarchy without knowing any pa ..."
Abstract

Cited by 42 (2 self)
 Add to MetaCart
A recent direction in the design of cacheefficient and diskefficient algorithms and data structures is the notion of cache obliviousness, introduced by Frigo, Leiserson, Prokop, and Ramachandran in 1999. Cacheoblivious algorithms perform well on a multilevel memory hierarchy without knowing any parameters of the hierarchy, only knowing the existence of a hierarchy. Equivalently, a single cacheoblivious algorithm is efficient on all memory hierarchies simultaneously. While such results might seem impossible, a recent body of work has developed cacheoblivious algorithms and data structures that perform as well or nearly as well as standard externalmemory structures which require knowledge of the cache/memory size and block transfer size. Here we describe several of these results with the intent of elucidating the techniques behind their design. Perhaps the most exciting of these results are the data structures, which form general building blocks immediately
Cache oblivious distribution sweeping
 IN PROC. 29TH INTERNATIONAL COLLOQUIUM ON AUTOMATA, LANGUAGES, AND PROGRAMMING (ICALP), VOLUME 2380 OF LNCS
, 2002
"... We adapt the distribution sweeping method to the cache oblivious model. Distribution sweeping is the name used for a general approach for divideandconquer algorithms where the combination of solved subproblems can be viewed as a merging process of streams. We demonstrate by a series of algorithms ..."
Abstract

Cited by 42 (10 self)
 Add to MetaCart
(Show Context)
We adapt the distribution sweeping method to the cache oblivious model. Distribution sweeping is the name used for a general approach for divideandconquer algorithms where the combination of solved subproblems can be viewed as a merging process of streams. We demonstrate by a series of algorithms for specific problems the feasibility of the method in a cache oblivious setting. The problems all come from computational geometry, and are: orthogonal line segment intersection reporting, the all nearest neighbors problem, the 3D maxima problem, computing the measure of a set of axisparallel rectangles, computing the visibility of a set of line segments from a point, batched orthogonal range queries, and reporting pairwise intersections of axisparallel rectangles. Our basic building block is a simplified version of the cache oblivious sorting algorithm Funnelsort of Frigo et al., which is of independent interest.
On the limits of cacheobliviousness
 IN PROC. 35TH ANNUAL ACM SYMPOSIUM ON THEORY OF COMPUTING
, 2003
"... In this paper, we present lower bounds for permuting and sorting in the cacheoblivious model. We prove that (1) I/O optimal cacheoblivious comparison based sorting is not possible without a tall cache assumption, and (2) there does not exist an I/O optimalcacheoblivious algorithm for permuting, ..."
Abstract

Cited by 42 (6 self)
 Add to MetaCart
(Show Context)
In this paper, we present lower bounds for permuting and sorting in the cacheoblivious model. We prove that (1) I/O optimal cacheoblivious comparison based sorting is not possible without a tall cache assumption, and (2) there does not exist an I/O optimalcacheoblivious algorithm for permuting, not even in the presence of a tall cache assumption.Our results for sorting show the existence of an inherent tradeoff in the cacheoblivious model between the strength of the tall cache assumption and the overhead for the case M >> B, and show that Funnelsort and recursive binary mergesort are optimal algorithms in the sense that they attain this tradeoff.
Monotone Minimal Perfect Hashing: Searching a Sorted Table with O(1) Accesses
"... A minimal perfect hash function maps a set S of n keys into the set { 0, 1,..., n − 1} bijectively. Classical results state that minimal perfect hashing is possible in constant time using a structure occupying space close to the lower bound of log e bits per element. Here we consider the problem of ..."
Abstract

Cited by 35 (11 self)
 Add to MetaCart
(Show Context)
A minimal perfect hash function maps a set S of n keys into the set { 0, 1,..., n − 1} bijectively. Classical results state that minimal perfect hashing is possible in constant time using a structure occupying space close to the lower bound of log e bits per element. Here we consider the problem of monotone minimal perfect hashing, in which the bijection is required to preserve the lexicographical ordering of the keys. A monotone minimal perfect hash function can be seen as a very weak form of index that provides ranking just on the set S (and answers randomly outside of S). Our goal is to minimise the description size of the hash function: we show that, for a set S of n elements out of a universe of 2 w elements, O(n log log w) bits are sufficient to hash monotonically with evaluation time O(log w). Alternatively, we can get space O(n log w) bits with O(1) query time. Both of these data structures improve a straightforward construction with O(n log w) space and O(log w) query time. As a consequence, it is possible to search a sorted table with O(1) accesses to the table (using additional O(n log log w) bits). Our results are based on a structure (of independent interest) that represents a trie in a very compact way, but admits errors. As a further application of the same structure, we show how to compute the predecessor (in the sorted order of S) of an arbitrary element, using O(1) accesses in expectation and an index of O(n log w) bits, improving the trivial result of O(nw) bits. This implies an efficient index for searching a blocked memory.
Funnel heap  a cache oblivious priority queue
 In Proc. 13th Annual International Symposium on Algorithms and Computation, volume 2518 of LNCS
, 2002
"... Abstract The cache oblivious model of computation is a twolevel memory model with the assumption that the parameters of the model are unknown to the algorithms. A consequence of this assumption is that an algorithm efficient in the cache oblivious model is automatically efficient in a multilevel m ..."
Abstract

Cited by 35 (6 self)
 Add to MetaCart
(Show Context)
Abstract The cache oblivious model of computation is a twolevel memory model with the assumption that the parameters of the model are unknown to the algorithms. A consequence of this assumption is that an algorithm efficient in the cache oblivious model is automatically efficient in a multilevel memory model. Arge et al. recently presented the first optimal cache oblivious priority queue, and demonstrated the importance of this result by providing the first cache oblivious algorithms for graph problems. Their structure uses cache oblivious sorting and selection as subroutines. In this paper, we devise an alternative optimal cache oblivious priority queue based only on binary merging. We also show that our structure can be made adaptive to different usage profiles. 1
CacheOblivious Streaming Btrees
, 2007
"... A streaming Btree is a dictionary that efficiently implements insertions and range queries. We present two cacheoblivious streaming Btrees, the shuttle tree, and the cacheoblivious lookahead array (COLA). For blocktransfer size B and on N elements, the shuttle tree implements searches in optima ..."
Abstract

Cited by 34 (10 self)
 Add to MetaCart
A streaming Btree is a dictionary that efficiently implements insertions and range queries. We present two cacheoblivious streaming Btrees, the shuttle tree, and the cacheoblivious lookahead array (COLA). For blocktransfer size B and on N elements, the shuttle tree implements searches in optimal O ` logB+1 N ´ transfers, range queries of L successive elements in optimal O ` logB+1 N + L/B ´ transfers, and insertions in O “ (logB+1 N)/BΘ(1/(loglogB)2 ”) +(log2 N)/B transfers, which is an asymptotic speedup over traditional Btrees if B ≥ (logN) 1+c/logloglog2 N for any constant c> 1. A COLA implements searches in O(logN) transfers, range queries in O(logN + L/B) transfers, and insertions in amortized O((logN)/B) transfers, matching the bounds for a (cacheaware) buffered repository tree. A partially deamortized COLA matches these bounds but reduces the worstcase insertion cost to O(logN) if memory size M = Ω(logN). We also present a cacheaware version of the COLA, the lookahead array, which achieves the same bounds as Brodal and Fagerberg’s (cacheaware) Bεtree. We compare our COLA implementation to a traditional Btree. Our COLA implementation runs 790 times faster for random insertions, 3.1 times slower for insertions of sorted data, and 3.5 times slower for searches.