Results 1  10
of
99
External Memory Data Structures
, 2001
"... In many massive dataset applications the data must be stored in space and query efficient data structures on external storage devices. Often the data needs to be changed dynamically. In this chapter we discuss recent advances in the development of provably worstcase efficient external memory dynami ..."
Abstract

Cited by 81 (36 self)
 Add to MetaCart
In many massive dataset applications the data must be stored in space and query efficient data structures on external storage devices. Often the data needs to be changed dynamically. In this chapter we discuss recent advances in the development of provably worstcase efficient external memory dynamic data structures. We also briefly discuss some of the most popular external data structures used in practice.
A LocalityPreserving CacheOblivious Dynamic Dictionary
, 2002
"... This paper presents a simple dictionary structure designed for a hierarchical memory. The proposed data structure is cache oblivious and locality preserving. A cacheoblivious data structure has memory performance optimized for all levels of the memory hierarchy even though it has no memoryhierarc ..."
Abstract

Cited by 73 (21 self)
 Add to MetaCart
This paper presents a simple dictionary structure designed for a hierarchical memory. The proposed data structure is cache oblivious and locality preserving. A cacheoblivious data structure has memory performance optimized for all levels of the memory hierarchy even though it has no memoryhierarchyspeci c parameterization. A localitypreserving dictionary maintains elements of similar key values stored close together for fast access to ranges of data with consecutive keys.
Cacheoblivious priority queue and graph algorithm applications
 In Proc. 34th Annual ACM Symposium on Theory of Computing
, 2002
"... In this paper we develop an optimal cacheoblivious priority queue data structure, supporting insertion, deletion, and deletemin operations in O ( 1 B logM/B N) amortized memory B transfers, where M and B are the memory and block transfer sizes of any two consecutive levels of a multilevel memory hi ..."
Abstract

Cited by 68 (10 self)
 Add to MetaCart
In this paper we develop an optimal cacheoblivious priority queue data structure, supporting insertion, deletion, and deletemin operations in O ( 1 B logM/B N) amortized memory B transfers, where M and B are the memory and block transfer sizes of any two consecutive levels of a multilevel memory hierarchy. In a cacheoblivious data structure, M and B are not used in the description of the structure. The bounds match the bounds of several previously developed externalmemory (cacheaware) priority queue data structures, which all rely crucially on knowledge about M and B. Priority queues are a critical component in many of the best known externalmemory graph algorithms, and using our cacheoblivious priority queue we develop several cacheoblivious graph algorithms.
Cache Oblivious Search Trees via Binary Trees of Small Height
 In Proc. ACMSIAM Symp. on Discrete Algorithms
, 2002
"... We propose a version of cache oblivious search trees which is simpler than the previous proposal of Bender, Demaine and FarachColton and has the same complexity bounds. In particular, our data structure avoids the use of weight balanced Btrees, and can be implemented as just a single array of ..."
Abstract

Cited by 64 (9 self)
 Add to MetaCart
We propose a version of cache oblivious search trees which is simpler than the previous proposal of Bender, Demaine and FarachColton and has the same complexity bounds. In particular, our data structure avoids the use of weight balanced Btrees, and can be implemented as just a single array of data elements, without the use of pointers. The structure also improves space utilization.
Cache oblivious distribution sweeping
 IN PROC. 29TH INTERNATIONAL COLLOQUIUM ON AUTOMATA, LANGUAGES, AND PROGRAMMING (ICALP), VOLUME 2380 OF LNCS
, 2002
"... We adapt the distribution sweeping method to the cache oblivious model. Distribution sweeping is the name used for a general approach for divideandconquer algorithms where the combination of solved subproblems can be viewed as a merging process of streams. We demonstrate by a series of algorithms ..."
Abstract

Cited by 43 (11 self)
 Add to MetaCart
We adapt the distribution sweeping method to the cache oblivious model. Distribution sweeping is the name used for a general approach for divideandconquer algorithms where the combination of solved subproblems can be viewed as a merging process of streams. We demonstrate by a series of algorithms for specific problems the feasibility of the method in a cache oblivious setting. The problems all come from computational geometry, and are: orthogonal line segment intersection reporting, the all nearest neighbors problem, the 3D maxima problem, computing the measure of a set of axisparallel rectangles, computing the visibility of a set of line segments from a point, batched orthogonal range queries, and reporting pairwise intersections of axisparallel rectangles. Our basic building block is a simplified version of the cache oblivious sorting algorithm Funnelsort of Frigo et al., which is of independent interest.
On the limits of cacheobliviousness
 IN PROC. 35TH ANNUAL ACM SYMPOSIUM ON THEORY OF COMPUTING
, 2003
"... In this paper, we present lower bounds for permuting and sorting in the cacheoblivious model. We prove that (1) I/O optimal cacheoblivious comparison based sorting is not possible without a tall cache assumption, and (2) there does not exist an I/O optimalcacheoblivious algorithm for permuting, ..."
Abstract

Cited by 40 (7 self)
 Add to MetaCart
In this paper, we present lower bounds for permuting and sorting in the cacheoblivious model. We prove that (1) I/O optimal cacheoblivious comparison based sorting is not possible without a tall cache assumption, and (2) there does not exist an I/O optimalcacheoblivious algorithm for permuting, not even in the presence of a tall cache assumption.Our results for sorting show the existence of an inherent tradeoff in the cacheoblivious model between the strength of the tall cache assumption and the overhead for the case M >> B, and show that Funnelsort and recursive binary mergesort are optimal algorithms in the sense that they attain this tradeoff.
Improving Index Performance through Prefetching
, 2001
"... This paper proposes and evaluates Prefetching B Trees#, which use prefetching to accelerate two important operations on B Tree indices: searches and range scans. To accelerate searches, pB Trees use prefetching to e#ectively create wider nodes than the natural data transfer size: e.g., ..."
Abstract

Cited by 39 (8 self)
 Add to MetaCart
This paper proposes and evaluates Prefetching B Trees#, which use prefetching to accelerate two important operations on B Tree indices: searches and range scans. To accelerate searches, pB Trees use prefetching to e#ectively create wider nodes than the natural data transfer size: e.g., eight vs. one cache lines or disk pages. These wider nodes reduce the height of the B Tree, thereby decreasing the number of expensive misses when going from parenttochild without signi#cantly increasing the cost of fetching a given node. Our results show that this technique speeds up search and update times by a factor of 1.2#1.5 for mainmemory B Trees. In addition, it outperforms and is complementary to #CacheSensitiveB Trees." To accelerate range scans, pB Trees provide arrays of pointers to their leaf nodes. These allow the pB Tree to prefetch arbitrarily far ahead, even for nonclustered indices, thereby hiding the normally expensive cache misses associated with traversing the leaves within the range. Our results show that this technique yields over a sixfold speedup on range scans of 1000+ keys. Although our experimental evaluation focuses on main memory databases, the techniques that we propose are also applicable to hiding disk latency.
The Level Ancestor Problem Simplified
"... We present a very simple algorithm for the Level Ancestor Problem. A Level Ancestor Query LA(v; d) requests the depth d ancestor of node v. The Level Ancestor Problem is thus: preprocess a given rooted tree T to answer level ancestor queries. While optimal solutions to this problem already exist ..."
Abstract

Cited by 38 (0 self)
 Add to MetaCart
We present a very simple algorithm for the Level Ancestor Problem. A Level Ancestor Query LA(v; d) requests the depth d ancestor of node v. The Level Ancestor Problem is thus: preprocess a given rooted tree T to answer level ancestor queries. While optimal solutions to this problem already exist, our new optimal solution is simple enough to be taught and implemented.
Cacheoblivious algorithms and data structures
 IN LECTURE NOTES FROM THE EEF SUMMER SCHOOL ON MASSIVE DATA SETS
, 2002
"... A recent direction in the design of cacheefficient and diskefficient algorithms and data structures is the notion of cache obliviousness, introduced by Frigo, Leiserson, Prokop, and Ramachandran in 1999. Cacheoblivious algorithms perform well on a multilevel memory hierarchy without knowing any pa ..."
Abstract

Cited by 36 (3 self)
 Add to MetaCart
A recent direction in the design of cacheefficient and diskefficient algorithms and data structures is the notion of cache obliviousness, introduced by Frigo, Leiserson, Prokop, and Ramachandran in 1999. Cacheoblivious algorithms perform well on a multilevel memory hierarchy without knowing any parameters of the hierarchy, only knowing the existence of a hierarchy. Equivalently, a single cacheoblivious algorithm is efficient on all memory hierarchies simultaneously. While such results might seem impossible, a recent body of work has developed cacheoblivious algorithms and data structures that perform as well or nearly as well as standard externalmemory structures which require knowledge of the cache/memory size and block transfer size. Here we describe several of these results with the intent of elucidating the techniques behind their design. Perhaps the most exciting of these results are the data structures, which form general building blocks immediately
Provably good multicore cache performance for divideandconquer algorithms
 In Proc. 19th ACMSIAM Sympos. Discrete Algorithms
, 2008
"... This paper presents a multicorecache model that reflects the reality that multicore processors have both perprocessor private (L1) caches and a large shared (L2) cache on chip. We consider a broad class of parallel divideandconquer algorithms and present a new online scheduler, controlledpdf, t ..."
Abstract

Cited by 35 (12 self)
 Add to MetaCart
This paper presents a multicorecache model that reflects the reality that multicore processors have both perprocessor private (L1) caches and a large shared (L2) cache on chip. We consider a broad class of parallel divideandconquer algorithms and present a new online scheduler, controlledpdf, that is competitive with the standard sequential scheduler in the following sense. Given any dynamically unfolding computation DAG from this class of algorithms, the cache complexity on the multicorecache model under our new scheduler is within a constant factor of the sequential cache complexity for both L1 and L2, while the time complexity is within a constant factor of the sequential time complexity divided by the number of processors p. These are the first such asymptoticallyoptimal results for any multicore model. Finally, we show that a separatorbased algorithm for sparsematrixdensevectormultiply achieves provably good cache performance in the multicorecache model, as well as in the wellstudied sequential cacheoblivious model.