Results 1  10
of
16
Cacheoblivious Btrees
, 2000
"... Abstract. This paper presents two dynamic search trees attaining nearoptimal performance on any hierarchical memory. The data structures are independent of the parameters of the memory hierarchy, e.g., the number of memory levels, the blocktransfer size at each level, and the relative speeds of me ..."
Abstract

Cited by 135 (22 self)
 Add to MetaCart
Abstract. This paper presents two dynamic search trees attaining nearoptimal performance on any hierarchical memory. The data structures are independent of the parameters of the memory hierarchy, e.g., the number of memory levels, the blocktransfer size at each level, and the relative speeds of memory levels. The performance is analyzed in terms of the number of memory transfers between two memory levels with an arbitrary blocktransfer size of B; this analysis can then be applied to every adjacent pair of levels in a multilevel memory hierarchy. Both search trees match the optimal search bound of Θ(1+logB+1 N) memory transfers. This bound is also achieved by the classic Btree data structure on a twolevel memory hierarchy with a known blocktransfer size B. The first search tree supports insertions and deletions in Θ(1 + logB+1 N) amortized memory transfers, which matches the Btree’s worstcase bounds. The second search tree supports scanning S consecutive elements optimally in Θ(1 + S/B) memory transfers and supports insertions and deletions in Θ(1 + logB+1 N + log2 N) amortized memory transfers, matching the performance of the Btree for B = B Ω(log N log log N).
Cache oblivious distribution sweeping
 IN PROC. 29TH INTERNATIONAL COLLOQUIUM ON AUTOMATA, LANGUAGES, AND PROGRAMMING (ICALP), VOLUME 2380 OF LNCS
, 2002
"... We adapt the distribution sweeping method to the cache oblivious model. Distribution sweeping is the name used for a general approach for divideandconquer algorithms where the combination of solved subproblems can be viewed as a merging process of streams. We demonstrate by a series of algorithms ..."
Abstract

Cited by 43 (11 self)
 Add to MetaCart
We adapt the distribution sweeping method to the cache oblivious model. Distribution sweeping is the name used for a general approach for divideandconquer algorithms where the combination of solved subproblems can be viewed as a merging process of streams. We demonstrate by a series of algorithms for specific problems the feasibility of the method in a cache oblivious setting. The problems all come from computational geometry, and are: orthogonal line segment intersection reporting, the all nearest neighbors problem, the 3D maxima problem, computing the measure of a set of axisparallel rectangles, computing the visibility of a set of line segments from a point, batched orthogonal range queries, and reporting pairwise intersections of axisparallel rectangles. Our basic building block is a simplified version of the cache oblivious sorting algorithm Funnelsort of Frigo et al., which is of independent interest.
Cacheoblivious algorithms and data structures
 IN LECTURE NOTES FROM THE EEF SUMMER SCHOOL ON MASSIVE DATA SETS
, 2002
"... A recent direction in the design of cacheefficient and diskefficient algorithms and data structures is the notion of cache obliviousness, introduced by Frigo, Leiserson, Prokop, and Ramachandran in 1999. Cacheoblivious algorithms perform well on a multilevel memory hierarchy without knowing any pa ..."
Abstract

Cited by 36 (3 self)
 Add to MetaCart
A recent direction in the design of cacheefficient and diskefficient algorithms and data structures is the notion of cache obliviousness, introduced by Frigo, Leiserson, Prokop, and Ramachandran in 1999. Cacheoblivious algorithms perform well on a multilevel memory hierarchy without knowing any parameters of the hierarchy, only knowing the existence of a hierarchy. Equivalently, a single cacheoblivious algorithm is efficient on all memory hierarchies simultaneously. While such results might seem impossible, a recent body of work has developed cacheoblivious algorithms and data structures that perform as well or nearly as well as standard externalmemory structures which require knowledge of the cache/memory size and block transfer size. Here we describe several of these results with the intent of elucidating the techniques behind their design. Perhaps the most exciting of these results are the data structures, which form general building blocks immediately
Efficient tree layout in a multilevel memory hierarchy, arXiv:cs.DS/0211010
, 2003
"... We consider the problem of laying out a tree with fixed parent/child structure in hierarchical memory. The goal is to minimize the expected number of block transfers performed during a search along a roottoleaf path, subject to a given probability distribution on the leaves. This problem was previ ..."
Abstract

Cited by 31 (7 self)
 Add to MetaCart
We consider the problem of laying out a tree with fixed parent/child structure in hierarchical memory. The goal is to minimize the expected number of block transfers performed during a search along a roottoleaf path, subject to a given probability distribution on the leaves. This problem was previously considered by Gil and Itai, who developed optimal but slow algorithms when the blocktransfer size B is known. We present faster but approximate algorithms for the same problem; the fastest such algorithm runs in linear time and produces a solution that is within an additive constant of optimal. In addition, we show how to extend any approximately optimal algorithm to the cacheoblivious setting in which the blocktransfer size is unknown to the algorithm. The query performance of the cacheoblivious layout is within a constant factor of the query performance of the optimal knownblocksize layout. Computing the cacheoblivious layout requires only logarithmically many calls to the layout algorithm for known block size; in particular, the cacheoblivious layout can be computed in O(N lg N) time, where N is the number of nodes. Finally, we analyze two greedy strategies, and show that they have a performance ratio between Ω(lg B / lg lg B) and O(lg B) when compared to the optimal layout.
Alternation and Redundancy Analysis of the Intersection Problem
"... The intersection of sorted arrays problem has applications in search engines such as Google. Previous work propose and compare deterministic algorithms for this problem, in an adaptive analysis based on the encoding size of a certificate of the result (cost analysis). We define the alternation analy ..."
Abstract

Cited by 10 (3 self)
 Add to MetaCart
The intersection of sorted arrays problem has applications in search engines such as Google. Previous work propose and compare deterministic algorithms for this problem, in an adaptive analysis based on the encoding size of a certificate of the result (cost analysis). We define the alternation analysis, based on the nondeterministic complexity of an instance. In this analysis we prove that there is a deterministic algorithm asymptotically performing as well as any randomized algorithm in the comparison model. We define the redundancy analysis, based on a measure of the internal redundancy of the instance. In this analysis we prove that any algorithm optimal in the redundancy analysis is optimal in the alternation analysis, but that there is a randomized algorithm which performs strictly better than any deterministic algorithm in the comparison model. Finally, we describe how those results can be extended beyond the comparison model.
Optimal cacheoblivious implicit dictionaries
 In Proceedings of the 30th International Colloquium on Automata, Languages and Programming
, 2003
"... Abstract. We consider the issues of implicitness and cacheobliviousness in the classical dictionary problem for n distinct keys over an unbounded and ordered universe. One finding in this paper is that of closing the longstanding open problem about the existence of an optimal implicit dictionary ov ..."
Abstract

Cited by 7 (2 self)
 Add to MetaCart
Abstract. We consider the issues of implicitness and cacheobliviousness in the classical dictionary problem for n distinct keys over an unbounded and ordered universe. One finding in this paper is that of closing the longstanding open problem about the existence of an optimal implicit dictionary over an unbounded universe. Another finding is motivated by the antithetic features of implicit and cacheoblivious models in data structures. We show how to blend their best qualities achieving O(log n) time and O(log B n) block transfers for searching and for amortized updating, while using just n memory cells like sorted arrays and heaps. As a result, we avoid space wasting and provide fast data access at any level of the memory hierarchy. 1
Persistent Predecessor Search and Orthogonal Point Location on the Word RAM
"... We answer a basic data structuring question (for example, raised by Dietz and Raman back in SODA 1991): can van Emde Boas trees be made persistent, without changing their asymptotic query/update time? We present a (partially) persistent data structure that supports predecessor search in a set of int ..."
Abstract

Cited by 6 (3 self)
 Add to MetaCart
We answer a basic data structuring question (for example, raised by Dietz and Raman back in SODA 1991): can van Emde Boas trees be made persistent, without changing their asymptotic query/update time? We present a (partially) persistent data structure that supports predecessor search in a set of integers in {1,..., U} under an arbitrary sequence of n insertions and deletions, with O(log log U) expected query time and expected amortized update time, and O(n) space. The query bound is optimal in U for linearspace structures and improves previous nearO((log log U) 2) methods. The same method solves a fundamental problem from computational geometry: point location in orthogonal planar subdivisions (where edges are vertical or horizontal). We obtain the first static data structure achieving O(log log U) worstcase query time and linear space. This result is again optimal in U for linearspace structures and improves the previous O((log log U) 2) method by de Berg, Snoeyink, and van Kreveld (1992). The same result also holds for higherdimensional subdivisions that are orthogonal binary space partitions, and for certain nonorthogonal planar subdivisions such as triangulations without small angles. Many geometric applications follow, including improved query times for orthogonal range reporting for dimensions ≥ 3 on the RAM. Our key technique is an interesting new vanEmdeBoas–style recursion that alternates between two strategies, both quite simple.
Project proposal: Associative containers with strong guarantees CPH STL Report 20074. Worldwide Web document available at http://cphstl.dk
, 2007
"... Abstract. The Standard Template Library (STL) is a collection of generic algorithms and data structures that is part of the standard runtime environment of the C++ programming language. The STL provides four kinds of associative element containers: set, multiset, map, and multimap. In this project ..."
Abstract

Cited by 3 (1 self)
 Add to MetaCart
Abstract. The Standard Template Library (STL) is a collection of generic algorithms and data structures that is part of the standard runtime environment of the C++ programming language. The STL provides four kinds of associative element containers: set, multiset, map, and multimap. In this project the goal is to develop an associative container that is safer, more reliable, more usable, and/or more efficient (with respect to time and space) than any of the existing realizations.
Externalmemory search trees with fast insertions
 Master’s thesis, MIT EECS
, 2006
"... This thesis provides both experimental and theoretical contributions regarding externalmemory dynamic search trees with fast insertions. The first contribution is the implementation of the buffered repository B ǫtree, a data structure that provably outperforms Btrees for updates at the cost of a c ..."
Abstract

Cited by 3 (0 self)
 Add to MetaCart
This thesis provides both experimental and theoretical contributions regarding externalmemory dynamic search trees with fast insertions. The first contribution is the implementation of the buffered repository B ǫtree, a data structure that provably outperforms Btrees for updates at the cost of a constant factor decrease in query performance. This thesis also describes the cacheoblivious lookahead array, which outperforms Btrees for updates at a logarithmic cost in query performance, and does so without knowing the cache parameters of the system it is being run on. The buffered repository B ǫtree is an externalmemory search tree that can be tuned for a tradeoff between queries and updates. Specifically, for any ǫ ∈ [1 / lg B, 1] this data structure achieves O((1/ǫB 1−ǫ)(1+log B(N/B))) block transfers for Insert and Delete and O((1/ǫ)(1 + log B(N/B))) block transfers for Search. The update complexity is amortized and is O((1/ǫ)(1+log B(N/B))) in the worst case. Using the value ǫ = 1/2, I was able to achieve a 17 times increase in insertion performance at the cost of only a 3 times decrease in search performance on a database with 12byte
I/OEfficient Map Overlay and Point Location in LowDensity Subdivisions ⋆
"... Abstract. We present improved and simplified i/oefficient algorithms for two problems on planar lowdensity subdivisions, namely map overlay and point location. More precisely, we show how to preprocess a lowdensity subdivision with n edges in O(sort(n)) i/o’s into a compressed linear quadtree such ..."
Abstract

Cited by 3 (0 self)
 Add to MetaCart
Abstract. We present improved and simplified i/oefficient algorithms for two problems on planar lowdensity subdivisions, namely map overlay and point location. More precisely, we show how to preprocess a lowdensity subdivision with n edges in O(sort(n)) i/o’s into a compressed linear quadtree such that one can: (i) compute the overlay of two preprocessed subdivisions in O(scan(n)) i/o’s, where n is the total number of edges in the two subdivisions, (ii) answer a single point location query in O(log B n) i/o’s and k batched point location queries in O(scan(n)+sort(k)) i/o’s. For the special case where the subdivision is a fat triangulation, we show how to obtain the same bounds with an ordinary (uncompressed) quadtree, and we show how to make the structure fully dynamic using O(log B n) i/o’s per update. Our algorithms and data structures improve on the previous best known bounds for general subdivisions both in the number of i/o’s and storage usage, they are significantly simpler, and several of our algorithms are cacheoblivious. 1