Results 1 - 10
of
13
Cache-oblivious B-trees
, 2000
"... Abstract. This paper presents two dynamic search trees attaining near-optimal performance on any hierarchical memory. The data structures are independent of the parameters of the memory hierarchy, e.g., the number of memory levels, the block-transfer size at each level, and the relative speeds of me ..."
Abstract
-
Cited by 119 (21 self)
- Add to MetaCart
Abstract. This paper presents two dynamic search trees attaining near-optimal performance on any hierarchical memory. The data structures are independent of the parameters of the memory hierarchy, e.g., the number of memory levels, the block-transfer size at each level, and the relative speeds of memory levels. The performance is analyzed in terms of the number of memory transfers between two memory levels with an arbitrary block-transfer size of B; this analysis can then be applied to every adjacent pair of levels in a multilevel memory hierarchy. Both search trees match the optimal search bound of Θ(1+logB+1 N) memory transfers. This bound is also achieved by the classic B-tree data structure on a two-level memory hierarchy with a known block-transfer size B. The first search tree supports insertions and deletions in Θ(1 + logB+1 N) amortized memory transfers, which matches the B-tree’s worst-case bounds. The second search tree supports scanning S consecutive elements optimally in Θ(1 + S/B) memory transfers and supports insertions and deletions in Θ(1 + logB+1 N + log2 N) amortized memory transfers, matching the performance of the B-tree for B = B Ω(log N log log N).
Cache oblivious distribution sweeping
- IN PROC. 29TH INTERNATIONAL COLLOQUIUM ON AUTOMATA, LANGUAGES, AND PROGRAMMING (ICALP), VOLUME 2380 OF LNCS
, 2002
"... We adapt the distribution sweeping method to the cache oblivious model. Distribution sweeping is the name used for a general approach for divide-and-conquer algorithms where the combination of solved subproblems can be viewed as a merging process of streams. We demonstrate by a series of algorithms ..."
Abstract
-
Cited by 37 (11 self)
- Add to MetaCart
We adapt the distribution sweeping method to the cache oblivious model. Distribution sweeping is the name used for a general approach for divide-and-conquer algorithms where the combination of solved subproblems can be viewed as a merging process of streams. We demonstrate by a series of algorithms for specific problems the feasibility of the method in a cache oblivious setting. The problems all come from computational geometry, and are: orthogonal line segment intersection reporting, the all nearest neighbors problem, the 3D maxima problem, computing the measure of a set of axis-parallel rectangles, computing the visibility of a set of line segments from a point, batched orthogonal range queries, and reporting pairwise intersections of axis-parallel rectangles. Our basic building block is a simplified version of the cache oblivious sorting algorithm Funnelsort of Frigo et al., which is of independent interest.
Cache-oblivious algorithms and data structures
- IN LECTURE NOTES FROM THE EEF SUMMER SCHOOL ON MASSIVE DATA SETS
, 2002
"... A recent direction in the design of cache-efficient and diskefficient algorithms and data structures is the notion of cache obliviousness, introduced by Frigo, Leiserson, Prokop, and Ramachandran in 1999. Cache-oblivious algorithms perform well on a multilevel memory hierarchy without knowing any pa ..."
Abstract
-
Cited by 33 (2 self)
- Add to MetaCart
A recent direction in the design of cache-efficient and diskefficient algorithms and data structures is the notion of cache obliviousness, introduced by Frigo, Leiserson, Prokop, and Ramachandran in 1999. Cache-oblivious algorithms perform well on a multilevel memory hierarchy without knowing any parameters of the hierarchy, only knowing the existence of a hierarchy. Equivalently, a single cache-oblivious algorithm is efficient on all memory hierarchies simultaneously. While such results might seem impossible, a recent body of work has developed cache-oblivious algorithms and data structures that perform as well or nearly as well as standard external-memory structures which require knowledge of the cache/memory size and block transfer size. Here we describe several of these results with the intent of elucidating the techniques behind their design. Perhaps the most exciting of these results are the data structures, which form general building blocks immediately
Alternation and Redundancy Analysis of the Intersection Problem
"... The intersection of sorted arrays problem has applications in search engines such as Google. Previous work propose and compare deterministic algorithms for this problem, in an adaptive analysis based on the encoding size of a certificate of the result (cost analysis). We define the alternation analy ..."
Abstract
-
Cited by 7 (2 self)
- Add to MetaCart
The intersection of sorted arrays problem has applications in search engines such as Google. Previous work propose and compare deterministic algorithms for this problem, in an adaptive analysis based on the encoding size of a certificate of the result (cost analysis). We define the alternation analysis, based on the non-deterministic complexity of an instance. In this analysis we prove that there is a deterministic algorithm asymptotically performing as well as any randomized algorithm in the comparison model. We define the redundancy analysis, based on a measure of the internal redundancy of the instance. In this analysis we prove that any algorithm optimal in the redundancy analysis is optimal in the alternation analysis, but that there is a randomized algorithm which performs strictly better than any deterministic algorithm in the comparison model. Finally, we describe how those results can be extended beyond the comparison model.
Optimal cache-oblivious implicit dictionaries
- In Proceedings of the 30th International Colloquium on Automata, Languages and Programming
, 2003
"... Abstract. We consider the issues of implicitness and cache-obliviousness in the classical dictionary problem for n distinct keys over an unbounded and ordered universe. One finding in this paper is that of closing the longstanding open problem about the existence of an optimal implicit dictionary ov ..."
Abstract
-
Cited by 6 (2 self)
- Add to MetaCart
Abstract. We consider the issues of implicitness and cache-obliviousness in the classical dictionary problem for n distinct keys over an unbounded and ordered universe. One finding in this paper is that of closing the longstanding open problem about the existence of an optimal implicit dictionary over an unbounded universe. Another finding is motivated by the antithetic features of implicit and cache-oblivious models in data structures. We show how to blend their best qualities achieving O(log n) time and O(log B n) block transfers for searching and for amortized updating, while using just n memory cells like sorted arrays and heaps. As a result, we avoid space wasting and provide fast data access at any level of the memory hierarchy. 1
External-memory search trees with fast insertions
- Master’s thesis, MIT EECS
, 2006
"... This thesis provides both experimental and theoretical contributions regarding externalmemory dynamic search trees with fast insertions. The first contribution is the implementation of the buffered repository B ǫ-tree, a data structure that provably outperforms B-trees for updates at the cost of a c ..."
Abstract
-
Cited by 3 (0 self)
- Add to MetaCart
This thesis provides both experimental and theoretical contributions regarding externalmemory dynamic search trees with fast insertions. The first contribution is the implementation of the buffered repository B ǫ-tree, a data structure that provably outperforms B-trees for updates at the cost of a constant factor decrease in query performance. This thesis also describes the cache-oblivious lookahead array, which outperforms B-trees for updates at a logarithmic cost in query performance, and does so without knowing the cache parameters of the system it is being run on. The buffered repository B ǫ-tree is an external-memory search tree that can be tuned for a tradeoff between queries and updates. Specifically, for any ǫ ∈ [1 / lg B, 1] this data structure achieves O((1/ǫB 1−ǫ)(1+log B(N/B))) block transfers for Insert and Delete and O((1/ǫ)(1 + log B(N/B))) block transfers for Search. The update complexity is amortized and is O((1/ǫ)(1+log B(N/B))) in the worst case. Using the value ǫ = 1/2, I was able to achieve a 17 times increase in insertion performance at the cost of only a 3 times decrease in search performance on a database with 12-byte
Project proposal: Associative containers with strong guarantees CPH STL Report 2007-4. Worldwide Web document available at http://cphstl.dk
, 2007
"... Abstract. The Standard Template Library (STL) is a collection of generic algorithms and data structures that is part of the standard run-time environment of the C++ programming language. The STL provides four kinds of associative element containers: set, multiset, map, and multimap. In this project ..."
Abstract
-
Cited by 3 (1 self)
- Add to MetaCart
Abstract. The Standard Template Library (STL) is a collection of generic algorithms and data structures that is part of the standard run-time environment of the C++ programming language. The STL provides four kinds of associative element containers: set, multiset, map, and multimap. In this project the goal is to develop an associative container that is safer, more reliable, more usable, and/or more efficient (with respect to time and space) than any of the existing realizations.
Persistent Predecessor Search and Orthogonal Point Location on the Word RAM
"... We answer a basic data structuring question (for example, raised by Dietz and Raman back in SODA 1991): can van Emde Boas trees be made persistent, without changing their asymptotic query/update time? We present a (partially) persistent data structure that supports predecessor search in a set of int ..."
Abstract
-
Cited by 3 (2 self)
- Add to MetaCart
We answer a basic data structuring question (for example, raised by Dietz and Raman back in SODA 1991): can van Emde Boas trees be made persistent, without changing their asymptotic query/update time? We present a (partially) persistent data structure that supports predecessor search in a set of integers in {1,..., U} under an arbitrary sequence of n insertions and deletions, with O(log log U) expected query time and expected amortized update time, and O(n) space. The query bound is optimal in U for linear-space structures and improves previous near-O((log log U) 2) methods. The same method solves a fundamental problem from computational geometry: point location in orthogonal planar subdivisions (where edges are vertical or horizontal). We obtain the first static data structure achieving O(log log U) worst-case query time and linear space. This result is again optimal in U for linear-space structures and improves the previous O((log log U) 2) method by de Berg, Snoeyink, and van Kreveld (1992). The same result also holds for higherdimensional subdivisions that are orthogonal binary space partitions, and for certain nonorthogonal planar subdivisions such as triangulations without small angles. Many geometric applications follow, including improved query times for orthogonal range reporting for dimensions ≥ 3 on the RAM. Our key technique is an interesting new van-Emde-Boas–style recursion that alternates between two strategies, both quite simple.
Algorithm Engineering
- The Algorithmics Column (J. Diaz), Bulletin of the EATCS
, 2003
"... Algorithm Engineering is concerned with the design, analysis, implementation, tuning, debugging and experimental evaluation of computer programs for solving algorithmic problems. It provides methodologies and tools for developing and engineering e#cient algorithmic codes and aims at integrating a ..."
Abstract
-
Cited by 2 (1 self)
- Add to MetaCart
Algorithm Engineering is concerned with the design, analysis, implementation, tuning, debugging and experimental evaluation of computer programs for solving algorithmic problems. It provides methodologies and tools for developing and engineering e#cient algorithmic codes and aims at integrating and reinforcing traditional theoretical approaches for the design and analysis of algorithms and data structures.
A general approach for cache-oblivious range reporting and approximate range counting
- Computational Geometry: Theory and Applications
"... We present cache-oblivious solutions to two important variants of range searching: range reporting and approximate range counting. The main contribution of our paper is a general approach for constructing cache-oblivious data structures that provide relative (1+ε)-approximations for a general class ..."
Abstract
-
Cited by 2 (2 self)
- Add to MetaCart
We present cache-oblivious solutions to two important variants of range searching: range reporting and approximate range counting. The main contribution of our paper is a general approach for constructing cache-oblivious data structures that provide relative (1+ε)-approximations for a general class of range counting queries. This class includes three-sided range counting, 3-d dominance counting, and 3-d halfspace range counting. Our technique allows us to obtain data structures that use linear space and answer queries in the optimal query bound of O(logB (N/K)) block transfers in the worst case, where K is the number of points in the query range. Using the same technique, we also obtain the first approximate 3-d halfspace range counting and 3-d dominance counting data structures with a worst-case query time of O(log (N/K)) in internal memory. An easy but important consequence of our main result is the existence of O(N log N)-space cache-oblivious data structures with an optimal query bound of O(logB N+K/B) block transfers for the reporting versions of the above problems. Using standard reductions, these data structures allow us to obtain the first cache-oblivious data structures that use near-linear space and achieve the optimal query bound for circular range reporting and K-nearest neighbour searching in the plane, as well as for orthogonal range reporting in three dimensions. Part of this work was done while visiting Dalhousie University.

