Results 1 - 10
of
35
Cache-oblivious B-trees
, 2000
"... Abstract. This paper presents two dynamic search trees attaining near-optimal performance on any hierarchical memory. The data structures are independent of the parameters of the memory hierarchy, e.g., the number of memory levels, the block-transfer size at each level, and the relative speeds of me ..."
Abstract
-
Cited by 119 (21 self)
- Add to MetaCart
Abstract. This paper presents two dynamic search trees attaining near-optimal performance on any hierarchical memory. The data structures are independent of the parameters of the memory hierarchy, e.g., the number of memory levels, the block-transfer size at each level, and the relative speeds of memory levels. The performance is analyzed in terms of the number of memory transfers between two memory levels with an arbitrary block-transfer size of B; this analysis can then be applied to every adjacent pair of levels in a multilevel memory hierarchy. Both search trees match the optimal search bound of Θ(1+logB+1 N) memory transfers. This bound is also achieved by the classic B-tree data structure on a two-level memory hierarchy with a known block-transfer size B. The first search tree supports insertions and deletions in Θ(1 + logB+1 N) amortized memory transfers, which matches the B-tree’s worst-case bounds. The second search tree supports scanning S consecutive elements optimally in Θ(1 + S/B) memory transfers and supports insertions and deletions in Θ(1 + logB+1 N + log2 N) amortized memory transfers, matching the performance of the B-tree for B = B Ω(log N log log N).
Scanning and traversing: maintaining data for traversals in a memory hierarchy
- In Proceedings of the 10th Annual European Symposium on Algorithms
, 2002
"... Abstract. We study the problem of maintaining a dynamic ordered set subject to insertions, deletions, and traversals of k consecutive elements. This problem is trivially solved on a RAM and on a simple two-level memory hierarchy. We explore this traversal problem on more realistic memory models: the ..."
Abstract
-
Cited by 29 (10 self)
- Add to MetaCart
Abstract. We study the problem of maintaining a dynamic ordered set subject to insertions, deletions, and traversals of k consecutive elements. This problem is trivially solved on a RAM and on a simple two-level memory hierarchy. We explore this traversal problem on more realistic memory models: the cache-oblivious model, which applies to unknown and multi-level memory hierarchies, and sequential-access models, where sequential block transfers are less expensive than random block transfers. 1
An experimental analysis of self-adjusting computation
- In Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI
, 2006
"... Self-adjusting computation uses a combination of dynamic dependence graphs and memoization to efficiently update the output of a program as the input changes incrementally or dynamically over time. Related work showed various theoretical results, indicating that the approach can be effective for a r ..."
Abstract
-
Cited by 24 (11 self)
- Add to MetaCart
Self-adjusting computation uses a combination of dynamic dependence graphs and memoization to efficiently update the output of a program as the input changes incrementally or dynamically over time. Related work showed various theoretical results, indicating that the approach can be effective for a reasonably broad range of applications. In this article, we describe algorithms and implementation techniques to realize self-adjusting computation and present an experimental evaluation of the proposed approach on a variety of applications, ranging from simple list primitives to more sophisticated computational geometry algorithms. The results of the experiments show that the approach is effective in practice, often offering orders of magnitude speedup from recomputing the output from scratch. We believe this is the first experimental evidence that incremental computation of any type is effective in practice for a reasonably broad set of applications.
Cache-Oblivious String B-trees
- IN: PROC. OF PRINCIPLES OF DATABASE SYSTEMS
, 2006
"... B-trees are the data structure of choice for maintaining searchable data on disk. However, B-trees perform suboptimally • when keys are long or of variable length, • when keys are compressed, even when using front compression, the standard B-tree compression scheme, • for range queries, and • with r ..."
Abstract
-
Cited by 23 (5 self)
- Add to MetaCart
B-trees are the data structure of choice for maintaining searchable data on disk. However, B-trees perform suboptimally • when keys are long or of variable length, • when keys are compressed, even when using front compression, the standard B-tree compression scheme, • for range queries, and • with respect to memory effects such as disk prefetching. This paper presents a cache-oblivious string B-tree (COSB-tree) data structure that is efficient in all these ways: • The COSB-tree searches asymptotically optimally and inserts and deletes nearly optimally. • It maintains an index whose size is proportional to the frontcompressed size of the dictionary. Furthermore, unlike standard front-compressed strings, keys can be decompressed in a memory-efficient manner. • It performs range queries with no extra disk seeks; in contrast, B-trees incur disk seeks when skipping from leaf block to leaf block. • It utilizes all levels of a memory hierarchy efficiently and makes good use of disk locality by using cache-oblivious layout strategies.
Boxes: Efficient maintenance of order-based labeling for dynamic XML data
- In Proc. of ICDE
, 2005
"... Order-based element labeling for tree-structured XML data is an important technique in XML processing. It lies at the core of many fundamental XML operations such as containment join and twig matching. While labeling for static XML documents is well understood, less is known about how to maintain ac ..."
Abstract
-
Cited by 15 (2 self)
- Add to MetaCart
Order-based element labeling for tree-structured XML data is an important technique in XML processing. It lies at the core of many fundamental XML operations such as containment join and twig matching. While labeling for static XML documents is well understood, less is known about how to maintain accurate labeling for dynamic XML documents, when elements and subtrees are inserted and deleted. Most existing approaches do not work well for arbitrary update patterns; they either produce unacceptably long labels or incur enormous relabeling costs. We present two novel I/O-efficient data structures, W-BOX and B-BOX, that efficiently maintain labeling for large, dynamic XML documents. We show analytically and experimentally that both, despite consuming minimal amounts of storage, gracefully handle arbitrary update patterns without sacrificing lookup efficiency. The two structures together provide a nice tradeoff between update and lookup costs: W-BOX has logarithmic amortized update cost and constant worst-case lookup cost, while B-BOX has constant amortized update cost and logarithmic worst-case lookup cost. We further propose techniques to eliminate the lookup cost for read-heavy workloads. 1.
Policy ratification
- In POLICY ’05: Proceedings of the Sixth IEEE International Workshop on Policies for Distributed Systems and Networks (POLICY’05
, 2005
"... It is not sufficient to merely check the syntax of new policies before they are deployed in a system; policies need to be analyzed for their interactions with each other and with their local environment. That is, policies need to go through a ratification process. We believe policy ratification beco ..."
Abstract
-
Cited by 11 (3 self)
- Add to MetaCart
It is not sufficient to merely check the syntax of new policies before they are deployed in a system; policies need to be analyzed for their interactions with each other and with their local environment. That is, policies need to go through a ratification process. We believe policy ratification becomes an essential part of system management as the number of policies in the system increases and as the system administration becomes more decentralized. In this paper, we focus on the basic tasks involved in policy ratification. To a large degree, these basic tasks can be performed independent of policy model and language and require little domain-specific knowledge. We present algorithms from constraint, linear, and logic programming disciplines to help perform ratification tasks. We provide an algorithm to efficiently assign priorities to the policies based on relative policy preferences indicated by policy administrators. Finally, with an example, we show how these algorithms have been integrated with our policy system to provide feedback to a policy administrator regarding potential interactions of policies with each other and with their deployment environment. 1
Making Data Structures Confluently Persistent
, 2001
"... We address a longstanding open problem of [10, 9], and present a general transformation that transforms any pointer based data structure to be confluently persistent. Such transformations for fully persistent data structures are given in [10], greatly improving the performance compared to the naive ..."
Abstract
-
Cited by 10 (0 self)
- Add to MetaCart
We address a longstanding open problem of [10, 9], and present a general transformation that transforms any pointer based data structure to be confluently persistent. Such transformations for fully persistent data structures are given in [10], greatly improving the performance compared to the naive scheme of simply copying the inputs. Unlike fully persistent data structures, where both the naive scheme and the fully persistent scheme of [10] are feasible, we show that the naive scheme for confluently persistent data structures is itself infeasible (requires exponential space and time). Thus, prior to this paper there was no feasible method for implementing confluently persistent data structures at all. Our methods give an exponential reduction in space and time compared to the naive method, placing confluently persistent data structures in the realm of possibility.
DITTO: Automatic Incrementalization of Data Structure . . .
- IN PLDI
, 2007
"... We present DITTO, an automatic incrementalizer for dynamic, sideeffect-free data structure invariant checks. Incrementalization speeds up the execution of a check by reusing its previous executions, checking the invariant anew only on the changed parts of the data structure. DITTO exploits propertie ..."
Abstract
-
Cited by 9 (0 self)
- Add to MetaCart
We present DITTO, an automatic incrementalizer for dynamic, sideeffect-free data structure invariant checks. Incrementalization speeds up the execution of a check by reusing its previous executions, checking the invariant anew only on the changed parts of the data structure. DITTO exploits properties specific to the domain of invariant checks to automate and simplify the process without restricting what mutations the program can perform. Our incrementalizer works for modern imperative languages such as Java and C#. It can incrementalize, for example, verification of red-black tree properties and the consistency of the hash code in a hash table bucket. Our source-tosource implementation for Java is automatic, portable, and efficient. DITTO provides speedups on data structures with as few as 100 elements; on larger data structures, its speedups are characteristic of non-automatic incrementalizers: roughly 5-fold at 5,000 elements, and growing linearly with data structure size.
Nested Parallelism in Transactional Memory
"... Abstract This paper describes XCilk, a runtime-system design forsoftware transactional memory in a Cilk-like parallel programming language, which uses a work-stealing sched-uler. XCilk supports transactions that themselves can contain nested parallelism and nested transactions, both of un-bounded ne ..."
Abstract
-
Cited by 8 (0 self)
- Add to MetaCart
Abstract This paper describes XCilk, a runtime-system design forsoftware transactional memory in a Cilk-like parallel programming language, which uses a work-stealing sched-uler. XCilk supports transactions that themselves can contain nested parallelism and nested transactions, both of un-bounded nesting depth. Thus, XCilk allows users to call a library function within a transaction, even if that func-tion itself exploits concurrency and uses transactions. XCilk provides transactional memory with strong atomicity, eagerupdates, eager conflict detection and lazy cleanup on aborts. XCilk uses a new algorithm and data structure, calledXConflict, to facilitate conflict detection between transactions. Using XConflict, XCilk guarantees provably good per-formance for closed-nested transactions of arbitrary depth in the special case when all accesses are writes and there is nomemory contention. More precisely, XCilk executes a program with work T1 and critical-path length Te ^ in O(T1/p +pT e^) time on p processors if all memory accesses are writesand all concurrent paths of the XCilk program access disjoint sets of memory locations. Although this bound holdsonly under rather optimistic assumptions, to our knowledge, this result is the first theoretical performance bound on a TMsystem that supports transactions with nested parallelism.
Some directed graph algorithms and their application to pointer analysis (work in progress
, 2004
"... This thesis is focused on improving execution time and precision of scalable pointer analysis. Such an analysis statically determines the targets of all pointer variables in a program. We formulate the analysis as a directed graph problem, where the solution can be obtained by a computation similar, ..."
Abstract
-
Cited by 8 (3 self)
- Add to MetaCart
This thesis is focused on improving execution time and precision of scalable pointer analysis. Such an analysis statically determines the targets of all pointer variables in a program. We formulate the analysis as a directed graph problem, where the solution can be obtained by a computation similar, in many ways, to transitive closure. As with transitive closure, identifying strongly connected components and transitive edges offers significant gains. However, our problem differs as the computation can result in new edges being added to the graph and, hence, dynamic algorithms are needed to efficiently identify these structures. Thus, pointer analysis has often been likened to the dynamic transitive closure problem. Two new algorithms for dynamically maintaining the topological order of a directed graph are presented. The first is a unit change algorithm, meaning the solution must be recomputed immediately following an edge insertion. While this has a marginally inferior worse-case time bound, compared with a previous solution, it is far simpler to implement and has fewer restrictions. For these reasons, we find it to be faster in practice and provide an experimental study over random graphs to support this. Our second is a batch algorithm, meaning the solution can be updated after

