Results 11  20
of
81
Nectar: automatic management of data and computation in datacenters
 In OSDI ’10
, 2010
"... Managing data and computation is at the heart of datacenter computing. Manual management of data can lead to data loss, wasteful consumption of storage, and laborious bookkeeping. Lack of proper management of computation can result in lost opportunities to share common computations across multiple j ..."
Abstract

Cited by 35 (0 self)
 Add to MetaCart
(Show Context)
Managing data and computation is at the heart of datacenter computing. Manual management of data can lead to data loss, wasteful consumption of storage, and laborious bookkeeping. Lack of proper management of computation can result in lost opportunities to share common computations across multiple jobs or to compute results incrementally. Nectar is a system designed to address the aforementioned problems. It automates and unifies the management of data and computation within a datacenter. In Nectar, data and computation are treated interchangeably by associating data with its computation. Derived datasets, which are the results of computations, are uniquely identified by the programs that produce them, and together with their programs, are automatically managed by a datacenter wide caching service. Any derived dataset can be transparently regenerated by reexecuting its program, and any computation can be transparently avoided by using previously cached results. This enables us to greatly improve datacenter management and resource utilization: obsolete or infrequently used derived datasets are automatically garbage collected, and shared common computations are computed only once and reused by others. This paper describes the design and implementation of Nectar, and reports on our evaluation of the system using analytic studies of logs from several production clusters and an actual deployment on a 240node cluster. 1
Imperative selfadjusting computation
 In POPL ’08: Proceedings of the 35th annual ACM SIGPLANSIGACT symposium on Principles of programming languages
, 2008
"... Recent work on selfadjusting computation showed how to systematically write programs that respond efficiently to incremental changes in their inputs. The idea is to represent changeable data using modifiable references, i.e., a special data structure that keeps track of dependencies between read an ..."
Abstract

Cited by 35 (17 self)
 Add to MetaCart
Recent work on selfadjusting computation showed how to systematically write programs that respond efficiently to incremental changes in their inputs. The idea is to represent changeable data using modifiable references, i.e., a special data structure that keeps track of dependencies between read and writeoperations, and to let computations construct traces that later, after changes have occurred, can drive a change propagation algorithm. The approach has been shown to be effective for a variety of algorithmic problems, including some for which adhoc solutions had previously remained elusive. All previous work on selfadjusting computation, however, relied on a purely functional programming model. In this paper, we show that it is possible to remove this limitation and support modifiable references that can be written multiple times. We formalize this using a language AIL for which we define evaluation and changepropagation semantics. AIL closely resembles a traditional higherorder imperative programming language. For AIL we state and prove consistency, i.e., the property that although the semantics is inherently nondeterministic, different evaluation paths will still give observationally equivalent results. In the imperative setting where pointer graphs in the store can form cycles, our previous proof techniques do not apply. Instead, we make use of a novel form of a stepindexed logical relation that handles modifiable references. We show that AIL can be realized efficiently by describing implementation strategies whose overhead is provably constanttime per primitive. When the number of reads and writes per modifiable is bounded by a constant, we can show that change propagation becomes as efficient as it was in the pure case. The general case incurs a slowdown that is logarithmic in the maximum number of such operations. We use DFS and related algorithms on graphs as our running examples and prove that they respond to insertions and deletions of edges efficiently. 1.
Dynamic programming via static incrementalization
 In Proceedings of the 8th European Symposium on Programming
, 1999
"... Dynamic programming is an important algorithm design technique. It is used for solving problems whose solutions involve recursively solving subproblems that share subsubproblems. While a straightforward recursive program solves common subsubproblems repeatedly and often takes exponential time, a dyn ..."
Abstract

Cited by 31 (14 self)
 Add to MetaCart
(Show Context)
Dynamic programming is an important algorithm design technique. It is used for solving problems whose solutions involve recursively solving subproblems that share subsubproblems. While a straightforward recursive program solves common subsubproblems repeatedly and often takes exponential time, a dynamic programming algorithm solves every subsubproblem just once, saves the result, reuses it when the subsubproblem is encountered again, and takes polynomial time. This paper describes a systematic method for transforming programs written as straightforward recursions into programs that use dynamic programming. The method extends the original program to cache all possibly computed values, incrementalizes the extended program with respect to an input increment to use and maintain all cached results, prunes out cached results that are not used in the incremental computation, and uses the resulting incremental program to form an optimized new program. Incrementalization statically exploits semantics of both control structures and data structures and maintains as invariants equalities characterizing cached results. The principle underlying incrementalization is general for achieving drastic program speedups. Compared with previous methods that perform memoization or tabulation, the method based on incrementalization is more powerful and systematic. It has been implemented and applied to numerous problems and succeeded on all of them. 1
Incremental evaluation of tabled logic programs
 In ICLP, volume 2916 of LNCS
, 2003
"... ..."
(Show Context)
DryadInc: Reusing work in largescale computations
"... Many largescale (cloud) computations operate on appendonly, partitioned datasets. We present two incremental computation frameworks to reuse prior work in these circumstances: (1) reusing identical computations already performed on data partitions, and (2) computing just on the newly appended data ..."
Abstract

Cited by 27 (3 self)
 Add to MetaCart
(Show Context)
Many largescale (cloud) computations operate on appendonly, partitioned datasets. We present two incremental computation frameworks to reuse prior work in these circumstances: (1) reusing identical computations already performed on data partitions, and (2) computing just on the newly appended data and merging the new and previous results. 1
CEAL: a Cbased language for selfadjusting computation
 In ACM SIGPLAN Conference on Programming Language Design and Implementation
, 2009
"... Selfadjusting computation offers a languagecentric approach to writing programs that can automatically respond to modifications to their data (e.g., inputs). Except for several domainspecific implementations, however, all previous implementations of selfadjusting computation assume mostly functi ..."
Abstract

Cited by 25 (11 self)
 Add to MetaCart
(Show Context)
Selfadjusting computation offers a languagecentric approach to writing programs that can automatically respond to modifications to their data (e.g., inputs). Except for several domainspecific implementations, however, all previous implementations of selfadjusting computation assume mostly functional, higherorder languages such as Standard ML. Prior to this work, it was not known if selfadjusting computation can be made to work with lowlevel, imperative languages such as C without placing undue burden on the programmer. We describe the design and implementation of CEAL: a Cbased language for selfadjusting computation. The language is fully general and extends C with a small number of primitives to enable writing selfadjusting programs in a style similar to conventional C programs. We present efficient compilation techniques for translating CEAL programs into C that can be compiled with existing C compilers using primitives supplied by a runtime library for selfadjusting computation. We implement the proposed compiler and evaluate its effectiveness. Our experiments show that CEAL is effective in practice: compiled selfadjusting programs respond to small modifications to their data by orders of magnitude faster than recomputing from scratch while slowing down a fromscratch run by a moderate constant factor. Compared to previous work, we
DITTO: Automatic Incrementalization of Data Structure . . .
 IN PLDI
, 2007
"... We present DITTO, an automatic incrementalizer for dynamic, sideeffectfree data structure invariant checks. Incrementalization speeds up the execution of a check by reusing its previous executions, checking the invariant anew only on the changed parts of the data structure. DITTO exploits propertie ..."
Abstract

Cited by 23 (0 self)
 Add to MetaCart
We present DITTO, an automatic incrementalizer for dynamic, sideeffectfree data structure invariant checks. Incrementalization speeds up the execution of a check by reusing its previous executions, checking the invariant anew only on the changed parts of the data structure. DITTO exploits properties specific to the domain of invariant checks to automate and simplify the process without restricting what mutations the program can perform. Our incrementalizer works for modern imperative languages such as Java and C#. It can incrementalize, for example, verification of redblack tree properties and the consistency of the hash code in a hash table bucket. Our sourcetosource implementation for Java is automatic, portable, and efficient. DITTO provides speedups on data structures with as few as 100 elements; on larger data structures, its speedups are characteristic of nonautomatic incrementalizers: roughly 5fold at 5,000 elements, and growing linearly with data structure size.
Caching intermediate results for program improvement
 In Proceedings of the 1995 ACM SIGPLAN Symposium on Partial Evaluation and SemanticsBased Program Manipulation, PEPM ’95
, 1995
"... A systematic approach is given for symbolically caching intermediate results useful for deriving incremental programs from nonincremental programs. We exploit a number of program analysis and transformation techniques, centered around e ective c a c hing based on its utilization in deriving increme ..."
Abstract

Cited by 22 (6 self)
 Add to MetaCart
(Show Context)
A systematic approach is given for symbolically caching intermediate results useful for deriving incremental programs from nonincremental programs. We exploit a number of program analysis and transformation techniques, centered around e ective c a c hing based on its utilization in deriving incremental programs, in order to increase the degree of incrementality not otherwise achievable by using only the return values of programs that are of direct interest. Our method can be applied straightforwardly to provide a systematic approach to program improvement via caching. 1
Strongly historyindependent hashing with applications
 In Proceedings of the 48th Annual IEEE Symposium on Foundations of Computer Science
, 2007
"... We present a strongly history independent (SHI) hash table that supports search in O(1) worstcase time, and insert and delete in O(1) expected time using O(n) data space. This matches the bounds for dynamic perfect hashing, and improves on the best previous results by Naor and Teague on history ind ..."
Abstract

Cited by 20 (5 self)
 Add to MetaCart
(Show Context)
We present a strongly history independent (SHI) hash table that supports search in O(1) worstcase time, and insert and delete in O(1) expected time using O(n) data space. This matches the bounds for dynamic perfect hashing, and improves on the best previous results by Naor and Teague on history independent hashing, which were either weakly history independent, or only supported insertion and search (no delete) each in O(1) expected time. The results can be used to construct many other SHI data structures. We show straightforward constructions for SHI ordered dictionaries: for n keys from {1,..., n k} searches take O(log log n) worstcase time and updates (insertions and deletions) O(log log n) expected time, and for keys in the comparison model searches take O(log n) worstcase time and updates O(log n) expected time. We also describe a SHI data structure for the ordermaintenance problem. It supports comparisons in O(1) worstcase time, and updates in O(1) expected time. All structures use O(n) data space. 1
Maintaining Dynamic Sequences under Equality Tests In Polyiogarithmic Time
, 1997
"... We present a randomized and a deterministic data structure for maintaining a dynamic family of sequences under equality tests of pairs of sequences and creations of new sequences by joining or splitting existing sequences. Both data structures support equality tests in O ( 1) time. The randomized v ..."
Abstract

Cited by 19 (0 self)
 Add to MetaCart
(Show Context)
We present a randomized and a deterministic data structure for maintaining a dynamic family of sequences under equality tests of pairs of sequences and creations of new sequences by joining or splitting existing sequences. Both data structures support equality tests in O ( 1) time. The randomized version supports new sequence creations in O(log 2 n) expected time where n is the length of the sequence created. The deterministic solution supports sequence creations in O (log n (log m log * m +log n)) time for the mth operation.