Results 1 - 10
of
45
A complete guide to the future
- Proc. 16th European Symposium on Programming (ESOP’07), volume 4421 of LNCS
, 2007
"... Abstract We present the semantics and proof system for an objectoriented language with active objects, asynchronous method calls, and futures. The language, based on Creol, distinguishes itself in that unlike active object models, it permits more than one thread of control within an object, though, ..."
Abstract
-
Cited by 31 (15 self)
- Add to MetaCart
Abstract We present the semantics and proof system for an objectoriented language with active objects, asynchronous method calls, and futures. The language, based on Creol, distinguishes itself in that unlike active object models, it permits more than one thread of control within an object, though, unlike Java, only one thread can be active within an object at a given time and rescheduling occurs only at specific release points. Consequently, reestablishing an object’s monitor invariant is possible at specific well-defined points in the code. The resulting proof system shows that this approach to concurrency is simpler for reasoning than, say, Java’s multithreaded concurrency model. From a methodological perspective, we identify constructs which admit a simple proof system and those which require, for example, interference freedom tests. 1
Alchemist: A transparent dependence distance profiling infrastructure
- In CGO ’09: Proceedings of the 2009 International Symposium on Code Generation and Optimization
, 2009
"... Abstract—Effectively migrating sequential applications to take advantage of parallelism available on multicore platforms is a well-recognized challenge. This paper addresses important aspects of this issue by proposing a novel profiling technique to automatically detect available concurrency in C pr ..."
Abstract
-
Cited by 8 (1 self)
- Add to MetaCart
Abstract—Effectively migrating sequential applications to take advantage of parallelism available on multicore platforms is a well-recognized challenge. This paper addresses important aspects of this issue by proposing a novel profiling technique to automatically detect available concurrency in C programs. The profiler, called Alchemist, operates completely transparently to applications, and identifies constructs at various levels of granularity (e.g., loops, procedures, and conditional statements) as candidates for asynchronous execution. Various dependences including read-after-write (RAW), write-after-read (WAR), and write-after-write (WAW), are detected between a construct and its continuation, the execution following the completion of the construct. The time-ordered distance between program points forming a dependence gives a measure of the effectiveness of parallelizing that construct, as well as identifying the transformations necessary to facilitate such parallelization. Using the notion of post-dominance, our profiling algorithm builds an execution index tree at run-time. This tree is used to differentiate among multiple instances of the same static construct, and leads to improved accuracy in the computed profile, useful to better identify constructs that are amenable to parallelization. Performance results indicate that the profiles generated by Alchemist pinpoint strong candidates for parallelization, and can help significantly ease the burden of application migration to multicore environments. Keywords-profiling; program dependence; parallelization; execution indexing I.
Software Thread-Level Speculation – An Optimistic Library Implementation
"... Software thread level speculation (tls) solutions tend to mirror the hardware ones, in the sense that they employ one, exact dependency-tracking mechanism. Our perspective is that software-flexibility is, perhaps, better exploited by a family of lighter, if less precise speculative models that can b ..."
Abstract
-
Cited by 7 (1 self)
- Add to MetaCart
Software thread level speculation (tls) solutions tend to mirror the hardware ones, in the sense that they employ one, exact dependency-tracking mechanism. Our perspective is that software-flexibility is, perhaps, better exploited by a family of lighter, if less precise speculative models that can be combined together in an effective configuration, which takes advantage of the application’s code-patterns. This paper reports on two main contributions. First, it introduces splsc: a software tls model that trades the potential for false-positive violations for a small memory overhead and efficient implementation. Second, it presents PolyLibTLS: a library that encapsulates several lightweight models and enables their composition. In this context, we report on the template meta-programming techniques that we used to achieve performance and safety, while preserving library’s modularity, extensibility and usability properties. Furthermore, we demonstrate that the user-framework interaction is straightforward and present parallelization timing results that validate our high-level perspective.
Stabilizers: a modular checkpointing abstraction for concurrent functional programs
- In ICFP ’06
, 2006
"... Transient faults that arise in large-scale software systems can often be repaired by re-executing the code in which they occur. Ascribing a meaningful semantics for safe re-execution in multi-threaded code is not obvious, however. For a thread to correctly re-execute a region of code, it must ensure ..."
Abstract
-
Cited by 6 (2 self)
- Add to MetaCart
Transient faults that arise in large-scale software systems can often be repaired by re-executing the code in which they occur. Ascribing a meaningful semantics for safe re-execution in multi-threaded code is not obvious, however. For a thread to correctly re-execute a region of code, it must ensure that all other threads which have witnessed its unwanted effects within that region are also reverted to a meaningful earlier state. If not done properly, data inconsistencies and other undesirable behavior may result. However, automatically determining what constitutes a consistent global checkpoint is not straightforward since thread interactions are a dynamic property of the program. In this paper, we present a safe and efficient checkpointing mechanism for Concurrent ML (CML) that can be used to recover from transient faults. We introduce a new linguistic abstraction called stabilizers that permits the specification of per-thread monitors and the restoration of globally consistent checkpoints. Safe global states are computed through lightweight monitoring of communication events among threads (e.g. message-passing operations or updates to shared variables). Our experimental results on several realistic, multithreaded, server-style CML applications, including a web server and a windowing toolkit, show that the overheads to use stabilizers are small, and lead us to conclude that they are a viable mechanism for defining safe checkpoints in concurrent functional programs.
Language and Virtual Machine Support for Efficient Fine-Grained Futures in Java
- In International Conference on Parallel Architectures and Compilation Techniques (PACT
, 2007
"... In this work, we investigate the implementation of futures in Java J2SE v5.0. Java 5.0 provides an interface-based implementation of futures that enables users to encapsulate potentially asynchronous computation and to define their own execution engines for futures. Although this methodology decoupl ..."
Abstract
-
Cited by 6 (4 self)
- Add to MetaCart
In this work, we investigate the implementation of futures in Java J2SE v5.0. Java 5.0 provides an interface-based implementation of futures that enables users to encapsulate potentially asynchronous computation and to define their own execution engines for futures. Although this methodology decouples thread scheduling from application logic, for applications with fine-grained parallelism, this model imposes an undue burden on the average users and introduces significant performance overhead. To address these issues, we investigate the use of lazy futures and offer an alternative implementation to the Java 5.0 approach. In particular, we present a directive-based programming model for using futures in Java that uses annotations in Java 5.0 (as opposed to interfaces) and a lazy future implementation to significantly simplify programmer effort. Our directive-based future system employs novel compilation and runtime techniques that transparently and adaptively split and spawn futures for parallel execution. All such decisions are automatic and guided by dynamically determined future granularity and underlying resource availability. We empirically evaluate our future implementation using different Java Virtual Machine configurations and common Java benchmarks that implement fine-grained parallelism. We compare directive-based lazy futures with lazy and Java 5.0 futures and show that our approach is significantly more scalable. 1.
Modular reasoning for deterministic parallelism. Computer laboratory technical report
, 2010
"... Weaving a concurrency control protocol into a program is difficult and error-prone. One way to alleviate this burden is deterministic parallelism. In this well-studied approach to parallelisation, a sequential program is annotated with sections that can execute concurrently, with automatically injec ..."
Abstract
-
Cited by 6 (2 self)
- Add to MetaCart
Weaving a concurrency control protocol into a program is difficult and error-prone. One way to alleviate this burden is deterministic parallelism. In this well-studied approach to parallelisation, a sequential program is annotated with sections that can execute concurrently, with automatically injected control constructs used to ensure observable behaviour consistent with the original program. This paper examines the formal specification and verification of these constructs. Our high-level specification defines the conditions necessary for correct execution; these conditions reflect program dependencies necessary to ensure deterministic behaviour. We connect the high-level specification used by clients of the library with the low-level library implementation, to prove that a client’s requirements for determinism are enforced. Significantly, we can reason about program and library correctness without breaking abstraction boundaries. To achieve this, we use concurrent abstract predicates, based on separation logic, to encapsulate racy behaviour in the library’s implementation. To allow generic specifications of libraries that can be instantiated by client programs, we extend the logic with higherorder parameters and quantification. We show that our high-level specification abstracts the details of deterministic parallelism by verifying two different low-level implementations of the library.
A lightweight in-place implementation for software thread-level speculation
- In SPAA ’09: Proc. 21st ACM Symposium on Parallelism in Algorithms and Architectures
, 2009
"... Thread-level speculation (TLS) is a technique that allows parts of a sequential program to be executed in parallel. TLS ensures the parallel program’s behaviour remains true to the language’s original sequential semantics; for example, allowing multiple iterations of a loop to run in parallel if the ..."
Abstract
-
Cited by 5 (1 self)
- Add to MetaCart
Thread-level speculation (TLS) is a technique that allows parts of a sequential program to be executed in parallel. TLS ensures the parallel program’s behaviour remains true to the language’s original sequential semantics; for example, allowing multiple iterations of a loop to run in parallel if there are no conflicts between them. Conventional software-TLS algorithms detect conflicts dynamically. They suffer from a number of problems. TLS implementations can impose large storage overheads caused by buffering speculative work. TLS implementations can offer disappointing scalability, if threads can only commit speculative work back to the “real ” heap sequentially. TLS implementations can be slow because speculative reads must consult look-aside tables to see earlier speculative writes, or because speculative operations replace normal reads and writes with expensive synchronisation primitives (e.g. CAS or memory fences). We present a streamlined software-TLS algorithm for mostlyparallel loops that aims to avoid these problems. We allow speculative work to be performed in place, so we avoid buffering, and so that reads naturally see earlier writes. We avoid needing a serialcommit protocol. We avoid the need for CAS or memory fences in common operations. We strive to reduce the size of TLS-related conflict-detection state, and to interact well with typical data-cache implementations. We evaluate our implementation on off-the-shelf hardware using seven applications from SciMark2, BYTEmark and JOlden. We achieve an average 77 % of the speed-up of manuallyparallelized versions of the benchmarks for fully parallel loops. We achieve a maximum of a 5.8x speed-up on an 8-core machine.
Adaptive Software Speculation for Enhancing the Cost-Efficiency of Behavior-Oriented Parallelization
"... Recently, software speculation has shown promising results in parallelizing complex sequential programs by exploiting dynamic high-level parallelism. The speculation however is cost-inefficient. Failed speculations may cause unnecessary shared resource contention, power consumption, and interference ..."
Abstract
-
Cited by 5 (3 self)
- Add to MetaCart
Recently, software speculation has shown promising results in parallelizing complex sequential programs by exploiting dynamic high-level parallelism. The speculation however is cost-inefficient. Failed speculations may cause unnecessary shared resource contention, power consumption, and interference to co-running applications. In this work, we propose adaptive speculation and design two algorithms to predict the profitability of a speculation and dynamically disable and enable the speculation of a region. Experimental results demonstrate significant improvement of computation efficiency without performance degradation. The adaptive speculation can also enhance the usability of behavior-oriented parallelization by allowing more flexibility in labeling possibly parallel regions. 1
Safe Programmable Speculative Parallelism
"... Execution order constraints imposed by dependences can serialize computation, preventing parallelization of code and algorithms. Speculating on the value(s) carried by dependences is one way to break such critical dependences. Value speculation has been used effectively at a low level, by compilers ..."
Abstract
-
Cited by 5 (0 self)
- Add to MetaCart
Execution order constraints imposed by dependences can serialize computation, preventing parallelization of code and algorithms. Speculating on the value(s) carried by dependences is one way to break such critical dependences. Value speculation has been used effectively at a low level, by compilers and hardware. In this paper, we focus on the use of speculation by programmers as an algorithmic paradigm to parallelize seemingly sequential code. We propose two new language constructs, speculative composition and speculative iteration. These constructs enable programmers to declaratively express speculative parallelism in programs: to indicate when and how to speculate, increasing the parallelism in the program, without concerning themselves with mundane implementation details. We present a core language with speculation constructs and mutable state and present a formal operational semantics for the language. We use the semantics to define the notion of a correct speculative execution as one that is equivalent to a non-speculative execution. In general, speculation requires a runtime mechanism to undo the effects of speculative computation in the case of mispredictions. We describe a set of conditions under which such rollback can be avoided. We present a static analysis that checks if a given program satisfies these conditions. This allows us to implement speculation efficiently, without the overhead required for rollbacks. We have implemented the speculation constructs as a C # library, along with the static checker for safety. We present an empirical evaluation of the efficacy of this approach to parallelization.
Semantics of Concurrent Revisions
"... Enabling applications to execute various tasks in parallel is difficult if those tasks exhibit read and write conflicts. We recently developed a programming model based on concurrent revisions that addresses this challenge in a novel way: each forked task gets a conceptual copy of all the shared sta ..."
Abstract
-
Cited by 4 (0 self)
- Add to MetaCart
Enabling applications to execute various tasks in parallel is difficult if those tasks exhibit read and write conflicts. We recently developed a programming model based on concurrent revisions that addresses this challenge in a novel way: each forked task gets a conceptual copy of all the shared state, and state changes are integrated only when tasks are joined, at which time write-write conflicts are deterministically resolved. In this paper, we study the precise semantics of this model, in particular its guarantees for determinacy and consistency. First, we introduce a revision calculus that concisely captures the programming model. Despite allowing concurrent execution and locally nondeterministic scheduling, we prove that the calculus is confluent and guarantees determinacy. We show that the consistency guarantees of our calculus are a logical extension of snapshot isolation with support for conflict resolution and nesting. Moreover, we discuss how custom merge functions can provide stronger guarantees for particular data types that are tailored to the needs of the application. Finally, we show we can visualize the nonlinear history of state in our computations using revision diagrams that clarify the synchronization between tasks and allow local reasoning about state updates. CR-number [sub-

