Results 1 - 10
of
38
CoreDet: A Compiler and Runtime System for Deterministic Multithreaded Execution
"... The behavior of a multithreaded program does not depend only on its inputs. Scheduling, memory reordering, timing, and low-level hardware effects all introduce nondeterminism in the execution of multithreaded programs. This severely complicates many tasks, including debugging, testing, and automatic ..."
Abstract
-
Cited by 34 (5 self)
- Add to MetaCart
The behavior of a multithreaded program does not depend only on its inputs. Scheduling, memory reordering, timing, and low-level hardware effects all introduce nondeterminism in the execution of multithreaded programs. This severely complicates many tasks, including debugging, testing, and automatic replication. In this work, we avoid these complications by eliminating their root cause: we develop a compiler and runtime system that runs arbitrary multithreaded C/C++ POSIX Threads programs deterministically. A trivial non-performant approach to providing determinism is simply deterministically serializing execution. Instead, we present a compiler and runtime infrastructure that ensures determinism but resorts to serialization rarely, for handling interthread communication and synchronization. We develop two basic approaches, both of which are largely dynamic with performance improved by some static compiler optimizations. First, an ownership-based approach detects interthread communication via an evolving table that tracks ownership of memory regions by threads. Second, a buffering approach uses versioned memory and employs a deterministic commit protocol to make changes visible to other threads. While buffering has larger single-threaded overhead than ownership, it tends to scale better (serializing less often). A hybrid system sometimes performs and scales better than either approach individually. Our implementation is based on the LLVM compiler infrastructure. It needs neither programmer annotations nor special hardware. Our empirical evaluation uses the PARSEC and SPLASH2 benchmarks and shows that our approach scales comparably to nondeterministic execution.
Parallel Programming Must Be Deterministic by Default
"... In today’s widely used parallel programming models, subtle programming errors can lead to unintended nondeterministic behavior and hard to catch bugs. In contrast, we argue for a parallel programming model that is deterministic by default: deterministic behavior is guaranteed unless the programmer e ..."
Abstract
-
Cited by 21 (4 self)
- Add to MetaCart
In today’s widely used parallel programming models, subtle programming errors can lead to unintended nondeterministic behavior and hard to catch bugs. In contrast, we argue for a parallel programming model that is deterministic by default: deterministic behavior is guaranteed unless the programmer explicitly uses nondeterministic constructs. This goal is particularly challenging for modern object-oriented languages with expressive use of reference aliasing and updates to shared mutable state. We propose a broad research agenda in support of this goal, and we describe some of our own work to further that agenda. 1
Memory Models: A Case for Rethinking Parallel Languages and Hardware
- COMMUNICATIONS OF THE ACM
, 2010
"... The era of parallel computing for the masses is here, but writing correct parallel programs remains far more difficult than writing sequential programs. Aside from a few domains, most parallel programs are written using a shared-memory approach. The memory model, which specifies the meaning of share ..."
Abstract
-
Cited by 16 (3 self)
- Add to MetaCart
The era of parallel computing for the masses is here, but writing correct parallel programs remains far more difficult than writing sequential programs. Aside from a few domains, most parallel programs are written using a shared-memory approach. The memory model, which specifies the meaning of shared variables, is at the heart of this programming model. Unfortunately, it has involved a tradeoff between programmability and performance, and has arguably been one of the most challenging and contentious areas in both hardware architecture and programming language specification. Recent broad community-scale efforts have finally led to a convergence in this debate, with popular languages such as Java and C++ and most hardware vendors publishing compatible memory model specifications. Although this convergence is a dramatic improvement, it has exposed fundamental shortcomings in current
Deterministic process groups in dOS
- In Proc. 9th USENIX Symposium on Operating System Design andImplementation
, 2010
"... Current multiprocessor systems execute parallel and concurrent software nondeterministically: even when given precisely the same input, two executions of the same program may produce different output. This severely complicates debugging, testing, and automatic replication for fault-tolerance. Previo ..."
Abstract
-
Cited by 12 (0 self)
- Add to MetaCart
Current multiprocessor systems execute parallel and concurrent software nondeterministically: even when given precisely the same input, two executions of the same program may produce different output. This severely complicates debugging, testing, and automatic replication for fault-tolerance. Previous efforts to address this issue have focused primarily on record and replay, but making execution actually deterministic would address the problem at the root. Our goals in this work are twofold: (1) to provide fully deterministic execution of arbitrary, unmodified, multithreaded programs as an OS service; and (2) to make all sources of intentional nondeterminism, such as network I/O, be explicit and controllable. To this end we propose a new OS abstraction, the Deterministic Process Group (DPG). All communication between threads and processes internal to a DPG happens deterministically, including implicit communication via sharedmemory accesses, as well as communication via OS channels such as pipes, signals, and the filesystem. To deal with fundamentally nondeterministic external events, our abstraction includes the shim layer, a programmable interface that interposes on all interaction between a DPG and the external world, making determinism useful even for reactive applications. We implemented the DPG abstraction as an extension to Linux and demonstrate its benefits with three use cases: plain deterministic execution; replicated execution; and record and replay by logging just external input. We evaluated our implementation on both parallel and reactive workloads, including Apache, Chromium, and PARSEC. 1.
DoublePlay: Parallelizing Sequential Logging and Replay
"... Deterministic replay systems record and reproduce the execution of a hardware or software system. In contrast to replaying execution on uniprocessors, deterministic replay on multiprocessors is very challenging to implement efficiently because of the need to reproduce the order or values read by sha ..."
Abstract
-
Cited by 11 (2 self)
- Add to MetaCart
Deterministic replay systems record and reproduce the execution of a hardware or software system. In contrast to replaying execution on uniprocessors, deterministic replay on multiprocessors is very challenging to implement efficiently because of the need to reproduce the order or values read by shared memory operations performed by multiple threads. In this paper, we present DoublePlay, a new way to efficiently guarantee replay on commodity multiprocessors. Our key insight is that one can use the simpler and faster mechanisms of single-processor record and replay, yet still achieve the scalability offered by multiple cores, by using an additional execution to parallelize the record and replay of an application. DoublePlay timeslices multiple threads on a single processor, then runs multiple time intervals (epochs) of the program concurrently on separate processors. This strategy, which we call uniparallelism, makes logging much easier because each epoch runs on a single processor (so threads in an epoch never simultaneously access the same memory) and different epochs operate on different copies of the memory. Thus, rather than logging the order of shared-memory accesses, we need only log the order in which threads in an epoch are timesliced on the processor. DoublePlay runs an additional execution of the program on multiple processors to generate checkpoints so that epochs run in parallel. We evaluate DoublePlay on a variety of client, server, and scientific parallel benchmarks; with spare cores, DoublePlay reduces logging overhead to an average of 15 % with two worker threads and 28 % with four threads.
Conflict Exceptions: Simplifying Concurrent Language Semantics with Precise Hardware Exceptions for Data-Races
, 2010
"... We argue in this paper that concurrency errors should be treated as exceptions, i.e., have fail-stop behavior and precise semantics. We propose an exception model based on conflict of synchronizationfree regions, which precisely detects a broad class of data-races. We show that our exceptions provid ..."
Abstract
-
Cited by 8 (5 self)
- Add to MetaCart
We argue in this paper that concurrency errors should be treated as exceptions, i.e., have fail-stop behavior and precise semantics. We propose an exception model based on conflict of synchronizationfree regions, which precisely detects a broad class of data-races. We show that our exceptions provide enough guarantees to simplify high-level programming language semantics and debugging, but are significantly cheaper to enforce than traditional data-race detection. To make the performance cost of enforcement negligible, we propose architecture support for accurately detecting and precisely delivering these exceptions. We evaluate the suitability of our model as well as the behavior of our architectural mechanisms using the PARSEC benchmark suite and commercial applications. Our results show that the exception model largely reflects how programmers are already writing code and that the main memory, traffic and performance overheads of the enforcement mechanisms we propose are very low.
ReAssert: Suggesting repairs for broken unit tests,” University of Illinois at Urbana-Champaign
, 2009
"... Abstract—Developers often change software in ways that cause tests to fail. When this occurs, developers must determine whether failures are caused by errors in the code under test or in the test code itself. In the latter case, developers must repair failing tests or remove them from the test suite ..."
Abstract
-
Cited by 6 (3 self)
- Add to MetaCart
Abstract—Developers often change software in ways that cause tests to fail. When this occurs, developers must determine whether failures are caused by errors in the code under test or in the test code itself. In the latter case, developers must repair failing tests or remove them from the test suite. Reparing tests is time consuming but beneficial, since removing tests reduces a test suite’s ability to detect regressions. Fortunately, simple program transformations can repair many failing tests automatically. We present ReAssert, a novel technique and tool that suggests repairs to failing tests ’ code which cause the tests to pass. Examples include replacing literal values in tests, changing assertion methods, or replacing one assertion with several. If the developer chooses to apply the repairs, ReAssert modifies the code automatically. Our experiments show that ReAssert can repair many common test failures and that its suggested repairs correspond to developers ’ expectations. I.
Modular reasoning for deterministic parallelism. Computer laboratory technical report
, 2010
"... Weaving a concurrency control protocol into a program is difficult and error-prone. One way to alleviate this burden is deterministic parallelism. In this well-studied approach to parallelisation, a sequential program is annotated with sections that can execute concurrently, with automatically injec ..."
Abstract
-
Cited by 6 (2 self)
- Add to MetaCart
Weaving a concurrency control protocol into a program is difficult and error-prone. One way to alleviate this burden is deterministic parallelism. In this well-studied approach to parallelisation, a sequential program is annotated with sections that can execute concurrently, with automatically injected control constructs used to ensure observable behaviour consistent with the original program. This paper examines the formal specification and verification of these constructs. Our high-level specification defines the conditions necessary for correct execution; these conditions reflect program dependencies necessary to ensure deterministic behaviour. We connect the high-level specification used by clients of the library with the low-level library implementation, to prove that a client’s requirements for determinism are enforced. Significantly, we can reason about program and library correctness without breaking abstraction boundaries. To achieve this, we use concurrent abstract predicates, based on separation logic, to encapsulate racy behaviour in the library’s implementation. To allow generic specifications of libraries that can be instantiated by client programs, we extend the logic with higherorder parameters and quantification. We show that our high-level specification abstracts the details of deterministic parallelism by verifying two different low-level implementations of the library.
Inferring method effect summaries for nested heap regions
- In Int’l Conf. on Softw. Eng’g (ASE
, 2009
"... Abstract—Effect systems are important for reasoning about the side effects of a program. Although effect systems have been around for decades, they have not been widely adopted in practice because of the large number of annotations that they require. A tool that infers effects automatically can make ..."
Abstract
-
Cited by 5 (4 self)
- Add to MetaCart
Abstract—Effect systems are important for reasoning about the side effects of a program. Although effect systems have been around for decades, they have not been widely adopted in practice because of the large number of annotations that they require. A tool that infers effects automatically can make effect systems practical. We present an effect inference algorithm and an Eclipse plug-in, DPJIZER, which alleviate the burden of writing effect annotations for a language called Deterministic Parallel Java (DPJ). The key novel feature of the algorithm is the ability to infer effects on nested heap regions. Besides DPJ, we also illustrate how the algorithm can be used for a different effect system based on object ownership. Our experience shows that DPJIZER is both useful and effective: (i) inferring effect annotations automatically saves significant programming burden; and (ii) inferred effects are more precise than those written manually, and are fine-grained enough to enable the compiler to prove determinism of the program. I.
A Time-Aware Type System For Data-Race Protection and Guaranteed Initialization
"... We introduce a type system based on intervals, objects representing the time in which a block of code will execute. The type system can verify time-based properties such as when a field will be accessed or a method will be invoked. One concrete application of our type system is data-race protection: ..."
Abstract
-
Cited by 4 (1 self)
- Add to MetaCart
We introduce a type system based on intervals, objects representing the time in which a block of code will execute. The type system can verify time-based properties such as when a field will be accessed or a method will be invoked. One concrete application of our type system is data-race protection: For fields which are initialized during one phase of the program and constant thereafter, users can designate the interval during which the field is mutable. Code which happens after this initialization interval can safely read the field in parallel. We also support fields guarded by a lock and even the use of dynamic race detectors. Another use for intervals is to designate different phases in the object’s lifetime, such as a constructor phase. The type system then ensures that only appropriate methods are invoked in each phase.

