Results 1 - 10
of
41
Semantics of transactional memory and automatic mutual exclusion
- In Proceedings of the 35th ACM SIGPLANSIGACT Symposium on Principles of Programming Languages (POPL
, 2008
"... Software Transactional Memory (STM) is an attractive basis for the development of language features for concurrent programming. However, the semantics of these features can be delicate and problematic. In this paper we explore the tradeoffs between semantic simplicity, the viability of efficient imp ..."
Abstract
-
Cited by 55 (8 self)
- Add to MetaCart
Software Transactional Memory (STM) is an attractive basis for the development of language features for concurrent programming. However, the semantics of these features can be delicate and problematic. In this paper we explore the tradeoffs between semantic simplicity, the viability of efficient implementation strategies, and the flexibility of language constructs. Specifically, we develop semantics and type systems for the constructs of the Automatic Mutual Exclusion (AME) programming model; our results apply also to other constructs, such as atomic blocks. With this semantics as a point of reference, we study several implementation strategies. We model STM systems that use in-place update, optimistic concurrency, lazy conflict detection, and roll-back. These strategies are correct only under non-trivial assumptions that we identify and analyze. One important source of errors is that some efficient implementations create dangerous “zombie” computations where a transaction keeps running after experiencing a conflict; the assumptions confine the effects of these computations.
FastTrack: Efficient and Precise Dynamic Race Detection
"... Multithreaded programs are notoriously prone to race conditions. Prior work on dynamic race detectors includes fast but imprecise race detectors that report false alarms, as well as slow but precise race detectors that never report false alarms. The latter typically use expensive vector clock operat ..."
Abstract
-
Cited by 52 (6 self)
- Add to MetaCart
Multithreaded programs are notoriously prone to race conditions. Prior work on dynamic race detectors includes fast but imprecise race detectors that report false alarms, as well as slow but precise race detectors that never report false alarms. The latter typically use expensive vector clock operations that require time linear in the number of program threads. This paper exploits the insight that the full generality of vector clocks is unnecessary in most cases. That is, we can replace heavyweight vector clocks with an adaptive lightweight representation that, for almost all operations of the target program, requires only constant space and supports constant-time operations. This representation change significantly improves time and space performance, with no loss in precision. Experimental results on Java benchmarks including the Eclipse development environment show that our FASTTRACK race detector is an order of magnitude faster than a traditional vector-clock race detector, and roughly twice as fast as the high-performance DJIT + algorithm. FASTTRACK is even comparable in speed to ERASER on our Java benchmarks, while never reporting false alarms.
A type and effect system for deterministic parallel java
- In Proc. Intl. Conf. on Object-Oriented Programming, Systems, Languages, and Applications
, 2009
"... Today’s shared-memory parallel programming models are complex and error-prone. While many parallel programs are intended to be deterministic, unanticipated thread interleavings can lead to subtle bugs and nondeterministic semantics. In this paper, we demonstrate that a practical type and effect syst ..."
Abstract
-
Cited by 38 (7 self)
- Add to MetaCart
Today’s shared-memory parallel programming models are complex and error-prone. While many parallel programs are intended to be deterministic, unanticipated thread interleavings can lead to subtle bugs and nondeterministic semantics. In this paper, we demonstrate that a practical type and effect system can simplify parallel programming by guaranteeing deterministic semantics with modular, compile-time type checking even in a rich, concurrent object-oriented language such as Java. We describe an object-oriented type and effect system that provides several new capabilities over previous systems for expressing deterministic parallel algorithms. We also describe a language called Deterministic Parallel Java (DPJ) that incorporates the new type system features, and we show that a core subset of DPJ is sound. We describe an experimental validation showing that DPJ can express a wide range of realistic parallel programs; that the new type system features are useful for such programs; and that the parallel programs exhibit good performance gains (coming close to or beating equivalent, nondeterministic multithreaded programs where those are available).
Goldilocks: A Race and Transaction-Aware Java Runtime
"... Data races often result in unexpected and erroneous behavior. In addition to causing data corruption and leading programs to crash, the presence of data races complicates the semantics of an execution which might no longer be sequentially consistent. Motivated by these observations, we have designed ..."
Abstract
-
Cited by 37 (3 self)
- Add to MetaCart
Data races often result in unexpected and erroneous behavior. In addition to causing data corruption and leading programs to crash, the presence of data races complicates the semantics of an execution which might no longer be sequentially consistent. Motivated by these observations, we have designed and implemented a Java runtime system that monitors program executions and throws a Data-RaceException when a data race is about to occur. Analogous to other runtime exceptions, the DataRaceException provides two key benefits. First, accesses causing race conditions are interrupted and handled before they cause errors that may be difficult to diagnose later. Second, if no DataRaceException is thrown in an execution, it is guaranteed to be sequentially consistent. This strong guarantee helps to rule out many concurrency-related possibilities as the cause of erroneous behavior. When a DataRaceException is caught, the operation, thread, or program causing it can be terminated gracefully. Alternatively, the DataRaceException can serve as a conflict-detection mechanism in optimistic uses of concurrency. We start with the definition of data-race-free executions in the Java memory model. We generalize this definition to executions that use transactions in addition to locks and volatile variables for synchronization. We present a precise and efficient algorithm for dynamically verifying that an execution is free of data races. This algorithm generalizes the Goldilocks algorithm for data-race detection by handling transactions and providing the ability to distinguish between read and write accesses. We have implemented our algorithm and the DataRaceException in the Kaffe Java Virtual Machine. We have evaluated our system on a variety of publicly available Java benchmarks and a few microbenchmarks that combine lock-based and transaction-based synchronization. Our experiments indicate that our implementation has reasonable overhead. Therefore, we believe that in addition to being a debugging tool, the DataRaceException may be a viable mechanism to enforce the safety of executions of multithreaded Java programs. Categories and Subject Descriptors D.2.4 [Software Engineering]: Software/Program Verification — formal methods, reliability,
Component-Based Lock Allocation
"... The allocation of lock objects to critical sections in concurrent programs affects both performance and correctness. Recent work explores automatic lock allocation, aiming primarily to minimize conflicts and maximize parallelism by allocating locks to individual critical section interferences. We in ..."
Abstract
-
Cited by 22 (1 self)
- Add to MetaCart
The allocation of lock objects to critical sections in concurrent programs affects both performance and correctness. Recent work explores automatic lock allocation, aiming primarily to minimize conflicts and maximize parallelism by allocating locks to individual critical section interferences. We investigate component-based lock allocation, which allocates locks to entire groups of interfering critical sections. Our allocator depends on a thread-based side effect analysis, and benefits from precise points-to and may happen in parallel information. Thread-local object information has a small impact, and dynamic locks do not improve significantly on static locks. We experiment with a range of small and large Java benchmarks on 2-way, 4-way, and 8-way machines, and find that a single static lock is sufficient for mtrt, that performance degrades by 10 % for hsqldb, that jbb2000 becomes mostly serialized, and that for lusearch, xalan, and jbb2005, component-based lock allocation recovers the performance of the original program. 1.
PACER: Proportional Detection of Data Races ∗
"... Data races indicate serious concurrency bugs such as order, atomicity, and sequential consistency violations. Races are difficult to find and fix, often manifesting only in deployment. The frequency of these bugs will likely increase as software adds parallelism to exploit multicore hardware. Unfort ..."
Abstract
-
Cited by 18 (4 self)
- Add to MetaCart
Data races indicate serious concurrency bugs such as order, atomicity, and sequential consistency violations. Races are difficult to find and fix, often manifesting only in deployment. The frequency of these bugs will likely increase as software adds parallelism to exploit multicore hardware. Unfortunately, sound and precise race detectors slow programs by factors of eight or more, which is too expensive to deploy. This paper presents a precise, low-overhead sampling-based data race detector called PACER. PACER makes a proportionality guarantee: it detects any race at a rate equal to the sampling rate, by finding races whose first access occurs during a global sampling period. During sampling, PACER tracks all accesses using the sound and precise FASTTRACK algorithm. In non-sampling periods, PACER discards sampled access information that cannot be part of a reported race, and PACER simplifies tracking of the happens-before relationship, yielding near-constant, instead of linear, overheads. Experimental results confirm our design claims: time and space overheads scale with the sampling rate, and sampling rates of 1-3 % yield overheads low enough to consider in production software. Furthermore, our results suggest PACER reports each race at a rate equal to the sampling rate. The resulting system provides a “get what you pay for ” approach to race detection that is suitable for identifying real, hard-to-reproduce races in deployed systems. 1.
Automated Dynamic Analysis of CUDA Programs
"... Recent increases in the programmability and performance of GPUs have led to a surge of interest in utilizing them for general-purpose computations. Tools such as NVIDIA’s Cuda allow programmers to use a C-like language to code algorithms for execution on the GPU. Unfortunately, parallel programs are ..."
Abstract
-
Cited by 17 (3 self)
- Add to MetaCart
Recent increases in the programmability and performance of GPUs have led to a surge of interest in utilizing them for general-purpose computations. Tools such as NVIDIA’s Cuda allow programmers to use a C-like language to code algorithms for execution on the GPU. Unfortunately, parallel programs are prone to subtle correctness and performance bugs, and Cuda tool support for solving these remains a work in progress. As a first step towards addressing these problems, we present an automated analysis technique for finding two specific classes of bugs in Cuda programs: race conditions, which impact program correctness, and shared memory bank conflicts, which impact program performance. Our technique automatically instruments a program in two ways: to keep track of the memory locations accessed by different threads, and to use this data to determine whether bugs exist in the program. The instrumented source code can be run directly in Cuda’s device emulation mode, and any potential errors discovered will be automatically reported to the user. This automated analysis can help programmers find and solve subtle bugs in programs that are too complex to analyze manually. Although these issues are explored in the context of Cuda programs, similar issues will arise in any sufficiently “manycore ” architecture. 1.
Types for Atomicity: Static Checking and Inference for Java
"... Atomicity is a fundamental correctness property in multithreaded programs. A method is atomic if, for every execution, there is an equivalent serial execution in which the actions of the method are not interleaved with actions of other threads. Atomic methods are amenable to sequential reasoning, wh ..."
Abstract
-
Cited by 11 (6 self)
- Add to MetaCart
Atomicity is a fundamental correctness property in multithreaded programs. A method is atomic if, for every execution, there is an equivalent serial execution in which the actions of the method are not interleaved with actions of other threads. Atomic methods are amenable to sequential reasoning, which significantly facilitates subsequent analysis and verification. This article presents a type system for specifying and verifying the atomicity of methods in multithreaded Java programs using a synthesis of Lipton’s theory of reduction and type systems for race detection. The type system supports guarded, write-guarded, and unguarded fields, as well as thread-local data, parameterized classes and methods, and protected locks. We also present an algorithm for verifying atomicity via type inference. We have applied our type checker and type inference tools to a number of commonly used Java library classes and programs. These tools were able to verify the vast majority of methods in these benchmarks as atomic, indicating that atomicity is a widespread methodology for multithreaded programming. In addition, reported atomicity violations revealed some subtle errors in the synchronization disciplines of these programs.
Conflict Exceptions: Simplifying Concurrent Language Semantics with Precise Hardware Exceptions for Data-Races
, 2010
"... We argue in this paper that concurrency errors should be treated as exceptions, i.e., have fail-stop behavior and precise semantics. We propose an exception model based on conflict of synchronizationfree regions, which precisely detects a broad class of data-races. We show that our exceptions provid ..."
Abstract
-
Cited by 8 (5 self)
- Add to MetaCart
We argue in this paper that concurrency errors should be treated as exceptions, i.e., have fail-stop behavior and precise semantics. We propose an exception model based on conflict of synchronizationfree regions, which precisely detects a broad class of data-races. We show that our exceptions provide enough guarantees to simplify high-level programming language semantics and debugging, but are significantly cheaper to enforce than traditional data-race detection. To make the performance cost of enforcement negligible, we propose architecture support for accurately detecting and precisely delivering these exceptions. We evaluate the suitability of our model as well as the behavior of our architectural mechanisms using the PARSEC benchmark suite and commercial applications. Our results show that the exception model largely reflects how programmers are already writing code and that the main memory, traffic and performance overheads of the enforcement mechanisms we propose are very low.
Incremental testing of concurrent programs using value schedules
- IEEE/ACM International Conference on Automated Software Engineering
, 2008
"... Concurrent programs are difficult to debug and verify because of the nondeterministic nature of concurrent executions. A particular concurrency-related bug may only show up under certain rarely-executed thread interleavings. Therefore, commonly used debugging methodologies, such as inserting print s ..."
Abstract
-
Cited by 5 (0 self)
- Add to MetaCart
Concurrent programs are difficult to debug and verify because of the nondeterministic nature of concurrent executions. A particular concurrency-related bug may only show up under certain rarely-executed thread interleavings. Therefore, commonly used debugging methodologies, such as inserting print statements, are no longer sufficient for uncovering concurrency-related bugs. However, many existing bug detection methods, such as dynamic analysis and model checking, have a very high computational cost. In this paper, we introduce a new technique for uncovering concurrency-related bugs from multithreaded Java programs. Our technique uncovers concurrency-related bugs by generating and testing read-write assignment sequences, referred to as value schedules, of a multithreaded Java program. Our value-schedule-based technique distinguishes itself in its ability to avoid exploring superfluous program state space caused by speculative permutation on transitions. Therefore, our technique can achieve a higher degree of POR (Partial Order Reduction) than existing methods. We demonstrate our technique using some programs, with an implementation built using an explicit state model checker called JPF.

