Results 1 - 10
of
47
Enforcing isolation and ordering in STM
- In the Proceedings of the Conf. on Programming Language Design and Implementation
, 2007
"... Transactional memory provides a new concurrency control mechanism that avoids many of the pitfalls of lock-based synchronization. High-performance software transactional memory (STM) implementations thus far provide weak atomicity: Accessing shared data both inside and outside a transaction can resu ..."
Abstract
-
Cited by 41 (6 self)
- Add to MetaCart
Transactional memory provides a new concurrency control mechanism that avoids many of the pitfalls of lock-based synchronization. High-performance software transactional memory (STM) implementations thus far provide weak atomicity: Accessing shared data both inside and outside a transaction can result in unexpected, implementation-dependent behavior. To guarantee isolation and consistent ordering in such a system, programmers are expected to enclose all shared-memory accesses inside transactions. A system that provides strong atomicity guarantees isolation even in the presence of threads that access shared data outside transactions. A strongly-atomic system also orders transactions with conflicting non-transactional memory operations in a consistent manner. In this paper, we discuss some surprising pitfalls of weak atomicity, and we present an STM system that avoids these problems
Velodrome: A Sound and Complete Dynamic Atomicity Checker for Multithreaded Programs
"... Atomicity is a fundamental correctness property in multithreaded programs, both because atomic code blocks are amenable to sequential reasoning (which significantly simplifies correctness arguments), and because atomicity violations often reveal defects in a program’s synchronization structure. Unfo ..."
Abstract
-
Cited by 40 (10 self)
- Add to MetaCart
Atomicity is a fundamental correctness property in multithreaded programs, both because atomic code blocks are amenable to sequential reasoning (which significantly simplifies correctness arguments), and because atomicity violations often reveal defects in a program’s synchronization structure. Unfortunately, all atomicity analyses developed to date are incomplete in that they may yield false alarms on correctly synchronized programs, which limits their usefulness. We present the first dynamic analysis for atomicity that is both sound and complete. The analysis reasons about the exact dependencies between operations in the observed trace of the target program, and it reports error messages if and only if the observed trace is not conflict-serializable. Despite this significant increase in precision, the performance and coverage of our analysis is competitive with earlier incomplete dynamic analyses for atomicity.
Muvi: Automatically inferring multi-variable access correlations and detecting related semantic and concurrency bugs
- In Proceedings of the 21st ACM Symposium on Operating Systems Principles (SOSP07
, 2007
"... Software defects significantly reduce system dependability. Among various types of software bugs, semantic and concurrency bugs are two of the most difficult to detect. This paper proposes a novel method, called MUVI, that detects an important class of semantic and concurrency bugs. MUVI automatical ..."
Abstract
-
Cited by 25 (6 self)
- Add to MetaCart
Software defects significantly reduce system dependability. Among various types of software bugs, semantic and concurrency bugs are two of the most difficult to detect. This paper proposes a novel method, called MUVI, that detects an important class of semantic and concurrency bugs. MUVI automatically infers commonly existing multi-variable access correlations through code analysis and then detects two types of related bugs: (1) inconsistent updates—correlated variables are not updated in a consistent way, and (2) multivariable concurrency bugs—correlated accesses are not protected in the same atomic sections in concurrent programs. We evaluate MUVI on four large applications: Linux, Mozilla, MySQL, and PostgreSQL. MUVI automatically infers more than 6000 variable access correlations with high accuracy (83%). Based on the inferred correlations, MUVI detects 39 new inconsistent update semantic bugs from the latest versions of these applications, with 17 of them recently confirmed by the developers based on our reports. We also implemented MUVI multi-variable extensions to two representative data race bug detection methods (lockset and happens-before). Our evaluation on five real-world multi-variable concurrency bugs from Mozilla and MySQL shows that the MUVI-extension correctly identifies the root causes of four out of the five multi-variable concurrency bugs with 14 % additional overhead on average. Interestingly, MUVI also helps detect four new multi-variable concurrency bugs in Mozilla that have never been reported before. None of the nine bugs can be identified correctly by the original race detectors without our MUVI extensions.
Flux: A Language for Programming High-Performance Servers
- In Proceedings of USENIX Annual Technical Conference
, 2006
"... Programming high-performance server applications is challenging: it is both complicated and error-prone to write the concurrent code required to deliver high performance and scalability. Server performance bottlenecks are difficult to identify and correct. Finally, it is difficult to predict server ..."
Abstract
-
Cited by 24 (4 self)
- Add to MetaCart
Programming high-performance server applications is challenging: it is both complicated and error-prone to write the concurrent code required to deliver high performance and scalability. Server performance bottlenecks are difficult to identify and correct. Finally, it is difficult to predict server performance prior to deployment. This paper presents Flux, a language that dramatically simplifies the construction of scalable high-performance server applications. Flux lets programmers compose offthe-shelf, sequential C or C++ functions into concurrent servers. Flux programs are type-checked and guaranteed to be deadlock-free. We have built a number of servers in Flux, including a web server with PHP support, an image-rendering server, a BitTorrent peer, and a game server. These Flux servers match or exceed the performance of their counterparts written entirely in C. By tracking hot paths through a running server, Flux simplifies the identification of performance bottlenecks. The Flux compiler also automatically generates discrete event simulators that accurately predict actual server performance under load and with different hardware resources. 1
Component-Based Lock Allocation
"... The allocation of lock objects to critical sections in concurrent programs affects both performance and correctness. Recent work explores automatic lock allocation, aiming primarily to minimize conflicts and maximize parallelism by allocating locks to individual critical section interferences. We in ..."
Abstract
-
Cited by 22 (1 self)
- Add to MetaCart
The allocation of lock objects to critical sections in concurrent programs affects both performance and correctness. Recent work explores automatic lock allocation, aiming primarily to minimize conflicts and maximize parallelism by allocating locks to individual critical section interferences. We investigate component-based lock allocation, which allocates locks to entire groups of interfering critical sections. Our allocator depends on a thread-based side effect analysis, and benefits from precise points-to and may happen in parallel information. Thread-local object information has a small impact, and dynamic locks do not improve significantly on static locks. We experiment with a range of small and large Java benchmarks on 2-way, 4-way, and 8-way machines, and find that a single static lock is sufficient for mtrt, that performance degrades by 10 % for hsqldb, that jbb2000 becomes mostly serialized, and that for lusearch, xalan, and jbb2005, component-based lock allocation recovers the performance of the original program. 1.
Lock Allocation
- POPL'07
, 2007
"... We introduce lock allocation, an automatic technique that takes a multi-threaded program annotated with atomic sections (that must be executed atomically), and infers a lock assignment from global variables to locks and a lock instrumentation that determines where each lock should be acquired and re ..."
Abstract
-
Cited by 20 (0 self)
- Add to MetaCart
We introduce lock allocation, an automatic technique that takes a multi-threaded program annotated with atomic sections (that must be executed atomically), and infers a lock assignment from global variables to locks and a lock instrumentation that determines where each lock should be acquired and released such that the resulting instrumented program is guaranteed to preserve atomicity and deadlock freedom (provided all shared state is accessed only within atomic sections). Our algorithm works in the presence of pointers and procedures, and sets up the lock allocation problem as a 0-1 ILP which minimizes the conflict cost between atomic sections while simultaneously minimizing the number of locks. We have implemented our algorithm for both C with pthreads and Java, and have applied it to infer locks in 15K lines of AOLserver code. Our automatic allocation produces the same results as hand annotations for most of this code, while solving the optimization instances within a second for most programs.
Inferring locks for atomic sections
"... Atomic sections are a recent and popular idiom to support the development of concurrent programs. Updates performed within an atomic section should not be visible to other threads until the atomic section has been executed entirely. Traditionally, atomic sections are supported through the use of opt ..."
Abstract
-
Cited by 19 (0 self)
- Add to MetaCart
Atomic sections are a recent and popular idiom to support the development of concurrent programs. Updates performed within an atomic section should not be visible to other threads until the atomic section has been executed entirely. Traditionally, atomic sections are supported through the use of optimistic concurrency, either using a transactional memory hardware, or an equivalent software emulation (STM). This paper explores automatically supporting atomic sections using pessimistic concurrency. We present a system that combines compiler and runtime techniques to automatically transform programs written with atomic sections into programs that only use locking primitives. To minimize contention in the transformed programs, our compiler chooses from several lock granularities, using fine-grain locks whenever it is possible. This paper formally presents our framework, shows that our compiler is sound (i.e., it protects all shared locations accessed within atomic sections), and reports experimental results.
Interprocedural Analysis of Asynchronous Programs
, 2007
"... An asynchronous program is one that contains procedure calls which are not immediately executed from the callsite, but stored and “dispatched” in a non-deterministic order by an external scheduler at a later point. We formalize the problem of interprocedural dataflow analysis for asynchronous progra ..."
Abstract
-
Cited by 18 (1 self)
- Add to MetaCart
An asynchronous program is one that contains procedure calls which are not immediately executed from the callsite, but stored and “dispatched” in a non-deterministic order by an external scheduler at a later point. We formalize the problem of interprocedural dataflow analysis for asynchronous programs as AIFDS problems, a generalization of the IFDS problems for interprocedural dataflow analysis. We give an algorithm for computing the precise meet-over-valid-paths solution for any AIFDS instance, as well as a demand-driven algorithm for solving the corresponding demand AIFDS instances. Our algorithm can be easily implemented on top of any existing interprocedural dataflow analysis framework. We have implemented the algorithm on top of BLAST, thereby obtaining the first safety verification tool for unbounded asynchronous programs. Though the problem of solving AIFDS instances is EXPSPACE-hard, we find that in practice our technique can efficiently analyze programs by exploiting standard optimizations of interprocedural dataflow analyses.
Reliable and efficient programming abstractions for wireless sensor networks
- In Proc. of PLDI
, 2007
"... It is currently difficult to build practical and reliable programming systems out of distributed and resource-constrained sensor devices. The state of the art in today’s sensornet programming is centered around a component-based language called nesC. nesC is a nodelevel language—a program is written ..."
Abstract
-
Cited by 18 (2 self)
- Add to MetaCart
It is currently difficult to build practical and reliable programming systems out of distributed and resource-constrained sensor devices. The state of the art in today’s sensornet programming is centered around a component-based language called nesC. nesC is a nodelevel language—a program is written for an individual node in the network—and nesC programs use the services of an operating system called TinyOS. We are pursuing an approach to programming sensor networks that significantly raises the level of abstraction over this practice. The critical change is one of perspective: rather than writing programs from the point of view of an individual node, programmers implement a central program that conceptually has access to the entire network. This approach pushes to the compiler the task of producing node-level programs that implement the desired behavior. We present the Pleiades programming language, its compiler,
Lightweight annotations for controlling sharing in concurrent data structures
, 2009
"... SharC is a recently developed system for checking data-sharing in multithreaded programs. Programmers specify sharing rules (readonly, protected by a lock, etc.) for individual objects, and the SharC compiler enforces these rules using static and dynamic checks. Violations of these rules indicate un ..."
Abstract
-
Cited by 10 (2 self)
- Add to MetaCart
SharC is a recently developed system for checking data-sharing in multithreaded programs. Programmers specify sharing rules (readonly, protected by a lock, etc.) for individual objects, and the SharC compiler enforces these rules using static and dynamic checks. Violations of these rules indicate unintended data sharing, which is the underlying cause of harmful data-races. Additionally, SharC allows programmers to change the sharing rules for a specific object using a sharing cast, to capture the fact that sharing rules for an object often change during the object’s lifetime. SharC was successfully applied to a number of multi-threaded C programs. However, many programs are not readily checkable using SharC because their sharing rules, and changes to sharing rules, effectively apply to whole data structures rather than to individual objects. We have developed a system called Shoal to address this shortcoming.

