Results 1 - 10
of
76
Finding and Reproducing Heisenbugs in Concurrent Programs
"... Concurrency is pervasive in large systems. Unexpected interference among threads often results in “Heisenbugs” that are extremely difficult to reproduce and eliminate. We have implemented a tool called CHESS for finding and reproducing such bugs. When attached to a program, CHESS takes control of th ..."
Abstract
-
Cited by 51 (7 self)
- Add to MetaCart
Concurrency is pervasive in large systems. Unexpected interference among threads often results in “Heisenbugs” that are extremely difficult to reproduce and eliminate. We have implemented a tool called CHESS for finding and reproducing such bugs. When attached to a program, CHESS takes control of thread scheduling and uses efficient search techniques to drive the program through possible thread interleavings. This systematic exploration of program behavior enables CHESS to quickly uncover bugs that might otherwise have remained hidden for a long time. For each bug, CHESS consistently reproduces an erroneous execution manifesting the bug, thereby making it significantly easier to debug the problem. CHESS scales to large concurrent programs and has found numerous bugs in existing systems that had been tested extensively prior to being tested by CHESS. CHESS has been integrated into the test frameworks of many code bases inside Microsoft and is used by testers on a daily basis. 1
Ctrigger: exposing atomicity violation bugs from their hiding places
- In ASPLOS
, 2009
"... Multicore hardware is making concurrent programs pervasive. Unfortunately, concurrent programs are prone to bugs. Among different types of concurrency bugs, atomicity violation bugs are common and important. Existing techniques to detect atomicity violation bugs suffer from one limitation: requiring ..."
Abstract
-
Cited by 39 (10 self)
- Add to MetaCart
Multicore hardware is making concurrent programs pervasive. Unfortunately, concurrent programs are prone to bugs. Among different types of concurrency bugs, atomicity violation bugs are common and important. Existing techniques to detect atomicity violation bugs suffer from one limitation: requiring bugs to manifest during monitored runs, which is an open problem in concurrent program testing. This paper makes two contributions. First, it studies the interleaving characteristics of the common practice in concurrent program testing (i.e., running a program over and over) to understand why atomicity violation bugs are hard to expose. Second, it proposes CTrigger to effectively and efficiently expose atomicity violation bugs in large programs. CTrigger focuses on a special type of interleavings (i.e., unserializable
The Essence of Concurrent ML
- In ML with Concurrency: Design, Analysis, Implementation, and Application
, 1997
"... Concurrent ML (CML) is a high-level message-passing language that supports the construction of first-class synchronous abstractions called events. This mechanism has proven quite effective over the years and has been incorporated in a number of other languages. While CML provides a concurrent progra ..."
Abstract
-
Cited by 34 (1 self)
- Add to MetaCart
Concurrent ML (CML) is a high-level message-passing language that supports the construction of first-class synchronous abstractions called events. This mechanism has proven quite effective over the years and has been incorporated in a number of other languages. While CML provides a concurrent programming model, its implementation has always been limited to uniprocessors. This limitation is exploited in the implementation of the synchronization protocol that underlies the event mechanism, but with the advent of cheap parallel processing on the desktop (and laptop), it is time for Parallel CML. Parallel implementations of CML-like primitives for Java and Haskell exist, but build on high-level synchronization constructs that are unlikely to perform well. This paper presents a novel, parallel implementation of CML that exploits a purpose-built optimistic concurrency protocol designed for both correctness and performance on shared-memory multiprocessors. This work extends and completes an earlier protocol that supported just a strict subset of CML with synchronization on input, but not output events. Our main contributions are a model-checked reference implementation of the protocol and two concrete implementations. This paper focuses on Manticore’s functional, continuation-based implementation but briefly discusses an independent, thread-based implementation written in C # and running on Microsoft’s stock, parallel runtime. Although very different in detail, both derive from the same design. Experimental evaluation of the Manticore implementation reveals good performance, dispite the extra overhead of multiprocessor synchronization.
Reducing Concurrent Analysis Under a Context Bound to Sequential Analysis
"... Abstract. This paper addresses the analysis of concurrent programs with shared memory. Such an analysis is undecidable in the presence of multiple procedures. One approach used in recent work obtains decidability by providing only a partial guarantee of correctness: the approach bounds the number of ..."
Abstract
-
Cited by 28 (6 self)
- Add to MetaCart
Abstract. This paper addresses the analysis of concurrent programs with shared memory. Such an analysis is undecidable in the presence of multiple procedures. One approach used in recent work obtains decidability by providing only a partial guarantee of correctness: the approach bounds the number of context switches allowed in the concurrent program, and aims to prove safety, or find bugs, under the given bound. In this paper, we show how to obtain simple and efficient algorithms for the analysis of concurrent programs with a context bound. We give a general reduction from a concurrent program P, and a given context bound K, to a slightly larger sequential program P K s such that the analysis of P K s can be used to prove properties about P. The reduction introduces symbolic constants and assume statements in P K s. Thus, any sequential analysis that can deal with these two additions can be extended to handle concurrent programs as well, under the context bound. We give instances of the reduction for common program models used in model checking, such as Boolean programs, pushdown systems (PDSs), and symbolic PDSs. 1
Randomized active atomicity violation detection in concurrent programs
- In SIGSOFT ’08/FSE-16: Proceedings of the 16th ACM SIGSOFT International Symposium on Foundations of Software Engineering
, 2008
"... Atomicity is an important specification that enables programmers to understand atomic blocks of code in a multi-threaded program as if they are sequential. This significantly simplifies the programmer’s job to reason about correctness. Several modern multithreaded programming languages provide no bu ..."
Abstract
-
Cited by 27 (5 self)
- Add to MetaCart
Atomicity is an important specification that enables programmers to understand atomic blocks of code in a multi-threaded program as if they are sequential. This significantly simplifies the programmer’s job to reason about correctness. Several modern multithreaded programming languages provide no built-in support to ensure atomicity; instead they rely on the fact that programmers would use locks properly in order to guarantee that atomic code blocks are indeed atomic. However, improper use of locks can sometimes fail to ensure atomicity. Therefore, we need tools that can check atomicity properties of lock-based code automatically. We propose a randomized dynamic analysis technique to detect a special, but important, class of atomicity violations that are often found in real-world programs. Specifically, our technique modifies the existing Java thread scheduler behavior to create atomicity violations with high probability. Our approach has several advantages over existing dynamic analysis tools. First, we can create a real atomicity violation and see if an exception can be thrown. Second, we can replay an atomicity violating execution by simply using the same seed for random number generation—we do not need to record the execution. Third, we give no false warnings unlike existing dynamic atomicity checking techniques. We have implemented the technique in a prototype tool for Java and have experimented on a number of large multi-threaded Java programs and libraries. We report a number of previously known and unknown bugs and atomicity violations in these Java programs.
stateless model checking
- In PLDI 08: Programming Language Design and Implementation
, 2008
"... Stateless model checking is a useful state-space exploration technique for systematically testing complex real-world software. Existing stateless model checkers are limited to the verification of safety properties on terminating programs. However, realistic concurrent programs are nonterminating, a ..."
Abstract
-
Cited by 25 (4 self)
- Add to MetaCart
Stateless model checking is a useful state-space exploration technique for systematically testing complex real-world software. Existing stateless model checkers are limited to the verification of safety properties on terminating programs. However, realistic concurrent programs are nonterminating, a property that significantly reduces the efficacy of stateless model checking in testing them. Moreover, existing stateless model checkers are unable to verify that a nonterminating program satisfies the important liveness property of livelock-freedom, a property that requires the program to make continuous progress for any input. To address these shortcomings, this paper argues for incorporating a fair scheduler in stateless exploration. The key contribution of this paper is an explicit scheduler that is (strongly) fair and at the same time sufficiently nondeterministic to guarantee full coverage of safety properties. We have implemented the fair scheduler in the CHESS model checker. We show through theoretical arguments and empirical evaluation that our algorithm satisfies two important properties: 1) it visits all states of a finite-state program achieving state coverage at a faster rate than existing techniques, and 2) it finds all livelocks in a finite-state program. Before this work, nonterminating programs had to be manually modified in order to apply CHESS to them. The addition of fairness has allowed CHESS to be effectively applied to real-world nonterminating programs without any modification. For example, we have successfully booted the Singularity operating system under the control of CHESS. Categories and Subject Descriptors D.2.4 [Software Engineering]: Software/Program Verification — formal methods, validation;
CrystalBall: Predicting and Preventing Inconsistencies in Deployed Distributed Systems
"... We propose a new approach for developing and deploying distributed systems, in which nodes predict distributed consequences of their actions, and use this information to detect and avoid errors. Each node continuously runs a state exploration algorithm on a recent consistent snapshot of its neighbor ..."
Abstract
-
Cited by 22 (3 self)
- Add to MetaCart
We propose a new approach for developing and deploying distributed systems, in which nodes predict distributed consequences of their actions, and use this information to detect and avoid errors. Each node continuously runs a state exploration algorithm on a recent consistent snapshot of its neighborhood and predicts possible future violations of specified safety properties. We describe a new state exploration algorithm, consequence prediction, which explores causally related chains of events that lead to property violation. This paper describes the design and implementation of this approach, termed CrystalBall. We evaluate CrystalBall on RandTree, BulletPrime, Paxos, and Chord distributed system implementations. We identified new bugs in mature Mace implementations of three systems. Furthermore, we show that if the bug is not corrected during system development, CrystalBall is effective in steering the execution away from inconsistent states at runtime.
Interprocedural analysis of concurrent programs under a context bound
- In TACAS
, 2007
"... Abstract. Analysis of recursive programs in the presence of concurrency and shared memory is undecidable. In previous work, Qadeer and Rehof [23] showed that context-bounded analysis is decidable for recursive programs under a finite-state abstraction of program data. In this paper, we show that con ..."
Abstract
-
Cited by 21 (5 self)
- Add to MetaCart
Abstract. Analysis of recursive programs in the presence of concurrency and shared memory is undecidable. In previous work, Qadeer and Rehof [23] showed that context-bounded analysis is decidable for recursive programs under a finite-state abstraction of program data. In this paper, we show that context-bounded analysis is decidable for certain families of infinite-state abstractions, and also provide a new symbolic algorithm for the finite-state case. 1
Static and precise detection of concurrency errors in systems code using SMT solvers
- In CAV
, 2009
"... Abstract. Context-bounded analysis is an attractive approach to verification of concurrent programs. Bounding the number of contexts executed per thread not only reduces the asymptotic complexity, but also the complexity increases gradually from checking a purely sequential program. Lal and Reps [14 ..."
Abstract
-
Cited by 19 (4 self)
- Add to MetaCart
Abstract. Context-bounded analysis is an attractive approach to verification of concurrent programs. Bounding the number of contexts executed per thread not only reduces the asymptotic complexity, but also the complexity increases gradually from checking a purely sequential program. Lal and Reps [14] provided a method for reducing the context-bounded verification of a concurrent boolean program to the verification of a sequential boolean program, thereby allowing sequential reasoning to be employed for verifying concurrent programs. In this work, we adapt the encoding to work for systems programs written in C with the heap and accompanying low-level operations such as pointer arithmetic and casts. Our approach is completely automatic: we use a verification condition generator and SMT solvers, instead of a boolean model checker, in order to avoid manual extraction of boolean programs and false alarms introduced by the abstraction. We demonstrate the use of field slicing for improving the scalability and (in some cases) coverage of our checking. We evaluate our tool STORM on a set of real-world Windows device drivers, and has discovered a bug that could not be detected by extensive application of previous tools. 1

