Results 1 - 10
of
23
Instrumentation and sampling strategies for Cooperative Concurrency Bug Isolation
- In OOPSLA
, 2010
"... Fixing concurrency bugs (or crugs) is critical in modern software systems. Static analyses to find crugs such as data races and atomicity violations scale poorly, while dynamic approaches incur high run-time overheads. Crugs manifest only under specific execution interleavings that may not arise dur ..."
Abstract
-
Cited by 36 (9 self)
- Add to MetaCart
(Show Context)
Fixing concurrency bugs (or crugs) is critical in modern software systems. Static analyses to find crugs such as data races and atomicity violations scale poorly, while dynamic approaches incur high run-time overheads. Crugs manifest only under specific execution interleavings that may not arise during in-house testing, thereby demanding a lightweight program monitoring technique that can be used post-deployment. We present Cooperative Crug Isolation (CCI), a lowoverhead instrumentation framework to diagnose productionrun failures caused by crugs. CCI tracks specific thread interleavings at run-time, and uses statistical models to identify strong failure predictors among these. We offer a varied suite of predicates that represent different trade-offs between complexity and fault isolation capability. We also develop variant random sampling strategies that suit different types of predicates and help keep the run-time overhead low. Experiments with 9 real-world bugs in 6 non-trivial C applications show that these schemes span a wide spectrum of performance and diagnosis capabilities, each suitable for different usage scenarios.
Improved multithreaded unit testing.
- In Proc. of the joint meeting of the European Software Engineering Conference and the ACM Symposium on the Foundations of Software Engineering (ESEC/FSE’11),
, 2011
"... ABSTRACT Multithreaded code is notoriously hard to develop and test. A multithreaded test exercises the code under test with two or more threads. Each test execution follows some schedule/interleaving of the multiple threads, and different schedules can give different results. Developers often want ..."
Abstract
-
Cited by 16 (1 self)
- Add to MetaCart
(Show Context)
ABSTRACT Multithreaded code is notoriously hard to develop and test. A multithreaded test exercises the code under test with two or more threads. Each test execution follows some schedule/interleaving of the multiple threads, and different schedules can give different results. Developers often want to enforce a particular schedule for test execution, and to do so, they use time delays (Thread.sleep in Java). Unfortunately, this approach can produce false positives or negatives, and can result in unnecessarily long testing time. This paper presents IMUnit, a novel approach to specifying and executing schedules for multithreaded tests. We introduce a new language that allows explicit specification of schedules as orderings on events encountered during test execution. We present a tool that automatically instruments the code to control test execution to follow the specified schedule, and a tool that helps developers migrate their legacy, sleep-based tests into event-based tests in IMUnit. The migration tool uses novel techniques for inferring events and schedules from the executions of sleep-based tests. We describe our experience in migrating over 200 tests. The inference techniques have high precision and recall of over 75%, and IMUnit reduces testing time compared to sleepbased tests on average 3.39x.
Evaluating Dynamic Software Update Safety Using Efficient Systematic Testing
- IEEE TRANSACTIONS ON SOFTWARE ENGINEERING
, 2010
"... Dynamic software updating (DSU) systems, which allow programs to be patched on the fly, often employ automatic safety checks to avoid applying a patch that may lead to incorrect behavior. This paper presents what we believe is the first significant empirical evaluation of two DSU safety checks: acti ..."
Abstract
-
Cited by 14 (10 self)
- Add to MetaCart
(Show Context)
Dynamic software updating (DSU) systems, which allow programs to be patched on the fly, often employ automatic safety checks to avoid applying a patch that may lead to incorrect behavior. This paper presents what we believe is the first significant empirical evaluation of two DSU safety checks: activeness safety (AS) and con-freeness safety (CFS). To measure the checks ’ effectiveness, we developed a novel approach to systematically test dynamic updates by forcing updates at each of the update points encountered during system test execution. To mitigate the increase in the number of tests, we developed an algorithm for test suite minimization which proved highly effective in our experiments. Using this approach, we systematically tested a series of dynamic patches to OpenSSH, vsftpd and ngIRCd. AS and CFS prevented most, but not all, dynamic update failures; CFS allowed more failures than AS, but AS was more restrictive, disallowing many more successful updates. Our results show that neither AS nor CFS can be completely relied upon to produce correct dynamic updates, and our investigation points to the reasons why. Our work represents an important step, and important insights, toward developing safe, easy-to-use DSU systems.
CONCURRIT: A Domain Specific Language for Reproducing Concurrency Bugs
"... We present CONCURRIT, a domain-specific language (DSL) for reproducing concurrency bugs. Given some partial information about the nature of a bug in an application, a programmer can write a CONCURRIT script to formally and concisely specify a set of thread schedules to explore in order to find a sch ..."
Abstract
-
Cited by 6 (0 self)
- Add to MetaCart
(Show Context)
We present CONCURRIT, a domain-specific language (DSL) for reproducing concurrency bugs. Given some partial information about the nature of a bug in an application, a programmer can write a CONCURRIT script to formally and concisely specify a set of thread schedules to explore in order to find a schedule exhibiting the bug. Further, the programmer can specify how these thread schedules should be searched to find a schedule that reproduces the bug. We implemented CONCURRIT as an embedded DSL in C++, which uses manual or automatic source instrumentation to partially control the scheduling of the software under test. Using CONCURRIT, we were able to write concise tests to reproduce concurrency bugs in a variety of benchmarks, including the Mozilla’s SpiderMonkey
Fully Automatic and Precise Detection of Thread Safety Violations
"... Concurrent, object-oriented programs often use thread-safe library classes. Existing techniques for testing a thread-safe class either rely on tests using the class, on formal specifications, or on both. Unfortunately, these techniques often are not fully automatic as they involve the user in analyz ..."
Abstract
-
Cited by 5 (3 self)
- Add to MetaCart
(Show Context)
Concurrent, object-oriented programs often use thread-safe library classes. Existing techniques for testing a thread-safe class either rely on tests using the class, on formal specifications, or on both. Unfortunately, these techniques often are not fully automatic as they involve the user in analyzing the output. This paper presents an automatic testing technique that reveals concurrency bugs in supposedly thread-safe classes. The analysis requires as input only the class under test and reports only true positives. The key idea is to generate tests in which multiple threads call methods on a shared instance of the tested class. If a concurrent test exhibits an exception or a deadlock that cannot be triggered in any linearized execution of the test, the analysis reports a thread safety violation. The approach is easily applicable, because it is independent of handwritten tests and explicit specifications. The analysis finds 15 concurrency bugs in popular Java libraries, including two previously unknown bugs in the Java standard library.
Setac: A Framework for Phased Deterministic Testing of Scala Actor Programs
"... Scala provides an actor library where computation entities, called actors, communicate by exchanging messages. The schedule of message exchanges is in general nondeterministic. Testing non-deterministic programs is hard, because it is necessary to ensure that the system under test has executed all i ..."
Abstract
-
Cited by 3 (2 self)
- Add to MetaCart
(Show Context)
Scala provides an actor library where computation entities, called actors, communicate by exchanging messages. The schedule of message exchanges is in general nondeterministic. Testing non-deterministic programs is hard, because it is necessary to ensure that the system under test has executed all important schedules. Setac is our proposed framework for testing Scala actors that (1) allows programmers to specify constraints on schedules and (2) makes it easy to check test assertions that require actors to be in a stable state. Setac requires little change to the program under test and requires no change to the actor run-time system. In sum, Setac aims to make it much simpler to test nondeterministic actor programs in Scala.
MuTMuT: Efficient exploration for mutation testing of multithreaded code
- In ICST
, 2010
"... Abstract—Mutation testing is a method for measuring the quality of test suites. Given a system under test and a test suite, mutations are systematically inserted into the system, and the test suite is executed to determine which mutants it detects. A major cost of mutation testing is the time requir ..."
Abstract
-
Cited by 3 (3 self)
- Add to MetaCart
(Show Context)
Abstract—Mutation testing is a method for measuring the quality of test suites. Given a system under test and a test suite, mutations are systematically inserted into the system, and the test suite is executed to determine which mutants it detects. A major cost of mutation testing is the time required to execute the test suite on all the mutants. This cost is even greater when the system under test is multithreaded: not only are test cases from the test suite executed on many mutants, but also each test case is executed for multiple possible thread schedules. We introduce a general framework that can reduce the time for mutation testing of multithreaded code. We present four techniques within the general framework and implement two of them in a tool called MuTMuT. We evaluate MuTMuT on eight multithreaded programs. The results show that MuTMuT reduces the time for mutation testing, substantially over a straightforward mutant execution and up to 77 % with the advanced technique over the basic technique. I.
CONCURRIT: Testing Concurrent Programs with Programmable State-Space Exploration
"... Testing is the most widely-used methodology for software validation. However, due to the nondeterministic interleavings of threads, traditional testing for concurrent programs is not as effective as for sequential programs. To attack the nondeterminism problem, software model checking techniques hav ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
(Show Context)
Testing is the most widely-used methodology for software validation. However, due to the nondeterministic interleavings of threads, traditional testing for concurrent programs is not as effective as for sequential programs. To attack the nondeterminism problem, software model checking techniques have been used to systematically enumerate all possible thread schedules of a test program. But such systematic and exhaustive exploration is typically too time-consuming for many test programs. We believe that the programmer’s help to guide the model checker towards interesting executions is critical to circumvent this problem. We propose a testing technique and a supporting tool called CONCURRIT, which provides a model checker that can be guided programmatically within test code. While writing a test, the programmer specifies a particular thread interleaving scenario in mind using an embedded domain-specific language (DSL), and CONCUR-RIT explores all and only the executions realizing the intended scenario. During the exploration, the programmer is also able to observe the execution (e.g., assert invariants) and constrain the future decisions of the model checker, all within the test code. We believe that providing the programmer the ability to observe and control the exploration of executions will lead to more effective and efficient testing for concurrent programs. 1.
Bita: Coverage-Guided, Automatic Testing of Actor Programs
"... Abstract—Actor programs are concurrent programs where concurrent entities communicate asynchronously by exchanging messages. Testing actor programs is challenging because the order of message receives depends on the non-deterministic scheduler and because exploring all schedules does not scale to la ..."
Abstract
-
Cited by 2 (1 self)
- Add to MetaCart
(Show Context)
Abstract—Actor programs are concurrent programs where concurrent entities communicate asynchronously by exchanging messages. Testing actor programs is challenging because the order of message receives depends on the non-deterministic scheduler and because exploring all schedules does not scale to large programs. This paper presents Bita, a scalable, automatic approach for testing non-deterministic behavior of actor programs. The key idea is to generate and explore schedules that are likely to reveal concurrency bugs because these schedules increase the schedule coverage. We present three schedule coverage criteria for actor programs, an algorithm to generate feasible schedules that increase coverage, and a technique to force a program to comply with a schedule. Applying Bita to real-world actor programs implemented in Scala reveals eight previously unknown concurrency bugs, of which six have already been fixed by the developers. Furthermore, we show our approach to find bugs 122x faster than random scheduling, on average. I.
IMUnit: Improved Multithreaded Unit Testing Position Statement
"... This position paper argues for an approach to bring several techniques successful for (regression) testing of sequential code over to multithreaded code. Multithreaded code is getting increasingly important but remains extremely hard to develop and test. Most recent research on testing multithreaded ..."
Abstract
-
Cited by 2 (1 self)
- Add to MetaCart
(Show Context)
This position paper argues for an approach to bring several techniques successful for (regression) testing of sequential code over to multithreaded code. Multithreaded code is getting increasingly important but remains extremely hard to develop and test. Most recent research on testing multithreaded code focuses solely on finding bugs in one given version of code. While there are many promising results, the tools are fairly slow (as they, conceptually, explore a large number of schedules) and do not exploit the fact that code evolves over several versions during development and maintenance. Our proposal is to allow explicit specification of relevant schedules (either manually written or automatically generated) for multithreaded tests, which can substantially speed up testing, especially for evolving code. To enable the use of schedules, we propose to design a novel language for specifying schedules in multithreaded tests, and to develop tools for automatic generation of multithreaded tests and for improved regression testing with multithreaded tests. 1.