Results 1 - 10
of
16
Instrumentation and sampling strategies for Cooperative Concurrency Bug Isolation
- In OOPSLA
, 2010
"... Fixing concurrency bugs (or crugs) is critical in modern software systems. Static analyses to find crugs such as data races and atomicity violations scale poorly, while dynamic approaches incur high run-time overheads. Crugs manifest only under specific execution interleavings that may not arise dur ..."
Abstract
-
Cited by 6 (1 self)
- Add to MetaCart
Fixing concurrency bugs (or crugs) is critical in modern software systems. Static analyses to find crugs such as data races and atomicity violations scale poorly, while dynamic approaches incur high run-time overheads. Crugs manifest only under specific execution interleavings that may not arise during in-house testing, thereby demanding a lightweight program monitoring technique that can be used post-deployment. We present Cooperative Crug Isolation (CCI), a lowoverhead instrumentation framework to diagnose productionrun failures caused by crugs. CCI tracks specific thread interleavings at run-time, and uses statistical models to identify strong failure predictors among these. We offer a varied suite of predicates that represent different trade-offs between complexity and fault isolation capability. We also develop variant random sampling strategies that suit different types of predicates and help keep the run-time overhead low. Experiments with 9 real-world bugs in 6 non-trivial C applications show that these schemes span a wide spectrum of performance and diagnosis capabilities, each suitable for different usage scenarios.
Automated atomicity-violation fixing
- In PLDI
, 2011
"... Fixing software bugs has always been an important and timeconsuming process in software development. Fixing concurrency bugs has become especially critical in the multicore era. However, fixing concurrency bugs is challenging, in part due to nondeterministic failures and tricky parallel reasoning. B ..."
Abstract
-
Cited by 5 (0 self)
- Add to MetaCart
Fixing software bugs has always been an important and timeconsuming process in software development. Fixing concurrency bugs has become especially critical in the multicore era. However, fixing concurrency bugs is challenging, in part due to nondeterministic failures and tricky parallel reasoning. Beyond correctly fixing the original problem in the software, a good patch should also avoid introducing new bugs, degrading performance unnecessarily, or damaging software readability. Existing tools cannot automate the whole fixing process and provide good-quality patches. We present AFix, a tool that automates the whole process of fixing one common type of concurrency bug: single-variable atomicity violations. AFix starts from the bug reports of existing bugdetection tools. It augments these with static analysis to construct a suitable patch for each bug report. It further tries to combine the patches of multiple bugs for better performance and code readability. Finally, AFix’s run-time component provides testing customized for each patch. Our evaluation shows that patches automatically generated by AFix correctly eliminate six out of eight real-world bugs and significantly decrease the failure probability in the other two cases. AFix patches never introduce new bugs and usually have similar performance to manually-designed patches.
Breadcrumbs: Efficient Context Sensitivity for Dynamic Bug Detection Analyses ∗
"... Calling context—the set of active methods on the stack—is critical for understanding the dynamic behavior of large programs. Dynamic program analysis tools, however, are almost exclusively context insensitive because of the prohibitive cost of representing calling contexts at run time. Deployable dy ..."
Abstract
-
Cited by 3 (2 self)
- Add to MetaCart
Calling context—the set of active methods on the stack—is critical for understanding the dynamic behavior of large programs. Dynamic program analysis tools, however, are almost exclusively context insensitive because of the prohibitive cost of representing calling contexts at run time. Deployable dynamic analyses, in particular, are limited to reporting only static program locations. This paper presents Breadcrumbs, an efficient technique for recording and reporting dynamic calling contexts. It builds on an existing technique for computing a compact (one word) encoding of each calling context that client analyses can use in place of a program location. The key feature of our system is a search algorithm that can reconstruct a calling context from its encoding using only a static call graph and a small amount of dynamic information collected in cold methods. Breadcrumbs requires no offline training or program modifications, and handles all language features, including dynamic class loading. On average, it adds 10% to 20 % overhead to existing dynamic analyses, depending on how much additional information it collects: more information slows down execution, but improves the decoding algorithm. We use Breadcrumbs to add context sensitivity to two dynamic analyses: a race detector and an analysis that identifies the origins of null pointer exceptions. Our system can reconstruct nearly all of the contexts for the reported bugs in a few seconds. These calling contexts are non-trivial, and they significantly improve both the precision of the analyses and the quality of the bug reports. 1.
Cooperative Reasoning for Preemptive Execution
"... We propose a cooperative methodology for multithreaded software, where threads use traditional synchronization idioms such as locks, but additionally document each point of potential thread interference with a “yield ” annotation. Under this methodology, code between two successive yield annotations ..."
Abstract
-
Cited by 3 (3 self)
- Add to MetaCart
We propose a cooperative methodology for multithreaded software, where threads use traditional synchronization idioms such as locks, but additionally document each point of potential thread interference with a “yield ” annotation. Under this methodology, code between two successive yield annotations forms a serializable transaction that is amenable to sequential reasoning. This methodology reduces the burden of reasoning about thread interleavings by indicating only those interference points that matter. We present experimental results showing that very few yield annotations are required, typically one or two per thousand lines of code. We also present dynamic analysis algorithms for detecting cooperability violations, where thread interference is not documented by a yield, and for yield annotation inference for legacy software.
Persuasive Prediction of Concurrency Access Anomalies
"... Predictive analysis is a powerful technique that exposes concurrency bugs in un-exercised program executions. However, current predictive analysis approaches lack the persuasiveness property as they offer little assistance in helping programmers fully understand the execution history that triggers t ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
Predictive analysis is a powerful technique that exposes concurrency bugs in un-exercised program executions. However, current predictive analysis approaches lack the persuasiveness property as they offer little assistance in helping programmers fully understand the execution history that triggers the predicted bugs. We present a persuasive bug prediction technique as well as a prototype tool, PECAN, for detecting general access anomalies (AAs) in concurrent programs. The main characteristic of PECAN is that, in addition to predict AAs in a more general way, it generates “bug hatching clips ” that deterministically instruct the input program to exercise the predicted AAs. The key ingredient of PECAN is an efficient offline schedule generation algorithm, with proof of the soundness, that guarantees to generate a feasible schedule for every real AA in programs that use locks in a nested way. We evaluate PECAN using twenty-two multi-threaded subjects including six large concurrent systems, and our experiments demonstrate that PECAN is able to effectively predict and deterministically expose real AAs. Several serious and previously unknown bugs in large open source concurrent systems were also revealed in our experiments.
IFRit: Interference-Free Regions for Dynamic Data-Race Detection ∗
"... We propose a new algorithm for dynamic data-race detection. Our algorithm reports no false positives and runs on arbitrary C and C++ code. Unlike previous algorithms, we do not have to instrument every memory access or track a full happens-before relation. Our data-race detector, which we call IFRit ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
We propose a new algorithm for dynamic data-race detection. Our algorithm reports no false positives and runs on arbitrary C and C++ code. Unlike previous algorithms, we do not have to instrument every memory access or track a full happens-before relation. Our data-race detector, which we call IFRit, is based on a run-time abstraction called an interference-free region (IFR). An IFR is an interval of one thread’s execution during which any write to a specific variable by a different thread is a data race. We insert instrumentation at compile time to monitor active IFRs at run-time. If the runtime observes overlapping IFRs for conflicting accesses to the same variable in two different threads, it reports a race. The static analysis aggregates information for multiple accesses to the same variable, avoiding the expense of having to instrument every memory access in the program. We directly compare IFRit to FastTrack [10] and Thread-Sanitizer [25], two state-of-the-art fully-precise data-race detectors. We show that IFRit imposes a fraction of the overhead of these detectors. We show that for the PARSEC benchmarks, and several real-world applications, IFRit finds many of the races detected by a fully-precise detector. We also demonstrate that sampling can further reduce IFRit’s performance overhead without completely forfeiting precision. ∗ This material is based upon work supported by the National Science
Types for Precise Thread Interference
"... The potential for unexpected interference between threads makes multithreaded programming notoriously difficult. Programmers use a variety of synchronization idioms such as locks and barriers to restrict where interference may actually occur. Unfortunately, the resulting actual interference points a ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
The potential for unexpected interference between threads makes multithreaded programming notoriously difficult. Programmers use a variety of synchronization idioms such as locks and barriers to restrict where interference may actually occur. Unfortunately, the resulting actual interference points are typically never documented and must be manually reconstructed as the first step in any subsequent programming task (code review, refactoring, etc). This paper proposes explicitly documenting actual interference points in the program source code, and it presents a type and effect system for verifying the correctness of these interference specifications. Experimental results on a variety of Java benchmarks show that this approach provides a significant improvement over prior systems based on method-level atomicity specifications. In particular, it reduces the number of interference points one must consider from several hundred points per thousand lines of code to roughly 13 per thousand lines of code. Explicit interference points also serve to highlight all known concurrency defects in these benchmarks. 1.
Sound Predictive Race Detection in Polynomial Time
"... Data races are among the most reliable indicators of programming errors in concurrent software. For at least two decades, Lamport’s happens-before (HB) relation has served as the standard test for detecting races—other techniques, such as lockset-based approaches, fail to be sound, as they may false ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
Data races are among the most reliable indicators of programming errors in concurrent software. For at least two decades, Lamport’s happens-before (HB) relation has served as the standard test for detecting races—other techniques, such as lockset-based approaches, fail to be sound, as they may falsely warn of races. This work introduces a new relation, causally-precedes (CP), which generalizes happens-before to observe more races without sacrificing soundness. Intuitively, CP tries to capture the concept of happens-before ordered events that must occur in the observed order for the program to observe the same values. What distinguishes CP from past predictive race detection approaches (which also generalize an observed execution to detect races in other plausible executions) is that CP-based race detection is both sound and of polynomial complexity. We demonstrate that the unique aspects of CP result in practical benefit. Applying CP to real-world programs, we successfully analyze server-level applications (e.g., Apache FtpServer) and show that traces longer than in past predictive race analyses can be analyzed in mere seconds to a few minutes. For these programs, CP race detection uncovers races that are hard to detect by repeated execution and HB race detection: a single run of CP race detection produces several races not discovered by 10 separate rounds of happens-before race detection.
ACCULOCK: Accurate and Efficient Detection of Data Races
"... Abstract—Happens-before detectors are precise but can be too conservative to detect certain data races in repeated test runs as they are sensitive to thread interleaving. By making the opposite tradeoffs, lockset detectors can detect more races but are not precise (by reporting false positives). For ..."
Abstract
- Add to MetaCart
Abstract—Happens-before detectors are precise but can be too conservative to detect certain data races in repeated test runs as they are sensitive to thread interleaving. By making the opposite tradeoffs, lockset detectors can detect more races but are not precise (by reporting false positives). For both types of detectors, happens-before detectors run more slowly as they use expensive vector clocks. Existing hybrid race detectors (combining lockset and happens-before) alleviate some of the limitations in both analysis techniques at the cost of additional analysis overhead. Recently, due to FASTTRACK, epoch-based happens-before and lockset detectors now exhibit comparable performance. It is the time to rethink how to design a hybrid race detector to balance precision and coverage, by leveraging the lightweightness of epoch clocks. ACCULOCK is the first such a solution. ACCULOCK analyzes a program by reasoning about the subset of the happens-before relation observed with lock acquires and releases excluded, thereby reducing its sensitivity to thread interleaving. When such a weaker happens-before relation is violated, ACCULOCK applies a new efficient lockset algorithm to enforce a lock-based synchronization discipline by distinguishing the locks protecting reads and writes. The key motivation behind is to ensure that ACCULOCK can improve happens-before detectors by discovering also data races in alternate thread interleavings when analyzing one program execution while limiting false warnings thus incurred in a controlled manner. In addition, ACCULOCK achieves these objectives by maintaining comparable performance as FASTTRACK, the fastest happens-before detector. All these properties of ACCULOCK are validated and confirmed by comparing it against six other detectors, all implemented in Jikes RVM using 11 benchmark programs. I.
Research Statement
"... I am excited by the challenge of making software significantly more reliable, scalable, and secure than it is today and thus helping achieve advances in areas such as science, education, health, and energy. I have focused on the problem of software bugs, which cause errors that cost billions of doll ..."
Abstract
- Add to MetaCart
I am excited by the challenge of making software significantly more reliable, scalable, and secure than it is today and thus helping achieve advances in areas such as science, education, health, and energy. I have focused on the problem of software bugs, which cause errors that cost billions of dollars annually and sometimes result in injury or death. These bugs are pervasive in modern software, which is only becoming more complex as developers add features, integrate components, and write concurrent software. State-of-the-art testing, static analysis, and modern language features eliminate some but not all bugs. In particular, thorough in-house testing cannot test all possible environments, configurations, and thread interleavings. Deployed software thus contains bugs that are hard to reproduce, find, and fix. Deployed systems need support for improving reliability in order to achieve highly robust software. My interests lie in solving these problems with programming languages and runtime systems. I focus on building analyses and systems that help developers and users make software more reliable and effective. I am particularly interested in approaches that are lightweight and flexible enough to run alongside deployed systems and that provide rigorous guarantees about accuracy and performance. My dissertation work introduces broadly applicable techniques that help developers find and fix bugs in deployed systems and help users by automatically tolerating the effects of errors. Two significant contributions are (1) efficient techniques that add context sensitivity to a broad set of analyses for reliability and security (first section below), and (2) an automatic approach for minimizing the effects of programmers’

