Results 1 - 10
of
55
LiteRace: effective sampling for lightweight data-race detection
- In PLDI
, 2009
"... Data races are one of the most common and subtle causes of pernicious concurrency bugs. Static techniques for preventing data races are overly conservative and do not scale well to large programs. Past research has produced several dynamic data race detectors that can be applied to large programs an ..."
Abstract
-
Cited by 62 (3 self)
- Add to MetaCart
(Show Context)
Data races are one of the most common and subtle causes of pernicious concurrency bugs. Static techniques for preventing data races are overly conservative and do not scale well to large programs. Past research has produced several dynamic data race detectors that can be applied to large programs and are precise in the sense that they only report actual data races. However, these dynamic data race detectors incur a high performance overhead, slowing down a program’s execution by an order of magnitude. In this paper we present FeatherLite, a very lightweight data race detector that samples and analyzes only selected portions of a program’s execution. We show that it is possible to sample a multi-threaded program at a low frequency and yet find infrequently occurring data races. We implemented FeatherLite using Microsoft’s Phoenix compiler. Our experiments with several Microsoft programs show that FeatherLite is able to find more than 75 % of data races by sampling less than 5 % of memory accesses in a given program execution. 1.
PACER: Proportional Detection of Data Races ∗
"... Data races indicate serious concurrency bugs such as order, atomicity, and sequential consistency violations. Races are difficult to find and fix, often manifesting only in deployment. The frequency of these bugs will likely increase as software adds parallelism to exploit multicore hardware. Unfort ..."
Abstract
-
Cited by 55 (5 self)
- Add to MetaCart
(Show Context)
Data races indicate serious concurrency bugs such as order, atomicity, and sequential consistency violations. Races are difficult to find and fix, often manifesting only in deployment. The frequency of these bugs will likely increase as software adds parallelism to exploit multicore hardware. Unfortunately, sound and precise race detectors slow programs by factors of eight or more, which is too expensive to deploy. This paper presents a precise, low-overhead sampling-based data race detector called PACER. PACER makes a proportionality guarantee: it detects any race at a rate equal to the sampling rate, by finding races whose first access occurs during a global sampling period. During sampling, PACER tracks all accesses using the sound and precise FASTTRACK algorithm. In non-sampling periods, PACER discards sampled access information that cannot be part of a reported race, and PACER simplifies tracking of the happens-before relationship, yielding near-constant, instead of linear, overheads. Experimental results confirm our design claims: time and space overheads scale with the sampling rate, and sampling rates of 1-3 % yield overheads low enough to consider in production software. Furthermore, our results suggest PACER reports each race at a rate equal to the sampling rate. The resulting system provides a “get what you pay for ” approach to race detection that is suitable for identifying real, hard-to-reproduce races in deployed systems. 1.
CrystalBall: Predicting and Preventing Inconsistencies in Deployed Distributed Systems
"... We propose a new approach for developing and deploying distributed systems, in which nodes predict distributed consequences of their actions, and use this information to detect and avoid errors. Each node continuously runs a state exploration algorithm on a recent consistent snapshot of its neighbor ..."
Abstract
-
Cited by 51 (5 self)
- Add to MetaCart
(Show Context)
We propose a new approach for developing and deploying distributed systems, in which nodes predict distributed consequences of their actions, and use this information to detect and avoid errors. Each node continuously runs a state exploration algorithm on a recent consistent snapshot of its neighborhood and predicts possible future violations of specified safety properties. We describe a new state exploration algorithm, consequence prediction, which explores causally related chains of events that lead to property violation. This paper describes the design and implementation of this approach, termed CrystalBall. We evaluate CrystalBall on RandTree, BulletPrime, Paxos, and Chord distributed system implementations. We identified new bugs in mature Mace implementations of three systems. Furthermore, we show that if the bug is not corrected during system development, CrystalBall is effective in steering the execution away from inconsistent states at runtime.
Finding low-utility data structures
- In ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI
, 2010
"... Many opportunities for easy, big-win, program optimizations are missed by compilers. This is especially true in highly layered Java applications. Often at the heart of these missed optimization opportunities lie computations that, with great expense, produce data values that have little impact on th ..."
Abstract
-
Cited by 37 (15 self)
- Add to MetaCart
(Show Context)
Many opportunities for easy, big-win, program optimizations are missed by compilers. This is especially true in highly layered Java applications. Often at the heart of these missed optimization opportunities lie computations that, with great expense, produce data values that have little impact on the program’s final output. Constructing a new date formatter to format every date, or populating a large set full of expensively constructed structures only to check its size: these involve costs that are out of line with the benefits gained. This disparity between the formation costs and accrued benefits of data structures is at the heart of much runtime bloat. We introduce a run-time analysis to discover these low-utility data structures. The analysis employs dynamic thin slicing, which naturally associates costs with value flows rather than raw data flows. It constructs a model of the incremental, hop-to-hop, costs
Chameleon: Adaptive Selection of Collections
"... Languages such as Java and C#, as well as scripting languages like Python, and Ruby, make extensive use of Collection classes. A collection implementation represents a fixed choice in the dimensions of operation time, space utilization, and synchronization. Using the collection in a manner not consi ..."
Abstract
-
Cited by 33 (2 self)
- Add to MetaCart
(Show Context)
Languages such as Java and C#, as well as scripting languages like Python, and Ruby, make extensive use of Collection classes. A collection implementation represents a fixed choice in the dimensions of operation time, space utilization, and synchronization. Using the collection in a manner not consistent with this fixed choice can cause significant performance degradation. In this paper, we present CHAMELEON, a low-overhead automatic tool that assists the programmer in choosing the appropriate collection implementation for her application. During program execution, CHAMELEON computes elaborate trace and heap-based metrics on collection behavior. These metrics are consumed on-thefly by a rules engine which outputs a list of suggested collection adaptation strategies. The tool can apply these corrective strategies automatically or present them to the programmer. We have implemented CHAMELEON on top of a IBM’s J9 production JVM, and evaluated it over a small set of benchmarks. We show that for some applications, using CHAMELEON leads to a significant improvement of the memory footprint of the application.
LeakChaser: Helping Programmers Narrow Down Causes of Memory Leaks
"... In large programs written in managed languages such as Java and C#, holding unnecessary references often results in memory leaks and bloat, degrading significantly their run-time performance and scalability. Despite the existence of many leak detectors for such languages, these detectors often targe ..."
Abstract
-
Cited by 17 (6 self)
- Add to MetaCart
(Show Context)
In large programs written in managed languages such as Java and C#, holding unnecessary references often results in memory leaks and bloat, degrading significantly their run-time performance and scalability. Despite the existence of many leak detectors for such languages, these detectors often target low-level objects; as a result, their reports contain many false warnings and lack sufficient semantic information to help diagnose problems. This paper introduces a specification-based technique called LeakChaser that can not only capture precisely the unnecessary references leading to leaks, but also explain, with high-level semantics, why these references become unnecessary. At the heart of LeakChaser is a three-tier approach that uses varying levels of abstraction to assist programmers with different skill levels and code familiarity to find leaks. At the highest tier
2ndstrike: toward manifesting hidden concurrency typestate bugs
- in ASPLOS
, 2011
"... Concurrency bugs are becoming increasingly prevalent in the multi-core era. Recently, much research has focused on data races and atomicity violation bugs, which are related to low-level memory accesses. However, a large number of concurrency typestate bugs such as “invalid reads to a closed file fr ..."
Abstract
-
Cited by 17 (4 self)
- Add to MetaCart
(Show Context)
Concurrency bugs are becoming increasingly prevalent in the multi-core era. Recently, much research has focused on data races and atomicity violation bugs, which are related to low-level memory accesses. However, a large number of concurrency typestate bugs such as “invalid reads to a closed file from a different thread” are under-studied. These concurrency typestate bugs are important yet challenging to study since they are mostly relevant to high-level program semantics. This paper presents 2ndStrike, a method to manifest hidden concurrency typestate bugs in software testing. Given a state machine describing correct program behavior on certain object typestates, 2ndStrike profiles runtime events related to the typestates and thread synchronization. Based on the profiling results, 2ndStrike then identifies bug candidates, each of which is a pair of runtime
Dynamic shape analysis via degree metrics
- In ISMM
, 2009
"... Applications continue to increase in size and complexity which makes debugging and program understanding more challenging. Programs written in managed languages, such as Java, C#, and Ruby, further exacerbate this challenge because they tend to encode much of their state in the heap. This paper intr ..."
Abstract
-
Cited by 17 (1 self)
- Add to MetaCart
(Show Context)
Applications continue to increase in size and complexity which makes debugging and program understanding more challenging. Programs written in managed languages, such as Java, C#, and Ruby, further exacerbate this challenge because they tend to encode much of their state in the heap. This paper introduces dynamic shape analysis which seeks to characterize data structures in the heap by dynamically summarizing the object pointer relationships and detecting dynamic degree metrics based on class. The analysis identifies recursive data structures, automatically discovers dynamic degree metrics, and reports errors when degree metrics are violated. Uses of dynamic shape analysis include helping programmers find data structure errors during development, generating assertions for verification with static or dynamic analysis, and detecting subtle errors in deployment. We implement dynamic shape analysis in a Java Virtual Machine (JVM). Using SPECjvm and DaCapo benchmarks, we show that most objects in the heap are part of recursive data structures that maintain strong dynamic degree metrics. We show that once dynamic shape analysis establishes degree metrics from correct executions, it can find automatically inserted errors on subsequent executions in microbenchmarks. These suggests it can be used in deployment for improving software reliability.
Jinn: Synthesizing a Dynamic Bug Detector for Foreign Language Interfaces ∗
"... Programming language specifications mandate static and dynamic analyses to preclude syntactic and semantic errors. Although individual languages are usually well-specified, composing languages in multilingual programs is not. Because multilingual programs are prevalent, poor specification is a sourc ..."
Abstract
-
Cited by 16 (0 self)
- Add to MetaCart
(Show Context)
Programming language specifications mandate static and dynamic analyses to preclude syntactic and semantic errors. Although individual languages are usually well-specified, composing languages in multilingual programs is not. Because multilingual programs are prevalent, poor specification is a source of many errors. For example, virtually all Java programs compose Java and C with the Java Native Interface (JNI). Unfortunately, JNI is informally specified and thus, Java compilers and virtual machines (VMs) check only a small subset of JNI constraints. Worse, Java compiler and VM implementations inconsistently check constraints. This paper’s most significant contribution is to show how to synthesize dynamic analyses from state machines to detect foreign function interface (FFI) violations. To demonstrate the generality of our approach, we build FFI state machines that encode specifications for JNI and Python/C. Although we identify over a thousand FFI correctness constraints, we show that they fall into three classes and a modest number of state machines encode them. From these state machines, we generate context-specific FFI dynamic analysis. For Java, we insert this analysis in a library that interposes on all language transitions and thus is compiler and VM independent. We call the resulting dynamic bug detection tool Jinn. We show Jinn detects and diagnoses a wide variety of FFI bugs that other tools do not. This paper lays the foundation for better specification and enforcement of FFIs and a more principled approach to developing correct multilingual software. 1.
What Can the GC Compute Efficiently? A Language for Heap Assertions at GC Time
"... We present the DeAL language for heap assertions that are efficiently evaluated during garbage collection time. DeAL is a rich, declarative, logic-based language whose programs are guaranteed to be executable with good whole-heap locality, i.e., within a single traversal over every live object on th ..."
Abstract
-
Cited by 13 (2 self)
- Add to MetaCart
(Show Context)
We present the DeAL language for heap assertions that are efficiently evaluated during garbage collection time. DeAL is a rich, declarative, logic-based language whose programs are guaranteed to be executable with good whole-heap locality, i.e., within a single traversal over every live object on the heap and a finite neighborhood around each object. As a result, evaluating DeAL programs incurs negligible cost: for simple assertion checking at each garbage collection, the end-to-end execution slowdown is below 2%. DeAL is integrated into Java as a VM extension and we demonstrate its efficiency and expressiveness with several applications and properties from the past literature. Compared to past systems for heap assertions, DeAL is distinguished by its very attractive expressiveness/efficiency tradeoff: it offers a significantly richer class of assertions than what past systems could check with a single traversal. Conversely, past systems that can express the same (or more) complex assertions as DeAL do so only by suffering ordersof-magnitude higher costs.