Results 1 - 10
of
17
Valgrind: A framework for heavyweight dynamic binary instrumentation
- In Proceedings of the 2007 Programming Language Design and Implementation Conference
, 2007
"... Dynamic binary instrumentation (DBI) frameworks make it easy to build dynamic binary analysis (DBA) tools such as checkers and profilers. Much of the focus on DBI frameworks has been on performance; little attention has been paid to their capabilities. As a result, we believe the potential of DBI ha ..."
Abstract
-
Cited by 211 (3 self)
- Add to MetaCart
Dynamic binary instrumentation (DBI) frameworks make it easy to build dynamic binary analysis (DBA) tools such as checkers and profilers. Much of the focus on DBI frameworks has been on performance; little attention has been paid to their capabilities. As a result, we believe the potential of DBI has not been fully exploited. In this paper we describe Valgrind, a DBI framework designed for building heavyweight DBA tools. We focus on its unique support for shadow values—a powerful but previously little-studied and difficult-to-implement DBA technique, which requires a tool to shadow every register and memory value with another value that describes it. This support accounts for several crucial design features that distinguish Valgrind from other DBI frameworks. Because of these features, lightweight tools built with Valgrind run comparatively slowly, but Valgrind can be used to build more interesting, heavyweight tools that are difficult or impossible to build with other DBI frameworks such as Pin and DynamoRIO. Categories and Subject Descriptors D.2.5 [Software Engineering]: Testing and Debugging—debugging aids, monitors; D.3.4
The Daikon system for dynamic detection of likely invariants
, 2006
"... Daikon is an implementation of dynamic detection of likely invariants; that is, the Daikon invariant detector reports likely program invariants. An invariant is a property that holds at a certain point or points in a program; these are often used in assert statements, documentation, and formal speci ..."
Abstract
-
Cited by 89 (8 self)
- Add to MetaCart
Daikon is an implementation of dynamic detection of likely invariants; that is, the Daikon invariant detector reports likely program invariants. An invariant is a property that holds at a certain point or points in a program; these are often used in assert statements, documentation, and formal specifications. Examples include being constant (x = a), non-zero (x ̸ = 0), being in a
Discovering documentation for java container classes
- IEEE Transactions on Software Engineering
, 2007
"... Modern programs make extensive use of reusable software libraries. For example, we found that 17 % to 30 % of the classes in a number of large Java applications use the container classes from the java.util package. Given this extensive code reuse in Java programs, it is important for the reusable in ..."
Abstract
-
Cited by 10 (0 self)
- Add to MetaCart
Modern programs make extensive use of reusable software libraries. For example, we found that 17 % to 30 % of the classes in a number of large Java applications use the container classes from the java.util package. Given this extensive code reuse in Java programs, it is important for the reusable interfaces to have clear and unambiguous documentation. Unfortunately, most documentation is expressed in English, and therefore does not always satisfy these requirements. Worse yet, there is no way of checking that the documentation is consistent with the associated code. Formal specifications present an alternative which does not suffer from these problems; however, formal specifications are notoriously hard to write. To alleviate this difficulty, we have implemented a tool which automatically derives documentation in the form of formal specifications. Our tool probes Java classes by invoking them on dynamically generated tests and captures the information observed during their execution as algebraic axioms. While the tool is not complete or correct from a formal perspective we demonstrate that it discovers many useful axioms when applied to container classes. These axioms then form an initial formal documentation of the class they describe. 1
Error propagation analysis for file systems
- In Proceedings of the ACM SIGPLAN 2009 Conference on Programming Language Design and Implementation
, 2009
"... Unchecked errors are especially pernicious in operating system file management code. Transient or permanent hardware failures are inevitable, and error-management bugs at the file system layer can cause silent, unrecoverable data corruption. We propose an interprocedural static analysis that tracks ..."
Abstract
-
Cited by 6 (6 self)
- Add to MetaCart
Unchecked errors are especially pernicious in operating system file management code. Transient or permanent hardware failures are inevitable, and error-management bugs at the file system layer can cause silent, unrecoverable data corruption. We propose an interprocedural static analysis that tracks errors as they propagate through file system code. Our implementation detects overwritten, out-ofscope, and unsaved unchecked errors. Analysis of four widely-used Linux file system implementations (CIFS, ext3, IBM JFS and ReiserFS), a relatively new file system implementation (ext4), and shared virtual file system (VFS) code uncovers 312 error propagation bugs. Our flow- and context-sensitive approach produces more precise results than related techniques while providing better diagnostic information, including possible execution paths that demonstrate each bug found.
Automatic Reverse Engineering of Data Structures from Binary Execution
"... With only the binary executable of a program, it is useful to discover the program’s data structures and infer their syntactic and semantic definitions. Such knowledge is highly valuable in a variety of security and forensic applications. Although there exist efforts in program data structure infere ..."
Abstract
-
Cited by 5 (2 self)
- Add to MetaCart
With only the binary executable of a program, it is useful to discover the program’s data structures and infer their syntactic and semantic definitions. Such knowledge is highly valuable in a variety of security and forensic applications. Although there exist efforts in program data structure inference, the existing solutions are not suitable for our targeted application scenarios. In this paper, we propose a reverse engineering technique to automatically reveal program data structures from binaries. Our technique, called REWARDS, is based on dynamic analysis. More specifically, each memory location accessed by the program is tagged with a timestamped type attribute. Following the program’s runtime data flow, this attribute is propagated to other memory locations and registers that share the same type. During the propagation, a variable’s type gets resolved if it is involved in a type-revealing execution point or “type sink”. More importantly, besides the forward type propagation, REWARDS involves a backward type resolution procedure where the types of some previously accessed variables get recursively resolved starting from a type sink. This procedure is constrained by the timestamps of relevant memory locations to disambiguate variables reusing the same memory location. In addition, REWARDS is able to reconstruct in-memory data structure layout based on the type information derived. We demonstrate that REWARDS provides unique benefits to two applications: memory image forensics and binary fuzzing for vulnerability discovery. 1
SigGraph: Brute Force Scanning of Kernel Data Structure Instances Using Graph-based Signatures
"... Brute force scanning of kernel memory images for finding kernel data structure instances is an important function in many computer security and forensics applications. Brute force scanning requires effective, robust signatures of kernel data structures. Existing approaches often use the value invari ..."
Abstract
-
Cited by 3 (1 self)
- Add to MetaCart
Brute force scanning of kernel memory images for finding kernel data structure instances is an important function in many computer security and forensics applications. Brute force scanning requires effective, robust signatures of kernel data structures. Existing approaches often use the value invariants of certain fields as data structure signatures. However, they do not fully exploit the rich pointsto relations between kernel data structures. In this paper, we show that such points-to relations can be leveraged to generate graph-based structural invariant signatures. More specifically, we develop SigGraph, a framework that systematically generates non-isomorphic signatures for data structures in an OS kernel. Each signature is a graph rooted at a subject data structure with its edges reflecting the points-to relations with other data structures. Our experiments with a range of Linux kernels show that SigGraph-based signatures achieve high accuracy in recognizing kernel data structure instances via brute force scanning. We further show that SigGraph achieves better robustness against pointer value anomalies and corruptions, without requiring global memory mapping and object reachability. We demonstrate that SigGraph can be applied to kernel memory forensics, kernel rootkit detection, and kernel version inference. 1
Automatic Dimension Inference and Checking for Object-Oriented Programs
"... This paper introduces UniFi, a tool that attempts to automatically detect dimension errors in Java programs. UniFi infers dimensional relationships across primitive type and string variables in a program, using an inter-procedural, context-sensitive analysis. It then monitors these dimensional relat ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
This paper introduces UniFi, a tool that attempts to automatically detect dimension errors in Java programs. UniFi infers dimensional relationships across primitive type and string variables in a program, using an inter-procedural, context-sensitive analysis. It then monitors these dimensional relationships as the program evolves, flagging inconsistencies that may be errors. UniFi requires no programmer annotations, and supports arbitrary program-specific dimensions, thus providing fine-grained dimensional consistency checking. UniFi exploits features of object-oriented languages, but can be used for other languages as well. We have run UniFi on real-life Java code and found that it is useful in exposing dimension errors. We present a case study of using UniFi on nightly builds of a 19,000 line code base as it evolved over 10 months. 1.
Locating Need-to-Translate Constant Strings in Web Applications
"... Software internationalization aims to make software accessible and usable by users all over the world. For a Java application that does not consider internationalization at the beginning of its development stage, our previous work proposed an approach to locating need-to-translate constant strings i ..."
Abstract
-
Cited by 2 (2 self)
- Add to MetaCart
Software internationalization aims to make software accessible and usable by users all over the world. For a Java application that does not consider internationalization at the beginning of its development stage, our previous work proposed an approach to locating need-to-translate constant strings in the Java code. However, when being applied on web applications, it can identify only constant strings that may go to the generated HTML texts, but cannot further distinguish constant strings visible at the browser side (needto-translate) from other constant strings (not need-to-translate). In this paper, to address significant challenges in internationalizing web applications, we propose a novel approach to locating need-totranslate constant strings in web applications. Among those constant strings that may go to the generated HTML texts, our approach further distinguishes strings visible at the browser side from non-visible strings via a novel technique called flag propagation. We evaluated our approach on three real-world open source PHPbased web applications (in total near 17 KLOC): Squirrel Mail, Lime Survey, and Mrbs. The empirical results demonstrate that our approach accurately distinguishes visible strings from non-visible strings among all the constant strings that may go to the generated HTML texts, and is effective for locating need-to-translate constant strings in web applications.
Toward Automatic Data Structure Replacement for Effective Parallelization
"... Data structures define how values being computed are stored and accessed within programs. By recognizing what data structures are being used in an application, tools can make applications more robust by enforcing data structure consistency properties, and developers can better understand and more ea ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
Data structures define how values being computed are stored and accessed within programs. By recognizing what data structures are being used in an application, tools can make applications more robust by enforcing data structure consistency properties, and developers can better understand and more easily modify applications to suit the target architecture for a particular application. This paper presents the design and application of DDT, a new program analysis tool that automatically identifies data structures within an application. A binary application is instrumented to dynamically monitor how the data is stored and organized for a set of sample inputs. The instrumentation detects which functions interact with the stored data, and creates a signature for these functions using dynamic invariant detection. The invariants of these functions are then matched against a library of known data structures, providing a probable identification. That is, DDT uses program consistency properties to identify what data structures an application employs. The empirical evaluation shows that this technique is highly accurate across several different implementations of standard data structures. 1.
Directed Random Testing
, 2009
"... Random testing can quickly generate many tests, is easy to implement, scales to large software applications, and reveals software errors. But it tends to generate many tests that are illegal or that exercise the same parts of the code as other tests, thus limiting its effectiveness. Directed random ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
Random testing can quickly generate many tests, is easy to implement, scales to large software applications, and reveals software errors. But it tends to generate many tests that are illegal or that exercise the same parts of the code as other tests, thus limiting its effectiveness. Directed random testing is a new approach to test generation that overcomes these limitations, by combining a bottom-up generation of tests with runtime guidance. A directed random test generator takes a collection of operations under test and generates new tests incrementally, by randomly selecting operations to apply and finding arguments from among previously-constructed tests. As soon as it generates a new test, the generator executes it, and the result determines whether the test is redundant, illegal, error-revealing, or useful for generating more tests. The technique outputs failing tests pointing to potential errors that should be corrected, and passing tests that can be used for regression testing. The thesis also contributes auxiliary techniques that post-process the generated tests, including a simplification technique that transforms a failing test into a smaller one that better

