Results 1 - 10
of
78
Modular Data Structure Verification
- EECS DEPARTMENT, MASSACHUSETTS INSTITUTE OF TECHNOLOGY
, 2007
"... This dissertation describes an approach for automatically verifying data structures, focusing on techniques for automatically proving formulas that arise in such verification. I have implemented this approach with my colleagues in a verification system called Jahob. Jahob verifies properties of Java ..."
Abstract
-
Cited by 32 (21 self)
- Add to MetaCart
This dissertation describes an approach for automatically verifying data structures, focusing on techniques for automatically proving formulas that arise in such verification. I have implemented this approach with my colleagues in a verification system called Jahob. Jahob verifies properties of Java programs with dynamically allocated data structures. Developers write Jahob specifications in classical higher-order logic (HOL); Jahob reduces the verification problem to deciding the validity of HOL formulas. I present a new method for proving HOL formulas by combining automated reasoning techniques. My method consists of 1) splitting formulas into individual HOL conjuncts, 2) soundly approximating each HOL conjunct with a formula in a more tractable fragment and 3) proving the resulting approximation using a decision procedure or a theorem prover. I present three concrete logics; for each logic I show how to use it to approximate HOL formulas, and how to decide the validity of formulas in this logic. First, I present an approximation of HOL based on a translation to first-order logic, which enables the use of existing resolution-based theorem provers. Second, I present an approximation of HOL based on field constraint analysis, a new technique that enables
Tupni: Automatic Reverse Engineering of Input Formats
- In Proceedings of the 15th ACM Conference on Computer and Communications Security (CCS
, 2008
"... Recent work has established the importance of automatic reverse engineering of protocol or file format specifications. However, the formats reverse engineered by previous tools have missed important information that is critical for security applications. In this paper, we present Tupni, a tool that ..."
Abstract
-
Cited by 28 (2 self)
- Add to MetaCart
Recent work has established the importance of automatic reverse engineering of protocol or file format specifications. However, the formats reverse engineered by previous tools have missed important information that is critical for security applications. In this paper, we present Tupni, a tool that can reverse engineer an input format with a rich set of information, including record sequences, record types, and input constraints. Tupni can generalize the format specification over multiple inputs. We have implemented a prototype of Tupni and evaluated it on 10 different formats: five file formats (WMF, BMP, JPG, PNG and TIF) and five network protocols (DNS, RPC, TFTP, HTTP and FTP). Tupni identified all record sequences in the test inputs. We also show that, by aggregating over multiple WMF files, Tupni can derive a more complete format specification for WMF. Furthermore, we demonstrate the utility of Tupni by using the rich information it provides for zeroday vulnerability signature generation, which was not possible with previous reverse engineering tools.
Software testing research: Achievements, challenges, dreams
- Proceedings of the Future of Software Engineering at ICSE 2007
, 2007
"... Her research interests are in architecture-based, component-based and service-oriented test methodologies, as well as methods for analysis of non-functional properties. ..."
Abstract
-
Cited by 24 (0 self)
- Add to MetaCart
Her research interests are in architecture-based, component-based and service-oriented test methodologies, as well as methods for analysis of non-functional properties.
DySy: Dynamic symbolic execution for invariant inference
- In Proc. 30th ACM/IEEE International Conference on Software Engineering (ICSE). ACM
, 2007
"... Dynamically discovering likely program invariants from concrete test executions has emerged as a highly promising software engineering technique. Dynamic invariant inference has the advantage of succinctly summarizing both “expected” program inputs and the subset of program behaviors that is normal ..."
Abstract
-
Cited by 23 (3 self)
- Add to MetaCart
Dynamically discovering likely program invariants from concrete test executions has emerged as a highly promising software engineering technique. Dynamic invariant inference has the advantage of succinctly summarizing both “expected” program inputs and the subset of program behaviors that is normal under those inputs. In this paper, we introduce a technique that can drastically increase the relevance of inferred invariants, or reduce the size of the test suite required to obtain good invariants. Instead of falsifying invariants produced by pre-set patterns, we determine likely program invariants by combining the concrete execution of actual test cases with a simultaneous symbolic execution of conditions over program variables that the concrete tests satisfy during their execution. In this way, we obtain the benefits of dynamic inference tools like Daikon: the inferred invariants correspond to the observed program behaviors. At the same time, however, our inferred invariants are much more suited to the program at hand than Daikon’s hardcoded invariant patterns. The symbolic invariants are literally derived from the program text itself, with appropriate value substitutions as dictated by symbolic execution. We implemented our technique in the DySy tool, which utilizes a powerful symbolic execution and simplification engine. The results confirm the benefits of our approach. In Daikon’s prime example benchmark, we infer the majority of the interesting Daikon invariants, while eliminating invariants that a human user is likely to consider irrelevant.
Parametric Prediction of Heap Memory Requirements
"... This work presents a technique to compute symbolic polynomial approximations of the amount of dynamic memory required to safely execute a method without running out of memory, for Javalike imperative programs. We consider object allocations and deallocations made by the method and the methods it tra ..."
Abstract
-
Cited by 19 (5 self)
- Add to MetaCart
This work presents a technique to compute symbolic polynomial approximations of the amount of dynamic memory required to safely execute a method without running out of memory, for Javalike imperative programs. We consider object allocations and deallocations made by the method and the methods it transitively calls. More precisely, given an initial configuration of the stack and the heap, the peak memory consumption is the maximum space occupied by newly created objects in all states along a run from it. We over-approximate the peak memory consumption using a scopedmemory management where objects are organized in regions associated with the lifetime of methods. We model the problem of computing the maximum memory occupied by any region configuration as a parametric polynomial optimization problem over a polyhedral domain and resort to Bernstein basis to solve it. We apply the developed tool to several benchmarks.
Automatically Patching Errors in Deployed Software
, 2009
"... We present ClearView, a system for automatically patching errors in deployed software. ClearView works on stripped Windows x86 binaries without any need for source code, debugging information, or other external information, and without human intervention. ClearView (1) observes normal executions to ..."
Abstract
-
Cited by 18 (4 self)
- Add to MetaCart
We present ClearView, a system for automatically patching errors in deployed software. ClearView works on stripped Windows x86 binaries without any need for source code, debugging information, or other external information, and without human intervention. ClearView (1) observes normal executions to learn invariants that characterize the application’s normal behavior, (2) uses error detectors to monitor the execution to detect failures, (3) identifies violations of learned invariants that occur during failed executions, (4) generates candidate repair patches that enforce selected invariants by changing the state or the flow of control to make the invariant true, and (5) observes the continued execution of patched applications to select the most successful patch. ClearView is designed to correct errors in software with high availability requirements. Aspects of ClearView that make it particularly
Automatic Inference and Enforcement of Kernel Data Structure Invariants
"... Kernel-level rootkits affect system security by modifying key kernel data structures to achieve a variety of malicious goals. While early rootkits modified control data structures, such as the system call table and values of function pointers, recent work has demonstrated rootkits that maliciously m ..."
Abstract
-
Cited by 17 (5 self)
- Add to MetaCart
Kernel-level rootkits affect system security by modifying key kernel data structures to achieve a variety of malicious goals. While early rootkits modified control data structures, such as the system call table and values of function pointers, recent work has demonstrated rootkits that maliciously modify non-control data. Prior techniques for rootkit detection fail to identify such rootkits either because they focus solely on detecting control data modifications or because they require elaborate, manually-supplied specifications to detect modifications of non-control data. This paper presents a novel rootkit detection technique that automatically detects rootkits that modify both control and non-control data. The key idea is to externally observe the execution of the kernel during a training period and hypothesize invariants on kernel data structures. These invariants are used as specifications of data structure integrity during an enforcement phase; violation of these invariants indicates the presence of a rootkit. We present the design and implementation of Gibraltar, a tool that uses the above approach to infer and enforce invariants. In our experiments, we found that Gibraltar can detect rootkits that modify both control and non-control data structures, and that its false positive rate and monitoring overheads are negligible. 1.
Qin Analysing memory resource bounds for low-level programs
- In ISMM 08
, 2008
"... Embedded systems are becoming more widely used but these systems are often resource constrained. Programming models for these systems should take into formal consideration resources such as stack and heap. In this paper, we show how memory resource bounds can be inferred for assembly-level programs. ..."
Abstract
-
Cited by 15 (0 self)
- Add to MetaCart
Embedded systems are becoming more widely used but these systems are often resource constrained. Programming models for these systems should take into formal consideration resources such as stack and heap. In this paper, we show how memory resource bounds can be inferred for assembly-level programs. Our inference process captures the memory needs of each method in terms of the symbolic values of its parameters. For better precision, we infer path-sensitive information through a novel guarded expression format. Our current proposal relies on a Presburger solver to capture memory requirements symbolically, and to perform fixpoint analysis for loops and recursion. Apart from safety in memory adequacy, our proposal can provide estimate on memory costs for embedded devices and improve performance via fewer runtime checks against memory bound. 1.
Robust signatures for kernel data structures
- In Proc. 16th ACM CCS
, 2009
"... Kernel-mode rootkits hide objects such as processes and threads using a technique known as Direct Kernel Object Manipulation (DKOM). Many forensic analysis tools attempt to detect these hidden objects by scanning kernel memory with handmade signatures; however, such signatures are brittle and rely o ..."
Abstract
-
Cited by 12 (2 self)
- Add to MetaCart
Kernel-mode rootkits hide objects such as processes and threads using a technique known as Direct Kernel Object Manipulation (DKOM). Many forensic analysis tools attempt to detect these hidden objects by scanning kernel memory with handmade signatures; however, such signatures are brittle and rely on non-essential features of these data structures, making them easy to evade. In this paper, we present an automated mechanism for generating signatures for kernel data structures and show that these signatures are robust: attempts to evade the signature by modifying the structure contents will cause the OS to consider the object invalid. Using dynamic analysis, we profile the target data structure to determine commonly used fields, and we then fuzz those fields to determine which are essential to the correct operation of the OS. These fields form the basis of a signature for the data structure. In our experiments, our new signature matched the accuracy of existing scanners for traditional malware and found processes hidden with our prototype rootkit that all current signatures missed. Our techniques significantly increase the difficulty of hiding objects from signature scanning.
Discovering documentation for java container classes
- IEEE Transactions on Software Engineering
, 2007
"... Modern programs make extensive use of reusable software libraries. For example, we found that 17 % to 30 % of the classes in a number of large Java applications use the container classes from the java.util package. Given this extensive code reuse in Java programs, it is important for the reusable in ..."
Abstract
-
Cited by 10 (0 self)
- Add to MetaCart
Modern programs make extensive use of reusable software libraries. For example, we found that 17 % to 30 % of the classes in a number of large Java applications use the container classes from the java.util package. Given this extensive code reuse in Java programs, it is important for the reusable interfaces to have clear and unambiguous documentation. Unfortunately, most documentation is expressed in English, and therefore does not always satisfy these requirements. Worse yet, there is no way of checking that the documentation is consistent with the associated code. Formal specifications present an alternative which does not suffer from these problems; however, formal specifications are notoriously hard to write. To alleviate this difficulty, we have implemented a tool which automatically derives documentation in the form of formal specifications. Our tool probes Java classes by invoking them on dynamically generated tests and captures the information observed during their execution as algebraic axioms. While the tool is not complete or correct from a formal perspective we demonstrate that it discovers many useful axioms when applied to container classes. These axioms then form an initial formal documentation of the class they describe. 1

