Results 1 - 10
of
56
EXE: Automatically generating inputs of death
- In Proceedings of the 13th ACM Conference on Computer and Communications Security (CCS
, 2006
"... This article presents EXE, an effective bug-finding tool that automatically generates inputs that crash real code. Instead of running code on manually or randomly constructed input, EXE runs it on symbolic input initially allowed to be anything. As checked code runs, EXE tracks the constraints on ea ..."
Abstract
-
Cited by 154 (11 self)
- Add to MetaCart
This article presents EXE, an effective bug-finding tool that automatically generates inputs that crash real code. Instead of running code on manually or randomly constructed input, EXE runs it on symbolic input initially allowed to be anything. As checked code runs, EXE tracks the constraints on each symbolic (i.e., input-derived) memory location. If a statement uses a symbolic value, EXE does not run it, but instead adds it as an input-constraint; all other statements run as usual. If code conditionally checks a symbolic expression, EXE forks execution, constraining the expression to be true on the true branch and false on the other. Because EXE reasons about all possible values on a path, it has much more power than a traditional runtime tool: (1) it can force execution down any feasible program path and (2) at dangerous operations (e.g., a pointer dereference), it detects if the current path constraints allow any value that causes a bug. When a path terminates or hits a bug, EXE automatically generates a test case by solving the current path constraints to find concrete values using its own co-designed constraint solver, STP. Because EXE’s constraints have no approximations, feeding this concrete input to an uninstrumented version of the checked code will cause it to follow the same path and hit the same bug (assuming deterministic code).
Test Input Generation with Java PathFinder
"... We show how model checking and symbolic execution can be used to generate test inputs to achieve structural coverage of code that manipulates complex data structures. We focus on obtaining branch-coverage during unit testing of some of the core methods of the red-black tree implementation in the Jav ..."
Abstract
-
Cited by 111 (6 self)
- Add to MetaCart
We show how model checking and symbolic execution can be used to generate test inputs to achieve structural coverage of code that manipulates complex data structures. We focus on obtaining branch-coverage during unit testing of some of the core methods of the red-black tree implementation in the Java TreeMap library, using the Java PathFinder model checker. Three di#erent test generation techniques will be introduced and compared, namely, straight model checking of the code, model checking used in a black-box fashion to generate all inputs up to a fixed size, and lastly, model checking used during white-box test input generation. The main contribution of this work is to show how e#cient white-box test input generation can be done for code manipulating complex data, taking into account complex method preconditions.
Towards automatic generation of vulnerability-based signatures
- In Proceedings of the 2006 IEEE Symposium on Security and Privacy
, 2006
"... In this paper we explore the problem of creating vulnerability signatures. A vulnerability signature matches all exploits of a given vulnerability, even polymorphic or metamorphic variants. Our work departs from previous approaches by focusing on the semantics of the program and vulnerability exerci ..."
Abstract
-
Cited by 102 (23 self)
- Add to MetaCart
In this paper we explore the problem of creating vulnerability signatures. A vulnerability signature matches all exploits of a given vulnerability, even polymorphic or metamorphic variants. Our work departs from previous approaches by focusing on the semantics of the program and vulnerability exercised by a sample exploit instead of the semantics or syntax of the exploit itself. We show the semantics of a vulnerability define a language which contains all and only those inputs that exploit the vulnerability. A vulnerability signature is a representation (e.g., a regular expression) of the vulnerability language. Unlike exploitbased signatures whose error rate can only be empirically measured for known test cases, the quality of a vulnerability signature can be formally quantified for all possible inputs. We provide a formal definition of a vulnerability signature and investigate the computational complexity of creating and matching vulnerability signatures. We also systematically explore the design space of vulnerability signatures. We identify three central issues in vulnerability-signature creation: how a vulnerability signature represents the set of inputs that may exercise a vulnerability, the vulnerability coverage (i.e., number of vulnerable program paths) that is subject to our analysis during signature creation, and how a vulnerability signature is then created for a given representation and coverage. We propose new data-flow analysis and novel adoption of existing techniques such as constraint solving for automatically generating vulnerability signatures. We have built a prototype system to test our techniques. Our experiments show that we can automatically generate a vulnerability signature using a single exploit which is of much higher quality than previous exploit-based signatures. In addition, our techniques have several other security applications, and thus may be of independent interest.
Execution generated test cases: How to make systems code crash itself
, 2005
"... This paper presents a technique that uses code to automatically generate its own test cases at run-time by using a combination of symbolic and concrete (i.e., regular) execution. The input values to a program (or software component) provide the standard interface of any testing framework with the pr ..."
Abstract
-
Cited by 70 (7 self)
- Add to MetaCart
This paper presents a technique that uses code to automatically generate its own test cases at run-time by using a combination of symbolic and concrete (i.e., regular) execution. The input values to a program (or software component) provide the standard interface of any testing framework with the program it is testing, and generating input values that will explore all the “interesting” behavior in the tested program remains an important open problem in software testing research. Our approach works by turning the problem on its head: we lazily generate, from within the program itself, the input values to the program (and values derived from input values) as needed. We applied the technique to real code and found numerous corner-case errors ranging from simple memory overflows and infinite loops to subtle issues in the interpretation of language standards.
Automated Test Data Generation Using An Iterative Relaxation Method
- In SIGSOFT ’98/FSE-6: Proceedings of the 6th ACM SIGSOFT international symposium on Foundations of software engineering
, 1998
"... An important problem that arises in path oriented testing is the generation of test data that causes a program to follow a given path. In this paper, we present a novel program execution based approach using an iterative relaxation method to address the above problem. In this method, test data gener ..."
Abstract
-
Cited by 66 (6 self)
- Add to MetaCart
An important problem that arises in path oriented testing is the generation of test data that causes a program to follow a given path. In this paper, we present a novel program execution based approach using an iterative relaxation method to address the above problem. In this method, test data generation is initiated with an arbitrarily chosen input from a given domain. This input is then iteratively refined to obtain an input on which all the branch predicates on the given path evaluate to the desired outcome. In each iteration the program statements relevant to the evaluation of each branch predicate on the path are executed, and a set of linear constraints is derived. The constraints are then solved to obtain the increments for the input. These increments are added to the current input to obtain the input for the next iteration. The relaxation technique used in deriving the constraints provides feedback on the amount by which each input variable should be adjusted for the branches o...
Generating Tests from Counterexamples
- In Proc. of the 26th ICSE
, 2004
"... We have extended the software model checker BLAST to automatically generate test suites that guarantee full coverage with respect to a given predicate. More precisely, given a C program and a target predicate p, BLAST determines the set L of program locations which program execution can reach with p ..."
Abstract
-
Cited by 66 (6 self)
- Add to MetaCart
We have extended the software model checker BLAST to automatically generate test suites that guarantee full coverage with respect to a given predicate. More precisely, given a C program and a target predicate p, BLAST determines the set L of program locations which program execution can reach with p true, and automatically generates a set of test vectors that exhibit the truth of p at all locations in L. We have used BLAST to generate test suites and to detect dead code in C programs with up to 30 K lines of code. The analysis and test-vector generation is fully automatic (no user intervention) and exact (no false positives).
Exploring multiple execution paths for malware analysis
- In Security and Privacy, 2007. SP ’07. IEEE Symposium on
, 2007
"... Malicious code (or malware) is defined as software that fulfills the deliberately harmful intent of an attacker. Malware analysis is the process of determining the behavior and purpose of a given malware sample (such as a virus, worm, or Trojan horse). This process is a necessary step to be able to ..."
Abstract
-
Cited by 60 (11 self)
- Add to MetaCart
Malicious code (or malware) is defined as software that fulfills the deliberately harmful intent of an attacker. Malware analysis is the process of determining the behavior and purpose of a given malware sample (such as a virus, worm, or Trojan horse). This process is a necessary step to be able to develop effective detection techniques and removal tools. Currently, malware analysis is mostly a manual process that is tedious and time-intensive. To mitigate this problem, a number of analysis tools have been proposed that automatically extract the behavior of an unknown program by executing it in a restricted environment and recording the operating system calls that are invoked. The problem of dynamic analysis tools is that only a single program execution is observed. Unfortunately, however, it is possible that certain malicious actions are only triggered under specific circumstances (e.g., on a particular day, when a certain file is present, or when a certain command is received). In this paper, we propose a system that allows us to explore multiple execution paths and identify malicious actions that are executed only when certain conditions are met. This enables us to automatically extract a more complete view of the program under analysis and identify under which circumstances suspicious actions are carried out. Our experimental results demonstrate that many malware samples show different behavior depending on input read from the environment. Thus, by exploring multiple execution paths, we can obtain a more complete picture of their actions. 1
Generating Test Data For Branch Coverage
- In Proc. of the International Conference on Automated Software Engineering
, 2000
"... Branch coverage is an important criteria used during the structural testing of programs. In this paper, we present a new program execution based approach to generate input data that exercises a selected branch in a program. The test data generation is initiated with an arbitrarily chosen input from ..."
Abstract
-
Cited by 48 (1 self)
- Add to MetaCart
Branch coverage is an important criteria used during the structural testing of programs. In this paper, we present a new program execution based approach to generate input data that exercises a selected branch in a program. The test data generation is initiated with an arbitrarily chosen input from the input domain of the program. A new input is derived from the initial input in an attempt to force execution through any of the paths through the selected branch. The method dynamically switches among the paths that reach the branch by refining the input. Using a numerical iterative technique that attempts to generate an input to exercise the branch, it dynamically selects a path that offers less resistance. We have implemented the technique and present experimental results of its performance for some programs. Our results show that our method is feasible and practical. Keywords - Path testing, branch testing, iterative relaxation technique, testing tools. 1 Introduction An important t...
A theory of predicate-complete test coverage and generation
- In FMCO’2004: Symp. on Formal Methods for Components and Objects. SpringerPress
, 2004
"... This page intentionally left blank. A Theory of Predicate-Complete Test Coverage and Generation ∗ Consider a program with m statements and n predicates, where the predicates are derived from the conditional statements and assertions in a program, as well as from implicit run-time safety checks. An o ..."
Abstract
-
Cited by 40 (4 self)
- Add to MetaCart
This page intentionally left blank. A Theory of Predicate-Complete Test Coverage and Generation ∗ Consider a program with m statements and n predicates, where the predicates are derived from the conditional statements and assertions in a program, as well as from implicit run-time safety checks. An observable state is an evaluation of the n predicates under some state at a program statement. The goal of predicate-complete testing (PCT) is to cover every reachable observable state (at most m × 2 n of them) in a program. PCT coverage is a new form of coverage motivated by the observation that certain errors in a program only can be exposed by considering the complex dependences between the predicates in a program and the statements whose execution they control. PCT coverage subsumes many existing control-flow coverage criteria and is incomparable to path coverage. To support the generation of tests to achieve high PCT coverage, we show how to define an upper bound U and lower bound L to the (unknown) set of reachable observable states R. These bounds are constructed automatically using Boolean (predicate) abstraction over modal transition systems and can be used to guide test generation via symbolic execution. We define a static coverage metric as |L|/|U|, which measures the ability of the Boolean abstraction to achieve high PCT coverage. Finally we show how to increase this ratio by the addition of new predicates. 1
Automatically generating malicious disks using symbolic execution
- In Proceedings of the 2006 IEEE Symposium on Security and Privacy
, 2006
"... Many current systems allow data produced by potentially malicious sources to be mounted as a file system. File system code must check this data for dangerous values or invariant violations before using it. Because file system code typically runs inside the operating system kernel, even a single unch ..."
Abstract
-
Cited by 37 (3 self)
- Add to MetaCart
Many current systems allow data produced by potentially malicious sources to be mounted as a file system. File system code must check this data for dangerous values or invariant violations before using it. Because file system code typically runs inside the operating system kernel, even a single unchecked value can crash the machine or lead to an exploit. Unfortunately, validating file system images is complex: they form DAGs with complex dependency relationships across massive amounts of data bound together with intricate, undocumented assumptions. This paper shows how to automatically find bugs in such code using symbolic execution. Rather than running the code on manually-constructed concrete input, we instead run it on symbolic input that is initially allowed to be “anything. ” As the code runs, it observes (tests) this input and thus constrains its possible values. We generate test cases by solving these constraints for concrete values. The approach works well in practice: we checked the disk mounting code of three widely-used Linux file systems: ext2, ext3, and JFS and found bugs in all of them where malicious data could either cause a kernel panic or form the basis of a buffer overflow attack. 1

