Results 1 - 10
of
23
WYSINWYX: What You See Is Not What You eXecute
, 2009
"... Over the last seven years, we have developed static-analysis methods to recover a good approximation to the variables and dynamically-allocated memory objects of a stripped executable, and to track the flow of values through them. The paper presents the algorithms that we developed, explains how the ..."
Abstract
-
Cited by 33 (7 self)
- Add to MetaCart
Over the last seven years, we have developed static-analysis methods to recover a good approximation to the variables and dynamically-allocated memory objects of a stripped executable, and to track the flow of values through them. The paper presents the algorithms that we developed, explains how they are used to recover intermediate representations (IRs) from executables that are similar to the IRs that would be available if one started from source code, and describes their application in the context of program understanding and automated bug hunting. Unlike algorithms for analyzing executables that existed prior to our work, the ones presented in this paper provide useful information about memory accesses, even in the absence of debugging information. The ideas described in the paper are incorporated in a tool for analyzing Intel x86 executables, called CodeSurfer/x86. CodeSurfer/x86 builds a system dependence graph for the program, and provides a GUI for exploring the graph by (i) navigating its edges, and (ii) invoking operations, such as forward slicing, backward slicing, and chopping, to discover how parts of the program can impact other parts. To assess the usefulness of the IRs recovered by CodeSurfer/x86 in the context of automated bug hunting, we built a tool on top of CodeSurfer/x86, called Device-Driver Analyzer for x86
DIVINE: DIscovering Variables IN Executables
- In VMCAI
, 2007
"... Abstract. This paper addresses the problem of recovering variable-like entities when analyzing executables in the absence of debugging information. We show that variable-like entities can be recovered by iterating Value-Set Analysis (VSA), a combined numeric-analysis and pointer-analysis algorithm, ..."
Abstract
-
Cited by 18 (7 self)
- Add to MetaCart
Abstract. This paper addresses the problem of recovering variable-like entities when analyzing executables in the absence of debugging information. We show that variable-like entities can be recovered by iterating Value-Set Analysis (VSA), a combined numeric-analysis and pointer-analysis algorithm, and Aggregate Structure Identification, an algorithm to identify the structure of aggregates. Our initial experiments show that the technique is successful in correctly identifying 88 % of the local variables and 89 % of the fields of heap-allocated objects. Previous techniques recovered 83 % of the local variables, but 0 % of the fields of heap-allocated objects. Moreover, the values computed by VSA using the variables recovered by our algorithm would allow any subsequent analysis to do a better job of interpreting instructions that use indirect addressing to access arrays and heap-allocated data objects: indirect operands can be resolved better at 4 % to 39 % of the sites of writes and up to 8 % of the sites of reads. (These are the memory-access operations for which it is the most difficult for an analyzer to obtain useful results.) 1
Intermediate-Representation Recovery from Low-Level Code
- IN PEPM
, 2006
"... The goal of our work is to create tools that an analyst can use to understand the workings of COTS components, plugins, mobile code, and DLLs, as well as memory snapshots of worms and virusinfected code. This paper describes how static analysis provides techniques that can be used to recover interme ..."
Abstract
-
Cited by 14 (8 self)
- Add to MetaCart
The goal of our work is to create tools that an analyst can use to understand the workings of COTS components, plugins, mobile code, and DLLs, as well as memory snapshots of worms and virusinfected code. This paper describes how static analysis provides techniques that can be used to recover intermediate representations that are similar to those that can be created for a program written in a high-level language.
Improving Pushdown System Model Checking
- IN CAV
, 2006
"... In this paper, we reduce pushdown system (PDS) model checking to a graphtheoretic problem, and apply a fast graph algorithm to improve the running time for model checking. Several other PDS questions and techniques can be carried out in the new setting, including witness tracing and incremental a ..."
Abstract
-
Cited by 14 (9 self)
- Add to MetaCart
In this paper, we reduce pushdown system (PDS) model checking to a graphtheoretic problem, and apply a fast graph algorithm to improve the running time for model checking. Several other PDS questions and techniques can be carried out in the new setting, including witness tracing and incremental analysis, each of which benefits from the fast graph-based algorithm.
A Next-Generation Platform for Analyzing Executables
- In APLAS
, 2005
"... Abstract. In recent years, there has been a growing need for tools that an analyst can use to understand the workings of COTS components, plugins, mobile code, and DLLs, as well as memory snapshots of worms and virus-infected code. Static analysis provides techniques that can help with such problems ..."
Abstract
-
Cited by 13 (6 self)
- Add to MetaCart
Abstract. In recent years, there has been a growing need for tools that an analyst can use to understand the workings of COTS components, plugins, mobile code, and DLLs, as well as memory snapshots of worms and virus-infected code. Static analysis provides techniques that can help with such problems; however, there are several obstacles that must be overcome: – For many kinds of potentially malicious programs, symbol-table and debugging information is entirely absent. Even if it is present, it cannot be relied upon. – To understand memory-access operations, it is necessary to determine the set of addresses accessed by each operation. This is difficult because ¯While some memory operations use explicit memory addresses in the instruction (easy), others use indirect addressing via address expressions (difficult). ¯Arithmetic on addresses is pervasive. For instance, even when the value of a local variable is loaded from its slot in an activation record, address arithmetic is performed. ¯There is no notion of type at the hardware level, so address values cannot be distinguished from integer values. ¯Memory accesses do not have to be aligned, so word-sized address values could potentially be cobbled together from misaligned reads and writes. We have developed static-analysis algorithms to recover information about the contents of memory locations and how they are manipulated by an executable. By combining these analyses with facilities provided by the IDAPro and CodeSurfer toolkits, we have created CodeSurfer/x86, a prototype tool for browsing, inspecting, and analyzing x86 executables. From an x86 executable, CodeSurfer/x86 recovers intermediate representations that are similar to what would be created by a compiler for a program written in a high-level language. CodeSurfer/x86 also supports a scripting language, as well as several kinds of sophisticated pattern-matching capabilities. These facilities provide a platform for the development of additional tools for analyzing the security properties of executables.
Model checking x86 executables with CodeSurfer/x86 and WPDS
- In CAV
, 2005
"... Abstract. This paper presents a toolset for model checking x86 executables. The members of the toolset are CodeSurfer/x86, WPDS++, and the Path Inspector. CodeSurfer/x86 is used to extract a model from an executable in the form of a weighted pushdown system. WPDS++ is a library for answering general ..."
Abstract
-
Cited by 10 (7 self)
- Add to MetaCart
Abstract. This paper presents a toolset for model checking x86 executables. The members of the toolset are CodeSurfer/x86, WPDS++, and the Path Inspector. CodeSurfer/x86 is used to extract a model from an executable in the form of a weighted pushdown system. WPDS++ is a library for answering generalized reachability queries on weighted pushdown systems. The Path Inspector is a software model checker built on top of CodeSurfer and WPDS++ that supports safety queries about the program’s possible control configurations. 1
Boundedness vs. Unboundedness of Lock Chains: Characterizing Decidability of CFL-Reachability for Threads Communicating via Locks
"... The problem of Pairwise CFL-reachability is to decide whether two given program locations in different threads are simultaneously reachable in the presence of recursion in threads and scheduling constraints imposed by synchronization primitives. Pairwise CFL-reachability is the core problem underlyi ..."
Abstract
-
Cited by 6 (3 self)
- Add to MetaCart
The problem of Pairwise CFL-reachability is to decide whether two given program locations in different threads are simultaneously reachable in the presence of recursion in threads and scheduling constraints imposed by synchronization primitives. Pairwise CFL-reachability is the core problem underlying concurrent program analysis especially dataflow analysis. Unfortunately, it is undecidable even for the most commonly used synchronization primitive, i.e., mutex locks. Lock usage in concurrent programs can be characterized in terms of lock chains, where a sequence of mutex locks is said to be chained if the scopes of adjacent (nonnested) mutexes overlap. Although pairwise reachability is known to decidable for threads interacting via nested locks, i.e., chains of length one, these techniques don’t extend to programs with non-nested locks used in crucial applications like databases and device drivers. In this paper, we exploit the fact that lock usage patterns in real life programs do not produce unbounded lock chains. For such programs, we show that pairwise CFL-reachability becomes decidable. Towards that end, we formulate small model properties that bound the lengths of paths that need to be traversed in order to reach a given pair of control states. Our new results narrow the decidability gap for pairwise CFL-reachability by providing a more refined characterization for it in terms of boundedness of lock chains rather than the current stateof-the-art, i.e., nestedness of locks (chains of length one). 1
Language strength reduction
, 2008
"... This paper concerns methods to check for atomic-set serializability violations in concurrent Java programs. The straightforward way to encode a reentrant lock is to model it with a context-free language to track the number of successive lock acquisitions. We present a construction that replaces th ..."
Abstract
-
Cited by 6 (4 self)
- Add to MetaCart
This paper concerns methods to check for atomic-set serializability violations in concurrent Java programs. The straightforward way to encode a reentrant lock is to model it with a context-free language to track the number of successive lock acquisitions. We present a construction that replaces the context-free language that describes a reentrant lock by a regular language that describes a non-reentrant lock. We call this replacement language strength reduction. Language strength reduction produces an average speedup (geometric mean) of 3.4. Moreover, for 2 programs that previously exhausted available space, the tool is now able to run to completion.
Improved Memory-Access Analysis for x86 Executables
"... Over the last seven years, we have developed static-analysis methods to recover a good approximation to the variables and dynamically allocated memory objects of a stripped executable, and to track the flow of values through them. It is relatively easy to track the effects of an instruction operand ..."
Abstract
-
Cited by 6 (1 self)
- Add to MetaCart
Over the last seven years, we have developed static-analysis methods to recover a good approximation to the variables and dynamically allocated memory objects of a stripped executable, and to track the flow of values through them. It is relatively easy to track the effects of an instruction operand that refers to a global address (i.e., an access to a global variable) or that uses a stack-frame offset (i.e., an access to a local scalar variable via the frame pointer or stack pointer). In our work, our algorithms are able to provide useful information for close to 100% of such “direct ” uses and defs. It is much harder for a static-analysis algorithm to track the effects of an instruction operand that uses a non-stack-frame register. These “indirect” uses and defs correspond to accesses to an array or a dynamically allocated memory object. In one study, our approach recovered useful information for only 29 % of indirect uses and 33 % of indirect defs. However, using the technique described in this paper, the algorithm recovered useful information for 81 % of indirect uses and 90 % of indirect defs.
Error propagation analysis for file systems
- In Proceedings of the ACM SIGPLAN 2009 Conference on Programming Language Design and Implementation
, 2009
"... Unchecked errors are especially pernicious in operating system file management code. Transient or permanent hardware failures are inevitable, and error-management bugs at the file system layer can cause silent, unrecoverable data corruption. We propose an interprocedural static analysis that tracks ..."
Abstract
-
Cited by 6 (6 self)
- Add to MetaCart
Unchecked errors are especially pernicious in operating system file management code. Transient or permanent hardware failures are inevitable, and error-management bugs at the file system layer can cause silent, unrecoverable data corruption. We propose an interprocedural static analysis that tracks errors as they propagate through file system code. Our implementation detects overwritten, out-ofscope, and unsaved unchecked errors. Analysis of four widely-used Linux file system implementations (CIFS, ext3, IBM JFS and ReiserFS), a relatively new file system implementation (ext4), and shared virtual file system (VFS) code uncovers 312 error propagation bugs. Our flow- and context-sensitive approach produces more precise results than related techniques while providing better diagnostic information, including possible execution paths that demonstrate each bug found.

