• Documents
  • Authors
  • Tables
  • Other Seers ▼
    RefSeer AckSeer CollabSeer SeerSeer
  • Log in
  • Sign up
  • MetaCart

CiteSeerX logo

Advanced Search Include Citations
Advanced Search Include Citations | Disambiguate

T.: Recency-Abstraction for Heap-Allocated Storage (2006)

by G Balakrishnan, Reps
Add To MetaCart

Tools

Sorted by:
Results 1 - 10 of 18
Next 10 →

WYSINWYX: What You See Is Not What You eXecute

by Gogul Balakrishnan, Thomas Reps , 2009
"... Over the last seven years, we have developed static-analysis methods to recover a good approximation to the variables and dynamically-allocated memory objects of a stripped executable, and to track the flow of values through them. The paper presents the algorithms that we developed, explains how the ..."
Abstract - Cited by 33 (7 self) - Add to MetaCart
Over the last seven years, we have developed static-analysis methods to recover a good approximation to the variables and dynamically-allocated memory objects of a stripped executable, and to track the flow of values through them. The paper presents the algorithms that we developed, explains how they are used to recover intermediate representations (IRs) from executables that are similar to the IRs that would be available if one started from source code, and describes their application in the context of program understanding and automated bug hunting. Unlike algorithms for analyzing executables that existed prior to our work, the ones presented in this paper provide useful information about memory accesses, even in the absence of debugging information. The ideas described in the paper are incorporated in a tool for analyzing Intel x86 executables, called CodeSurfer/x86. CodeSurfer/x86 builds a system dependence graph for the program, and provides a GUI for exploring the graph by (i) navigating its edges, and (ii) invoking operations, such as forward slicing, backward slicing, and chopping, to discover how parts of the program can impact other parts. To assess the usefulness of the IRs recovered by CodeSurfer/x86 in the context of automated bug hunting, we built a tool on top of CodeSurfer/x86, called Device-Driver Analyzer for x86

Beyond reachability: Shape abstraction in the presence of pointer arithmetic

by Cristiano Calcagno, Dino Distefano, Peter W. O’hearn, Hongseok Yang - In SAS’06: Static Analysis Symposium, 2006. M. Colón , 2001
"... Abstract. Previous shape analysis algorithms use a memory model where the heap is composed of discrete nodes that can be accessed only via access paths built from variables and field names, an assumption that is violated by pointer arithmetic. In this paper we show how this assumption can be removed ..."
Abstract - Cited by 28 (4 self) - Add to MetaCart
Abstract. Previous shape analysis algorithms use a memory model where the heap is composed of discrete nodes that can be accessed only via access paths built from variables and field names, an assumption that is violated by pointer arithmetic. In this paper we show how this assumption can be removed, and pointer arithmetic embraced, by using an analysis based on separation logic. We describe an abstract domain whose elements are certain separation logic formulae, and an abstraction mechanism that automatically transits between a low-level RAM view of memory and a higher, fictional, view that abstracts from the representation of nodes and multiword linked-lists as certain configurations of the RAM. A widening operator is used to accelerate the analysis. We report experimental results obtained from running our analysis on a number of classic algorithms for dynamic memory management. 1

DIVINE: DIscovering Variables IN Executables

by Gogul Balakrishnan, Thomas Reps - In VMCAI , 2007
"... Abstract. This paper addresses the problem of recovering variable-like entities when analyzing executables in the absence of debugging information. We show that variable-like entities can be recovered by iterating Value-Set Analysis (VSA), a combined numeric-analysis and pointer-analysis algorithm, ..."
Abstract - Cited by 18 (7 self) - Add to MetaCart
Abstract. This paper addresses the problem of recovering variable-like entities when analyzing executables in the absence of debugging information. We show that variable-like entities can be recovered by iterating Value-Set Analysis (VSA), a combined numeric-analysis and pointer-analysis algorithm, and Aggregate Structure Identification, an algorithm to identify the structure of aggregates. Our initial experiments show that the technique is successful in correctly identifying 88 % of the local variables and 89 % of the fields of heap-allocated objects. Previous techniques recovered 83 % of the local variables, but 0 % of the fields of heap-allocated objects. Moreover, the values computed by VSA using the variables recovered by our algorithm would allow any subsequent analysis to do a better job of interpreting instructions that use indirect addressing to access arrays and heap-allocated data objects: indirect operands can be resolved better at 4 % to 39 % of the sites of writes and up to 8 % of the sites of reads. (These are the memory-access operations for which it is the most difficult for an analyzer to obtain useful results.) 1

Type Analysis for JavaScript

by Simon Holm Jensen, Anders Møller, Peter Thiemann
"... Abstract. JavaScript is the main scripting language for Web browsers, and it is essential to modern Web applications. Programmers have started using it for writing complex applications, but there is still little tool support available during development. We present a static program analysis infrastr ..."
Abstract - Cited by 15 (3 self) - Add to MetaCart
Abstract. JavaScript is the main scripting language for Web browsers, and it is essential to modern Web applications. Programmers have started using it for writing complex applications, but there is still little tool support available during development. We present a static program analysis infrastructure that can infer detailed interpretation. The analysis is designed to support the full language as defined in the ECMAScript standard, including its peculiar object model and all built-in functions. The analysis results can be used to detect common programming errors – or rather, prove their absence, and for producing type information for program comprehension. Preliminary experiments conducted on real-life JavaScript code indicate that the approach is promising regarding analysis precision on small and medium size programs, which constitute the majority of JavaScript applications. With potential for further improvement, we propose the analysis as a foundation for building tools that can aid JavaScript programmers. 1

Intermediate-Representation Recovery from Low-Level Code

by Thomas Reps, Gogul Balakrishnan, Junghee Lim - IN PEPM , 2006
"... The goal of our work is to create tools that an analyst can use to understand the workings of COTS components, plugins, mobile code, and DLLs, as well as memory snapshots of worms and virusinfected code. This paper describes how static analysis provides techniques that can be used to recover interme ..."
Abstract - Cited by 14 (8 self) - Add to MetaCart
The goal of our work is to create tools that an analyst can use to understand the workings of COTS components, plugins, mobile code, and DLLs, as well as memory snapshots of worms and virusinfected code. This paper describes how static analysis provides techniques that can be used to recover intermediate representations that are similar to those that can be created for a program written in a high-level language.

Finding Concurrency-Related Bugs using Random Isolation

by Nicholas Kidd, Thomas Reps, Julian Dolby, Ana Vaziri
"... Abstract. This paper describes the methods used in Empire, a tool to detect concurrency-related bugs, namely atomic-set serializability violations in Java programs. The correctness criterion is based on atomic sets of memory locations, which share a consistency property, and units of work, which pre ..."
Abstract - Cited by 12 (5 self) - Add to MetaCart
Abstract. This paper describes the methods used in Empire, a tool to detect concurrency-related bugs, namely atomic-set serializability violations in Java programs. The correctness criterion is based on atomic sets of memory locations, which share a consistency property, and units of work, which preserve consistency when executed sequentially. Empire checks that, for each atomic set, its units of work are serializable. This notion subsumes data races (single-location atomic sets), and serializability (all locations in one atomic set). To obtain a sound, finite model of locking behavior for use in Empire, we devised a new abstraction principle, random isolation, which allows strong updates to be performed on the abstract counterpart of each randomly-isolated object. This permits Empire to track the status of a Java lock, even for programs that use an unbounded number of locks. The advantage of random isolation is that properties proved about a randomly-isolated object can be generalized to all objects allocated at the same site. We ran Empire on eight programs from the ConTest benchmark suite, for which Empire detected numerous violations. 1

Improved Memory-Access Analysis for x86 Executables

by Thomas Reps, Gogul Balakrishnan
"... Over the last seven years, we have developed static-analysis methods to recover a good approximation to the variables and dynamically allocated memory objects of a stripped executable, and to track the flow of values through them. It is relatively easy to track the effects of an instruction operand ..."
Abstract - Cited by 6 (1 self) - Add to MetaCart
Over the last seven years, we have developed static-analysis methods to recover a good approximation to the variables and dynamically allocated memory objects of a stripped executable, and to track the flow of values through them. It is relatively easy to track the effects of an instruction operand that refers to a global address (i.e., an access to a global variable) or that uses a stack-frame offset (i.e., an access to a local scalar variable via the frame pointer or stack pointer). In our work, our algorithms are able to provide useful information for close to 100% of such “direct ” uses and defs. It is much harder for a static-analysis algorithm to track the effects of an instruction operand that uses a non-stack-frame register. These “indirect” uses and defs correspond to accesses to an array or a dynamically allocated memory object. In one study, our approach recovered useful information for only 29 % of indirect uses and 33 % of indirect defs. However, using the technique described in this paper, the algorithm recovered useful information for 81 % of indirect uses and 90 % of indirect defs.

Interprocedural Analysis with Lazy Propagation

by Simon Holm Jensen, Anders Møller, Peter Thiemann
"... Abstract. We propose lazy propagation as a technique for flow- and context-sensitive interprocedural analysis of programs with objects and first-class functions where transfer functions may not be distributive. The technique is described formally as a systematic modification of a variant of the mono ..."
Abstract - Cited by 4 (1 self) - Add to MetaCart
Abstract. We propose lazy propagation as a technique for flow- and context-sensitive interprocedural analysis of programs with objects and first-class functions where transfer functions may not be distributive. The technique is described formally as a systematic modification of a variant of the monotone framework and its theoretical properties are shown. It is implemented in a type analysis tool for JavaScript where it results in a significant improvement in performance. 1

A Static Birthmark of Binary Executables Based on API Call Structure ⋆

by Seokwoo Choi, Heewan Park, Hyun-il Lim, Taisook Han
"... Abstract. A software birthmark is a unique characteristic of a program that can be used as a software theft detection. In this paper we suggest and empirically evaluate a static birthmark of binary executables based on API call structure. The program properties employed in this birthmark are functio ..."
Abstract - Cited by 1 (1 self) - Add to MetaCart
Abstract. A software birthmark is a unique characteristic of a program that can be used as a software theft detection. In this paper we suggest and empirically evaluate a static birthmark of binary executables based on API call structure. The program properties employed in this birthmark are functions and standard API calls when the functions are executed. The API calls from a function includes the API calls explicitly found from the function and its descendants within limited depth in the call graph. To statically identify functions, call graphs and API calls, we utilizes IDAPro disassembler and its plug-ins. We define the similarity between two functions as the proportion of the number of all API calls to the number of the common API calls. The similarity between two programs is obtained by the maximum weight bipartite matching between two programs using the function similarity matrix. To show the credibility of the proposed techniques, we compare the same applications with different versions and the various types of applications which include text editors, picture viewers, multimedia players, P2P applications and ftp clients. To show the resilience, we compare binary executables compiled from various compilers. The empirical result shows that the similarities obtained using our birthmark sufficiently indicates the functional and structural similarities among programs.

A Dynamic Evaluation of the Precision of Static Heap Abstractions

by Percy Liang, Mayur Naik, Mooly Sagiv
"... The quality of a static analysis of heap-manipulating programs is largely determined by its heap abstraction. Object allocation sites are a commonly-used abstraction, but are too coarse for some clients. The goal of this paper is to investigate how various refinements of allocation sites can improve ..."
Abstract - Cited by 1 (0 self) - Add to MetaCart
The quality of a static analysis of heap-manipulating programs is largely determined by its heap abstraction. Object allocation sites are a commonly-used abstraction, but are too coarse for some clients. The goal of this paper is to investigate how various refinements of allocation sites can improve precision. In particular, we consider abstractions that use call stack, object recency, and heap connectivity information. We measure the precision of these abstractions dynamically for four different clients motivated by concurrency and on nine Java programs chosen from the DaCapo benchmark suite. Our dynamic results shed new light on aspects of heap abstractions that matter for precision, which allows us to more effectively navigate the large space of possible heap abstractions.
The National Science Foundation
  • About CiteSeerX
  • Submit Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2010 The Pennsylvania State University