Results 11 - 20
of
30
Krakatoa: Decompilation in Java (Does Bytecode Reveal Source?)
- In Third USENIX Conference on Object-Oriented Technologies and Systems (COOTS
, 1997
"... This paper presents our technique for automatically decompiling Java bytecode into Java source. Our technique reconstructs source-level expressions from bytecode, and reconstructs readable, high-level control statements from primitive goto- like branches. Fewer than a dozen simple coderewriting rul ..."
Abstract
-
Cited by 24 (0 self)
- Add to MetaCart
This paper presents our technique for automatically decompiling Java bytecode into Java source. Our technique reconstructs source-level expressions from bytecode, and reconstructs readable, high-level control statements from primitive goto- like branches. Fewer than a dozen simple coderewriting rules reconstruct the high-level statements. 1 Introduction Decompilation transforms a low-level language into a high-level language. The Java Virtual Machine (JVM) specifies a low-level bytecode language for a stack-based machine [LY97]. This language defines 203 operators, with most of the control flow specified by simple explicit transfers and labels. Compiling a Java class yields a class file that contains type information and bytecode. The JVM requires a significant amount of type information from the class files for object linking. Furthermore, the bytecode must be verifiably well-behaved in order to ensure safe execution. Decompilation systems can exploit this type information and well...
Assembly to High-Level Language Translation
- In Int. Conf. on Softw. Maint
, 1998
"... Translation of assembly code to high-level language code is of importance in the maintenance of legacy code, as well as in the areas of program understanding, porting, and recovery of code. We present techniques used in the asm2c translator, a SPARC assembly to C translator. The techniques invol ..."
Abstract
-
Cited by 23 (1 self)
- Add to MetaCart
Translation of assembly code to high-level language code is of importance in the maintenance of legacy code, as well as in the areas of program understanding, porting, and recovery of code. We present techniques used in the asm2c translator, a SPARC assembly to C translator. The techniques involve data and control flow analyses. The data flow analysis eliminates machine dependencies from the assembly code and recovers high-level language expressions. The control flow analysis recovers control structure statements. Simple data type recovery is also done. The presented techniques are extensions and improvements on previously developed CISC techniques. The choice of intermediate representation allows for both RISC and CISC assembly code to be supported by the analyses. We tested asm2c against SPEC95 SPARC assembly programs generated by a C compiler. Results using both unoptimized and optimized assembly code are presented. 1 Introduction Recovery of high-level language cod...
Structuring Decompiled Graphs
- In Proceedings of the International Conference on Compiler Construction
, 1996
"... . A structuring algorithm for arbitrary control flow graphs is presented. Graphs are structured into functional, semantical and structural equivalent graphs, without code replication or introduction of new variables. The algorithm makes use of a set of generic high-level language structures that inc ..."
Abstract
-
Cited by 21 (6 self)
- Add to MetaCart
. A structuring algorithm for arbitrary control flow graphs is presented. Graphs are structured into functional, semantical and structural equivalent graphs, without code replication or introduction of new variables. The algorithm makes use of a set of generic high-level language structures that includes different types of loops and conditionals. Gotos are used only when the graph cannot be structured with the structures in the generic set. This algorithm is adequate for the control flow analysis required when decompiling programs, given that a pure binary program does not contain information on the high-level structures used by the initial high-level language program (i.e. before compilation). The algorithm has been implemented as part of the dcc decompiler, an i80286 decompiler of DOS binary programs, and has proved successful in its aim of structuring decompiled graphs. 1 Introduction A decompiler is a software tool that reverses the compilation process by translating a pure binar...
Optimizing Java Bytecodes
, 1997
"... This paper concentrates on optimizations which rely on the knowledge of the target architecture, so they cannot be performed by the compiler which generates the class file, since the target machine is not known at that time. At the same time, the optimization techniques we consider cannot be easily ..."
Abstract
-
Cited by 14 (1 self)
- Add to MetaCart
This paper concentrates on optimizations which rely on the knowledge of the target architecture, so they cannot be performed by the compiler which generates the class file, since the target machine is not known at that time. At the same time, the optimization techniques we consider cannot be easily performed on bytecodes directly and require the recovery of high-level representation of the code which is being optimized. Briki is a compiler developed to research the issues of potential benefits of high-level optimizations for Java programs. Briki reads in a Java program distributed in the bytecode form, converts it into JavaIR (an intermediate representation used to represent Java programs in our compiler), performs the optimizations, and writes out the optimized code. We are primarily interested in the configuration of Briki which performs JIT compilation, i.e., a compiler which is integrated with the virtual machine and generates machine code for immediate execution. The current implementation of Briki, which was used in the experiments presented in this paper, reads in a class file and writes the optimized code to another file in the form of Java source. We chose off-line compilation and Java source as the output form for the ease of debugging and better understanding of the quality of the recovered code. A JIT implementation of Briki which will integrate our compiler with kaffe [3], a publicly available JIT Java system, is under way. Section 2 presents the organization of Briki including a brief discussion of JavaIR,
Design and Implementation of Pep, a Java Just-In-Time Translator
, 1997
"... Java, a new object-oriented member of the C family of languages, has become popular in part because it emphasizes portability. Portability is achieved by compiling programs to machine-independent bytecodes that can be interpreted on a Java virtual machine. Unfortunately, interpreted performance do ..."
Abstract
-
Cited by 11 (1 self)
- Add to MetaCart
Java, a new object-oriented member of the C family of languages, has become popular in part because it emphasizes portability. Portability is achieved by compiling programs to machine-independent bytecodes that can be interpreted on a Java virtual machine. Unfortunately, interpreted performance does not match native code performance. A just-in-time compiler can regain performance without sacrificing portability by turning the bytecodes into native code at runtime. This idea has a proven track record: Deutsch and Schiffman presented a dynamic Smalltalk compiler in 1984 [5], and the Self system currently sports a dynamic type-feedback based optimizing compiler [12]. To study the performance potential of Java with this state-of-the-art optimization technology, we built Pep, a just-intime compiler from Java bytecodes to Self. Following translation by Pep, Java programs can execute on the Self virtual machine and benefit from the optimizations performed by Self's compiler. We describe the design and implementation of Pep, focusing on concepts and trade-offs, but also compare performance with the JDK 1.0.2 and 1.1 interpreters. 1
Parallelizing Compilers: Implementation and Effectiveness
, 1993
"... An important thank you goes to one of my undergraduate professors, Ken Kennedy. He proposed the project that led to this thesis, and my desire to know the answer gave me the strength to complete this work. I would like to thank the languages group at Kubota Pacific Computers, Inc. for showing me tha ..."
Abstract
-
Cited by 8 (0 self)
- Add to MetaCart
An important thank you goes to one of my undergraduate professors, Ken Kennedy. He proposed the project that led to this thesis, and my desire to know the answer gave me the strength to complete this work. I would like to thank the languages group at Kubota Pacific Computers, Inc. for showing me that I could indeed be productive and that all problems in compilers did not take years to solve. My sanity is thanks to all of my friends from dancing, "O " runs, and everything else. They made it possible to return to work each day and eventually to graduate. I owe my parents a great debt for encouraging me to stay in graduate school even when I thought I would never finish. Last, but certainly not least, I would like to thank Don Ramsey for reading many drafts and listening to many dry runs. His input greatly helped the presentation of this thesis in both oral and written forms.
A Structuring Algorithm for Decompilation
- In XIX Conferencia Latinoamericana de Inform'atica
, 1993
"... This paper presents a structuring algorithm for arbitrary reducible, unstructured graphs. Graphs are structured into semantically equivalent graphs, without the need of code replication or introduction of new variables. The algorithm makes use of structures such as, if..then..elses, while, repeat an ..."
Abstract
-
Cited by 8 (2 self)
- Add to MetaCart
This paper presents a structuring algorithm for arbitrary reducible, unstructured graphs. Graphs are structured into semantically equivalent graphs, without the need of code replication or introduction of new variables. The algorithm makes use of structures such as, if..then..elses, while, repeat and loop loops, and case statements. Gotos are only used when the graph cannot be structured with any of the above constructs. This algorithm is adequate for the analysis needed in the decompilation of programs, given that a binary program does not contain information as to the language and compiler used to compile the original source program. And given that unstructuredness is introduced by the use of gotos (still widely available in today's compilers) and optimizations produced by the compiler, we have to assume an unstructured graph for our decompilation analysis. This algorithm has been implemented as part of the dcc decompiler, currently under development at the Queensland University of ...
The Use of Control-Flow and Control Dependence in Software Tools
, 1993
"... Program development, debugging, and maintenance can be greatly improved by the use of software tools that provide information about program behavior. This thesis focuses on a number of useful software tools and shows how their efficiency, generality, and precision can be increased through the use of ..."
Abstract
-
Cited by 4 (0 self)
- Add to MetaCart
Program development, debugging, and maintenance can be greatly improved by the use of software tools that provide information about program behavior. This thesis focuses on a number of useful software tools and shows how their efficiency, generality, and precision can be increased through the use of control-flow and control dependence analysis. We consider two classes of tools: execution measurement tools, which collect information about a particular program execution; and program analysis tools, which provide information about potential program behavior by statically analyzing the program. We consider three tools that measure aspects of a program's execution: profiling, tracing, and event counting tools. We describe algorithms for profiling and tracing programs that use a combination of control-flow analysis and program instrumentation to produce exact profiles and traces with low run-time overhead. Rather than record information at every point in a program, the algorithms record info...
Is the Quality of Numerical Subroutine Code Improving?
, 1997
"... We begin by using a software metric tool to generate a number of software complexity measures and we investigate how these values may be used to determine subroutines which are likely to be of substandard quality. Following this we look at how these metric values have changed over the years. First w ..."
Abstract
-
Cited by 4 (1 self)
- Add to MetaCart
We begin by using a software metric tool to generate a number of software complexity measures and we investigate how these values may be used to determine subroutines which are likely to be of substandard quality. Following this we look at how these metric values have changed over the years. First we consider a number of freely available Fortran libraries (Eispack, Linpack and Lapack) which have been constructed by teams. In order to ensure a fair comparison we use a restructuring tool to transform original Fortran 66 code into Fortran 77. We then consider the Fortran codes from the Collected Algorithms from the ACM (CALGO) to see whether we can detect the same trends in software written by the general numerical community. Our measurements show that although the standard of code in the freely available libraries does appear to have improved over time these libraries still contain routines which are effectively unmaintainable and untestable. Applied to the CALGO codes the metrics indica...

