Results 1 -
5 of
5
Assembly to High-Level Language Translation
- In Int. Conf. on Softw. Maint
, 1998
"... Translation of assembly code to high-level language code is of importance in the maintenance of legacy code, as well as in the areas of program understanding, porting, and recovery of code. We present techniques used in the asm2c translator, a SPARC assembly to C translator. The techniques invol ..."
Abstract
-
Cited by 23 (1 self)
- Add to MetaCart
Translation of assembly code to high-level language code is of importance in the maintenance of legacy code, as well as in the areas of program understanding, porting, and recovery of code. We present techniques used in the asm2c translator, a SPARC assembly to C translator. The techniques involve data and control flow analyses. The data flow analysis eliminates machine dependencies from the assembly code and recovers high-level language expressions. The control flow analysis recovers control structure statements. Simple data type recovery is also done. The presented techniques are extensions and improvements on previously developed CISC techniques. The choice of intermediate representation allows for both RISC and CISC assembly code to be supported by the analyses. We tested asm2c against SPEC95 SPARC assembly programs generated by a C compiler. Results using both unoptimized and optimized assembly code are presented. 1 Introduction Recovery of high-level language cod...
Building a Control-Flow Graph from Scheduled Assembly Code
, 2002
"... Abstract A variety of applications have arisen where it isworthwhile to apply code optimizations directly to the machine code (or assembly code) produced bya compiler. These include link-time whole-program analysis and optimization, code compression, binary-to-binary translation, and bit-transition ..."
Abstract
-
Cited by 8 (2 self)
- Add to MetaCart
Abstract A variety of applications have arisen where it isworthwhile to apply code optimizations directly to the machine code (or assembly code) produced bya compiler. These include link-time whole-program analysis and optimization, code compression, binary-to-binary translation, and bit-transition reduction (for power). Many, if not most, optimizations assumethe presence of a control-flow graph ( cfg). Com-piled, scheduled code has properties that can make cfg construction more complex than it is inside a typ-ical compiler. In this paper, we examine the problems of scheduled code on architectures that have multipledelay slots. In particular, if branch delay slots contain other branches, the classic algorithms for building a cfg produce incorrect results. We explain the problem using two simple exam-ples. We then present an algorithm for building correct cfgs from scheduled assembly code that includesbranches in branch-delay slots. The algorithm works by building an approximate cfg and then refiningit to reflect the actions of delayed branches. If all branches have explicit targets, the complexity of therefining step is linear with respect to the number of branches in the code. Analysis of the kind presented in this paper is anecessary first step for any system that analyzes or translates compiled, assembly-level code. We have implemented this algorithm in ourpower-consumption experiments based on the
Automatically Generating the Back End of a Compiler Using Declarative Machine Descriptions
, 2008
"... Although I have proven that the general problem is undecidable, I show how, for machines of practical interest, to generate the back end of a compiler. Unlike previous work on generating back ends, I generate the machinedependent components of the back end using only information that is independent ..."
Abstract
-
Cited by 3 (2 self)
- Add to MetaCart
Although I have proven that the general problem is undecidable, I show how, for machines of practical interest, to generate the back end of a compiler. Unlike previous work on generating back ends, I generate the machinedependent components of the back end using only information that is independent of the compiler’s internal data structures and intermediate form. My techniques substantially reduce the burden of retargeting the compiler: although it is still necessary to master the target machine’s instruction set, it is not necessary to master the data structures and algorithms in the compiler’s back end. Instead, the machine-dependent knowledge is isolated in the declarative machine descriptions. The largest machine-dependent component in a back end is the instruction selector. Previous work has shown that it is difficult to generate a highquality instruction selector. But by adopting the compiler architecture developed by Davidson and Fraser (1984), I can generate a naïve instruction
a Retargetable Static Binary Translation Framework
, 2002
"... Binary translation, the process of translating binary executables, makes it possible to run code compiled for source (input) machine M s on target (output) machine M t . Unlike an interpreter or emulator, a binary translator makes it possible to approach the speed of native code on machine M t . ..."
Abstract
- Add to MetaCart
Binary translation, the process of translating binary executables, makes it possible to run code compiled for source (input) machine M s on target (output) machine M t . Unlike an interpreter or emulator, a binary translator makes it possible to approach the speed of native code on machine M t . Translated code may still run slower than native code because low-level properties of machine M s must often be modeled on machine M t . The University of Queensland Binary Translation (UQBT) framework is a retargetable framework for experimenting with static binary translation on CISC and RISC machines.
1 Approaches for Universal Static Binary Translation
, 2006
"... Binary translation is the process of converting machine code created for one architecture to semantically equivalent machine code for another architecture. Static binary translation in particular is one of many ways to achieve Architecture-Independent Computing (AIC). AIC aims to provide the ability ..."
Abstract
- Add to MetaCart
Binary translation is the process of converting machine code created for one architecture to semantically equivalent machine code for another architecture. Static binary translation in particular is one of many ways to achieve Architecture-Independent Computing (AIC). AIC aims to provide the ability to execute code on any machine regardless of the original target. Unlike other solutions for AIC, such as portable virtual machines, emulators, or interpreters, static binary translation presents the possibility to create a framework that can translate code between any two machine code languages and provide results at near native speeds. In this thesis we present the Binary Translation System (BTS), a new static binary translator that can retarget machine code for arbitrary computing architectures. This thesis presents the extensible BTS framework that can be used to create translators for any common instruction set architecture. We also present our architecture-independent code representation and manipulations that we used to face some of the common problems with binary translation. Our prototype BTS implements a RISC-like, architecture-independent code representation, a PowerPC decoder, and a SPARC encoder to demonstrate the feasibility and problems of static binary translation. The PowerPC decoder in particular, exemplifies many decompilation problems that must be dealt with when converting machine code to an architecture-independent code representation. Our test results show that our prototype can achieve comparable performance and compatiblilty to other AIC solutions. Our methods and approaches presented in this thesis may be of interest to binary translation writers, reverse engineers, decompiler/compiler designers, and engineers wanting to do binary program manipulation or instrumentation. 1 1

