Results 1 - 10
of
25
PLTO: A Link-Time Optimizer for the Intel IA-32 Architecture
- In Proc. 2001 Workshop on Binary Translation (WBT-2001
, 2001
"... tool we have developed for the Intel IA-32 architecture. A number of characteristics of this architecture complicate the task of link-time optimization. These include a large number of op-codes and addressing modes, which increases the complexity of program analysis; variable-length instructions, wh ..."
Abstract
-
Cited by 50 (15 self)
- Add to MetaCart
tool we have developed for the Intel IA-32 architecture. A number of characteristics of this architecture complicate the task of link-time optimization. These include a large number of op-codes and addressing modes, which increases the complexity of program analysis; variable-length instructions, which complicates disassembly of machine code; a paucity of available registers, which limits the extent of some optimizations; and a reliance on using memory locations for holding values and for parameter passing, which complicates program analysis and optimization. We describe how PLTO addresses these problems and the resulting performance improvements it is able to achieve.
Transparent dynamic optimization: The design and implementation of Dynamo
, 1999
"... dynamic optimization, compiler, trace selection, binary translation © Copyright Hewlett-Packard Company 1999 Dynamic optimization refers to the runtime optimization of a native program binary. This report describes the design and implementation of Dynamo, a prototype dynamic optimizer that is capabl ..."
Abstract
-
Cited by 49 (4 self)
- Add to MetaCart
dynamic optimization, compiler, trace selection, binary translation © Copyright Hewlett-Packard Company 1999 Dynamic optimization refers to the runtime optimization of a native program binary. This report describes the design and implementation of Dynamo, a prototype dynamic optimizer that is capable of optimizing a native program binary at runtime. Dynamo is a realistic implementation, not a simulation, that is written entirely in user-level software, and runs on a PA-RISC machine under the HPUX operating system. Dynamo does not depend on any special programming language,
Disassembly of Executable Code Revisited
- In Proc. IEEE 2002 Working Conference on Reverse Engineering (WCRE
, 2002
"... Machine code disassembly routines form a fundamental component of software systems that statically analyze or modify executable programs. The task of disassembly is complicated by indirect jumps and the presence of nonexecutable data---jump tables, alignment bytes, etc.---in the instruction stream ..."
Abstract
-
Cited by 44 (7 self)
- Add to MetaCart
Machine code disassembly routines form a fundamental component of software systems that statically analyze or modify executable programs. The task of disassembly is complicated by indirect jumps and the presence of nonexecutable data---jump tables, alignment bytes, etc.---in the instruction stream. Existing disassembly algorithms are not always able to cope successfully with executable files containing such features and fail silently---i.e., produce incorrect disassemblies without any indication that the results they are producing are incorrect. This can be a serious problem, since it can compromise the correctness of a binary rewriting tool. In this paper we examine two commonlyused disassembly algorithms and illustrate their shortcomings. We propose a hybrid approach that performs better than these algorithms in the sense that it is able to detect situations where the disassembly may be incorrect and limit the extent of such disassembly errors. Experimental results indicate that the algorithm is quite effective: the amount of code flagged as incurring disassembly errors is usually quite small.
alto: A Link-Time Optimizer for the Compaq Alpha
- Software - Practice and Experience
, 1999
"... Traditional optimizing compilers are limited in the scope of their optimizations by the fact that only a single function, or possibly a single module, is available for analysis and optimization. In particular, this means that library routines cannot be optimized to specific calling contexts. Other ..."
Abstract
-
Cited by 41 (13 self)
- Add to MetaCart
Traditional optimizing compilers are limited in the scope of their optimizations by the fact that only a single function, or possibly a single module, is available for analysis and optimization. In particular, this means that library routines cannot be optimized to specific calling contexts. Other optimization opportunities, exploiting information not available before linktime such as addresses of variables and the final code layout, are often ignored because linkers are traditionally unsophisticated. A possible solution is to carry out whole-program optimization at link time. This paper describes alto, a link-time optimizer for the Compaq Alpha architecture. It is able to realize significant performance improvements even for programs compiled with a good optimizing compiler with a high level of optimization. The resulting code is considerably faster that that obtained using the OM link-time optimizer, even when the latter is used in conjunction with profile-guided and inter-fi...
Design and Implementation of a Lightweight Dynamic Optimization System
- Journal of Instruction-Level Parallelism
, 2004
"... Many opportunities exist to improve micro-architectural performance due to performance events that are di#cult to optimize at static compile time. Cache misses and branch mis-prediction patterns may vary for di#erent micro-architectures using di#erent inputs. ..."
Abstract
-
Cited by 36 (7 self)
- Add to MetaCart
Many opportunities exist to improve micro-architectural performance due to performance events that are di#cult to optimize at static compile time. Cache misses and branch mis-prediction patterns may vary for di#erent micro-architectures using di#erent inputs.
LLVM: An Infrastructure for Multi-Stage Optimization
, 2002
"... Modern programming languages and software engineering principles are causing increasing problems for compiler systems. Traditional approaches, which use a simple compile-link-execute model, are unable to provide adequate application performance under the demands of the new conditions. Traditional ap ..."
Abstract
-
Cited by 31 (6 self)
- Add to MetaCart
Modern programming languages and software engineering principles are causing increasing problems for compiler systems. Traditional approaches, which use a simple compile-link-execute model, are unable to provide adequate application performance under the demands of the new conditions. Traditional approaches to interprocedural and profile-driven compilation can provide the application performance needed, but require infeasible amounts of compilation time to build the application. This thesis presents LLVM, a design and implementation of a compiler infrastructure which supports a unique multi-stage optimization system. This system is designed to support extensive interprocedural and profile-driven optimizations, while being efficient enough for use in commercial compiler systems. The LLVM virtual instruction set is the glue that holds the system together. It is a low-level representation, but with high-level type information. This provides the benefits of a low-level representation (compact representation, wide variety of available transformations, etc.) as well as providing high-level information to support aggressive interprocedural optimizations at link- and post-link time. In particular, this system is designed to support optimization in the field, both at run-time and during otherwise unused idle time on the machine. This thesis also describes an implementation of this compiler design, the LLVM compiler infrastructure, proving that the design is feasible. The LLVM compiler infrastructure is a maturing and efficient system, which we show is a good host for a variety of research. More information about LLVM can be found on its web site at: http://llvm.cs.uiuc.edu/
Feedback directed optimization in Compaq's compilation tools for Alpha
- In Proc. 2nd Workshop on Feedback Directed Optimization
, 1999
"... This paper describes and evaluates the feedback directed optimizations that are used in the Compaq C compiler tool chain for Alpha. The optimizations include superblock formation, inlining, commando loop optimization, register allocation, code layout, and switch statement optimization. The optimizat ..."
Abstract
-
Cited by 21 (0 self)
- Add to MetaCart
This paper describes and evaluates the feedback directed optimizations that are used in the Compaq C compiler tool chain for Alpha. The optimizations include superblock formation, inlining, commando loop optimization, register allocation, code layout, and switch statement optimization. The optimizations either are extensions of classical optimizations or are restructuring transformations that enable classical optimizations. Feedback directed optimization is highly effective, achieving a 17% speedup over aggressive classical optimization. Inlining contributes the most performance and code layout, superblock formation, and loop restructuring are also important. 1 Introduction When tuning programs, we often notice that the compiler has made poor optimization decisions. Compilers can only use the information they are given, and we usually know much more about a program than what is expressed in the source code. One important piece of information is the execution behavior of a program. How...
The StarJIT Compiler: A Dynamic Compiler for Managed Runtime Environments
, 2003
"... Dynamic compilers (or Just-in-Time [JIT] compilers) are a key component of managed runtime environments. This paper describes the design and implementation of the StarJIT compiler, a dynamic compiler for Java Virtual Machines and Common Language Runtime platforms. The goal of the StarJIT compiler is ..."
Abstract
-
Cited by 20 (7 self)
- Add to MetaCart
Dynamic compilers (or Just-in-Time [JIT] compilers) are a key component of managed runtime environments. This paper describes the design and implementation of the StarJIT compiler, a dynamic compiler for Java Virtual Machines and Common Language Runtime platforms. The goal of the StarJIT compiler is to build an infrastructure to research the influence of managed runtime environments on Intel architectures. The StarJIT compiler can compile both Java Infrastructure (CLI) bytecodes, and it uses a single intermediate representation and global optimization framework for both Java and CLI. The StarJIT compiler is designed to generate optimized code for the major Intel architectures and currently targets two Intel architectures: IA-32 and the Itanium Processor Family.
Code Layout Optimizations for Transaction Processing Workloads
- IN PROC. 28TH ANNUAL INT. SYMP. COMPUTER ARCHITECTURE
, 2001
"... Commercial applications such as databases and Web servers constitute the most important market segment for high-performance servers. Among these applications, on-line transaction processing (OLTP) workloads provide a challenging set of requirements for system designs since they often exhibit ineffic ..."
Abstract
-
Cited by 18 (4 self)
- Add to MetaCart
Commercial applications such as databases and Web servers constitute the most important market segment for high-performance servers. Among these applications, on-line transaction processing (OLTP) workloads provide a challenging set of requirements for system designs since they often exhibit inefficient executions dominated by a large memory stall component. This behavior arises from large instruction and data footprints and high communication miss rates. A number of recent studies have characterized the behavior of commercial workloads and proposed architectural features to improve their performance. However, there has been little research on the impact of software and compiler-level optimizations for improving the behavior of such workloads. This paper provides a detailed study of profile-driven compiler optimizations to improve the code layout in commercial workloads with
Dynamic Trace Selection Using Performance Monitoring Hardware Sampling
- in Proceedings of the 1st International Symposium on Code Generation and Optimization
, 2003
"... Optimizing programs at run-time provides opportunities to apply aggressive optimizations to programs based on information that was not available at compile time. At run time, programs can be adapted to better exploit architectural features, optimize the use of dynamic libraries, and simplify code ba ..."
Abstract
-
Cited by 15 (1 self)
- Add to MetaCart
Optimizing programs at run-time provides opportunities to apply aggressive optimizations to programs based on information that was not available at compile time. At run time, programs can be adapted to better exploit architectural features, optimize the use of dynamic libraries, and simplify code based on run-time constants. Our profiling system provides a framework for collecting information required for performing run-time optimization. We sample the performance hardware registers available on an ltanium processor, and select a set of code that is likely to lead to important performance-events. We gather distribution information about the performance-events we wish to monitor, and test our traces by estimating the ability for dynamic patching of a program to execute run-time generated traces. Our results show that we are able to capture 58 % of execution time across various SPEC2000 integer benchmarks using our profile and patching techniques on a relatively small number of frequently executed execution paths. Our profiling and detection system overhead increases execution time by only 2-4%. 1.

