Abstract:
This paper describes and evaluates the profile-based optimizations in the Compaq C compiler tool chain for Alpha. The optimizations include superblock formation, inlining, commando loop optimization, register allocation, code layout, and switch statement optimization. The optimizations either are extensions of classical optimizations or are restructuring transformations that enable classical optimizations. Profile-based optimization is highly effective, achieving a 17% speedup over aggressive classical optimization on the SPECInt95 benchmarks. Inlining contributes the most performance and code layout, superblock formation, and loop restructuring are also important. 1. Introduction When tuning programs, we often notice that the compiler has made poor optimization decisions. Compilers can only use the information they are given, and we usually know much more about a program than is expressed in the source code. One important piece of information is the execution behavior of a pr...
Citations
|
597
|
Trace scheduling : A technique for global microcode compaction
– Fisher
- 1981
|
|
240
|
Profile guided code positioning
– Pettis, Hansen
- 1990
|
|
239
|
The superblock: an effective technique for VLIW and superscalar compilation. The Journal of Supercomputine (this issue
– Hwu, Mahlke, et al.
- 1992
|
|
169
|
The Multiflow Trace scheduling compiler
– Lowney, Freudenberger, et al.
- 1993
|
|
147
|
Achieving High Instruction Cache Performance with an Optimizing Compiler
– Hwu, Chang
- 1989
|
|
137
|
Program Optimization for Instruction Caches
– McFarling
- 1989
|
|
110
|
Using Profile Information to Assist Classic Code Optimizations
– Chang, Mahlke, et al.
- 1991
|
|
73
|
Minimizing register usage penalty at procedure calls
– Chow
- 1988
|
|
39
|
Quality and speed in linear-scan register allocation
– Traub, Holloway, et al.
- 1998
|
|
37
|
Optimizing Alpha Executables on Windows NT with Spike
– Cohn, Goodwin, et al.
- 1997
|
|
37
|
Feedback-Directed Selection and Characterization of Compiler Optimizations
– Chow, Wu
- 1999
|
|
24
|
Scalable Cross-Module Optimization
– Ayers
- 1998
|
|
11
|
A Transparent Method for Correlating Profiles with Source Programs
– Albert
- 1999
|
|
10
|
Delivering Binary Object Modification Tools for Program Analysis
– Wilson, Neth, et al.
- 1996
|
|
9
|
Discourse analysis
– Harris
- 1952
|
|
9
|
Profile-directed restructuring of operating system code
– Schmidt, Roediger, et al.
- 1998
|
|
9
|
Performance Analysis Using Very Large Memory on the 64-bit AlphaServer System
– Kawaf, Shakshober, et al.
- 1996
|
|
2
|
An Approach to Global Register Allocation
– Johnson
- 1975
|