Results 11 - 20
of
38
alto: A Link-Time Optimizer for the Compaq Alpha
- Software - Practice and Experience
, 1999
"... Traditional optimizing compilers are limited in the scope of their optimizations by the fact that only a single function, or possibly a single module, is available for analysis and optimization. In particular, this means that library routines cannot be optimized to specific calling contexts. Other ..."
Abstract
-
Cited by 41 (13 self)
- Add to MetaCart
Traditional optimizing compilers are limited in the scope of their optimizations by the fact that only a single function, or possibly a single module, is available for analysis and optimization. In particular, this means that library routines cannot be optimized to specific calling contexts. Other optimization opportunities, exploiting information not available before linktime such as addresses of variables and the final code layout, are often ignored because linkers are traditionally unsophisticated. A possible solution is to carry out whole-program optimization at link time. This paper describes alto, a link-time optimizer for the Compaq Alpha architecture. It is able to realize significant performance improvements even for programs compiled with a good optimizing compiler with a high level of optimization. The resulting code is considerably faster that that obtained using the OM link-time optimizer, even when the latter is used in conjunction with profile-guided and inter-fi...
Annotation-Directed Run-Time Specialization in C
- IN PEPM'97 PROCEEDINGS
, 1997
"... We present the design of a dynamic compilation system for C. Directed by a few declarative user annotations specifying where and on what dynamic compilation is to take place, a binding time analysis computes the set of run-time constants at each program point in each annotated procedure's control fl ..."
Abstract
-
Cited by 39 (6 self)
- Add to MetaCart
We present the design of a dynamic compilation system for C. Directed by a few declarative user annotations specifying where and on what dynamic compilation is to take place, a binding time analysis computes the set of run-time constants at each program point in each annotated procedure's control flow graph; the analysis supports program-point-specific polyvariant division and specialization. The analysis results guide the construction of a specialized run-time specializer for each dynamically compiled region; the specializer supports various caching strategies for managing dynamically generated code and supports mixes of speculative and demand-driven specialization of dynamic branch successors. Most of the key cost/benefit trade-offs in the binding time analysis and the run-time specializer are open to user control through declarative policy annotations. Our design is being implemented in the context of an existing optimizing compiler.
tcc: A Template-Based Compiler for `C
- In Proceedings of the First Workshop on Compiler Support for Systems Software (WCSSS
, 1995
"... Dynamic code generation is an important technique for improving the performance of software by exploiting information known only at run time. `C (Tick C) is a superset of ANSI C that, unlike most prior systems, allows high-level, efficient, and machineindependent specification of dynamically generat ..."
Abstract
-
Cited by 26 (2 self)
- Add to MetaCart
Dynamic code generation is an important technique for improving the performance of software by exploiting information known only at run time. `C (Tick C) is a superset of ANSI C that, unlike most prior systems, allows high-level, efficient, and machineindependent specification of dynamically generated code. `C provides facilities for dynamic code generation within the context of a statically typed, imperative language closely related to the language most widely used in systems development. This paper describes tcc, a compiler currently being written for `C. tcc has two objectives: (1) to deliver a complete, solid implementation of `C, and (2) to minimize the run-time costs of dynamic code generation. tcc implements dynamic code generation by emitting templates, segments of binary code which at run time can be combined and completed with the values of registers, stack offsets, and constants. tcc also allows some decisions about storage allocation and instruction selection to occur at ru...
Dynamo: A Staged Compiler Architecture for Dynamic Program Optimization
, 1997
"... Syntax Possible Dynamic Input Value-Specific Optimizations Register Allocation Coarse Scheduling Code Generation AMMA G Possible Dynamic Input Peephole Optimization Code Layout Branch Prediction Assembly ELTA D Possible Dynamic Input Peephole Optimization Code Layout Branch Prediction Assembly ELT ..."
Abstract
-
Cited by 26 (1 self)
- Add to MetaCart
Syntax Possible Dynamic Input Value-Specific Optimizations Register Allocation Coarse Scheduling Code Generation AMMA G Possible Dynamic Input Peephole Optimization Code Layout Branch Prediction Assembly ELTA D Possible Dynamic Input Peephole Optimization Code Layout Branch Prediction Assembly ELTA D Possible Dynamic Input Dynamic Input Dynamically Optimizing Native Code PSILON E Create Specialized Code Generators Java VM Code LPHA A High-Level IR Mid-Level IR Low-Level IR Native Code Program Analysis Code Improvements Figure 2: Staged Compiler Architecture ffl Dynamic method dispatch and higher-order procedures can make it impossible to determine a program's control flow graph until run time; ffl Dynamically allocated data structures can make static loop transformations difficult, e.g., array bounds may not be known at compile time; ffl Separate compilation makes it difficult to optimize data representations across module boundaries. It is therefore possible to postpone part...
Two for the Price of One: Composing Partial Evaluation and Compilation
, 1997
"... One of the flagship applications of partial evaluation is compilation and compiler generation. However, partial evaluation is usually expressed as a source-to-source transformation for high-level languages, whereas realistic compilers produce object code. We close this gap by composing a partial eva ..."
Abstract
-
Cited by 20 (3 self)
- Add to MetaCart
One of the flagship applications of partial evaluation is compilation and compiler generation. However, partial evaluation is usually expressed as a source-to-source transformation for high-level languages, whereas realistic compilers produce object code. We close this gap by composing a partial evaluator with a compiler by automatic means. Our work is a successful application of several meta-computation techniques to build the system, both in theory and in practice. The composition is an application of deforestation or fusion. The result is a run-time code generation system built from existing components. Its applications are numerous. For example, it allows the language designer to perform interpreter-based experiments with a source-to-source version of the partial evaluator before building a realistic compiler which generates object code automatically.
Modal Types as Staging Specifications for Run-time Code Generation
, 1998
"... ing with credit is permitted. To copy otherwise, to republish, to post on servers, to redistribute to lists, or to use any component of this work in other works, requires prior specific permission and/or a fee. Permissions may be requested from Publications Dept, ACM Inc., 1515 Broadway, New York, N ..."
Abstract
-
Cited by 19 (8 self)
- Add to MetaCart
ing with credit is permitted. To copy otherwise, to republish, to post on servers, to redistribute to lists, or to use any component of this work in other works, requires prior specific permission and/or a fee. Permissions may be requested from Publications Dept, ACM Inc., 1515 Broadway, New York, NY 10036 USA, fax +1 (212) 869-0481, or permissions@acm.org. 2 \Delta P. Wickline, P. Lee, F. Pfenning, and R. Davies type poly = int list; (* val evalPoly : int * poly -? int *) fun evalPoly (x, nil) = 0 --- evalPoly (x, a::p) = a + (x * evalPoly (x, p)); If this function were called many times with the same polynomial but different bases, it might be profitable to specialize it to the particular polynomial, in effect synthesizing a function that directly computes the polynomial rather than interpreting its list representation. One way that we can accomplish this is by transforming the code as follows. (* val specPoly : poly -? int -? int *) fun specPoly (nil) = (fn x =? 0) --- specPoly (...
A Declarative Approach to Run-Time Code Generation
- In Workshop on Compiler Support for System Software (WCSSS
, 1996
"... Introduction Run-time code generation promises to improve the performance and reliability of current and future systems. Optimizations performed at run time make use of values and invariants that cannot be exploited at compile time, yielding code that is often superior to statically optimal code. F ..."
Abstract
-
Cited by 18 (2 self)
- Add to MetaCart
Introduction Run-time code generation promises to improve the performance and reliability of current and future systems. Optimizations performed at run time make use of values and invariants that cannot be exploited at compile time, yielding code that is often superior to statically optimal code. Furthermore, run-time code generation encourages better structuring of systems: in our experience, the performance of special-purpose code can, in some cases, be matched or exceeded by automatically customized general-purpose code. Run-time code generation also promotes the use of advanced languages because it amortizes the cost of dynamic safety checks and optimizes across abstraction boundaries. Most previous approaches to run-time code generation have been imperative, relying on the programmer to: ffl specify the code to be optimized at run time, usually as a "template" of machine code or as code that builds an abstract syntax tree, ffl s
Reverse Interpretation + Mutation Analysis = Automatic Retargeting
, 1997
"... There are three popular methods for constructing highly retargetable compilers: (1) the compiler emits abstract machine code which is interpreted at run-time, (2) the compiler emits C code which is subsequently compiled to machine code by the native C compiler, or (3) the compiler's codegenerator is ..."
Abstract
-
Cited by 15 (2 self)
- Add to MetaCart
There are three popular methods for constructing highly retargetable compilers: (1) the compiler emits abstract machine code which is interpreted at run-time, (2) the compiler emits C code which is subsequently compiled to machine code by the native C compiler, or (3) the compiler's codegenerator is generated by a back-end generator from a formal machine description produced by the compiler writer. These methods incur high costs at run-time, compiletime, or compiler-construction time, respectively. In this paper we will describe a novel method which promises to significantly reduce the effort required to retarget a compiler to a new architecture, while at the same time producing fast and effective compilers. The basic idea is to use the native C compiler at compiler construction time to discover architectural features of the new architecture. From this information a formal machine description is produced. Given this machine description, a native code-generator can be generated by a b...
Dynamic Optimization through the use of Automatic Runtime Specialization
, 1999
"... Profile-driven optimizations and dynamic optimization through specialization have taken optimizations to a new level. By using actual runtime data, optimizers can generate code that is specially tuned for the task at hand. However, most existing compilers that perform these optimizations require s ..."
Abstract
-
Cited by 14 (2 self)
- Add to MetaCart
Profile-driven optimizations and dynamic optimization through specialization have taken optimizations to a new level. By using actual runtime data, optimizers can generate code that is specially tuned for the task at hand. However, most existing compilers that perform these optimizations require separate test runs to gather profile information, and/or user annotations in the code. In this thesis, I describe runtime optimizations that a dynamic compiler can perform automatically --- without user annotations --- by utilizing realtime performance data. I describe the implementation of the dynamic optimizations in the framework of a Java Virtual Machine and give performance results.
SEJITS: Getting Productivity and Performance With Selective Embedded JIT Specialization
"... Today’s “high productivity ” programming languages such as Python lack the performance of harder-toprogram “efficiency ” languages (CUDA, Cilk, C with OpenMP) that can exploit extensive programmer knowledge of parallel hardware architectures. We combine efficiency-language performance with productiv ..."
Abstract
-
Cited by 12 (6 self)
- Add to MetaCart
Today’s “high productivity ” programming languages such as Python lack the performance of harder-toprogram “efficiency ” languages (CUDA, Cilk, C with OpenMP) that can exploit extensive programmer knowledge of parallel hardware architectures. We combine efficiency-language performance with productivitylanguage programmability using selective embedded just-in-time specialization (SEJITS). At runtime, we specialize (generate, compile, and execute efficiencylanguage source code for) an application-specific and platform-specific subset of a productivity language, largely invisibly to the application programmer. Because the specialization machinery is implemented in the productivity language itself, it is easy for efficiency programmers to incrementally add specializers for new domain abstractions, new hardware, or both. SEJITS has the potential to bridge productivity-layer research and efficiency-layer research, allowing domain experts to exploit different parallel hardware architectures with a fraction of the programmer time and effort usually required. 1

