Results 1 - 10
of
10
ADAPTIVE OPTIMIZATION FOR SELF: RECONCILING HIGH PERFORMANCE WITH EXPLORATORY PROGRAMMING
, 1994
"... Object-oriented programming languages confer many benefits, including abstraction, which lets the programmer hide
the details of an object’s implementation from the object’s clients. Unfortunately, crossing abstraction boundaries
often incurs a substantial run-time overhead in the form of frequent p ..."
Abstract
-
Cited by 95 (6 self)
- Add to MetaCart
Object-oriented programming languages confer many benefits, including abstraction, which lets the programmer hide
the details of an object’s implementation from the object’s clients. Unfortunately, crossing abstraction boundaries
often incurs a substantial run-time overhead in the form of frequent procedure calls. Thus, pervasive use of abstraction,
while desirable from a design standpoint, may be impractical when it leads to inefficient programs.
Aggressive compiler optimizations can reduce the overhead of abstraction. However, the long compilation times
introduced by optimizing compilers delay the programming environment‘s responses to changes in the program.
Furthermore, optimization also conflicts with source-level debugging. Thus, programmers are caught on the horns of
two dilemmas: they have to choose between abstraction and efficiency, and between responsive programming environments
and efficiency. This dissertation shows how to reconcile these seemingly contradictory goals by performing
optimizations lazily.
Four new techniques work together to achieve high performance and high responsiveness:
• Type feedback achieves high performance by allowing the compiler to inline message sends based on information
extracted from the runtime system. On average, programs run 1.5 times faster than the previous SELF system;
compared to a commercial Smalltalk implementation, two medium-sized benchmarks run about three times faster.
This level of performance is obtained with a compiler that is both simpler and faster than previous SELF compilers.
• Adaptive optimization achieves high responsiveness without sacrificing performance by using a fast nonoptimizing
compiler to generate initial code while automatically recompiling heavily used parts of the program
with an optimizing compiler. On a previous-generation workstation like the SPARCstation-2, fewer than 200
pauses exceeded 200 ms during a 50-minute interaction, and 21 pauses exceeded one second. On a currentgeneration
workstation, only 13 pauses exceed 400 ms.
• Dynamic deoptimization shields the programmer from the complexity of debugging optimized code by
transparently recreating non-optimized code as needed. No matter whether a program is optimized or not, it can
always be stopped, inspected, and single-stepped. Compared to previous approaches, deoptimization allows more
debugging while placing fewer restrictions on the optimizations that can be performed.
• Polymorphic inline caching generates type-case sequences on-the-fly to speed up messages sent from the same
call site to several different types of object. More significantly, they collect concrete type information for the
optimizing compiler.
With better performance yet good interactive behavior, these techniques make exploratory programming possible
both for pure object-oriented languages and for application domains requiring higher ultimate performance, reconciling
exploratory programming, ubiquitous abstraction, and high performance.
Managing Interprocedural Optimization
, 1991
"... This dissertation addresses a number of important issues related to interprocedural optimization. Interprocedural optimization is an integral component in a compilation system for high-performance computing. The importance of interprocedural optimization stems from two sources: it increases the cont ..."
Abstract
-
Cited by 60 (9 self)
- Add to MetaCart
This dissertation addresses a number of important issues related to interprocedural optimization. Interprocedural optimization is an integral component in a compilation system for high-performance computing. The importance of interprocedural optimization stems from two sources: it increases the context available to the optimizing compiler, and it enables programmers to use procedure calls without the concern of hurting execution time. While important, interprocedural optimization can introduce some significant compile-time costs. When interprocedural information is used to optimize a procedure, the procedure is then dependent on those interprocedural facts. Thus, even if the procedure is not edited, it may require recompilation due to changes in the interprocedural facts. In addition to these effects on recompilation, interprocedural information can also be expensive to compute. Furthermore, interprocedural optimizations can increase program size which can in turn increase compile tim...
Reconciling responsiveness with performance in pure object-oriented languages
- ACM TRANSACTIONS ON PROGRAMMING LANGUAGES AND SYSTEMS
, 1996
"... Dynamically-dispatched calls often limit the performance of object-oriented programs since object-oriented programming encourages factoring code into small, reusable units, thereby increasing the frequency of these expensive operations. Frequent calls not only slow down execution with the dispatch o ..."
Abstract
-
Cited by 55 (0 self)
- Add to MetaCart
Dynamically-dispatched calls often limit the performance of object-oriented programs since object-oriented programming encourages factoring code into small, reusable units, thereby increasing the frequency of these expensive operations. Frequent calls not only slow down execution with the dispatch overhead per se, but more importantly they hinder optimization by limiting the range and effectiveness of standard global optimizations. In particular, dynamicallydispatched calls prevent standard interprocedural optimizations that depend on the availability of a static call graph. The SELF implementation described here offers two novel approaches to optimization. Type feedback speculatively inlines dynamically-dispatched calls based on profile information that predicts likely receiver classes. Adaptive optimization reconciles optimizing compilation with interactive performance by incrementally optimizing only the frequently-executed parts of a program. When combined, these two techniques result in a system that can execute programs significantly faster than previous systems while retaining much of the interactiveness of an interpreted system.
An Experiment with Inline Substitution
, 1991
"... This paper describes an experiment undertaken to evaluate the effectiveness of inline substitution as a method of improving the running time of compiled code. Our particular interests are in the interaction between inline substitution and aggressive code optimization. To understand this relationship ..."
Abstract
-
Cited by 42 (9 self)
- Add to MetaCart
This paper describes an experiment undertaken to evaluate the effectiveness of inline substitution as a method of improving the running time of compiled code. Our particular interests are in the interaction between inline substitution and aggressive code optimization. To understand this relationship, we used commercially available FORTRAN optimizing compilers as the basis for our study. This paper reports on the effectiveness of the various compilers at optimizing the inlined code. We examine both the runtime performance of the resulting code and the compile-time performance of the compilers. This work can be viewed as a study of the effectiveness of inlining in modern optimizers; alternatively, it can be viewed as one data point on the overall effectiveness of modern optimizing compilers. We discovered that, with optimizing FORTRAN compilers, (1) object-code growth from inlining is substantially smaller than source-code growth, (2) compile-time growth from inlining is smaller than source-code growth, and (3) the compilers we tested were not able to capitalize consistently inlining
Lambda-Splitting: A Higher-Order Approach to Cross-Module Optimizations
- In Proc. 1997 ACM SIGPLAN International Conference on Functional Programming (ICFP’97
, 1997
"... We describe an algorithm for automatic inline expansion across module boundaries that works in the presence of higher-order functions and free variables; it rearranges bindings and scopes as necessary to move nonexpansive code from one module to another. We describe---and implement---the algorithm a ..."
Abstract
-
Cited by 20 (7 self)
- Add to MetaCart
We describe an algorithm for automatic inline expansion across module boundaries that works in the presence of higher-order functions and free variables; it rearranges bindings and scopes as necessary to move nonexpansive code from one module to another. We describe---and implement---the algorithm as transformations on #-calculus. Our inliner interacts well with separate compilation and is e#cient, robust, and practical enough for everyday use in the SML/NJ compiler. Inlining improves performance by 4--8% on existing code, and makes it possible to use much more data abstraction by consistently eliminating penalties for modularity. 1 Introduction Abstraction and modular design of software promote clarity and provide clear lines along which large projects can be subdivided. But one often pays a large performance penalty for using abstraction. Cross-module inlining can bridge the gap between abstract design and high performance by transparently moving the border between compilation unit...
Isolation and Analysis of Optimization Errors
- In Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation
, 1993
"... This paper describes two related tools developed to support the isolation and analysis of optimization errors in the vpo optimizer. Both tools rely on vpo identifying sequences of changes, referred to as transformations, that result in semantically equivalent (and usually improved) code. One tool de ..."
Abstract
-
Cited by 14 (7 self)
- Add to MetaCart
This paper describes two related tools developed to support the isolation and analysis of optimization errors in the vpo optimizer. Both tools rely on vpo identifying sequences of changes, referred to as transformations, that result in semantically equivalent (and usually improved) code. One tool determines the first transformation that causes incorrect output of the execution of the compiled program. This tool not only automatically isolates the illegal transformation, but also identifies the location and instant the transformation is performed in vpo. Toassist in the analysis of an optimization error, a graphical optimization viewer was also implemented that can display the state of the generated instructions before and after each transformation performed by vpo. Unique features of the optimization viewer include re verse viewing (or undoing) of transformations and the ability to stop at breakpoints associated with the generated instructions. Both tools are useful independently. Together these tools form a powerful environment for facilitating the retargeting of vpo to a new machine and supporting experimentation with new optimizations. In addition, the optimization viewer can be used as a teaching aid in compiler classes.
Hierarchical Modularity And Intermodule Optimization
, 1997
"... Separate compilation is an important tool for coping with design complexity in large software projects. When done right it can also be used to create software libraries, thus promoting code reuse. But separate compilation comes in various flavors and has many facets: namespace management, linking, o ..."
Abstract
-
Cited by 9 (1 self)
- Add to MetaCart
Separate compilation is an important tool for coping with design complexity in large software projects. When done right it can also be used to create software libraries, thus promoting code reuse. But separate compilation comes in various flavors and has many facets: namespace management, linking, optimization, dependencies. Many programming languages identify modular units with units of compilation, while only a few extend this to permit hierarchies of language-level modules within individual compilation units. When the number of compilation units is large, then it becomes increasingly important that the mechanism of separate compilation itself can be used to control namespaces. The group model implemented in SML/NJ's compilation manager CM provides the necessary facilities to avoid unwanted interferences between unrelated parts of large programs. Compilation units are arranged into groups, and explicit export interfaces can be used to control namespaces. When there are many groups, t...
Recursion Unrolling for Divide and Conquer Programs
, 2000
"... This paper presents recursion unrolling, a technique for improving the performance of recursive computations. Conceptually, recursion unrolling inlines recursive calls to reduce control flow overhead and increase the size of the basic blocks in the computation, which in turn increases the effective ..."
Abstract
-
Cited by 5 (0 self)
- Add to MetaCart
This paper presents recursion unrolling, a technique for improving the performance of recursive computations. Conceptually, recursion unrolling inlines recursive calls to reduce control flow overhead and increase the size of the basic blocks in the computation, which in turn increases the effectiveness of standard compiler optimizations such as register allocation and instruction scheduling. We have identified two transformations that significantly improve the effectiveness of the basic recursion unrolling technique. Conditional fusion merges conditionals with identical expressions, considerably simplifying the control flow in unrolled procedures. Recursion re-rolling rolls back the recursive part of the procedure to ensure that a large unrolled base case is always executed, regardless of the input problem size. We have implemented our techniques and applied them to an important class of recursive programs, divide and conquer programs. Our experimental results show that recursion unrolling can improve the performance of our programs by a factor of between 3.6 to 10.8 depending on the combination of the program and the architecture.
The space overhead of customization
, 1997
"... Abstract. Customization aims to improve the performance of pure object-oriented languages by compiling multiple copies of a source method, each of them specialized for a certain revceiver type. Like other code duplication techniques, it gains performance by trading code space for better speed. Unfor ..."
Abstract
-
Cited by 3 (0 self)
- Add to MetaCart
Abstract. Customization aims to improve the performance of pure object-oriented languages by compiling multiple copies of a source method, each of them specialized for a certain revceiver type. Like other code duplication techniques, it gains performance by trading code space for better speed. Unfortunately, customization can significantly increase code space, especially for larger programs. We show that customization increases memory usage by almost a factor of three for some applications in the Self-93 system. We analyze and quantify the factors that lead to this space overhead and identify strategies to eliminate most of it. We focus on dynamically-compiled systems like Self-93 where it is impractical or undesirable to use whole-program analysis or programmer-directed profiling to guide customization decisions. Our experiments show that a combination of relatively straight-forward strategies can bring the code space consumption of customization to within 34 % or less of a completely non-customizing system. Thus, even in dynamically-compiled systems customization and efficient memory usage need not be mutually exclusive. 1.
Decreasing process memory requirements by overlapping program portions
- In Proceedings of the Hawaii International Conference on System Sciences
, 1998
"... Most of the time, faced with a time/space trade-off, a compiler writer will choose to optimize time, even at the cost of space. This was not always the case. Early in the history of computers, programmers would try everything they could think of to reduce the size of their code to get it to fit in t ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
Most of the time, faced with a time/space trade-off, a compiler writer will choose to optimize time, even at the cost of space. This was not always the case. Early in the history of computers, programmers would try everything they could think of to reduce the size of their code to get it to fit in the computer’s constrained space. As memory and

