Results 1 - 10
of
26
ADAPTIVE OPTIMIZATION FOR SELF: RECONCILING HIGH PERFORMANCE WITH EXPLORATORY PROGRAMMING
, 1994
"... Object-oriented programming languages confer many benefits, including abstraction, which lets the programmer hide
the details of an object’s implementation from the object’s clients. Unfortunately, crossing abstraction boundaries
often incurs a substantial run-time overhead in the form of frequent p ..."
Abstract
-
Cited by 95 (6 self)
- Add to MetaCart
Object-oriented programming languages confer many benefits, including abstraction, which lets the programmer hide
the details of an object’s implementation from the object’s clients. Unfortunately, crossing abstraction boundaries
often incurs a substantial run-time overhead in the form of frequent procedure calls. Thus, pervasive use of abstraction,
while desirable from a design standpoint, may be impractical when it leads to inefficient programs.
Aggressive compiler optimizations can reduce the overhead of abstraction. However, the long compilation times
introduced by optimizing compilers delay the programming environment‘s responses to changes in the program.
Furthermore, optimization also conflicts with source-level debugging. Thus, programmers are caught on the horns of
two dilemmas: they have to choose between abstraction and efficiency, and between responsive programming environments
and efficiency. This dissertation shows how to reconcile these seemingly contradictory goals by performing
optimizations lazily.
Four new techniques work together to achieve high performance and high responsiveness:
• Type feedback achieves high performance by allowing the compiler to inline message sends based on information
extracted from the runtime system. On average, programs run 1.5 times faster than the previous SELF system;
compared to a commercial Smalltalk implementation, two medium-sized benchmarks run about three times faster.
This level of performance is obtained with a compiler that is both simpler and faster than previous SELF compilers.
• Adaptive optimization achieves high responsiveness without sacrificing performance by using a fast nonoptimizing
compiler to generate initial code while automatically recompiling heavily used parts of the program
with an optimizing compiler. On a previous-generation workstation like the SPARCstation-2, fewer than 200
pauses exceeded 200 ms during a 50-minute interaction, and 21 pauses exceeded one second. On a currentgeneration
workstation, only 13 pauses exceed 400 ms.
• Dynamic deoptimization shields the programmer from the complexity of debugging optimized code by
transparently recreating non-optimized code as needed. No matter whether a program is optimized or not, it can
always be stopped, inspected, and single-stepped. Compared to previous approaches, deoptimization allows more
debugging while placing fewer restrictions on the optimizations that can be performed.
• Polymorphic inline caching generates type-case sequences on-the-fly to speed up messages sent from the same
call site to several different types of object. More significantly, they collect concrete type information for the
optimizing compiler.
With better performance yet good interactive behavior, these techniques make exploratory programming possible
both for pure object-oriented languages and for application domains requiring higher ultimate performance, reconciling
exploratory programming, ubiquitous abstraction, and high performance.
Interprocedural Conditional Branch Elimination
, 1997
"... The existence of statically detectable correlation among conditional branches enables their elimination, an optimization that has a number of benefits. This paper presents techniques to determine whether an interprocedural execution path leading to a conditional branch exists along which the branch ..."
Abstract
-
Cited by 66 (15 self)
- Add to MetaCart
The existence of statically detectable correlation among conditional branches enables their elimination, an optimization that has a number of benefits. This paper presents techniques to determine whether an interprocedural execution path leading to a conditional branch exists along which the branch outcome is known at compile time, and then to eliminate the branch along this path through code restructuring. The technique consists of a demand driven interprocedural analysis that determines whether a specific branch outcome is correlated with prior statements or branch outcomes. The optimization is performed using a code restructuring algorithm that replicates code to separate out the paths with correlation. When the correlated path is affected by a procedure call, the restructuring is based on procedure entry splitting and exit splitting. The entry splitting transformation creates multiple entries to a procedure, and the exit splitting transformation allows a procedure to return control...
BPF+: Exploiting Global Data-flow Optimization in a Generalized Packet Filter Architecture
- In SIGCOMM
, 1999
"... A packet filter is a programmable selection criterion for classifying or selecting packets from a packet stream in a generic, reusable fashion. Previous work on packet filters falls roughly into two categories, namely those efforts that investigate flexible and extensible filter abstractions but sac ..."
Abstract
-
Cited by 53 (0 self)
- Add to MetaCart
A packet filter is a programmable selection criterion for classifying or selecting packets from a packet stream in a generic, reusable fashion. Previous work on packet filters falls roughly into two categories, namely those efforts that investigate flexible and extensible filter abstractions but sacrifice performance, and those that focus on low-level, optimized filtering representations but sacrifice flexibility. Applications like network monitoring and intrusion detection, however, require both high-level expressiveness and raw performance. In this paper, we propose a fully general packet filter framework that affords both a high degree of flexibility and good performance. In our framework, a packet filter is expressed in a high-level language that is compiled into a highly efficient native implementation. The optimization phase of the compiler uses a flowgraph set relation called edge dominators and the novel application of an optimization technique that we call "redundant predicate...
Using Paths to Measure, Explain, and Enhance Program Behavior
, 2000
"... Paths can reveal a program’s dynamic behavior and uncover patterns of path locality that can be exploited to increase program performance. The authors explore several methods for doing so. ..."
Abstract
-
Cited by 25 (0 self)
- Add to MetaCart
Paths can reveal a program’s dynamic behavior and uncover patterns of path locality that can be exploited to increase program performance. The authors explore several methods for doing so.
Improving Semi-static Branch Prediction by Code Replication
- Proc. ACM SIGPLAN Conf. on Programming Language Design and Implementation
, 1994
"... Speculative execution on superscalar processors demands substantially better branch prediction than what has been previously available. In this paper we present code replication techniques that improve the accurracy of semi-static branch prediction to a level comparable to dynamic branch prediction ..."
Abstract
-
Cited by 24 (0 self)
- Add to MetaCart
Speculative execution on superscalar processors demands substantially better branch prediction than what has been previously available. In this paper we present code replication techniques that improve the accurracy of semi-static branch prediction to a level comparable to dynamic branch prediction schemes. Our technique uses profiling to collect information about the correlation between different branches and about the correlation between the subsequent outcomes of a single branch. Using this information and code replication the outcome of branches is represented in the program state. Our experiments have shown that the misprediction rate can almost be halved while the code size is increased by one third. 1 Introduction Branch prediction forecasts the direction a conditional branch will take. It reduces the branch penalty in a processor and is a basis for the application of compiler optimization techniques. In this paper we are mainly interested in the latter use, since we will appl...
Control CPR: A Branch Height Reduction Optimization for EPIC Architectures
- In Proceedings of the 1999 ACM SIGPLAN Conference on Programming Language Design and Implementation
, 1999
"... The challenge of exploiting high degrees of instruction-level parallelism is often hampered by frequent branching. Both exposed branch latency and low branch throughput can restrict parallelism. Control critical path reduction (control CPR) is a compilation technique to address these problems. Cont ..."
Abstract
-
Cited by 19 (1 self)
- Add to MetaCart
The challenge of exploiting high degrees of instruction-level parallelism is often hampered by frequent branching. Both exposed branch latency and low branch throughput can restrict parallelism. Control critical path reduction (control CPR) is a compilation technique to address these problems. Control CPR can reduce the dependence height of critical paths through branch operations as well as decrease the number of executed branches. In this paper, we present an approach to control CPR that recognizes sequences of branches using profiling statistics. The control CPR transformation is applied to the predominant path through this sequence. Our approach, its implementation, and experimental results are presented. This work demonstrates that control CPR enhances instruction-level parallelism for a variety of application programs and improves their performance across a range of processors. 1 Introduction Increases in microprocessor performance are driven by both increased clock speed and...
Isolation and Analysis of Optimization Errors
- In Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation
, 1993
"... This paper describes two related tools developed to support the isolation and analysis of optimization errors in the vpo optimizer. Both tools rely on vpo identifying sequences of changes, referred to as transformations, that result in semantically equivalent (and usually improved) code. One tool de ..."
Abstract
-
Cited by 14 (7 self)
- Add to MetaCart
This paper describes two related tools developed to support the isolation and analysis of optimization errors in the vpo optimizer. Both tools rely on vpo identifying sequences of changes, referred to as transformations, that result in semantically equivalent (and usually improved) code. One tool determines the first transformation that causes incorrect output of the execution of the compiled program. This tool not only automatically isolates the illegal transformation, but also identifies the location and instant the transformation is performed in vpo. Toassist in the analysis of an optimization error, a graphical optimization viewer was also implemented that can display the state of the generated instructions before and after each transformation performed by vpo. Unique features of the optimization viewer include re verse viewing (or undoing) of transformations and the ability to stop at breakpoints associated with the generated instructions. Both tools are useful independently. Together these tools form a powerful environment for facilitating the retargeting of vpo to a new machine and supporting experimentation with new optimizations. In addition, the optimization viewer can be used as a teaching aid in compiler classes.
Automatic isolation of compiler errors
- ACM Transactions on Programming Languages and Systems
, 1994
"... This paper describes a tool called vpoiso that was developed to automatically isolate errors in the vpo compiler system. The two general types of compiler errors isolated by this tool are optimization and nonoptimization errors. When isolating optimization errors, vpoiso relies on the vpo optimizer ..."
Abstract
-
Cited by 12 (1 self)
- Add to MetaCart
This paper describes a tool called vpoiso that was developed to automatically isolate errors in the vpo compiler system. The two general types of compiler errors isolated by this tool are optimization and nonoptimization errors. When isolating optimization errors, vpoiso relies on the vpo optimizer to identify sequences of changes, referred to as transformations, that result in semantically equivalent code and to provide the ability to stop performing improving (or unnecessary) transformations after a specified number have been performed. Acompilation of a typical program by vpo often results in thousands of improving transformations being performed. The vpoiso tool can automatically isolate the first improving transformation that causes incorrect output of the execution of the compiled program by using a binary search that varies the number of improving transformations performed. Not only is the illegal transformation automatically isolated, but vpoiso also identifies the location and instant the transformation is performed in vpo. Nonoptimization errors occur from problems in the front end, code generator, and necessary transformations in the optimizer. Ifanother compiler is available that can produce correct (but perhaps more inefficient) code, then vpoiso can isolate nonoptimization errors to a single function. Automatic isolation of compiler errors facilitates retargeting a compiler to a new machine, maintenance of the compiler, and supporting experimentation with new optimizations.
Target-specific Global Code Improvement: Principles and Applications
, 1994
"... This article describes the key principles behind the design and implementation of a global code improver that has been use to construct several high-quality compilers and other program transformation and analysis tools. The code improver, called vpo, employs a paradigm of compilation that has proven ..."
Abstract
-
Cited by 8 (0 self)
- Add to MetaCart
This article describes the key principles behind the design and implementation of a global code improver that has been use to construct several high-quality compilers and other program transformation and analysis tools. The code improver, called vpo, employs a paradigm of compilation that has proven to be flexible and adaptable---all code improving transformations are performed on a target-specific representation of the program. The aggressive use of this paradigm yields a code improver with several valuable properties. Four properties stand out. First, vpo is language and compiler independent. That is, it has been used to implement compilers for several different computer languages. For the C programming language, it has been used with several front ends each of which generates a different intermediate language. Second, because all code improvements are applied to a single low-level intermediate representation, phase ordering programs are minimized. Third, vpo is easily retargeted and handles a wide variety of architectures. In particular, vpo's structure allows new architectures and new implementations of existing architectures to be accommodated quickly and easily. Fourth and finally, because of its flexible structure, vpo has several other interesting uses in addition to its primary use in an optimizing compiler. This article describes the principles that have driven the design of vpo and the implications of these principles on vpo's implementation. The article concludes with a brief description of vpo's use as a back end with front ends for several different languages, and its use as a key component
Programs Follow Paths
- MICROSOFT RESEARCH, MICROSOFT RESEARCH
, 1999
"... Program paths—sequences of executed basic blocks—have proven to be an effective way to capture a program’s elusive dynamic behavior. This paper shows how paths and path spectra compactly and precisely record many aspects of programs’ execution-time control flow behavior and explores applications of ..."
Abstract
-
Cited by 8 (0 self)
- Add to MetaCart
Program paths—sequences of executed basic blocks—have proven to be an effective way to capture a program’s elusive dynamic behavior. This paper shows how paths and path spectra compactly and precisely record many aspects of programs’ execution-time control flow behavior and explores applications of these paths in computer architecture, compilers, debugging, program testing, and software maintenance.

