Results 1 -
6 of
6
Optimizing Alpha Executables on Windows NT with Spike
, 1997
"... This paper discusses the Spike performance tool and its use in optimizing Windows NT--based applications running on Alpha processors. In the following section, we describe the characteristics of Windows NT--based ..."
Abstract
-
Cited by 37 (5 self)
- Add to MetaCart
This paper discusses the Spike performance tool and its use in optimizing Windows NT--based applications running on Alpha processors. In the following section, we describe the characteristics of Windows NT--based
The StarJIT Compiler: A Dynamic Compiler for Managed Runtime Environments
, 2003
"... Dynamic compilers (or Just-in-Time [JIT] compilers) are a key component of managed runtime environments. This paper describes the design and implementation of the StarJIT compiler, a dynamic compiler for Java Virtual Machines and Common Language Runtime platforms. The goal of the StarJIT compiler is ..."
Abstract
-
Cited by 20 (7 self)
- Add to MetaCart
Dynamic compilers (or Just-in-Time [JIT] compilers) are a key component of managed runtime environments. This paper describes the design and implementation of the StarJIT compiler, a dynamic compiler for Java Virtual Machines and Common Language Runtime platforms. The goal of the StarJIT compiler is to build an infrastructure to research the influence of managed runtime environments on Intel architectures. The StarJIT compiler can compile both Java Infrastructure (CLI) bytecodes, and it uses a single intermediate representation and global optimization framework for both Java and CLI. The StarJIT compiler is designed to generate optimized code for the major Intel architectures and currently targets two Intel architectures: IA-32 and the Itanium Processor Family.
Partial Redundancy Elimination Driven by a Cost-Benefit Analysis
, 1997
"... Partial redundancy elimination has become a major compiler optimization that subsumes various ad hoc code motion optimizations. However, partial redundancy elimination is extremely conservative, failing to take advantage of many opportunities for optimization. We describe a new formulation of partia ..."
Abstract
-
Cited by 15 (0 self)
- Add to MetaCart
Partial redundancy elimination has become a major compiler optimization that subsumes various ad hoc code motion optimizations. However, partial redundancy elimination is extremely conservative, failing to take advantage of many opportunities for optimization. We describe a new formulation of partial redundancy elimination based on a cost-benefit analysis of the flowgraph. Costs and benefits are measured by the number of evaluations of an expression.
Improving region selection in dynamic optimization systems
- In MICRO 38: Proceedings of the 38th annual IEEE/ACM International Symposium on Microarchitecture
, 2005
"... The performance of a dynamic optimization system depends heavily on the code it selects to optimize. Many current systems follow the design of HP Dynamo and select a single interprocedural path, or trace, as the unit of code optimization and code caching. Though this approach to region selection has ..."
Abstract
-
Cited by 11 (1 self)
- Add to MetaCart
The performance of a dynamic optimization system depends heavily on the code it selects to optimize. Many current systems follow the design of HP Dynamo and select a single interprocedural path, or trace, as the unit of code optimization and code caching. Though this approach to region selection has worked well in practice, we show that it is possible to adapt this basic approach to produce regions with greater locality, less needless code duplication, and fewer profiling counters. In particular, we propose two new region-selection algorithms and evaluate them against Dynamo’s selection mechanism, Next-Executing Tail (NET). Our first algorithm, Last-Executed Iteration (LEI), identifies cyclic paths of execution better than NET, improving locality of execution while reducing the size of the code cache. Our second algorithm allows overlapping traces of similar execution frequency to be combined into a single large region. This second technique can be applied to both NET and LEI, and we find that it significantly improves metrics of locality and memory overhead for each. 1
Phased Behavior and Its Impact on Program Optimization
"... Run-time optimization systems are gaining in popularity because they can automatically restructure an executable based on the current program behavior and specifics of the underlying machine. Software -based versions of these systems have several benefits over hardware-based versions: they can su ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
Run-time optimization systems are gaining in popularity because they can automatically restructure an executable based on the current program behavior and specifics of the underlying machine. Software -based versions of these systems have several benefits over hardware-based versions: they can support more sophisticated optimizations; they can apply optimizations to larger program regions; and they, like any other software system, can be upgraded as new optimization techniques become available. These benefits, however, come at the expense of a higher run-time overhead. Unlike a hardware-based system which can run continuously during a program's execution, we must selectively apply the power of a software-based system so that its higher overhead is appropriately amortized. This paper investigates phased behavior, an aspect of program behavior that softwarebased systems can use to make them cost effective. This paper illustrates some types of phased behavior that occur in (non-...
Superblock-Based Source Code Optimizations for WCET Reduction
"... Abstract—Superblocks represent regions in a program code that consist of multiple basic blocks. Compilers benefit from this structure since it enables optimization across block boundaries. This increased optimization potential was thoroughly studied in the past for average-case execution time (ACET) ..."
Abstract
- Add to MetaCart
Abstract—Superblocks represent regions in a program code that consist of multiple basic blocks. Compilers benefit from this structure since it enables optimization across block boundaries. This increased optimization potential was thoroughly studied in the past for average-case execution time (ACET) reduction at assembly level. In this paper, the concept of superblocks is exploited for the optimization of embedded real-time systems that have to meet stringent timing constraints specified by the worst-case execution time (WCET). To achieve this goal, our superblock formation is based on a novel trace selection algorithm which is driven by WCET data. Moreover, we translate superblocks for the first time from assembly to source code level. This approach enables an early code restructuring in the optimizer, providing more optimization opportunities for both subsequent source code and assembly level transformations. An adaption of the traditional optimizations common subexpression and dead code elimination to our WCET-aware superblocks allows an effective WCET reduction. Using our techniques, we significantly outperform standard optimizations and achieve an average WCET reduction of up to 10.2 % for a total of 55 real-life benchmarks. I.

