Results 11 - 20
of
277
Iterated Register Coalescing
- ACM Transactions on Programming Languages and Systems
, 1996
"... An important function of any register allocator is to target registers so as to eliminate copy instructions. Graph-coloring register allocation is an elegant approach to this problem. If the source and destination of a move instruction do not interfere, then their nodes can be coalesced in the inter ..."
Abstract
-
Cited by 132 (4 self)
- Add to MetaCart
An important function of any register allocator is to target registers so as to eliminate copy instructions. Graph-coloring register allocation is an elegant approach to this problem. If the source and destination of a move instruction do not interfere, then their nodes can be coalesced in the interference graph. Chaitin's coalescing heuristic could make a graph uncolorable (i.e., introduce spills); Briggs et al. demonstrated a conservative coalescing heuristic that preserves colorability. But Briggs's algorithm is too conservative, and leaves too many move instructions in our programs. We show how to interleave coloring reductions with Briggs's coalescing heuristic, leading to an algorithm that is safe but much more aggressive. 1 Introduction Graph coloring is a powerful approach to register allocation and can have a significant impact on the execution of compiled code. A good register allocator does copy propagation, eliminating many move instructions by "coloring" the source tempor...
The Design and Implementation of the SELF Compiler, an Optimizing Compiler for Object-Oriented Programming Languages
, 1992
"... Object-oriented programming languages promise to improve programmer productivity by supporting abstract data types, inheritance, and message passing directly within the language. Unfortunately, traditional implementations of object-oriented language features, particularly message passing, have been ..."
Abstract
-
Cited by 120 (15 self)
- Add to MetaCart
Object-oriented programming languages promise to improve programmer productivity by supporting abstract data types, inheritance, and message passing directly within the language. Unfortunately, traditional implementations of object-oriented language features, particularly message passing, have been much slower than traditional implementations of their non-object-oriented counterparts: the fastest existing implementation of Smalltalk-80 runs at only a tenth the speed of an optimizing C implementation. The dearth of suitable implementation technology has forced most object-oriented languages to be designed as hybrids with traditional non-object-oriented languages, complicating the languages and making programs harder to extend and reuse. This dissertation describes a collection of implementation techniques that can improve the run-time performance of object-oriented languages, in hopes of reducing the need for hybrid languages and encouraging wider spread of purely object-oriented langu...
Register allocation via hierarchical graph coloring
- In Proceedings of the SIGPLAN '91 Conference on Programming Language Design and Implementation
, 1991
"... We present a graph coloring register allocator de- signed to minimize the number of dynamic memory references. We cover the program with sets of blocks called tiles and group these tiles into a tree reflecting the program's hierarchical control structure. Registers are allocated for each tile using ..."
Abstract
-
Cited by 93 (0 self)
- Add to MetaCart
We present a graph coloring register allocator de- signed to minimize the number of dynamic memory references. We cover the program with sets of blocks called tiles and group these tiles into a tree reflecting the program's hierarchical control structure. Registers are allocated for each tile using standard graph coloring techniques and the local allocation and conflict information is passed around the tree in a two phase algorithm. This results in an allocation of reg- isters that is sensitive to local usage patterns while retaining a global perspective. Spill code is placed in less frequently executed portions of the program and the choice of variables to spill is based on usage pat- terns between the spills and the reloads rather than usage patterns over the entire program. 1
Hybrid Evolutionary Algorithms for Graph Coloring
, 1998
"... A recent and very promising approach for combinatorial optimization is to embed local search into the framework of evolutionary algorithms. In this paper, we present such hybrid algorithms for the graph coloring problem. These algorithms combine a new class of highly specialized crossover operators ..."
Abstract
-
Cited by 79 (9 self)
- Add to MetaCart
A recent and very promising approach for combinatorial optimization is to embed local search into the framework of evolutionary algorithms. In this paper, we present such hybrid algorithms for the graph coloring problem. These algorithms combine a new class of highly specialized crossover operators and a well-known tabu search algorithm. Experiments of such a hybrid algorithm are carried out on large DIMACS Challenge benchmark graphs. Results prove very competitive with and even better than those of state-of-the-art algorithms. Analysis of the behavior of the algorithm sheds light on ways to further improvement. Keywords: Graph coloring, solution recombination, tabu search, combinatorial optimization. 1 Introduction A recent and very promising approach for combinatorial optimization is to embed local search into the framework of population based evolutionary algorithms, leading to hybrid evolutionary algorithms (HEA). Such an algorithm is essentially based on two key elements: an eff...
Combining Analyses, Combining Optimizations
, 1995
"... This thesis presents a framework for describing optimizations. It shows how to combine two such frameworks and how to reason about the properties of the resulting framework. The structure of the framework provides insight into when a combination yields better results. Also presented is a simple iter ..."
Abstract
-
Cited by 67 (4 self)
- Add to MetaCart
This thesis presents a framework for describing optimizations. It shows how to combine two such frameworks and how to reason about the properties of the resulting framework. The structure of the framework provides insight into when a combination yields better results. Also presented is a simple iterative algorithm for solving these frameworks. A framework is shown that combines Constant Propagation, Unreachable Code Elimination, Global Congruence Finding and Global Value Numbering. For these optimizations, the iterative algorithm runs in O(n^2) time.
This thesis then presents an O(n log n) algorithm for combining the same optimizations. This technique also finds many of the common subexpressions found by Partial Redundancy Elimination. However, it requires a global code motion pass to make the optimized code correct, also presented. The global code motion algorithm removes some Partially Dead Code as a side-effect. An implementation demonstrates that the algorithm has shorter compile times than repeated passes of the separate optimizations while producing run-time speedups of 4%–7%.
While global analyses are stronger, peephole analyses can be unexpectedly powerful. This thesis demonstrates parse-time peephole optimizations that find more than 95% of the constants and common subexpressions found by the best combined analysis. Finding constants and common subexpressions while parsing reduces peak intermediate representation size. This speeds up the later global analyses, reducing total compilation time by 10%. In conjunction with global code motion, these peephole optimizations generate excellent code very quickly, a useful feature for compilers that stress compilation speed over code quality.
Marmot: An Optimizing Compiler for Java
, 1998
"... The Marmot system is a research platform for studying the implementation of high level programming languages. It currently comprises an optimizing native-code compiler, runtime system, and libraries for a large subset of Java. Marmot integrates well-known representation, optimization, code generat ..."
Abstract
-
Cited by 63 (6 self)
- Add to MetaCart
The Marmot system is a research platform for studying the implementation of high level programming languages. It currently comprises an optimizing native-code compiler, runtime system, and libraries for a large subset of Java. Marmot integrates well-known representation, optimization, code generation, and runtime techniques with a few Java-specific features to achieve competitive performance. This paper contains a description of the Marmot system design, along with highlights of our experience applying and adapting traditional implementation techniques to Java. A detailed performance evaluation assesses both Marmot's overall performance relative to other Java and C++ implementations and the relative costs of various Java language features in Marmot-compiled code. Our experience with Marmot has demonstrated that well-known compilation techniques can produce very good performance for static Java applications---comparable or superior to other Java systems, and approaching that o...
A Novel Framework of Register Allocation for Software Pipelining
, 1993
"... ing with credit is permitted. To copy otherwise, to republish, to post on servers, or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from Publications Dept., ACM Inc., fax +1 (212) 869-0481, or (permissions@acm.org). Qi Ning Guang R. Gao School of Com ..."
Abstract
-
Cited by 59 (11 self)
- Add to MetaCart
ing with credit is permitted. To copy otherwise, to republish, to post on servers, or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from Publications Dept., ACM Inc., fax +1 (212) 869-0481, or (permissions@acm.org). Qi Ning Guang R. Gao School of Computer Science McGill University Montreal, Quebec Canada H3A 2A7 email: ning@cs.mcgill.ca gao@cs.mcgill.ca Abstract Although software pipelining has been proposed as one of the most important loop scheduling methods, simultaneous scheduling and register allocation is less understood and remains an open problem [28]. The objective of this paper is to develop a unified algorithmic framework for concurrent scheduling and register allocation to support time-optimal software pipelining. A key intuition leading to this surprisingly simple formulation and its efficient solution is the association of maximum computation rate of a program graph with its critical cycles due to Reiter's pioneering work...
A Practical Data Flow Framework for Array Reference Analysis and its Use in Optimizations
- In ACM SIGPLAN'93 Conf. on Prog. Lang. Design and Implementation
, 1993
"... Data flow analysis techniques have traditionally been restricted to the analysis of scalar variables. This restriction, however, imposes a limitation on the kinds of optimizations that can be performed in loops containing array references. We present a data flow framework for array reference analysi ..."
Abstract
-
Cited by 55 (2 self)
- Add to MetaCart
Data flow analysis techniques have traditionally been restricted to the analysis of scalar variables. This restriction, however, imposes a limitation on the kinds of optimizations that can be performed in loops containing array references. We present a data flow framework for array reference analysis that provides the information needed in various optimizations targeted at sequential or fine-grained parallel architectures. The framework extends the traditional scalar framework by incorporating iteration distance values into the analysis to qualify the computed data flow solution during the fixed point iteration. Analyses phrased in this framework are capable of discovering recurrent access patterns among array references that evolve during the execution of a loop. The framework is practical in that the fixed point solution requires at most three passes over the body of structured loops. Applications of our framework are discussed for register allocation, load/store optimizations, and controlled loop unrolling.

