Results 1 - 10
of
96
Improving Register Allocation for Subscripted Variables
, 1990
"... INTRODUCTION By the late 1980s, memory system performance and CPU performance had already begun to diverge. This trend made effective use of the register file imperative for excellent performance. Although most compilers at that time allocated scalar variables to registers using graph coloring with ..."
Abstract
-
Cited by 192 (34 self)
- Add to MetaCart
INTRODUCTION By the late 1980s, memory system performance and CPU performance had already begun to diverge. This trend made effective use of the register file imperative for excellent performance. Although most compilers at that time allocated scalar variables to registers using graph coloring with marked success [12, 13, 14, 6], allocation of array values to registers only occurred in rare circumstances because standard data-flow analysis techniques could not uncover the available reuse of array memory locations. This deficiency was especially problematic for scientific codes since a majority of the computation involves array references. Our original paper addressed this problem by presenting an algorithm and experiment for a loop transformation, called scalar replacement, that exposed the reuse available in array references in an innermost loop. It also demonstrated experimentally how another loop transformation, called unroll-and-jam [2], could expose more opportunities for scalar…
Approximate graph coloring by semidefinite programming
- Proc. 35 th IEEE FOCS, IEEE
, 1994
"... a coloring is called the chromatic number of�, and is usually denoted by��.Determining the chromatic number of a graph is known to be NP-hard (cf. [19]). Besides its theoretical significance as a canonical NPhard problem, graph coloring arises naturally in a variety of applications such as register ..."
Abstract
-
Cited by 154 (7 self)
- Add to MetaCart
a coloring is called the chromatic number of�, and is usually denoted by��.Determining the chromatic number of a graph is known to be NP-hard (cf. [19]). Besides its theoretical significance as a canonical NPhard problem, graph coloring arises naturally in a variety of applications such as register allocation [11, 12, 13] is the maximum degree of any vertex. Be-and timetable/examination scheduling [8, 40]. In many We consider the problem of coloring�-colorable graphs with the fewest possible colors. We give a randomized polynomial time algorithm which colors a 3-colorable graph on vertices with� � ���� colors where sides giving the best known approximation ratio in terms of, this marks the first non-trivial approximation result as a function of the maximum degree. This result can be generalized to�-colorable graphs to obtain a coloring using�� � ��� � � � �colors. Our results are inspired by the recent work of Goemans and Williamson who used an algorithm for semidefinite optimization problems, which generalize linear programs, to obtain improved approximations for the MAX CUT and MAX 2-SAT problems. An intriguing outcome of our work is a duality relationship established between the value of the optimum solution to our semidefinite program and the Lovász�-function. We show lower bounds on the gap between the optimum solution of our semidefinite program and the actual chromatic number; by duality this also demonstrates interesting new facts about the�-function. 1
Register Allocation via Graph Coloring
, 1992
"... Chaitin and his colleagues at IBM in Yorktown Heights built the first global register allocator based on graph coloring. This thesis describes a series of improvements and extensions to the Yorktown allocator. There are four primary results: Optimistic coloring Chaitin's coloring heuristic pessimis ..."
Abstract
-
Cited by 133 (4 self)
- Add to MetaCart
Chaitin and his colleagues at IBM in Yorktown Heights built the first global register allocator based on graph coloring. This thesis describes a series of improvements and extensions to the Yorktown allocator. There are four primary results: Optimistic coloring Chaitin's coloring heuristic pessimistically assumes any node of high degree will not be colored and must therefore be spilled. By optimistically assuming that nodes of high degree will receive colors, I often achieve lower spill costs and faster code; my results are never worse. Coloring pairs The pessimism of Chaitin's coloring heuristic is emphasized when trying to color register pairs. My heuristic handles pairs as a natural consequence of its optimism. Rematerialization Chaitin et al. introduced the idea of rematerialization to avoid the expense of spilling and reloading certain simple values. By propagating rematerialization information around the SSA graph using a simple variation of Wegman and Zadeck's constant propag...
The Design and Implementation of the SELF Compiler, an Optimizing Compiler for Object-Oriented Programming Languages
, 1992
"... Object-oriented programming languages promise to improve programmer productivity by supporting abstract data types, inheritance, and message passing directly within the language. Unfortunately, traditional implementations of object-oriented language features, particularly message passing, have been ..."
Abstract
-
Cited by 120 (15 self)
- Add to MetaCart
Object-oriented programming languages promise to improve programmer productivity by supporting abstract data types, inheritance, and message passing directly within the language. Unfortunately, traditional implementations of object-oriented language features, particularly message passing, have been much slower than traditional implementations of their non-object-oriented counterparts: the fastest existing implementation of Smalltalk-80 runs at only a tenth the speed of an optimizing C implementation. The dearth of suitable implementation technology has forced most object-oriented languages to be designed as hybrids with traditional non-object-oriented languages, complicating the languages and making programs harder to extend and reuse. This dissertation describes a collection of implementation techniques that can improve the run-time performance of object-oriented languages, in hopes of reducing the need for hybrid languages and encouraging wider spread of purely object-oriented langu...
Register allocation via hierarchical graph coloring
- In Proceedings of the SIGPLAN '91 Conference on Programming Language Design and Implementation
, 1991
"... We present a graph coloring register allocator de- signed to minimize the number of dynamic memory references. We cover the program with sets of blocks called tiles and group these tiles into a tree reflecting the program's hierarchical control structure. Registers are allocated for each tile using ..."
Abstract
-
Cited by 93 (0 self)
- Add to MetaCart
We present a graph coloring register allocator de- signed to minimize the number of dynamic memory references. We cover the program with sets of blocks called tiles and group these tiles into a tree reflecting the program's hierarchical control structure. Registers are allocated for each tile using standard graph coloring techniques and the local allocation and conflict information is passed around the tree in a two phase algorithm. This results in an allocation of reg- isters that is sensitive to local usage patterns while retaining a global perspective. Spill code is placed in less frequently executed portions of the program and the choice of variables to spill is based on usage pat- terns between the spills and the reloads rather than usage patterns over the entire program. 1
Software pipelining showdown: Optimal vs. heuristic methods in a production compiler
- In Proc. of the ACM SIGPLAN'96 Conf. on Programming Languages Design and Implementation
, 1996
"... This paper is a scientific comparison of two code generation tech-niques with identical goals — generation of the best possible soft-ware pipelined code for computers with instruction level parallelism. Both are variants of modulo scheduling, a framework for generation of soflware pipelines pioneere ..."
Abstract
-
Cited by 53 (9 self)
- Add to MetaCart
This paper is a scientific comparison of two code generation tech-niques with identical goals — generation of the best possible soft-ware pipelined code for computers with instruction level parallelism. Both are variants of modulo scheduling, a framework for generation of soflware pipelines pioneered by Rau and Glaser [RaG181], but are otherwise quite dissimilar. One technique was developed at Silicon Graphics and is used in the MIPSpro compiler. This is the production compiler for SG1’S systems which are based on the MIPS R8000 processor [Hsu94]. It is essentially a branch-and-bound enumeration of possible sched-ules with extensive pruning. This method is heuristic becaus(s of the way it prunes and also because of the interaction between reg-ister allocation and scheduling. The second technique aims to produce optimal results by formulat-
A Register Allocation Framework Based on Hierarchical Cyclic Interval Graphs
- In International Workshop on Compiler Construction, Paderdorn
, 1993
"... In this paper, we propose the use of cyclic interval graphs as an alternative representation for register allocation. The "thickness" of the cyclic interval graph captures the notion of overlap between live ranges of variables relative to each particular point of time in the program execution. We de ..."
Abstract
-
Cited by 52 (13 self)
- Add to MetaCart
In this paper, we propose the use of cyclic interval graphs as an alternative representation for register allocation. The "thickness" of the cyclic interval graph captures the notion of overlap between live ranges of variables relative to each particular point of time in the program execution. We demonstrate that cyclic interval graphs provide a feasible and effective representation that accurately captures the periodic nature of live ranges found in loops. A new heuristic algorithm for minimum register allocation, the fat cover algorithm, has been developed and implemented to exploit such program structure. In addition, a new spilling algorithm is proposed that makes use of the extra information available in the interval graph representation. These two algorithms work together to provide a two-phase register allocation process that does not require iteration of the spilling or coloring phases. We extend the notion of cyclic interval graphs to hierarchical cyclic interval graphs and we...
Structured Programs have Small Tree-Width and Good Register Allocation
- Information and Computation
, 1995
"... The register allocation problem for an imperative program is often modelled as the coloring problem of the interference graph of the control-flow graph of the program. The interference graph of a flow graph G is the intersection graph of some connected subgraphs of G. These connected subgraphs repre ..."
Abstract
-
Cited by 47 (1 self)
- Add to MetaCart
The register allocation problem for an imperative program is often modelled as the coloring problem of the interference graph of the control-flow graph of the program. The interference graph of a flow graph G is the intersection graph of some connected subgraphs of G. These connected subgraphs represent the lives, or life times, of variables, so the coloring problem models that two variables with overlapping life times should be in different registers. For general programs with unrestricted gotos, the interference graph can be any graph, and hence we cannot in general color within a factor O(n " ) from optimality unless NP=P. It is shown that if a graph has tree-width k, we can efficiently color any intersection graph of connected subgraphs within a factor (bk=2c + 1) from optimality. Moreover, it is shown that structured (j goto-free) programs, including, for example, short circuit evaluations and multiple exits from loops, have tree-width at most 6. Thus, for every structured progr...
Coloring Random and Semi-Random k-Colorable Graphs
, 1995
"... The problem of coloring a graph with the minimum number of colors is well known to be NPhard, even restricted to k-colorable graphs for constant k 3. On the other hand, it is known that random k-colorable graphs are easy to k-color. The algorithms for coloring random k- colorable graphs require fai ..."
Abstract
-
Cited by 44 (0 self)
- Add to MetaCart
The problem of coloring a graph with the minimum number of colors is well known to be NPhard, even restricted to k-colorable graphs for constant k 3. On the other hand, it is known that random k-colorable graphs are easy to k-color. The algorithms for coloring random k- colorable graphs require fairly high edge densities, however. In this paper we present algorithms that color randomly generated k-colorable graphs for much lower edge densities than previous approaches. In addition, to study a wider variety of graph distributions, we also present a model of graphs generated by the semi-random source of Santha and Vazirani that provides a smooth transition between the worst-case and random models. In this model, the graph is generated by a "noisy adversary" --- an adversary whose decisions (whether or not to insert a particular edge) have some small (random) probability of being reversed. We show that even for quite low noise rates, semi-random k-colorable graphs can be optimally colored with high probability.
Optimization of Array Accesses by Collective Loop Transformations
, 1990
"... In this paper, we investigate the problem of optimizing array accesses across a collection of loops. We demonstrate that a good solution to such a problem should be based on an optimization scheme, called collective loop transformations, that considers all loops simultaneously. In particular, loop r ..."
Abstract
-
Cited by 36 (2 self)
- Add to MetaCart
In this paper, we investigate the problem of optimizing array accesses across a collection of loops. We demonstrate that a good solution to such a problem should be based on an optimization scheme, called collective loop transformations, that considers all loops simultaneously. In particular, loop reversal, loop interchange and loop fusion are performed collectively on a set of loop nests. The main impact of these transformations is an optimization called array contraction, that saves space and time by converting an array variable into a scalar variable or a buffer containing a small number of scalar variables. This optimization is applicable to general-purpose high-performance architectures. For a multiprocessor architecture, array contraction is performed by executing the producer and consumer loops on separate processors, and by using a smaller buffer for the array communication. For a uniprocessor architecture, array contraction is performed by fusing the producer and consumer loop...

