Results 1 - 10
of
17
Optimistic Register Coalescing
- In Proceedings of the 1998 International Conference on Parallel Architecture and Compilation Techniques
, 1998
"... Graph-coloring register allocators eliminate copies by coalescing the source and target node of a copy if they do not interfere in the interference graph. Coalescing is, however, known to be harmful to the colorability of the graph because it tends to yield a graph with nodes of higher degrees. Unli ..."
Abstract
-
Cited by 34 (1 self)
- Add to MetaCart
Graph-coloring register allocators eliminate copies by coalescing the source and target node of a copy if they do not interfere in the interference graph. Coalescing is, however, known to be harmful to the colorability of the graph because it tends to yield a graph with nodes of higher degrees. Unlike aggressive coalescing which coalesces any pair of non-interfering copyrelated nodes, conservative coalescing or iterated coalescing perform safe coalescing that preserves the colorability. Unfortunately, these heuristics give up coalescing too early, losing many opportunities of coalescing that would turn out to be safe. Moreover, they ignore the fact that coalescing may even improve the colorability of the graph by reducing the degree of neighbor nodes that are interfering with both the source and target nodes being coalesced. This paper proposes a new heuristic called optimistic coalescing which optimistically performs aggressive coalescing, thus fully exploiting the positive impact of ...
Register allocation : what does the NP-Completeness proof of Chaitin et al. really prove? Or revisting register allocation: why and how
- In Proc. of the 19 th International Workshop on Languages and Compilers for Parallel Computing (LCPC ’06
, 2006
"... Register allocation is one of the most studied problems in compilation. It is considered as an NP-complete problem since Chaitin et al., in 1981, modeled the problem of assigning temporary variables to k machine registers as the problem of coloring, with k colors, the interference graph associated t ..."
Abstract
-
Cited by 9 (3 self)
- Add to MetaCart
Register allocation is one of the most studied problems in compilation. It is considered as an NP-complete problem since Chaitin et al., in 1981, modeled the problem of assigning temporary variables to k machine registers as the problem of coloring, with k colors, the interference graph associated to the variables. The fact that the interference graph can be arbitrary proves the NP-completeness of this formulation. However, this original proof does not really show where the complexity of register allocation comes from. Recently, the re-discovery that interference graphs of SSA programs can be colored in polynomial time raised the question: Can we exploit SSA form to perform register allocation in polynomial time, without contradicting Chaitin et al’s NP-completeness result? To address such a question and, more generally, the complexity of register allocation, we revisit Chaitin et al’s proof to better identify the interactions between spilling (load/store insertion), coalescing/splitting (removal/insertion of moves between registers), critical edges (a property of the control-flow graph), and coloring (assignment to registers). In particular, we show that, in general (we will make clear when), it is easy to decide if temporary variables can be assigned to k registers or if some spilling is necessary. In other words, the real complexity does not come from the coloring itself (as a wrong interpretation of the proof of Chaitin et al. may suggest) but comes from the presence of critical edges and from the optimizations of spilling and coalescing.
On the complexity of register coalescing
- In Proc. of the International Symposium on Code Generation and Optimization (CGO ’07
, 2006
"... Memory transfers are becoming more important to optimize, for both performance and power consumption. With this goal in mind, new register allocation schemes are developed, which revisit not only the spilling problem but also the coalescing problem. Indeed, a more aggressive strategy to avoid load/s ..."
Abstract
-
Cited by 8 (2 self)
- Add to MetaCart
Memory transfers are becoming more important to optimize, for both performance and power consumption. With this goal in mind, new register allocation schemes are developed, which revisit not only the spilling problem but also the coalescing problem. Indeed, a more aggressive strategy to avoid load/store instructions may increase the constraints to suppress (coalesce) move instructions. This paper is devoted to the complexity of the coalescing phase, in particular in the light of recent developments on the SSA form. We distinguish several optimizations that occur in coalescing heuristics: a) aggressive coalescing removes as many moves as possible, regardless of the colorability of the resulting interference graph; b) conservative coalescing removes as many moves as possible while keeping the colorability of the graph; c) incremental conservative coalescing removes one particular move while keeping the colorability of the graph; d) optimistic coalescing coalesces moves aggressively, then gives up about as few moves as possible so that the graph becomes colorable again. We almost completely classify the NP-completeness of these problems, discussing also on the structure of the interference graph: arbitrary, chordal, or k-colorable in a greedy fashion. We believe that such a study is a necessary step for designing new coalescing strategies. 1
A Progressive Register Allocator for Irregular Architectures
"... ... a compiler performs. Conventional graph-coloring based register allocators are fast and do well on regular, RISC-like, architectures, but perform poorly on irregular, CISC-like, architectures with few registers and nonorthogonal instruction sets. At the other extreme, optimal register allocators ..."
Abstract
-
Cited by 7 (4 self)
- Add to MetaCart
... a compiler performs. Conventional graph-coloring based register allocators are fast and do well on regular, RISC-like, architectures, but perform poorly on irregular, CISC-like, architectures with few registers and nonorthogonal instruction sets. At the other extreme, optimal register allocators based on integer linear programming are capable of fully modeling and exploiting the peculiarities of irregular architectures but do not scale well. We introduce the idea of a progressive allocator. A progressive allocator finds an initial allocation of quality comparable to a conventional allocator, but as more time is allowed for computation the quality of the allocation approaches optimal. This paper presents a progressive register allocator which uses a multi-commodity network flow model to elegantly represent the intricacies of irregular architectures. We evaluate our allocator as a substitute for gcc's local register allocation pass.
Register Allocation and Optimal Spill Code Scheduling in Software Pipelined Loops Using 0-1 Integer Linear Programming Formulation
"... Abstract. In achieving higher instruction level parallelism, software pipelining increases the register pressure in the loop. The usefulness of the generated schedule may be restricted to cases where the register pressure is less than the available number of registers. Spill instructions need to be ..."
Abstract
-
Cited by 4 (0 self)
- Add to MetaCart
Abstract. In achieving higher instruction level parallelism, software pipelining increases the register pressure in the loop. The usefulness of the generated schedule may be restricted to cases where the register pressure is less than the available number of registers. Spill instructions need to be introduced otherwise. But scheduling these spill instructions in the compact schedule is a difficult task. Several heuristics have been proposed to schedule spill code. These heuristics may generate more spill code than necessary, and scheduling them may necessitate increasing the initiation interval. We model the problem of register allocation with spill code generation and scheduling in software pipelined loops as a 0-1 integer linear program. The formulation minimizes the increase in initiation interval (II) by optimally placing spill code and simultaneously minimizes the amount of spill code produced. To the best of our knowledge, this is the first integrated formulation for register allocation, optimal spill code generation and scheduling for software pipelined loops. The proposed formulation performs better than the existing heuristics by preventing an increase in II in 11.11 % of the loops and generating 18.48 % less spill code on average among the loops extracted from Perfect Club and SPEC benchmarks with a moderate increase in compilation time. 1
Revisiting graph coloring register allocation: A study of the Chaitin-Briggs and CallahanKoblenz algorithms
- In Proc. of the Workshop on Languages and Compilers for Parallel Computing (LCPC’05
, 2005
"... Abstract. Techniques for global register allocation via graph coloring have been extensively studied and widely implemented in compiler frameworks. This paper examines a particular variant – the Callahan Koblenz allocator – and compares it to the Chaitin-Briggs graph coloring register allocator. Bot ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
Abstract. Techniques for global register allocation via graph coloring have been extensively studied and widely implemented in compiler frameworks. This paper examines a particular variant – the Callahan Koblenz allocator – and compares it to the Chaitin-Briggs graph coloring register allocator. Both algorithms were published in the 1990’s, yet the academic literature does not contain an assessment of the Callahan-Koblenz allocator. This paper evaluates and contrasts the allocation decisions made by both algorithms. In particular, we focus on two key differences between the allocators: Spill code: The Callahan-Koblenz allocator attempts to minimize the effect of spill code by using program structure to guide allocation and spill code placement. We evaluate the impact of this strategy on allocated code. Copy elimination: Effective register-to-register copy removal is important for producing good code. The allocators use different techniques to eliminate these copies. We compare the mechanisms and provide insights into the relative performance of the contrasting techniques. The Callahan-Koblenz allocator may potentially insert extra branches as part of the allocation process. We also measure the performance overhead due to these branches. 1
An analysis of graph coloring register allocation
, 2006
"... Graph coloring is the de facto standard technique for register allocation within a compiler. In this paper we examine the importance of the quality of the coloring algorithm and various extensions of the basic graph coloring technique by replacing the coloring phase of the GNU compiler’s register al ..."
Abstract
-
Cited by 2 (1 self)
- Add to MetaCart
Graph coloring is the de facto standard technique for register allocation within a compiler. In this paper we examine the importance of the quality of the coloring algorithm and various extensions of the basic graph coloring technique by replacing the coloring phase of the GNU compiler’s register allocator with an optimal coloring algorithm. We then extend this optimal algorithm to incorporate various extensions such as coalescing and preferential register assignment. We find that using an optimal coloring algorithm has surprisingly little benefit and empirically demonstrate the benefit of the various extensions.
Effective Instruction Scheduling with Limited Registers
, 2001
"... Effective global instruction scheduling techniques have become an important component in modern compilers for exposing more instruction-level parallelism (ILP) and exploiting the everincreasing number of parallel function units. Effective register allocation has long been an essential component of a ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
Effective global instruction scheduling techniques have become an important component in modern compilers for exposing more instruction-level parallelism (ILP) and exploiting the everincreasing number of parallel function units. Effective register allocation has long been an essential component of a good compiler for reducing memory references. While instruction scheduling and register allocation are both essential compiler optimizations for fully exploiting the capability of modern high-performance microprocessors, there is a phase-ordering problem when we perform these two optimizations separately: instruction scheduling before register allocation may create insatiable demands for registers; register allocation before instruction scheduling may reduce the amount of parallelism that instruction scheduling can exploit. In this thesis, we propose to solve this phase-ordering problem by inserting a moderating optimization called code reorganization between prepass instruction scheduling and register allocation. Code reorganization adjusts the prepass scheduling results to make them demand fewer registers (i.e. exhibit lower register pressure) and guides register allocation to insert spill code that has less impact on schedule length. Our new approach avoids the complexity of simultaneous instruction scheduling and register allocation algorithms. In fact, it does not modify either instruction scheduling or register allocation algorithms. Therefore instruction scheduling can focus on maximizing instruction-level parallelism, and register allocation can focus on minimizing the cost of spill code. We compare the performance of our approach with a particular successful register-pressure-sensitive scheduling algorithm, and show an average of 18% improvement in speedup for an 8...
Compiler Optimizations for Nondeferred Reference-Counting Garbage Collection
- In Proceedings of the International Symposium on Memory Management
, 2006
"... Reference counting is a well-known technique for automatic memory management, offering unique advantages over other forms of garbage collection. However, on account of the high costs associated with the maintenance of up-to-date tallies of references from the stack, deferred variants are typically u ..."
Abstract
-
Cited by 2 (1 self)
- Add to MetaCart
Reference counting is a well-known technique for automatic memory management, offering unique advantages over other forms of garbage collection. However, on account of the high costs associated with the maintenance of up-to-date tallies of references from the stack, deferred variants are typically used in modern implementations. This partially sacrifices some of the benefits of nondeferred reference-counting (RC) garbage collection, like the immediate reclamation of garbage and short collector pause times. This paper presents a series of optimizations that target the stack and substantially enhance the throughput of nondeferred RC collection. A key enabler is a new static analysis and optimization called RC subsumption that significantly reduces the overhead of maintaining the stack contribution to reference counts. We report execution time improvements on a benchmark suite of ten C # programs, and show how RC subsumption, aided with other optimizations, improves the performance of nondeferred RC collection by as much as a factor of 10, making possible running times that are within 32 % of that with an advanced traversal-based collector on seven programs, and 19 % of that with a deferred RC collector on eight programs. This is in the context of a baseline RC implementation that is typically at least a factor of 6 slower than the tracing collector and a factor of 5 slower than the deferred RC collector.
A combined algorithm for graph-coloring in register allocation
- Proceedings of the Computational Symposium on Graph Coloring and its Generalizations
, 2002
"... Abstract. Our research involves improved algorithms for graph-coloring, in the context of register allocation. We extend the usual algorithm, first proposed by Chaitin, adding two further routines, one a form of semi-randomized greedy allocation of colors, and the second using local search with rand ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
Abstract. Our research involves improved algorithms for graph-coloring, in the context of register allocation. We extend the usual algorithm, first proposed by Chaitin, adding two further routines, one a form of semi-randomized greedy allocation of colors, and the second using local search with random restarts, a method developed in the context of logical satisfiability problems. For typical register-set sizes, the extended algorithm can color graphs using significantly less time and spilling significantly smaller numbers of nodes to memory than does the Chaitin algorithm; these advantages become less pronounced, as register-set size decreases, and Chaitin can offer some small measure of better performance for very small sizes. Our algorithm has some interesting additional features. On the one hand, it can be extended to the general graph-coloring problem, outside of the particular application considered. In addition, it makes available adjustable parameters for producing moreor less-optimal executables during compiler runs.

