Results 1  10
of
26
Optimistic Register Coalescing
 In Proceedings of the 1998 International Conference on Parallel Architecture and Compilation Techniques
, 1998
"... Graphcoloring register allocators eliminate copies by coalescing the source and target node of a copy if they do not interfere in the interference graph. Coalescing is, however, known to be harmful to the colorability of the graph because it tends to yield a graph with nodes of higher degrees. Unli ..."
Abstract

Cited by 51 (1 self)
 Add to MetaCart
(Show Context)
Graphcoloring register allocators eliminate copies by coalescing the source and target node of a copy if they do not interfere in the interference graph. Coalescing is, however, known to be harmful to the colorability of the graph because it tends to yield a graph with nodes of higher degrees. Unlike aggressive coalescing which coalesces any pair of noninterfering copyrelated nodes, conservative coalescing or iterated coalescing perform safe coalescing that preserves the colorability. Unfortunately, these heuristics give up coalescing too early, losing many opportunities of coalescing that would turn out to be safe. Moreover, they ignore the fact that coalescing may even improve the colorability of the graph by reducing the degree of neighbor nodes that are interfering with both the source and target nodes being coalesced. This paper proposes a new heuristic called optimistic coalescing which optimistically performs aggressive coalescing, thus fully exploiting the positive impact of ...
On the complexity of register coalescing
 In Proc. of the International Symposium on Code Generation and Optimization (CGO ’07
, 2006
"... Memory transfers are becoming more important to optimize, for both performance and power consumption. With this goal in mind, new register allocation schemes are developed, which revisit not only the spilling problem but also the coalescing problem. Indeed, a more aggressive strategy to avoid load/s ..."
Abstract

Cited by 15 (4 self)
 Add to MetaCart
(Show Context)
Memory transfers are becoming more important to optimize, for both performance and power consumption. With this goal in mind, new register allocation schemes are developed, which revisit not only the spilling problem but also the coalescing problem. Indeed, a more aggressive strategy to avoid load/store instructions may increase the constraints to suppress (coalesce) move instructions. This paper is devoted to the complexity of the coalescing phase, in particular in the light of recent developments on the SSA form. We distinguish several optimizations that occur in coalescing heuristics: a) aggressive coalescing removes as many moves as possible, regardless of the colorability of the resulting interference graph; b) conservative coalescing removes as many moves as possible while keeping the colorability of the graph; c) incremental conservative coalescing removes one particular move while keeping the colorability of the graph; d) optimistic coalescing coalesces moves aggressively, then gives up about as few moves as possible so that the graph becomes colorable again. We almost completely classify the NPcompleteness of these problems, discussing also on the structure of the interference graph: arbitrary, chordal, or kcolorable in a greedy fashion. We believe that such a study is a necessary step for designing new coalescing strategies. 1
Register allocation: what does the NPCompleteness proof of Chaitin et al. really prove?
 IN PROC. OF THE 19 TH INTERNATIONAL WORKSHOP ON LANGUAGES AND COMPILERS FOR PARALLEL COMPUTING (LCPC ’06
, 2006
"... Register allocation is one of the most studied problems in compilation. It is considered as an NPcomplete problem since Chaitin et al., in 1981, modeled the problem of assigning temporary variables to k machine registers as the problem of coloring, with k colors, the interference graph associated t ..."
Abstract

Cited by 12 (5 self)
 Add to MetaCart
(Show Context)
Register allocation is one of the most studied problems in compilation. It is considered as an NPcomplete problem since Chaitin et al., in 1981, modeled the problem of assigning temporary variables to k machine registers as the problem of coloring, with k colors, the interference graph associated to the variables. The fact that the interference graph can be arbitrary proves the NPcompleteness of this formulation. However, this original proof does not really show where the complexity of register allocation comes from. Recently, the rediscovery that interference graphs of SSA programs can be colored in polynomial time raised the question: Can we exploit SSA form to perform register allocation in polynomial time, without contradicting Chaitin et al’s NPcompleteness result? To address such a question and, more generally, the complexity of register allocation, we revisit Chaitin et al’s proof to better identify the interactions between spilling (load/store insertion), coalescing/splitting (removal/insertion of moves between registers), critical edges (a property of the controlflow graph), and coloring (assignment to registers). In particular, we show that, in general (we will make clear when), it is easy to decide if temporary variables can be assigned to k registers or if some spilling is necessary. In other words, the real complexity does not come from the coloring itself (as a wrong interpretation of the proof of Chaitin et al. may suggest) but comes from the presence of critical edges and from the optimizations of spilling and coalescing.
A progressive register allocator for irregular architectures
 IN PROCEEDINGS OF THE INTERNATIONAL SYMPOSIUM ON CODE GENERATION AND OPTIMIZATION, CGO ’05
, 2005
"... Register allocation is one of the most important optimizations a compiler performs. Conventional graphcoloring based register allocators are fast and do well on regular, RISClike, architectures, but perform poorly on irregular, CISClike, architectures with few registers and nonorthogonal instruc ..."
Abstract

Cited by 11 (4 self)
 Add to MetaCart
(Show Context)
Register allocation is one of the most important optimizations a compiler performs. Conventional graphcoloring based register allocators are fast and do well on regular, RISClike, architectures, but perform poorly on irregular, CISClike, architectures with few registers and nonorthogonal instruction sets. At the other extreme, optimal register allocators based on integer linear programming are capable of fully modeling and exploiting the peculiarities of irregular architectures but do not scale well. We introduce the idea of a progressive allocator. A progressive allocator finds an initial allocation of quality comparable to a conventional allocator, but as more time is allowed for computation the quality of the allocation approaches optimal. This paper presents a progressive register allocator which uses a multicommodity network flow model to elegantly represent the intricacies of irregular architectures. We evaluate our allocator as a substitute for gcc’s local register allocation pass.
Tailoring graphcoloring register allocation for runtime compilation
 In International Symposium on Code Generation and Optimization (CGO’06
, 2006
"... Justintime compilers are invoked during application execution and therefore need to ensure fast compilation times. Consequently, runtime compiler designers are averse to implementing compiletime intensive optimization algorithms. Instead, they tend to select faster but less effective transformat ..."
Abstract

Cited by 8 (0 self)
 Add to MetaCart
(Show Context)
Justintime compilers are invoked during application execution and therefore need to ensure fast compilation times. Consequently, runtime compiler designers are averse to implementing compiletime intensive optimization algorithms. Instead, they tend to select faster but less effective transformations. In this paper, we explore this tradeoff for an important optimization – global register allocation. We present a graphcoloring register allocator that has been redesigned for runtime compilation. Compared to ChaitinBriggs [7], a standard graphcoloring technique, the reformulated algorithm requires considerably less allocation time and produces allocations that are only marginally worse than those of ChaitinBriggs. Our experimental results indicate that the allocator performs better than the linearscan and ChaitinBriggs allocators on most benchmarks in a runtime compilation environment. By increasing allocation efficiency and preserving optimization quality, the presented algorithm increases the suitability and profitability of a graphcoloring register allocation strategy for a runtime compiler. 1
Register Allocation and Optimal Spill Code Scheduling in Software Pipelined Loops Using 01 Integer Linear Programming Formulation
"... Abstract. In achieving higher instruction level parallelism, software pipelining increases the register pressure in the loop. The usefulness of the generated schedule may be restricted to cases where the register pressure is less than the available number of registers. Spill instructions need to be ..."
Abstract

Cited by 6 (0 self)
 Add to MetaCart
(Show Context)
Abstract. In achieving higher instruction level parallelism, software pipelining increases the register pressure in the loop. The usefulness of the generated schedule may be restricted to cases where the register pressure is less than the available number of registers. Spill instructions need to be introduced otherwise. But scheduling these spill instructions in the compact schedule is a difficult task. Several heuristics have been proposed to schedule spill code. These heuristics may generate more spill code than necessary, and scheduling them may necessitate increasing the initiation interval. We model the problem of register allocation with spill code generation and scheduling in software pipelined loops as a 01 integer linear program. The formulation minimizes the increase in initiation interval (II) by optimally placing spill code and simultaneously minimizes the amount of spill code produced. To the best of our knowledge, this is the first integrated formulation for register allocation, optimal spill code generation and scheduling for software pipelined loops. The proposed formulation performs better than the existing heuristics by preventing an increase in II in 11.11 % of the loops and generating 18.48 % less spill code on average among the loops extracted from Perfect Club and SPEC benchmarks with a moderate increase in compilation time. 1
Revisiting graph coloring register allocation: A study of the ChaitinBriggs and CallahanKoblenz algorithms
 IN PROC. OF THE WORKSHOP ON LANGUAGES AND COMPILERS FOR PARALLEL COMPUTING (LCPC’05
, 2005
"... Techniques for global register allocation via graph coloring have been extensively studied and widely implemented in compiler frameworks. This paper examines a particular variant – the Callahan Koblenz allocator – and compares it to the ChaitinBriggs graph coloring register allocator. Both algorith ..."
Abstract

Cited by 4 (0 self)
 Add to MetaCart
(Show Context)
Techniques for global register allocation via graph coloring have been extensively studied and widely implemented in compiler frameworks. This paper examines a particular variant – the Callahan Koblenz allocator – and compares it to the ChaitinBriggs graph coloring register allocator. Both algorithms were published in the 1990’s, yet the academic literature does not contain an assessment of the CallahanKoblenz allocator. This paper evaluates and contrasts the allocation decisions made by both algorithms. In particular, we focus on two key differences between the allocators: Spill code: The CallahanKoblenz allocator attempts to minimize the effect of spill code by using program structure to guide allocation and spill code placement. We evaluate the impact of this strategy on allocated code. Copy elimination: Effective registertoregister copy removal is important for producing good code. The allocators use different techniques to eliminate these copies. We compare the mechanisms and provide insights into the relative performance of the contrasting techniques. The CallahanKoblenz allocator may potentially insert extra branches as part of the allocation process. We also measure the performance overhead due to these branches.
Effective Instruction Scheduling with Limited Registers
, 2001
"... Effective global instruction scheduling techniques have become an important component in modern compilers for exposing more instructionlevel parallelism (ILP) and exploiting the everincreasing number of parallel function units. Effective register allocation has long been an essential component of a ..."
Abstract

Cited by 3 (0 self)
 Add to MetaCart
Effective global instruction scheduling techniques have become an important component in modern compilers for exposing more instructionlevel parallelism (ILP) and exploiting the everincreasing number of parallel function units. Effective register allocation has long been an essential component of a good compiler for reducing memory references. While instruction scheduling and register allocation are both essential compiler optimizations for fully exploiting the capability of modern highperformance microprocessors, there is a phaseordering problem when we perform these two optimizations separately: instruction scheduling before register allocation may create insatiable demands for registers; register allocation before instruction scheduling may reduce the amount of parallelism that instruction scheduling can exploit. In this thesis, we propose to solve this phaseordering problem by inserting a moderating optimization called code reorganization between prepass instruction scheduling and register allocation. Code reorganization adjusts the prepass scheduling results to make them demand fewer registers (i.e. exhibit lower register pressure) and guides register allocation to insert spill code that has less impact on schedule length. Our new approach avoids the complexity of simultaneous instruction scheduling and register allocation algorithms. In fact, it does not modify either instruction scheduling or register allocation algorithms. Therefore instruction scheduling can focus on maximizing instructionlevel parallelism, and register allocation can focus on minimizing the cost of spill code. We compare the performance of our approach with a particular successful registerpressuresensitive scheduling algorithm, and show an average of 18% improvement in speedup for an 8...
A combined algorithm for graphcoloring in register allocation
 Proceedings of the Computational Symposium on Graph Coloring and its Generalizations
, 2002
"... Abstract. Our research involves improved algorithms for graphcoloring, in the context of register allocation. We extend the usual algorithm, first proposed by Chaitin, adding two further routines, one a form of semirandomized greedy allocation of colors, and the second using local search with rand ..."
Abstract

Cited by 3 (0 self)
 Add to MetaCart
(Show Context)
Abstract. Our research involves improved algorithms for graphcoloring, in the context of register allocation. We extend the usual algorithm, first proposed by Chaitin, adding two further routines, one a form of semirandomized greedy allocation of colors, and the second using local search with random restarts, a method developed in the context of logical satisfiability problems. For typical registerset sizes, the extended algorithm can color graphs using significantly less time and spilling significantly smaller numbers of nodes to memory than does the Chaitin algorithm; these advantages become less pronounced, as registerset size decreases, and Chaitin can offer some small measure of better performance for very small sizes. Our algorithm has some interesting additional features. On the one hand, it can be extended to the general graphcoloring problem, outside of the particular application considered. In addition, it makes available adjustable parameters for producing moreor lessoptimal executables during compiler runs.
Liverange Unsplitting for Faster Optimal Coalescing
"... Register allocation is often a twophase approach: spilling of registers to memory, followed by coalescing of registers. Extreme liverange splitting (i.e. liverange splitting after each statement) enables optimal solutions based on ILP, for both spilling and coalescing. However, while the solutions ..."
Abstract

Cited by 3 (0 self)
 Add to MetaCart
(Show Context)
Register allocation is often a twophase approach: spilling of registers to memory, followed by coalescing of registers. Extreme liverange splitting (i.e. liverange splitting after each statement) enables optimal solutions based on ILP, for both spilling and coalescing. However, while the solutions are easily found for spilling, for coalescing they are more elusive. This difficulty stems from the huge size of interference graphs resulting from liverange splitting. This report focuses on optimal coalescing in the context of extreme liverange splitting. We present some theoretical properties that give rise to an algorithm for reducing interference graphs, while preserving optimality. This reduction consists mainly in finding and removing useless splitting points. It is followed by a graph decomposition based on clique separators. The last optimization consists in two preprocessing rules. Any coalescing technique can be applied after these optimizations. Our optimizations have been tested on a standard benchmark, the optimal coalescing challenge. For this benchmark, the cuttingplane algorithm for optimal coalescing (the only optimal algorithm for coalescing) runs 300 times faster when combined with our optimizations. Moreover, we provide all the solutions of the optimal coalescing challenge, including the 3 instances that were previously unsolved.