Results 1  10
of
27
Optimistic Register Coalescing
 In Proceedings of the 1998 International Conference on Parallel Architecture and Compilation Techniques
, 1998
"... Graphcoloring register allocators eliminate copies by coalescing the source and target node of a copy if they do not interfere in the interference graph. Coalescing is, however, known to be harmful to the colorability of the graph because it tends to yield a graph with nodes of higher degrees. Unli ..."
Abstract

Cited by 43 (1 self)
 Add to MetaCart
(Show Context)
Graphcoloring register allocators eliminate copies by coalescing the source and target node of a copy if they do not interfere in the interference graph. Coalescing is, however, known to be harmful to the colorability of the graph because it tends to yield a graph with nodes of higher degrees. Unlike aggressive coalescing which coalesces any pair of noninterfering copyrelated nodes, conservative coalescing or iterated coalescing perform safe coalescing that preserves the colorability. Unfortunately, these heuristics give up coalescing too early, losing many opportunities of coalescing that would turn out to be safe. Moreover, they ignore the fact that coalescing may even improve the colorability of the graph by reducing the degree of neighbor nodes that are interfering with both the source and target nodes being coalesced. This paper proposes a new heuristic called optimistic coalescing which optimistically performs aggressive coalescing, thus fully exploiting the positive impact of ...
On the complexity of register coalescing
 In Proc. of the International Symposium on Code Generation and Optimization (CGO ’07
, 2006
"... Memory transfers are becoming more important to optimize, for both performance and power consumption. With this goal in mind, new register allocation schemes are developed, which revisit not only the spilling problem but also the coalescing problem. Indeed, a more aggressive strategy to avoid load/s ..."
Abstract

Cited by 11 (3 self)
 Add to MetaCart
(Show Context)
Memory transfers are becoming more important to optimize, for both performance and power consumption. With this goal in mind, new register allocation schemes are developed, which revisit not only the spilling problem but also the coalescing problem. Indeed, a more aggressive strategy to avoid load/store instructions may increase the constraints to suppress (coalesce) move instructions. This paper is devoted to the complexity of the coalescing phase, in particular in the light of recent developments on the SSA form. We distinguish several optimizations that occur in coalescing heuristics: a) aggressive coalescing removes as many moves as possible, regardless of the colorability of the resulting interference graph; b) conservative coalescing removes as many moves as possible while keeping the colorability of the graph; c) incremental conservative coalescing removes one particular move while keeping the colorability of the graph; d) optimistic coalescing coalesces moves aggressively, then gives up about as few moves as possible so that the graph becomes colorable again. We almost completely classify the NPcompleteness of these problems, discussing also on the structure of the interference graph: arbitrary, chordal, or kcolorable in a greedy fashion. We believe that such a study is a necessary step for designing new coalescing strategies. 1
Register allocation: what does the NPCompleteness proof of Chaitin et al. really prove?
 IN PROC. OF THE 19 TH INTERNATIONAL WORKSHOP ON LANGUAGES AND COMPILERS FOR PARALLEL COMPUTING (LCPC ’06
, 2006
"... Register allocation is one of the most studied problems in compilation. It is considered as an NPcomplete problem since Chaitin et al., in 1981, modeled the problem of assigning temporary variables to k machine registers as the problem of coloring, with k colors, the interference graph associated t ..."
Abstract

Cited by 11 (4 self)
 Add to MetaCart
(Show Context)
Register allocation is one of the most studied problems in compilation. It is considered as an NPcomplete problem since Chaitin et al., in 1981, modeled the problem of assigning temporary variables to k machine registers as the problem of coloring, with k colors, the interference graph associated to the variables. The fact that the interference graph can be arbitrary proves the NPcompleteness of this formulation. However, this original proof does not really show where the complexity of register allocation comes from. Recently, the rediscovery that interference graphs of SSA programs can be colored in polynomial time raised the question: Can we exploit SSA form to perform register allocation in polynomial time, without contradicting Chaitin et al’s NPcompleteness result? To address such a question and, more generally, the complexity of register allocation, we revisit Chaitin et al’s proof to better identify the interactions between spilling (load/store insertion), coalescing/splitting (removal/insertion of moves between registers), critical edges (a property of the controlflow graph), and coloring (assignment to registers). In particular, we show that, in general (we will make clear when), it is easy to decide if temporary variables can be assigned to k registers or if some spilling is necessary. In other words, the real complexity does not come from the coloring itself (as a wrong interpretation of the proof of Chaitin et al. may suggest) but comes from the presence of critical edges and from the optimizations of spilling and coalescing.
A progressive register allocator for irregular architectures
 In Proceedings of the International Symposium on Code Generation and Optimization, CGO ’05
, 2005
"... Register allocation is one of the most important optimizations a compiler performs. Conventional graphcoloring based register allocators are fast and do well on regular, RISClike, architectures, but perform poorly on irregular, CISClike, architectures with few registers and nonorthogonal instr ..."
Abstract

Cited by 8 (4 self)
 Add to MetaCart
(Show Context)
Register allocation is one of the most important optimizations a compiler performs. Conventional graphcoloring based register allocators are fast and do well on regular, RISClike, architectures, but perform poorly on irregular, CISClike, architectures with few registers and nonorthogonal instruction sets. At the other extreme, optimal register allocators based on integer linear programming are capable of fully modeling and exploiting the peculiarities of irregular architectures but do not scale well. We introduce the idea of a progressive allocator. A progressive allocator finds an initial allocation of quality comparable to a conventional allocator, but as more time is allowed for computation the quality of the allocation approaches optimal. This paper presents a progressive register allocator which uses a multicommodity network flow model to elegantly represent the intricacies of irregular architectures. We evaluate our allocator as a substitute for gcc’s local register allocation pass. 1.
Tailoring graphcoloring register allocation for runtime compilation
 In International Symposium on Code Generation and Optimization (CGO’06
, 2006
"... Justintime compilers are invoked during application execution and therefore need to ensure fast compilation times. Consequently, runtime compiler designers are averse to implementing compiletime intensive optimization algorithms. Instead, they tend to select faster but less effective transformat ..."
Abstract

Cited by 6 (0 self)
 Add to MetaCart
(Show Context)
Justintime compilers are invoked during application execution and therefore need to ensure fast compilation times. Consequently, runtime compiler designers are averse to implementing compiletime intensive optimization algorithms. Instead, they tend to select faster but less effective transformations. In this paper, we explore this tradeoff for an important optimization – global register allocation. We present a graphcoloring register allocator that has been redesigned for runtime compilation. Compared to ChaitinBriggs [7], a standard graphcoloring technique, the reformulated algorithm requires considerably less allocation time and produces allocations that are only marginally worse than those of ChaitinBriggs. Our experimental results indicate that the allocator performs better than the linearscan and ChaitinBriggs allocators on most benchmarks in a runtime compilation environment. By increasing allocation efficiency and preserving optimization quality, the presented algorithm increases the suitability and profitability of a graphcoloring register allocation strategy for a runtime compiler. 1
Register Allocation and Optimal Spill Code Scheduling in Software Pipelined Loops Using 01 Integer Linear Programming Formulation
"... Abstract. In achieving higher instruction level parallelism, software pipelining increases the register pressure in the loop. The usefulness of the generated schedule may be restricted to cases where the register pressure is less than the available number of registers. Spill instructions need to be ..."
Abstract

Cited by 4 (0 self)
 Add to MetaCart
(Show Context)
Abstract. In achieving higher instruction level parallelism, software pipelining increases the register pressure in the loop. The usefulness of the generated schedule may be restricted to cases where the register pressure is less than the available number of registers. Spill instructions need to be introduced otherwise. But scheduling these spill instructions in the compact schedule is a difficult task. Several heuristics have been proposed to schedule spill code. These heuristics may generate more spill code than necessary, and scheduling them may necessitate increasing the initiation interval. We model the problem of register allocation with spill code generation and scheduling in software pipelined loops as a 01 integer linear program. The formulation minimizes the increase in initiation interval (II) by optimally placing spill code and simultaneously minimizes the amount of spill code produced. To the best of our knowledge, this is the first integrated formulation for register allocation, optimal spill code generation and scheduling for software pipelined loops. The proposed formulation performs better than the existing heuristics by preventing an increase in II in 11.11 % of the loops and generating 18.48 % less spill code on average among the loops extracted from Perfect Club and SPEC benchmarks with a moderate increase in compilation time. 1
Effective Instruction Scheduling with Limited Registers
, 2001
"... Effective global instruction scheduling techniques have become an important component in modern compilers for exposing more instructionlevel parallelism (ILP) and exploiting the everincreasing number of parallel function units. Effective register allocation has long been an essential component of a ..."
Abstract

Cited by 3 (0 self)
 Add to MetaCart
Effective global instruction scheduling techniques have become an important component in modern compilers for exposing more instructionlevel parallelism (ILP) and exploiting the everincreasing number of parallel function units. Effective register allocation has long been an essential component of a good compiler for reducing memory references. While instruction scheduling and register allocation are both essential compiler optimizations for fully exploiting the capability of modern highperformance microprocessors, there is a phaseordering problem when we perform these two optimizations separately: instruction scheduling before register allocation may create insatiable demands for registers; register allocation before instruction scheduling may reduce the amount of parallelism that instruction scheduling can exploit. In this thesis, we propose to solve this phaseordering problem by inserting a moderating optimization called code reorganization between prepass instruction scheduling and register allocation. Code reorganization adjusts the prepass scheduling results to make them demand fewer registers (i.e. exhibit lower register pressure) and guides register allocation to insert spill code that has less impact on schedule length. Our new approach avoids the complexity of simultaneous instruction scheduling and register allocation algorithms. In fact, it does not modify either instruction scheduling or register allocation algorithms. Therefore instruction scheduling can focus on maximizing instructionlevel parallelism, and register allocation can focus on minimizing the cost of spill code. We compare the performance of our approach with a particular successful registerpressuresensitive scheduling algorithm, and show an average of 18% improvement in speedup for an 8...
Revisiting graph coloring register allocation: A study of the ChaitinBriggs and CallahanKoblenz algorithms
 IN PROC. OF THE WORKSHOP ON LANGUAGES AND COMPILERS FOR PARALLEL COMPUTING (LCPC’05
, 2005
"... Techniques for global register allocation via graph coloring have been extensively studied and widely implemented in compiler frameworks. This paper examines a particular variant – the Callahan Koblenz allocator – and compares it to the ChaitinBriggs graph coloring register allocator. Both algorith ..."
Abstract

Cited by 3 (0 self)
 Add to MetaCart
(Show Context)
Techniques for global register allocation via graph coloring have been extensively studied and widely implemented in compiler frameworks. This paper examines a particular variant – the Callahan Koblenz allocator – and compares it to the ChaitinBriggs graph coloring register allocator. Both algorithms were published in the 1990’s, yet the academic literature does not contain an assessment of the CallahanKoblenz allocator. This paper evaluates and contrasts the allocation decisions made by both algorithms. In particular, we focus on two key differences between the allocators: Spill code: The CallahanKoblenz allocator attempts to minimize the effect of spill code by using program structure to guide allocation and spill code placement. We evaluate the impact of this strategy on allocated code. Copy elimination: Effective registertoregister copy removal is important for producing good code. The allocators use different techniques to eliminate these copies. We compare the mechanisms and provide insights into the relative performance of the contrasting techniques. The CallahanKoblenz allocator may potentially insert extra branches as part of the allocation process. We also measure the performance overhead due to these branches.
Scratchpad Memory Allocation for Data Aggregates via Interval Coloring in Superperfect Graphs
"... Existing methods place data or code in scratchpad memory, i.e., SPM by relying on heuristics or resorting to integer programming or mapping it to a graph coloring problem. In this paper, the SPM allocation problem for arrays is formulated as an interval coloring problem. The key observation is that ..."
Abstract

Cited by 2 (2 self)
 Add to MetaCart
(Show Context)
Existing methods place data or code in scratchpad memory, i.e., SPM by relying on heuristics or resorting to integer programming or mapping it to a graph coloring problem. In this paper, the SPM allocation problem for arrays is formulated as an interval coloring problem. The key observation is that in many embedded C programs, two arrays can be modeled such that either their live ranges do not interfere or one contains the other (with good accuracy). As a result, array interference graphs often form a special class of superperfect graphs (known as comparability graphs) and their optimal interval colorings become efficiently solvable. This insight has led to the development of an SPM allocation algorithm that places arrays in an interference graph in SPM by examining its maximal cliques. If the SPM is no smaller than the clique number of an interference graph, then all arrays in the graph can be placed in SPM optimally. Otherwise, we rely on containmentmotivated heuristics to split or spill array live ranges until the resulting graph is optimally colorable. We have implemented our algorithm in SUIF/machSUIF and evaluated it using a set of embedded C benchmarks from MediaBench and MiBench. Compared to a graph coloring algorithm and an optimal ILP algorithm (when it runs to completion), our algorithm achieves closetooptimal results and is superior to graph coloring for the benchmarks tested.
An analysis of graph coloring register allocation
, 2006
"... Graph coloring is the de facto standard technique for register allocation within a compiler. In this paper we examine the importance of the quality of the coloring algorithm and various extensions of the basic graph coloring technique by replacing the coloring phase of the GNU compiler’s register al ..."
Abstract

Cited by 2 (1 self)
 Add to MetaCart
(Show Context)
Graph coloring is the de facto standard technique for register allocation within a compiler. In this paper we examine the importance of the quality of the coloring algorithm and various extensions of the basic graph coloring technique by replacing the coloring phase of the GNU compiler’s register allocator with an optimal coloring algorithm. We then extend this optimal algorithm to incorporate various extensions such as coalescing and preferential register assignment. We find that using an optimal coloring algorithm has surprisingly little benefit and empirically demonstrate the benefit of the various extensions.