Results 1  10
of
16
Register allocation for programs in ssaform
 In Compiler Construction 2006, volume 3923 of LNCS
, 2006
"... In this technical report, we present an architecture for register allocation on the SSAform. We show, how the properties of SSAform programs and their interference graphs can be exploited to develop new methods for spilling, coloring and coalescing. We present heuristic and optimal solution method ..."
Abstract

Cited by 26 (3 self)
 Add to MetaCart
In this technical report, we present an architecture for register allocation on the SSAform. We show, how the properties of SSAform programs and their interference graphs can be exploited to develop new methods for spilling, coloring and coalescing. We present heuristic and optimal solution methods for these three subtasks. 1
Register allocation : what does the NPCompleteness proof of Chaitin et al. really prove? Or revisting register allocation: why and how
 In Proc. of the 19 th International Workshop on Languages and Compilers for Parallel Computing (LCPC ’06
, 2006
"... Register allocation is one of the most studied problems in compilation. It is considered as an NPcomplete problem since Chaitin et al., in 1981, modeled the problem of assigning temporary variables to k machine registers as the problem of coloring, with k colors, the interference graph associated t ..."
Abstract

Cited by 10 (4 self)
 Add to MetaCart
Register allocation is one of the most studied problems in compilation. It is considered as an NPcomplete problem since Chaitin et al., in 1981, modeled the problem of assigning temporary variables to k machine registers as the problem of coloring, with k colors, the interference graph associated to the variables. The fact that the interference graph can be arbitrary proves the NPcompleteness of this formulation. However, this original proof does not really show where the complexity of register allocation comes from. Recently, the rediscovery that interference graphs of SSA programs can be colored in polynomial time raised the question: Can we exploit SSA form to perform register allocation in polynomial time, without contradicting Chaitin et al’s NPcompleteness result? To address such a question and, more generally, the complexity of register allocation, we revisit Chaitin et al’s proof to better identify the interactions between spilling (load/store insertion), coalescing/splitting (removal/insertion of moves between registers), critical edges (a property of the controlflow graph), and coloring (assignment to registers). In particular, we show that, in general (we will make clear when), it is easy to decide if temporary variables can be assigned to k registers or if some spilling is necessary. In other words, the real complexity does not come from the coloring itself (as a wrong interpretation of the proof of Chaitin et al. may suggest) but comes from the presence of critical edges and from the optimizations of spilling and coalescing.
On the complexity of register coalescing
 In Proc. of the International Symposium on Code Generation and Optimization (CGO ’07
, 2006
"... Memory transfers are becoming more important to optimize, for both performance and power consumption. With this goal in mind, new register allocation schemes are developed, which revisit not only the spilling problem but also the coalescing problem. Indeed, a more aggressive strategy to avoid load/s ..."
Abstract

Cited by 9 (3 self)
 Add to MetaCart
Memory transfers are becoming more important to optimize, for both performance and power consumption. With this goal in mind, new register allocation schemes are developed, which revisit not only the spilling problem but also the coalescing problem. Indeed, a more aggressive strategy to avoid load/store instructions may increase the constraints to suppress (coalesce) move instructions. This paper is devoted to the complexity of the coalescing phase, in particular in the light of recent developments on the SSA form. We distinguish several optimizations that occur in coalescing heuristics: a) aggressive coalescing removes as many moves as possible, regardless of the colorability of the resulting interference graph; b) conservative coalescing removes as many moves as possible while keeping the colorability of the graph; c) incremental conservative coalescing removes one particular move while keeping the colorability of the graph; d) optimistic coalescing coalesces moves aggressively, then gives up about as few moves as possible so that the graph becomes colorable again. We almost completely classify the NPcompleteness of these problems, discussing also on the structure of the interference graph: arbitrary, chordal, or kcolorable in a greedy fashion. We believe that such a study is a necessary step for designing new coalescing strategies. 1
Optimal Register Sharing for HighLevel Synthesis Of SSA . . .
 IEEE TRANS. COMPUTER AIDED DESIGN
, 2006
"... Register sharing for highlevel synthesis of programs represented in static single assignment (SSA) form is proven to have a polynomialtime solution. Register sharing is modeled as a graphcoloring problem. Although graph coloring is NPComplete in the general case, an interference graph constructe ..."
Abstract

Cited by 8 (3 self)
 Add to MetaCart
Register sharing for highlevel synthesis of programs represented in static single assignment (SSA) form is proven to have a polynomialtime solution. Register sharing is modeled as a graphcoloring problem. Although graph coloring is NPComplete in the general case, an interference graph constructed for a program in SSA form probably belongs to the class of chordal graphs that have an optimal O(V time algorithm. Chordal graph coloring reduces the number of registers allocated to the program by as much as 86% and 64.93% on average compared to linear scan register allocation.
A Framework for EndtoEnd Verification and Evaluation of Register Allocators
"... Abstract. This paper presents a framework for designing, verifying, and evaluating register allocation algorithms. The proposed framework has three main components. The first component is MIRA, a language for describing programs prior to register allocation. The second component is FORD, a language ..."
Abstract

Cited by 2 (1 self)
 Add to MetaCart
Abstract. This paper presents a framework for designing, verifying, and evaluating register allocation algorithms. The proposed framework has three main components. The first component is MIRA, a language for describing programs prior to register allocation. The second component is FORD, a language that describes the results produced by the register allocator. The third component is a type checker for the output of a register allocator which helps to find bugs. To illustrate the effectiveness of the framework, we present RALF, a tool that allows a register allocator to be integrated into the gcc compiler for the StrongARM architecture. RALF simplifies the development of register allocators by sheltering the programmer from the internal complexity of gcc. MIRA and FORD’s features are sufficient to implement most of the register allocators currently in use and are independent of any particular register allocation algorithm or compiler. To demonstrate the generality of our framework, we have used RALF to evaluate eight different register allocators, including iterated register coalescing, linear scan, a chordal based allocator, and two integer linear programming approaches.
Liverange Unsplitting for Faster Optimal Coalescing
"... Register allocation is often a twophase approach: spilling of registers to memory, followed by coalescing of registers. Extreme liverange splitting (i.e. liverange splitting after each statement) enables optimal solutions based on ILP, for both spilling and coalescing. However, while the solutions ..."
Abstract

Cited by 2 (0 self)
 Add to MetaCart
Register allocation is often a twophase approach: spilling of registers to memory, followed by coalescing of registers. Extreme liverange splitting (i.e. liverange splitting after each statement) enables optimal solutions based on ILP, for both spilling and coalescing. However, while the solutions are easily found for spilling, for coalescing they are more elusive. This difficulty stems from the huge size of interference graphs resulting from liverange splitting. This report focuses on optimal coalescing in the context of extreme liverange splitting. We present some theoretical properties that give rise to an algorithm for reducing interference graphs, while preserving optimality. This reduction consists mainly in finding and removing useless splitting points. It is followed by a graph decomposition based on clique separators. The last optimization consists in two preprocessing rules. Any coalescing technique can be applied after these optimizations. Our optimizations have been tested on a standard benchmark, the optimal coalescing challenge. For this benchmark, the cuttingplane algorithm for optimal coalescing (the only optimal algorithm for coalescing) runs 300 times faster when combined with our optimizations. Moreover, we provide all the solutions of the optimal coalescing challenge, including the 3 instances that were previously unsolved.
Register Allocation Deconstructed
, 2009
"... Register allocation is a fundamental part of any optimizing compiler. Effectively managing the limited register resources of the constrained architectures commonly found in embedded systems is essential in order to maximize code quality. In this paper we deconstruct the register allocation problem i ..."
Abstract

Cited by 2 (0 self)
 Add to MetaCart
Register allocation is a fundamental part of any optimizing compiler. Effectively managing the limited register resources of the constrained architectures commonly found in embedded systems is essential in order to maximize code quality. In this paper we deconstruct the register allocation problem into distinct components: coalescing, spilling, move insertion, and assignment. Using an optimal register allocation framework, we empirically evaluate the importance of each of the components, the impact of component integration, and the effectiveness of existing heuristics. We evaluate code quality both in terms of code performance and code size and consider four distinct instruction set architectures: ARM, Thumb, x86, and x8664. The results of our investigation reveal general principles for register allocation design.
Approximating Maximum Weight KColorable Subgraphs in Chordal Graphs
"... We present a 2approximation algorithm for the problem of finding the maximum weight Kcolorable subgraph in a given chordal graph with node weights. The running time of the algorithm is O(K(n+m)), where n and m are the number of vertices and edges in the given graph. ..."
Abstract

Cited by 2 (0 self)
 Add to MetaCart
We present a 2approximation algorithm for the problem of finding the maximum weight Kcolorable subgraph in a given chordal graph with node weights. The running time of the algorithm is O(K(n+m)), where n and m are the number of vertices and edges in the given graph.
Scratchpad Memory Allocation for Data Aggregates via Interval Coloring in Superperfect Graphs
"... Existing methods place data or code in scratchpad memory, i.e., SPM by relying on heuristics or resorting to integer programming or mapping it to a graph coloring problem. In this paper, the SPM allocation problem for arrays is formulated as an interval coloring problem. The key observation is that ..."
Abstract

Cited by 2 (2 self)
 Add to MetaCart
Existing methods place data or code in scratchpad memory, i.e., SPM by relying on heuristics or resorting to integer programming or mapping it to a graph coloring problem. In this paper, the SPM allocation problem for arrays is formulated as an interval coloring problem. The key observation is that in many embedded C programs, two arrays can be modeled such that either their live ranges do not interfere or one contains the other (with good accuracy). As a result, array interference graphs often form a special class of superperfect graphs (known as comparability graphs) and their optimal interval colorings become efficiently solvable. This insight has led to the development of an SPM allocation algorithm that places arrays in an interference graph in SPM by examining its maximal cliques. If the SPM is no smaller than the clique number of an interference graph, then all arrays in the graph can be placed in SPM optimally. Otherwise, we rely on containmentmotivated heuristics to split or spill array live ranges until the resulting graph is optimally colorable. We have implemented our algorithm in SUIF/machSUIF and evaluated it using a set of embedded C benchmarks from MediaBench and MiBench. Compared to a graph coloring algorithm and an optimal ILP algorithm (when it runs to completion), our algorithm achieves closetooptimal results and is superior to graph coloring for the benchmarks tested.
Nearly optimal register allocation with PBQP
 In Proceedings of the 7th Joint Modular Languages Conference (JMLC’06). LNCS
, 2006
"... Abstract. For irregular architectures global register allocation remains a challenging problem, and has received a lot of attention in recent years. The classical graphcolouring analogy used by Chaitin and Briggs is not adequate for irregular architectures featuring nonorthogonal instruction sets ..."
Abstract

Cited by 1 (1 self)
 Add to MetaCart
Abstract. For irregular architectures global register allocation remains a challenging problem, and has received a lot of attention in recent years. The classical graphcolouring analogy used by Chaitin and Briggs is not adequate for irregular architectures featuring nonorthogonal instruction sets and irregular register sets. Previous work [1, 2] on register allocation based on partitioned boolean quadratic programming (PBQP) has demonstrated that this approach is effective for highly irregular architectures and small benchmarks. However, experiments have shown that the heuristic used for nonreducible nodes performs poorly for larger benchmarks and more regular architectures. In this paper we present a new heuristic for PBQP, which significantly outperforms the old heuristic, and produces register allocations equal to those of the stateoftheart graphcolouring approach. We also present a new solver for PBQP which is based on branchandbound and is able to solve register allocations optimally. The branchandbound solver allows PBQP to be used as a progressive register allocator, where programmers may explicitly trade extra compile time for a better register allocation. Experiments were conducted using the register allocation problems in the SPEC2000 benchmark suite as input, with IA32 as the target architecture. Using an optimal solver for PBQP we were able to solve 97.4 % of the register allocation problems in SPEC2000 optimally. 1