Results 1 - 10
of
11
On the complexity of register coalescing
- In Proc. of the International Symposium on Code Generation and Optimization (CGO ’07
, 2006
"... Memory transfers are becoming more important to optimize, for both performance and power consumption. With this goal in mind, new register allocation schemes are developed, which revisit not only the spilling problem but also the coalescing problem. Indeed, a more aggressive strategy to avoid load/s ..."
Abstract
-
Cited by 8 (2 self)
- Add to MetaCart
Memory transfers are becoming more important to optimize, for both performance and power consumption. With this goal in mind, new register allocation schemes are developed, which revisit not only the spilling problem but also the coalescing problem. Indeed, a more aggressive strategy to avoid load/store instructions may increase the constraints to suppress (coalesce) move instructions. This paper is devoted to the complexity of the coalescing phase, in particular in the light of recent developments on the SSA form. We distinguish several optimizations that occur in coalescing heuristics: a) aggressive coalescing removes as many moves as possible, regardless of the colorability of the resulting interference graph; b) conservative coalescing removes as many moves as possible while keeping the colorability of the graph; c) incremental conservative coalescing removes one particular move while keeping the colorability of the graph; d) optimistic coalescing coalesces moves aggressively, then gives up about as few moves as possible so that the graph becomes colorable again. We almost completely classify the NP-completeness of these problems, discussing also on the structure of the interference graph: arbitrary, chordal, or k-colorable in a greedy fashion. We believe that such a study is a necessary step for designing new coalescing strategies. 1
A fast cutting-plane algorithm for optimal coalescing
- In Proc. of the 16 th International Conference on Compiler Construction (CC ’07
"... Abstract. Recent work has shown that the subtasks of register allocation (spilling, register assignment, and coalescing) can be completely separated. This work presents an algorithm for the coalescing subproblem that relies on this separation. The algorithm uses 0/1 Linear Programming (ILP), a gener ..."
Abstract
-
Cited by 7 (1 self)
- Add to MetaCart
Abstract. Recent work has shown that the subtasks of register allocation (spilling, register assignment, and coalescing) can be completely separated. This work presents an algorithm for the coalescing subproblem that relies on this separation. The algorithm uses 0/1 Linear Programming (ILP), a general-purpose optimization technique, to derive optimal solutions. We provide the first optimal solutions for a benchmark called “Optimal Coalescing Challenge”, i.e., our ILP model outperforms previous approaches. Additionally, we use these optimal solutions to assess the quality of well-known heuristics. A second benchmark on SPEC CPU2000 programs emphasizes the practicality of our algorithm. 1
Fast Liveness Checking for SSA-Form Programs
"... Liveness analysis is an important analysis in optimizing compilers. Liveness information is used in several optimizations and is mandatory during the code-generation phase. Two drawbacks of conventional liveness analyses are that their computations are fairly expensive and their results are easily i ..."
Abstract
-
Cited by 5 (1 self)
- Add to MetaCart
Liveness analysis is an important analysis in optimizing compilers. Liveness information is used in several optimizations and is mandatory during the code-generation phase. Two drawbacks of conventional liveness analyses are that their computations are fairly expensive and their results are easily invalidated by program transformations. We present a method to check liveness of variables that overcomes both obstacles. The major advantage of the proposed method is that the analysis result survives all program transformations except for changes in the control-flow graph. For common program sizes our technique is faster and consumes less memory than conventional data-flow approaches. Thereby, we heavily make use of SSA-form properties, which allow us to completely circumvent data-flow equation solving. We evaluate the competitiveness of our approach in an industrial strength compiler. Our measurements use the integer part of the SPEC2000 benchmarks and investigate the liveness analysis used by the SSA destruction pass. We compare the net time spent in liveness computations of our implementation against the one provided by that compiler. The results show that in the vast majority of cases our algorithm, while providing the same quality of information, needs less time: an average speed-up of 16%.
Abstract Scratchpad Allocation for Data Aggregates in Superperfect Graphs
"... Existing methods place data or code in scratchpad memory, i.e., SPM by either relying on heuristics or resorting to integer programming or mapping it to a graph coloring problem. In this work, the SPM allocation problem is formulated as an interval coloring problem. The key observation is that in ma ..."
Abstract
-
Cited by 4 (4 self)
- Add to MetaCart
Existing methods place data or code in scratchpad memory, i.e., SPM by either relying on heuristics or resorting to integer programming or mapping it to a graph coloring problem. In this work, the SPM allocation problem is formulated as an interval coloring problem. The key observation is that in many embedded applications, arrays (including structs as a special case) are often related in the following way: For any two arrays, their live ranges are often such that one is either disjoint from or contains the other. As a result, array interference graphs are often superperfect graphs and optimal interval colorings for such array interference graphs are possible. This has led to the development of two new SPM allocation algorithms. While differing in whether live range splits and spills are done sequentially or together, both algorithms place arrays in SPM based on examining the cliques in an interference graph. In both cases, we guarantee optimally that all arrays in an interference graph can be placed in SPM if its size is no smaller than the clique number of the graph. In the case that the SPM is not large enough, we rely on heuristics to split or spill a live range until the graph is colorable. Our experiment results using embedded benchmarks show that our algorithms can outperform graph coloring when their interference graphs are superperfect or nearly so although graph coloring is admittedly more general and may also be effective to applications with arbitrary interference graphs.
SSA Elimination after Register Allocation
"... Abstract. Compilers such as gcc use static-single-assignment (SSA) form as an intermediate representation and usually perform SSA elimination before register allocation. But the order could as well be the opposite: the recent approach of SSA-based register allocation performs SSA elimination after r ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
Abstract. Compilers such as gcc use static-single-assignment (SSA) form as an intermediate representation and usually perform SSA elimination before register allocation. But the order could as well be the opposite: the recent approach of SSA-based register allocation performs SSA elimination after register allocation. SSA elimination before register allocation is straightforward and standard, while previously described approaches to SSA elimination after register allocation have shortcomings; in particular, they have problems with implementing copies between memory locations. We present spill-free SSA elimination, a simple and efficient algorithm for SSA elimination after register allocation that avoids increasing the number of spilled variables. We also present three optimizations of the core algorithm. Our experiments show that spillfree SSA elimination takes less than five percent of the total compilation time of a JIT compiler. Our optimizations reduce the number of memory accesses by more than 9 % and improve the program execution time by more than 1.8%. 1
Lecture Notes on Register Allocation 15-411: Compiler Design
, 2009
"... In this lecture we discuss register allocation, which is one of the last steps in a compiler before code emission. Its task is to map the potentially unbounded numbers of variables or “temps ” in pseudo-assembly to the actually available registers on the target machine. If not enough registers are ..."
Abstract
- Add to MetaCart
In this lecture we discuss register allocation, which is one of the last steps in a compiler before code emission. Its task is to map the potentially unbounded numbers of variables or “temps ” in pseudo-assembly to the actually available registers on the target machine. If not enough registers are
SSA-Form-Based Register Allocation for the Java HotSpot TM Server Compiler
"... Register allocation, i.e., the task of assigning processor registers to local variables and temporary values, is one of the most important compiler optimizations. A vast amount of research has led to algorithms ranging from simple and fast heuristics to optimal algorithms with exponential time compl ..."
Abstract
- Add to MetaCart
Register allocation, i.e., the task of assigning processor registers to local variables and temporary values, is one of the most important compiler optimizations. A vast amount of research has led to algorithms ranging from simple and fast heuristics to optimal algorithms with exponential time complexity. Because the problem is known to be NP-complete [2], algorithms must balance the time necessary for allocation against the resulting code quality. Two common algorithms in modern compilers are graph coloring (see for example [1, 2]), which is suitable when compilation time is not a major concern, and linear scan [6, 8, 10], which is faster and therefore frequently used for just-in-time compilers where compilation time adds to run time. Static single assignment (SSA) form [3] is a type of intermediate representation that simplifies many compiler optimizations. All variables have only a single point of definition. At control flow joins, phi functions are used to merge different variables of the predecessor blocks. Because processors cannot execute phi functions, it is necessary to replace them with move instructions during code generation (SSA form deconstruction). Traditionally, SSA form deconstruction was performed before register allocation. Only recently has it been observed that register allocation on SSA form has several advantages due to additional guarantees on variable lifetime. Lifetime information is essential for register allocation because two variables that interfere,
Scratchpad Memory Allocation for Data Aggregates via Interval Coloring in Superperfect Graphs
"... Existing methods place data or code in scratchpad memory, i.e., SPM by relying on heuristics or resorting to integer programming or mapping it to a graph coloring problem. In this paper, the SPM allocation problem for arrays is formulated as an interval coloring problem. The key observation is that ..."
Abstract
- Add to MetaCart
Existing methods place data or code in scratchpad memory, i.e., SPM by relying on heuristics or resorting to integer programming or mapping it to a graph coloring problem. In this paper, the SPM allocation problem for arrays is formulated as an interval coloring problem. The key observation is that in many embedded C programs, two arrays can be modeled such that either their live ranges do not interfere or one contains the other (with good accuracy). As a result, array interference graphs often form a special class of superperfect graphs (known as comparability graphs) and their optimal interval colorings become efficiently solvable. This insight has led to the development of an SPM allocation algorithm that places arrays in an interference graph in SPM by examining its maximal cliques. If the SPM is no smaller than the clique number of an interference graph, then all arrays in the graph can be placed in SPM optimally. Otherwise, we rely on containment-motivated heuristics to split or spill array live ranges until the resulting graph is optimally colorable. We have implemented our algorithm in SUIF/machSUIF and evaluated it using a set of embedded C benchmarks from MediaBench and MiBench. Compared to a graph coloring algorithm and an optimal ILP algorithm (when it runs to completion), our algorithm achieves close-to-optimal results and is superior to graph coloring for the benchmarks tested.
SSA Back-Translation: Faster Results with Edge Splitting and Post Optimization
, 2011
"... A compiler translates one representation of a software program into another. Beside translation compilers often have other tasks such as optimizating the result and warning the programmer for mistakes. Internally a compiler uses an Intermediate Representation (IR) for analysis and manipulation of th ..."
Abstract
- Add to MetaCart
A compiler translates one representation of a software program into another. Beside translation compilers often have other tasks such as optimizating the result and warning the programmer for mistakes. Internally a compiler uses an Intermediate Representation (IR) for analysis and manipulation of the program at hand. Data dependencies in most programming languages are implicit. Some compilers use an IR in Static Single Assignment (SSA) in which each local variable is only defined once to simplify analysis of data dependencies. If the number of assignments in the IR is not restricted, it is said to be in normal form. Input of a compiler is in normal form and translation is needed to bring the IR in SSA form. SSA-form contains phi functions to merge values based on control flow. After optimizations on SSA-form are performed it is not trivial to translate SSA-form back to normal form because the properties of phi nodes cannot be translated directly to processor instructions. The algorithms of Briggs
Title omitted for double-blind reasons
- PLDI (2010)
, 2010
"... Recent results on the static single assignment (SSA) form open promising directions for the design of new register allocation heuristics for just-in-time (JIT) compilation. In particular, heuristics based on tree scans with two decoupled phases, one for spilling, one for splitting/coloring/coalescin ..."
Abstract
- Add to MetaCart
Recent results on the static single assignment (SSA) form open promising directions for the design of new register allocation heuristics for just-in-time (JIT) compilation. In particular, heuristics based on tree scans with two decoupled phases, one for spilling, one for splitting/coloring/coalescing, seem good candidates for designing memory-friendly, fast, and competitive register allocators. Another class of register allocators, well-suited for JIT compilation, are those based on linear scans. Most of them perform coalescing poorly but also do live-range splitting (mostly on control-flow edges) to avoid spilling. This leads to a large amount of register-toregister copies inside basic blocks but also, implicitly, on critical edges, i.e., edges that flow from a block with several successors to a block with several predecessors. This paper presents a new back-end optimization that we call

