Results 1  10
of
22
On the complexity of spill everywhere under ssa form
 LCTES’07
, 2007
"... Compilation for embedded processors can be either aggressive (time consuming crosscompilation) or just in time (embedded and usually dynamic). The heuristics used in dynamic compilation are highly constrained by limited resources, time and memory in particular. Recent results on the SSA form open p ..."
Abstract

Cited by 8 (2 self)
 Add to MetaCart
Compilation for embedded processors can be either aggressive (time consuming crosscompilation) or just in time (embedded and usually dynamic). The heuristics used in dynamic compilation are highly constrained by limited resources, time and memory in particular. Recent results on the SSA form open promising directions for the design of new register allocation heuristics for embedded systems and especially for embedded compilation. In particular, heuristics based on tree scan with two separated phases — one for spilling, then one for coloring/coalescing — seem good candidates for designing memoryfriendly, fast, and competitive register allocators. Still, also because of the side effect on power consumption, the minimization of loads and stores overhead (spilling problem) is an important issue. This paper provides an exhaustive study of the complexity of the “spill everywhere” problem in the context of the SSA form. Unfortunately, conversely to our initial hopes, many of the questions we raised lead to NPcompleteness results. We identify some polynomial cases but that are impractical in JIT context. Nevertheless, they can give hints to simplify formulations for the design of aggressive allocators.
Linear scan register allocation on ssa form
 In Proceedings of the International Symposium on Code Generation and Optimization
, 2010
"... The linear scan algorithm for register allocation provides a good register assignment with a low compilation overhead and is thus frequently used for justintime compilers. Although most of these compilers use static single assignment (SSA) form, the algorithm has not yet been applied on SSA form, ..."
Abstract

Cited by 6 (0 self)
 Add to MetaCart
(Show Context)
The linear scan algorithm for register allocation provides a good register assignment with a low compilation overhead and is thus frequently used for justintime compilers. Although most of these compilers use static single assignment (SSA) form, the algorithm has not yet been applied on SSA form, i.e., SSA form is usually deconstructed before register allocation. However, the structural properties of SSA form can be used to simplify the algorithm. With only one definition per variable, lifetime intervals (the main data structure) can be constructed without data flow analysis. During allocation, some tests of interval intersection can be skipped because SSA form guarantees nonintersection. Finally, deconstruction of SSA form after register allocation can be integrated into the resolution phase of the register allocator without much additional code. We modified the linear scan register allocator of the Java HotSpot TM client compiler so that it operates on SSA form. The evaluation shows that our simpler and faster version generates equally good or slightly better machine code.
SSA Elimination after Register Allocation
"... Abstract. Compilers such as gcc use staticsingleassignment (SSA) form as an intermediate representation and usually perform SSA elimination before register allocation. But the order could as well be the opposite: the recent approach of SSAbased register allocation performs SSA elimination after r ..."
Abstract

Cited by 6 (0 self)
 Add to MetaCart
(Show Context)
Abstract. Compilers such as gcc use staticsingleassignment (SSA) form as an intermediate representation and usually perform SSA elimination before register allocation. But the order could as well be the opposite: the recent approach of SSAbased register allocation performs SSA elimination after register allocation. SSA elimination before register allocation is straightforward and standard, while previously described approaches to SSA elimination after register allocation have shortcomings; in particular, they have problems with implementing copies between memory locations. We present spillfree SSA elimination, a simple and efficient algorithm for SSA elimination after register allocation that avoids increasing the number of spilled variables. We also present three optimizations of the core algorithm. Our experiments show that spillfree SSA elimination takes less than five percent of the total compilation time of a JIT compiler. Our optimizations reduce the number of memory accesses by more than 9 % and improve the program execution time by more than 1.8%. 1
Efficient Alias Set Analysis Using SSA Form
"... Precise, flowsensitive analyses of pointer relationships often represent each object using the set of local variables that point to it (the alias set), possibly augmented with additional predicates. Many such analyses are difficult to scale due to the size of the abstraction and due to flow sensiti ..."
Abstract

Cited by 5 (2 self)
 Add to MetaCart
(Show Context)
Precise, flowsensitive analyses of pointer relationships often represent each object using the set of local variables that point to it (the alias set), possibly augmented with additional predicates. Many such analyses are difficult to scale due to the size of the abstraction and due to flow sensitivity. The focus of this paper is on efficient representation and manipulation of the alias set. Taking advantage of certain properties of static single assignment (SSA) form, we propose an efficient data structure that allows much of the representations of sets at different points in the program to be shared. The transfer function for each statement, instead of creating an updated set, makes only local changes to the existing data structure representing the set. The key enabling properties of SSA form are that every point at which a variable is live is dominated by its definition, and that the definitions of any set of simultaneously live variables are totally ordered according to the dominance relation. We represent the variables pointing to an object using a list ordered consistently with the dominance relation. Thus, when a variable is newly defined to point to the object, it need only be added to the head of the list. A back edge at which some variables cease to be live requires only dropping variables from the head of the list. We prove that the analysis using the proposed data structure computes the same result as a setbased analysis. We empirically show that the proposed data structure is more efficient in both time and memory requirements than set implementations using hash tables and balanced trees.
Optimal polynomialtime interprocedural register allocation for high level synthesis and ASIP design
 IN PROC. INT. CONF. COMPUT.AIDED DESIGN, 2007
, 2007
"... Register allocation, in highlevel synthesis and ASIP design, is the process of determining the number of registers to include in the resulting circuit or processor. The goal is to allocate the minimum number of registers such that no scalar variable is spilled to memory. Previously, an optimal poly ..."
Abstract

Cited by 3 (1 self)
 Add to MetaCart
(Show Context)
Register allocation, in highlevel synthesis and ASIP design, is the process of determining the number of registers to include in the resulting circuit or processor. The goal is to allocate the minimum number of registers such that no scalar variable is spilled to memory. Previously, an optimal polynomialtime algorithm for this problem has been presented for individual procedures represented in Static Single Assignment (SSA) Form. This result is now extended to complete programs (or subprograms), as long as: (1) each procedure is represented in SSA Form; and (2) at every procedure call, all live variables are split at the call point. With this representation, it is possible to ensure that the interprocedural interference graph (IIG) is chordal, and can therefore be colored optimally in polynomial time. An optimal coloring of the IIG can be achieved by allocating registers for each procedure individually. Previous work has shown that optimal register allocation in SSA Form does not require an interference graph. Optimal interprocedural register allocation, therefore, is achieved without constructing an interference graph, giving the optimal algorithm a significant runtime advantage over prior suboptimal heuristics.
Register Allocation Deconstructed
, 2009
"... Register allocation is a fundamental part of any optimizing compiler. Effectively managing the limited register resources of the constrained architectures commonly found in embedded systems is essential in order to maximize code quality. In this paper we deconstruct the register allocation problem i ..."
Abstract

Cited by 3 (0 self)
 Add to MetaCart
(Show Context)
Register allocation is a fundamental part of any optimizing compiler. Effectively managing the limited register resources of the constrained architectures commonly found in embedded systems is essential in order to maximize code quality. In this paper we deconstruct the register allocation problem into distinct components: coalescing, spilling, move insertion, and assignment. Using an optimal register allocation framework, we empirically evaluate the importance of each of the components, the impact of component integration, and the effectiveness of existing heuristics. We evaluate code quality both in terms of code performance and code size and consider four distinct instruction set architectures: ARM, Thumb, x86, and x8664. The results of our investigation reveal general principles for register allocation design.
An Optimistic and Conservative Register Assignment Heuristic for Chordal Graphs
, 2007
"... This paper presents a new register assignment heuristic for procedures in SSA Form, whose interference graphs are chordal; the heuristic is called optimistic chordal coloring (OCC). Previous register assignment heuristics eliminate copy instructions via coalescing, in other words, merging nodes in t ..."
Abstract

Cited by 3 (0 self)
 Add to MetaCart
(Show Context)
This paper presents a new register assignment heuristic for procedures in SSA Form, whose interference graphs are chordal; the heuristic is called optimistic chordal coloring (OCC). Previous register assignment heuristics eliminate copy instructions via coalescing, in other words, merging nodes in the interference graph. Node merging, however, can not preserve the chordal graph property, making it unappealing for SSAbased register allocation. OCC is based on graph coloring, but does not employ coalescing, and, consequently, preserves graph chordality, and does not increase its chromatic number; in this sense, OCC is conservative as well as optimistic. OCC is observed to eliminate at least as many dynamically executed copy instructions as iterated register coalescing (IRC) for a set of chordal interference graphs generated from several Mediabench and MiBench applications. In many cases, OCC and IRC were able to find optimal or nearoptimal solutions for these graphs. OCC ran 1.89x faster than IRC, on average.
Coordinated Resource Optimization in Behavioral Synthesis
"... Abstract—Reducing resource usage is one of the most important optimization objectives in behavioral synthesis due to its direct impact on power, performance and cost. The datapath in a typical design is composed of different kinds of components, including functional units, registers and multiplexers ..."
Abstract

Cited by 2 (1 self)
 Add to MetaCart
(Show Context)
Abstract—Reducing resource usage is one of the most important optimization objectives in behavioral synthesis due to its direct impact on power, performance and cost. The datapath in a typical design is composed of different kinds of components, including functional units, registers and multiplexers. To optimize the overall resource usage, a behavioral synthesis tool should consider all kinds of components at the same time. However, most previous work on behavioral synthesis has the limitations of (i) not being able to consider all kinds of resources globally, and/or (ii) separating the synthesis process into a sequence of optimization steps without a consistent optimization objective. In this paper we present a behavioral synthesis flow in which all types of components in the datapath are modeled and optimized consistently. The key idea is to feed to the scheduler the intentions for sharing functional units and registers in favor of the global optimization goal (such as total area), so that the scheduler could generate a schedule that makes the sharing intentions feasible. Experiments show that compared to the solution of minimizing functional unit requirements in scheduling and using the least number of functional units and registers in binding, our solution achieves a 24 % reduction in total area; compared to the online tool provided by ctoverilog.com, our solution achieves a 30% reduction on average. I.
Decoupled (SSAbased) Register Allocators: from Theory to Practice, Coping with JustInTime Compilation and Embedded Processors Constraints.
, 2013
"... In compilation, register allocation is the optimization that chooses which variables of the source program, in unlimited number, are mapped to the actual registers, in limited number. Parts of the liveranges of the variables that cannot be mapped to registers are placed in memory. This eviction is ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
In compilation, register allocation is the optimization that chooses which variables of the source program, in unlimited number, are mapped to the actual registers, in limited number. Parts of the liveranges of the variables that cannot be mapped to registers are placed in memory. This eviction is called spilling. Until recently, compilers mainly addressed register allocation via graph coloring using an idea developed by Chaitin et al. [33] in 1981. This approach addresses the spilling and the mapping of the variables to registers in one phase. In 2001, Appel and George [3] proposed to split the register allocation in two separate phases. This idea yields better and independent solutions for both problems, but requires a very aggressive form of liverange splitting, split everywhere, which renames all variables between all instructions of the program. However, in 2005, several groups [27, 84, 56, 16] observed that the static single assignment (SSA) form provides sufficient split points to decouple the register allocation as Appel and George suggested, unless register aliasing or precoloring constraints are involved.
An Efficient Storeless Heap Abstraction Using SSA Form
, 2008
"... Precise, flowsensitive analyses of pointer relationships often use a storeless heap abstraction. In this model, an object is represented using some abstraction of the expressions that refer to it (i.e. access paths). Many analyses using such an abstraction are difficult to scale due to the size of ..."
Abstract
 Add to MetaCart
Precise, flowsensitive analyses of pointer relationships often use a storeless heap abstraction. In this model, an object is represented using some abstraction of the expressions that refer to it (i.e. access paths). Many analyses using such an abstraction are difficult to scale due to the size of the abstraction and due to flow sensitivity. Typically, an object is represented by the set of local variables pointing to it, together with additional predicates representing pointers from other objects. The focus of this paper is on the set of local variables, the core of any such abstraction. Taking advantage of certain properties of static single assignment (SSA) form, we propose an efficient data structure that allows much of the representation of an object at different points in the program to be shared. The transfer function for each statement, instead of creating an updated set, makes only local changes to the existing data structure representing the set. The key enabling properties of SSA form are that every point at which a variable is live is dominated by its definition, and that the definitions of any set of simultaneously live variables are totally ordered according to the dominance relation. We represent the variables pointing to an object using a list ordered consistently with the dominance relation. Thus, when a variable is newly defined to point to the object, it need only be added to the head of the list. A back edge at which some variables cease to be live requires only dropping variables from the head of the list. We prove that the analysis using the proposed data structure computes the same result as a setbased analysis. We empirically show that the proposed data structure is more efficient in both time and memory requirements than set implementations using hash tables and balanced trees.