Results 1  10
of
14
Efficient Alias Set Analysis Using SSA Form
"... Precise, flowsensitive analyses of pointer relationships often represent each object using the set of local variables that point to it (the alias set), possibly augmented with additional predicates. Many such analyses are difficult to scale due to the size of the abstraction and due to flow sensiti ..."
Abstract

Cited by 6 (2 self)
 Add to MetaCart
Precise, flowsensitive analyses of pointer relationships often represent each object using the set of local variables that point to it (the alias set), possibly augmented with additional predicates. Many such analyses are difficult to scale due to the size of the abstraction and due to flow sensitivity. The focus of this paper is on efficient representation and manipulation of the alias set. Taking advantage of certain properties of static single assignment (SSA) form, we propose an efficient data structure that allows much of the representations of sets at different points in the program to be shared. The transfer function for each statement, instead of creating an updated set, makes only local changes to the existing data structure representing the set. The key enabling properties of SSA form are that every point at which a variable is live is dominated by its definition, and that the definitions of any set of simultaneously live variables are totally ordered according to the dominance relation. We represent the variables pointing to an object using a list ordered consistently with the dominance relation. Thus, when a variable is newly defined to point to the object, it need only be added to the head of the list. A back edge at which some variables cease to be live requires only dropping variables from the head of the list. We prove that the analysis using the proposed data structure computes the same result as a setbased analysis. We empirically show that the proposed data structure is more efficient in both time and memory requirements than set implementations using hash tables and balanced trees.
On the complexity of spill everywhere under ssa form
 LCTES’07
, 2007
"... Compilation for embedded processors can be either aggressive (time consuming crosscompilation) or just in time (embedded and usually dynamic). The heuristics used in dynamic compilation are highly constrained by limited resources, time and memory in particular. Recent results on the SSA form open p ..."
Abstract

Cited by 5 (0 self)
 Add to MetaCart
Compilation for embedded processors can be either aggressive (time consuming crosscompilation) or just in time (embedded and usually dynamic). The heuristics used in dynamic compilation are highly constrained by limited resources, time and memory in particular. Recent results on the SSA form open promising directions for the design of new register allocation heuristics for embedded systems and especially for embedded compilation. In particular, heuristics based on tree scan with two separated phases — one for spilling, then one for coloring/coalescing — seem good candidates for designing memoryfriendly, fast, and competitive register allocators. Still, also because of the side effect on power consumption, the minimization of loads and stores overhead (spilling problem) is an important issue. This paper provides an exhaustive study of the complexity of the “spill everywhere” problem in the context of the SSA form. Unfortunately, conversely to our initial hopes, many of the questions we raised lead to NPcompleteness results. We identify some polynomial cases but that are impractical in JIT context. Nevertheless, they can give hints to simplify formulations for the design of aggressive allocators.
Coordinated Resource Optimization in Behavioral Synthesis
"... Abstract—Reducing resource usage is one of the most important optimization objectives in behavioral synthesis due to its direct impact on power, performance and cost. The datapath in a typical design is composed of different kinds of components, including functional units, registers and multiplexers ..."
Abstract

Cited by 2 (1 self)
 Add to MetaCart
Abstract—Reducing resource usage is one of the most important optimization objectives in behavioral synthesis due to its direct impact on power, performance and cost. The datapath in a typical design is composed of different kinds of components, including functional units, registers and multiplexers. To optimize the overall resource usage, a behavioral synthesis tool should consider all kinds of components at the same time. However, most previous work on behavioral synthesis has the limitations of (i) not being able to consider all kinds of resources globally, and/or (ii) separating the synthesis process into a sequence of optimization steps without a consistent optimization objective. In this paper we present a behavioral synthesis flow in which all types of components in the datapath are modeled and optimized consistently. The key idea is to feed to the scheduler the intentions for sharing functional units and registers in favor of the global optimization goal (such as total area), so that the scheduler could generate a schedule that makes the sharing intentions feasible. Experiments show that compared to the solution of minimizing functional unit requirements in scheduling and using the least number of functional units and registers in binding, our solution achieves a 24 % reduction in total area; compared to the online tool provided by ctoverilog.com, our solution achieves a 30% reduction on average. I.
Optimal polynomialtime interprocedural register allocation for high level synthesis and
 ASIP design,” in Proc. Int. Conf. Comput.Aided Design, 2007
"... Abstract—Register allocation, in highlevel synthesis and ASIP design, is the process of determining the number of registers to include in the resulting circuit or processor. The goal is to allocate the minimum number of registers such that no scalar variable is spilled to memory. Previously, an opt ..."
Abstract

Cited by 2 (1 self)
 Add to MetaCart
Abstract—Register allocation, in highlevel synthesis and ASIP design, is the process of determining the number of registers to include in the resulting circuit or processor. The goal is to allocate the minimum number of registers such that no scalar variable is spilled to memory. Previously, an optimal polynomialtime algorithm for this problem has been presented for individual procedures represented in Static Single Assignment (SSA) Form. This result is now extended to complete programs (or subprograms), as long as: (1) each procedure is represented in SSA Form; and (2) at every procedure call, all live variables are split at the call point. With this representation, it is possible to ensure that the interprocedural interference graph (IIG) is chordal, and can therefore be colored optimally in polynomial time. An optimal coloring of the IIG can be achieved by allocating registers for each procedure individually. Previous work has shown that optimal register allocation in SSA Form does not require an interference graph. Optimal interprocedural register allocation, therefore, is achieved without constructing an interference graph, giving the optimal algorithm a significant runtime advantage over prior suboptimal heuristics. I.
An Optimistic and Conservative Register Assignment Heuristic for Chordal Graphs
, 2007
"... This paper presents a new register assignment heuristic for procedures in SSA Form, whose interference graphs are chordal; the heuristic is called optimistic chordal coloring (OCC). Previous register assignment heuristics eliminate copy instructions via coalescing, in other words, merging nodes in t ..."
Abstract

Cited by 2 (0 self)
 Add to MetaCart
This paper presents a new register assignment heuristic for procedures in SSA Form, whose interference graphs are chordal; the heuristic is called optimistic chordal coloring (OCC). Previous register assignment heuristics eliminate copy instructions via coalescing, in other words, merging nodes in the interference graph. Node merging, however, can not preserve the chordal graph property, making it unappealing for SSAbased register allocation. OCC is based on graph coloring, but does not employ coalescing, and, consequently, preserves graph chordality, and does not increase its chromatic number; in this sense, OCC is conservative as well as optimistic. OCC is observed to eliminate at least as many dynamically executed copy instructions as iterated register coalescing (IRC) for a set of chordal interference graphs generated from several Mediabench and MiBench applications. In many cases, OCC and IRC were able to find optimal or nearoptimal solutions for these graphs. OCC ran 1.89x faster than IRC, on average.
Linear scan register allocation on ssa form
 In Proceedings of the International Symposium on Code Generation and Optimization
, 2010
"... The linear scan algorithm for register allocation provides a good register assignment with a low compilation overhead and is thus frequently used for justintime compilers. Although most of these compilers use static single assignment (SSA) form, the algorithm has not yet been applied on SSA form, ..."
Abstract

Cited by 2 (0 self)
 Add to MetaCart
The linear scan algorithm for register allocation provides a good register assignment with a low compilation overhead and is thus frequently used for justintime compilers. Although most of these compilers use static single assignment (SSA) form, the algorithm has not yet been applied on SSA form, i.e., SSA form is usually deconstructed before register allocation. However, the structural properties of SSA form can be used to simplify the algorithm. With only one definition per variable, lifetime intervals (the main data structure) can be constructed without data flow analysis. During allocation, some tests of interval intersection can be skipped because SSA form guarantees nonintersection. Finally, deconstruction of SSA form after register allocation can be integrated into the resolution phase of the register allocator without much additional code. We modified the linear scan register allocator of the Java HotSpot TM client compiler so that it operates on SSA form. The evaluation shows that our simpler and faster version generates equally good or slightly better machine code.
Register Allocation Deconstructed
, 2009
"... Register allocation is a fundamental part of any optimizing compiler. Effectively managing the limited register resources of the constrained architectures commonly found in embedded systems is essential in order to maximize code quality. In this paper we deconstruct the register allocation problem i ..."
Abstract

Cited by 2 (0 self)
 Add to MetaCart
Register allocation is a fundamental part of any optimizing compiler. Effectively managing the limited register resources of the constrained architectures commonly found in embedded systems is essential in order to maximize code quality. In this paper we deconstruct the register allocation problem into distinct components: coalescing, spilling, move insertion, and assignment. Using an optimal register allocation framework, we empirically evaluate the importance of each of the components, the impact of component integration, and the effectiveness of existing heuristics. We evaluate code quality both in terms of code performance and code size and consider four distinct instruction set architectures: ARM, Thumb, x86, and x8664. The results of our investigation reveal general principles for register allocation design.
Optimistic chordal coloring: a coalescing heuristic for SSA form programs
"... Abstract The interference graph for a procedure in Static Single Assignment (SSA) Form is chordal. Since the kcolorability problem can be solved in polynomialtime for chordal graphs, this result has generated interest in SSAbased heuristics for spilling and coalescing. Since copies can be folded ..."
Abstract
 Add to MetaCart
Abstract The interference graph for a procedure in Static Single Assignment (SSA) Form is chordal. Since the kcolorability problem can be solved in polynomialtime for chordal graphs, this result has generated interest in SSAbased heuristics for spilling and coalescing. Since copies can be folded during SSA construction, instances of the coalescing problem under SSA have fewer affinities than traditional methods. This paper presents Optimistic Chordal Coloring (OCC), a coalescing heuristic for chordal graphs. OCC was evaluated on interference graphs from embedded/multimedia benchmarks: in all cases, OCC found the optimal solution, and ran, on average, 2.30 × faster than Iterated Register Coalescing.
Program Interpolation
"... Program interpolation is a new type of transformation that given an input program written in a specially constructed Domain Specific Language (DSL), produces a family of functionally equivalent instruction sequences as output. Each sequence is an “interpolation” between the controlflows of implemen ..."
Abstract
 Add to MetaCart
Program interpolation is a new type of transformation that given an input program written in a specially constructed Domain Specific Language (DSL), produces a family of functionally equivalent instruction sequences as output. Each sequence is an “interpolation” between the controlflows of implementation strategies supplied in the input program. The purpose of the transformation is to expose behavioural differences (e.g. performance) within the sequences, and thus allow automated optimisation with respect to architectural tradeoffs that are difficult to quantify and model. We present results from a prototype compiler that demonstrate a 63 % speedup in the domain of multiprecision integer arithmetic. 1.
An Optimal LinearTime Algorithm for Interprocedural Register Allocation in High Level Synthesis Using SSA Form
"... Abstract—An optimal lineartime algorithm for interprocedural register allocation in high level synthesis is presented. Historically, register allocation has been modeled as a graph coloring problem, which is nondeterministic polynomial timecomplete in general; however, converting each procedure to ..."
Abstract
 Add to MetaCart
Abstract—An optimal lineartime algorithm for interprocedural register allocation in high level synthesis is presented. Historically, register allocation has been modeled as a graph coloring problem, which is nondeterministic polynomial timecomplete in general; however, converting each procedure to static single assignment (SSA) form ensures a chordal interference graph, which can be colored in O(V+E) time; the interprocedural interference graph (IIG) is not guaranteed to be chordal after this transformation. An extension to SSA form is introduced which ensures that the IIG is chordal, and the conversion process does not increase its chromatic number. The resulting IIG can then be colored in lineartime. Index Terms—Chordal graph, graph coloring, high level synthesis, (inteprocedural) register allocation, static single assignment (SSA) form. I.