Results 1 -
4 of
4
SSA Elimination after Register Allocation
"... Abstract. Compilers such as gcc use static-single-assignment (SSA) form as an intermediate representation and usually perform SSA elimination before register allocation. But the order could as well be the opposite: the recent approach of SSA-based register allocation performs SSA elimination after r ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
Abstract. Compilers such as gcc use static-single-assignment (SSA) form as an intermediate representation and usually perform SSA elimination before register allocation. But the order could as well be the opposite: the recent approach of SSA-based register allocation performs SSA elimination after register allocation. SSA elimination before register allocation is straightforward and standard, while previously described approaches to SSA elimination after register allocation have shortcomings; in particular, they have problems with implementing copies between memory locations. We present spill-free SSA elimination, a simple and efficient algorithm for SSA elimination after register allocation that avoids increasing the number of spilled variables. We also present three optimizations of the core algorithm. Our experiments show that spillfree SSA elimination takes less than five percent of the total compilation time of a JIT compiler. Our optimizations reduce the number of memory accesses by more than 9 % and improve the program execution time by more than 1.8%. 1
Coordinated Resource Optimization in Behavioral Synthesis
"... Abstract—Reducing resource usage is one of the most important optimization objectives in behavioral synthesis due to its direct impact on power, performance and cost. The datapath in a typical design is composed of different kinds of components, including functional units, registers and multiplexers ..."
Abstract
- Add to MetaCart
Abstract—Reducing resource usage is one of the most important optimization objectives in behavioral synthesis due to its direct impact on power, performance and cost. The datapath in a typical design is composed of different kinds of components, including functional units, registers and multiplexers. To optimize the overall resource usage, a behavioral synthesis tool should consider all kinds of components at the same time. However, most previous work on behavioral synthesis has the limitations of (i) not being able to consider all kinds of resources globally, and/or (ii) separating the synthesis process into a sequence of optimization steps without a consistent optimization objective. In this paper we present a behavioral synthesis flow in which all types of components in the datapath are modeled and optimized consistently. The key idea is to feed to the scheduler the intentions for sharing functional units and registers in favor of the global optimization goal (such as total area), so that the scheduler could generate a schedule that makes the sharing intentions feasible. Experiments show that compared to the solution of minimizing functional unit requirements in scheduling and using the least number of functional units and registers in binding, our solution achieves a 24 % reduction in total area; compared to the online tool provided by c-to-verilog.com, our solution achieves a 30% reduction on average. I.
Program Interpolation
"... Program interpolation is a new type of transformation that given an input program written in a specially constructed Domain Specific Language (DSL), produces a family of functionally equivalent instruction sequences as output. Each sequence is an “interpolation” between the control-flows of implemen ..."
Abstract
- Add to MetaCart
Program interpolation is a new type of transformation that given an input program written in a specially constructed Domain Specific Language (DSL), produces a family of functionally equivalent instruction sequences as output. Each sequence is an “interpolation” between the control-flows of implementation strategies supplied in the input program. The purpose of the transformation is to expose behavioural differences (e.g. performance) within the sequences, and thus allow automated optimisation with respect to architectural trade-offs that are difficult to quantify and model. We present results from a prototype compiler that demonstrate a 63 % speedup in the domain of multi-precision integer arithmetic. 1.
An Optimal Linear-Time Algorithm for Interprocedural Register Allocation in High Level Synthesis Using SSA Form
"... Abstract—An optimal linear-time algorithm for interprocedural register allocation in high level synthesis is presented. Historically, register allocation has been modeled as a graph coloring problem, which is nondeterministic polynomial time-complete in general; however, converting each procedure to ..."
Abstract
- Add to MetaCart
Abstract—An optimal linear-time algorithm for interprocedural register allocation in high level synthesis is presented. Historically, register allocation has been modeled as a graph coloring problem, which is nondeterministic polynomial time-complete in general; however, converting each procedure to static single assignment (SSA) form ensures a chordal interference graph, which can be colored in O(|V|+|E|) time; the interprocedural interference graph (IIG) is not guaranteed to be chordal after this transformation. An extension to SSA form is introduced which ensures that the IIG is chordal, and the conversion process does not increase its chromatic number. The resulting IIG can then be colored in linear-time. Index Terms—Chordal graph, graph coloring, high level synthesis, (inteprocedural) register allocation, static single assignment (SSA) form. I.

