## The Meeting Graph: A New Model for Loop Cyclic Register Allocation (1995)

Venue: | In Proc. of the Fifth Workshop on Compilers for Parallel Computers (CPC95 |

Citations: | 31 - 11 self |

### BibTeX

@INPROCEEDINGS{Eisenbeis95themeeting,

author = {Christine Eisenbeis and Sylvain Lelait and Bruno Marmol},

title = {The Meeting Graph: A New Model for Loop Cyclic Register Allocation},

booktitle = {In Proc. of the Fifth Workshop on Compilers for Parallel Computers (CPC95},

year = {1995},

pages = {264--267},

publisher = {ACM Press}

}

### OpenURL

### Abstract

Register allocation is a compiler phase in which the gains can be essential in achieving performance on new architectures exploiting instruction level parallelism. We focus our attention on loops and improve the existing methods by introducing a new kind of graph. We model loop unrolling and register allocation together in a common framework, called the meeting graph. We expect our results to significantly improve loop register allocation while keeping the amount of code replication low. As a byproduct, we present an optimal algorithm for allocating loop variables to a rotating register file, as well as a new heuristic for loop variables spilling. 1 Introduction The efficiency of register allocation is a crucial problem in modern microprocessors, where the increasing gap between the internal clock cycle and memory latency exacerbates the need to keep the variables in registers and to avoid spill code. In this paper, we address the important problem of loop register allocation and spi...

### Citations

2293 | Maintaining Knowledge about Temporal Intervals
- Allen
- 1983
(Show Context)
Citation Context ...ters in the loop (cyclic) context. FOR i = 1 to n DO t1 = x(i) t2 = Ast1 t3 = y(i) t4 = t3 + t2 A = t4sA B = AsB ENDDO Figure 1: Original loop iteration j \Gamma 2 iteration j \Gamma 1 iteration j t1 =-=[1]-=- = x(1) t3 [1] = y(1) t2 [1] = A [0]st1 [1] t1 [2] = x(2) t3 [2] = y(2) t4 [1] = t3 [1] + t2 [1] A [1] = t4 [1]sA [0] FOR j = 3 to n DO t2 [j\Gamma1] = A [j\Gamma2]st1 [j\Gamma1] t1 [j] = x(j) B [j\Ga... |

515 | Software pipelining: an effective scheduling technique for vliw machines
- Lam
- 2004
(Show Context)
Citation Context ... interference graphs resulting from loop code are circular interval graphs (CIG) [8], on which usual graph problems are known to be easier than on general graphs [6]. Second, loop software pipelining =-=[10]-=-, that is necessary to exploit the instruction level parallelism, generates variable lifetimes that may span more than one iteration, enforcing software (loop unrolling [10, 4]) or hardware (rotating ... |

417 |
Register allocation & spilling via graph coloring
- Chaitin
- 1982
(Show Context)
Citation Context ...e value in t3 [j \Gamma1] ), it is impossible to allocate two consecutive instances of t3 to the same register. The usual model for representing register lifetimes conflicts is the interference graph =-=[4]-=-, where the vertices are the variables and an edge is drawn between two vertices if the lifetimes of the corresponding variables overlap. The interference graph of our software pipelined loop is shown... |

237 |
Some scheduling techniques and an easily schedulable horizontal architecture for high performance scientific computing
- Rau, Glaeser
- 1981
(Show Context)
Citation Context ...nce graphs resulting from loop code are circular interval graphs (CIG) [14, 24], on which usual graph problems are known to be easier than on general graphs [10, 12]. Second, loop software pipelining =-=[20, 16]-=-, that is necessary to exploit the instruction level parallelism, generates variable lifetimes that may span more than one iteration, enforcing software (loop unrolling [16, 7]) or hardware (rotating ... |

142 |
Compiling for the Cydra 5
- Dehnert, Towle
- 1993
(Show Context)
Citation Context ...ling. 2.3 Rotating register file A convenient hardware feature for dealing with the modulo variable expansion is the concept of rotating register file, that is implemented on the Cydra 5 architecture =-=[6]-=-. At each iteration, a pointer to the register file is shifted cyclically one location ahead. The addressing of the registers is performed according to this pointer. It is also possible to address a r... |

132 | Lifetime-sensitive modulo scheduling
- Huff
- 1993
(Show Context)
Citation Context ...n a bivaluated graph. In our work we do not consider the problem of loop scheduling, this is beyond the scope of this paper. Relevant work combining scheduling and register allocation can be found in =-=[9, 12, 3, 14]-=-. We first recall the problems of loop register allocation and software and hardware techniques for solving the problem of variables that span more than one iteration. Then we describe what the meetin... |

114 |
The complexity of coloring circular arcs and chords
- Garey, Johnson, et al.
- 1980
(Show Context)
Citation Context ...rk are the following. First, the usual interference graphs resulting from loop code are circular interval graphs (CIG) [8], on which usual graph problems are known to be easier than on general graphs =-=[6]-=-. Second, loop software pipelining [10], that is necessary to exploit the instruction level parallelism, generates variable lifetimes that may span more than one iteration, enforcing software (loop un... |

107 |
M.S.: Register allocation for software pipelined loops
- Rau, Lee, et al.
- 1992
(Show Context)
Citation Context ...sary to exploit the instruction level parallelism, generates variable lifetimes that may span more than one iteration, enforcing software (loop unrolling [10, 4]) or hardware (rotating registers file =-=[13]-=-) techniques to be used. These two facts have always been treated separately. Our starting point was the following question : what is the effect of loop unrolling on the interference graph ? Especiall... |

86 | Complexity and algorithms for reasoning about time: a graph-theoretic approach
- Golumbic, Shamir
- 1993
(Show Context)
Citation Context ...oop unrolling during register allocation, we introduce a new kind of graph to replace the usual interference graph [2]. The starting point of this new graph is the interval logic algebra described in =-=[7]-=-. The main actor in the register allocation, which is not present in the interference graph is time. When using the underlying interval structure of a circular interval interference graph, the notion ... |

69 |
Efficient algorithms for interval graph and circular-arc graphs
- Gupta, Lee, et al.
- 1982
(Show Context)
Citation Context ...e the following. First, the usual interference graphs resulting from loop code are circular interval graphs (CIG) [14, 24], on which usual graph problems are known to be easier than on general graphs =-=[10, 12]-=-. Second, loop software pipelining [20, 16], that is necessary to exploit the instruction level parallelism, generates variable lifetimes that may span more than one iteration, enforcing software (loo... |

67 | Register allocation with instruction scheduling
- Pinter
- 1993
(Show Context)
Citation Context ... by trying to integrate the eventual register conflicts into the scheduling process, or minimizing a criterion in order to favor the following register allocation phase. The first strategy is used in =-=[19]-=- and [3], in the case of straight line code. It should be noted that the technique presented in [3] consists in depicting by chains the possible reuse of resources (functional units as well as registe... |

61 | A Novel Framework of Register Allocation for Software Pipelining
- Ning, Gao
- 1993
(Show Context)
Citation Context ...n a bivaluated graph. In our work we do not consider the problem of loop scheduling, this is beyond the scope of this paper. Relevant work combining scheduling and register allocation can be found in =-=[9, 12, 3, 14]-=-. We first recall the problems of loop register allocation and software and hardware techniques for solving the problem of variables that span more than one iteration. Then we describe what the meetin... |

56 | C.: A register allocation framework based on hierarchical cyclic interval graphs
- Hendren, Gao, et al.
- 1992
(Show Context)
Citation Context ...p register allocation and spilling. The two main facts that have motivated our work are the following. First, the usual interference graphs resulting from loop code are circular interval graphs (CIG) =-=[8]-=-, on which usual graph problems are known to be easier than on general graphs [6]. Second, loop software pipelining [10], that is necessary to exploit the instruction level parallelism, generates vari... |

39 | URSA: A Unified ReSource Allocator for Registers and Functional Units in VLIW Architectures
- Berson, Gupta, et al.
- 1993
(Show Context)
Citation Context ...g to integrate the eventual register conflicts into the scheduling process, or minimizing a criterion in order to favor the following register allocation phase. The first strategy is used in [19] and =-=[3]-=-, in the case of straight line code. It should be noted that the technique presented in [3] consists in depicting by chains the possible reuse of resources (functional units as well as registers). Whe... |

29 | A polynomial time method for optimal software pipelining
- Dongen, Gao, et al.
- 1992
(Show Context)
Citation Context ...re used, at the price of a largest unrolling degree (6 against 2). Because unrolling may be necessary for other issues than register allocation (functional units allocation [9] or instruction timings =-=[22, 13]-=-), it is important to be able to control it very precisely. As a matter of fact, two cyclicity phenomena with respective periods of u and v result in a period of lcm(u; v), that may be very large. 4 5... |

26 |
allocation and spilling via graph coloring
- Register
- 1982
(Show Context)
Citation Context ...gure 1: Chaotic unrolling. 3 The Meeting graph In order to better take into account loop unrolling during register allocation, we introduce a new kind of graph to replace the usual interference graph =-=[2]-=-. The starting point of this new graph is the interval logic algebra described in [7]. The main actor in the register allocation, which is not present in the interference graph is time. When using the... |

21 |
New clique and independent set algorithms for circle graphs
- Apostolico, Atallah, et al.
- 1992
(Show Context)
Citation Context ... ; ffl in each component, determine a Hamiltonian circuit ; ffl for a Hamiltonian circuit : construct the circle graph induced by the chords ; ffl find the maximum independent set in the circle graph =-=[1]-=-. Every step has polynomial complexity [5]. The fact that this is only a heuristic comes from the choice of the Hamiltonian circuit. To find the best decomposition, every Hamiltonian circuit should be... |

19 |
Study of a NP-hard cyclic scheduling problem: The recurrent job-shop
- Hanen
- 1994
(Show Context)
Citation Context ...re used, at the price of a largest unrolling degree (6 against 2). Because unrolling may be necessary for other issues than register allocation (functional units allocation [9] or instruction timings =-=[22, 13]-=-), it is important to be able to control it very precisely. As a matter of fact, two cyclicity phenomena with respective periods of u and v result in a period of lcm(u; v), that may be very large. 4 5... |

15 |
Compiler techniques for optimizing memory and register usage on the Cray-2
- Eisenbeis, Jalby, et al.
- 1990
(Show Context)
Citation Context ..., loop software pipelining [10], that is necessary to exploit the instruction level parallelism, generates variable lifetimes that may span more than one iteration, enforcing software (loop unrolling =-=[10, 4]-=-) or hardware (rotating registers file [13]) techniques to be used. These two facts have always been treated separately. Our starting point was the following question : what is the effect of loop unro... |

11 | Optimal Software Pipelining in Presence of Resource
- EISENBEIS, WINDHEISER
- 1993
(Show Context)
Citation Context ...y 7 registers (against 8) are used, at the price of a largest unrolling degree (6 against 2). Because unrolling may be necessary for other issues than register allocation (functional units allocation =-=[9]-=- or instruction timings [22, 13]), it is important to be able to control it very precisely. As a matter of fact, two cyclicity phenomena with respective periods of u and v result in a period of lcm(u;... |

9 |
Optimal cycles in doubly weighted directed linear graphs
- Lawler
- 1966
(Show Context)
Citation Context ...emaining register pressure is less than R. The basic step of this heuristic is the search for the circuit with the best gain, that is exactly the critical cycle, for which polynomial algorithms exist =-=[11]-=-. Hamiltonian Circuit Chords Max Stable Set Graph Decomposition 2 6 4 1 3 5 1 3 6 4 2 5 3 2 1 4 5 6 4 1 3 6 2 5 b c f e a d e a b c d f 1 2 3 4 5 6 Figure 4: Heuristic to discover a valid graph decomp... |

4 |
DE DINECHIN: An Introduction to Simplex Scheduling
- DUPONT
- 1994
(Show Context)
Citation Context ...n a bivaluated graph. In our work we do not consider the problem of loop scheduling, this is beyond the scope of this paper. Relevant work combining scheduling and register allocation can be found in =-=[9, 12, 3, 14]-=-. We first recall the problems of loop register allocation and software and hardware techniques for solving the problem of variables that span more than one iteration. Then we describe what the meetin... |

4 |
Program Structure as a Basis for the Parallelization of Global Compiler Optimizations
- Zobel
- 1992
(Show Context)
Citation Context ...p register allocation and spilling. The two main facts that have motivated our work are the following. First, the usual interference graphs resulting from loop code are circular interval graphs (CIG) =-=[14, 24]-=-, on which usual graph problems are known to be easier than on general graphs [10, 12]. Second, loop software pipelining [20, 16], that is necessary to exploit the instruction level parallelism, gener... |

2 | The meeting graph: a new framework for loop register allocation
- Eisenbeis, Lelait, et al.
- 1995
(Show Context)
Citation Context ... min D i 2D (lcm(ae i 1 ; : : : ; ae i n )) (1) where ae i j is the number of turns of the circuit C i j . 2 II stands for Initiation Interval ; it is the length of the loop The proof can be found in =-=[5]-=-. For instance in figure 2, there is only one circuit passing through node t3 and concerning the second circuit, its length is 8 (2 iterations). This means that to allocate the variables to 3 register... |

2 |
Register requirement for exposing loops' maximal instruction-level parallelism
- Wang, Krall, et al.
- 1994
(Show Context)
Citation Context ...g graph could also be used for scheduling purposes. The second strategy, minimizing a criterion, 3 The penalty of the circle is the sum of the penalty of its arcs 4 Same as for the penalty is used in =-=[15, 5, 18, 23]-=-. Roughly speaking, all these works aim at the minimization of the variables lifetime, without further consideration on how the final actual register allocation is performed. Our work has shown that t... |

1 |
Registers Requirement for Exposing Loops
- Wang, Krall, et al.
- 1994
(Show Context)
Citation Context |