## Software pipelining showdown: Optimal vs. heuristic methods in a production compiler (1996)

### Cached

### Download Links

- [www.cs.utexas.edu]
- [pages.cs.wisc.edu]
- [pages.cs.wisc.edu]
- DBLP

### Other Repositories/Bibliography

Venue: | In Proc. of the ACM SIGPLAN'96 Conf. on Programming Languages Design and Implementation |

Citations: | 58 - 9 self |

### BibTeX

@INPROCEEDINGS{Ruttenberg96softwarepipelining,

author = {John Ruttenberg and G. R. Gao and A. Stoutchinin and W. Lichtenstein},

title = {Software pipelining showdown: Optimal vs. heuristic methods in a production compiler},

booktitle = {In Proc. of the ACM SIGPLAN'96 Conf. on Programming Languages Design and Implementation},

year = {1996},

pages = {1--11}

}

### Years of Citing Articles

### OpenURL

### Abstract

This paper is a scientific comparison of two code generation tech-niques with identical goals — generation of the best possible soft-ware pipelined code for computers with instruction level parallelism. Both are variants of modulo scheduling, a framework for generation of soflware pipelines pioneered by Rau and Glaser [RaG181], but are otherwise quite dissimilar. One technique was developed at Silicon Graphics and is used in the MIPSpro compiler. This is the production compiler for SG1’S systems which are based on the MIPS R8000 processor [Hsu94]. It is essentially a branch-and-bound enumeration of possible sched-ules with extensive pruning. This method is heuristic becaus(s of the way it prunes and also because of the interaction between reg-ister allocation and scheduling. The second technique aims to produce optimal results by formulat-

### Citations

516 | Software pipelining: An effective scheduling technique for VLIW machines
- Lam
- 1988
(Show Context)
Citation Context ...binary search 1sused [Tou84]. Lam pointed out that being able to find a schedule at II does not imply being able to find a schedule at 11+1and used this to explain why her compiler used linear search =-=[Lam88]-=-.ssearching successively: MinII, MinII+l, MinII+2, MinII+4, MinII+8,... until we either fmd a schedule or exceed MaxII. If a schedule is found with II s MinII+2, there are no better 11s left to search... |

451 | The Omega test: a fast and practical integer programming algorithm for dependence analysis - Pugh - 1991 |

262 | Conversion of control dependence to data dependence - Allen, Kennedy, et al. - 1983 |

171 | InstructionLevel Parallel Processing: History, Overview and Perspective
- Rau, Fisher
- 1993
(Show Context)
Citation Context ...[GaJo79], a fact that has led to a number of heuristic techniques [DeTo93, GaSc9 1, HuffP3, MoEb92, Rau94, Warter92, Lam88, AiNi88] for generatioti of optimal or near optimal soflware pipelined code. =-=[RaFi93]-=- contains an introductory survey of these methods. 1.2 The allure of optimal techniques Recently, motivated by the critical role of software pipelining in high performance computing, researchers have ... |

142 | Compiling for the Cydra 5 - Dehnert, Towle - 1993 |

136 | Register allocation via graph coloring - Briggs - 1992 |

134 | Coloring heuristics for register allocation - Briggs, Cooper, et al. - 1989 |

132 | Lifetime-sensitive modulo scheduling - Huff - 1993 |

118 | Optimal loop parallelization - Aiken, Nicolau - 1988 |

86 | A systolic array optimizing compiler - Lam - 1987 |

76 | An ecient resource-constrained global scheduling technique for superscalar and VLIW processors - Moon, Ebcioglu - 1992 |

75 | Minimizing register requirements under resource-constrained rate-optimal software pipelining - Govindarajan, Altman, et al. - 1994 |

71 | Bockhaus. Enhanced modulo scheduling for loops with conditional branches - Warter, Haab, et al. - 1992 |

48 |
A FORTRAN Compiler for the FPS-164 Scientific Computer
- Touzeau
- 1984
(Show Context)
Citation Context ...al backoff from MinII, 1. The use of binary search in this context has a fairly long history in the literature. Touzems described the AP 120 and FPS 164 compiler and explains how binary search 1sused =-=[Tou84]-=-. Lam pointed out that being able to find a schedule at II does not imply being able to find a schedule at 11+1and used this to explain why her compiler used linear search [Lam88].ssearching successiv... |

29 | Optimum modulo schedules for minimum register requirements - Eichenberger, Davidson, et al. - 1995 |

28 | Fine-grain scheduling under resource constraints - Feautrier - 1994 |

12 | Efficient algorithms for cyclic scheduling - Gasperoni, Schwiegelshohn - 1991 |

5 | Loop storage optimization for dataflow machines - Gao, Ning - 1991 |

2 | A Framework for Rate-Optimal Resource-Constrained Software Pipelining - Govindarajan, Altman, et al. - 1994 |

1 |
Scheduling and mapping: Software pipelining m the presence of structural hazards
- Altrnan, Givindarajan, et al.
- 1995
(Show Context)
Citation Context ... resource constraints, resulting in a unified ILP formulation for the problem for simple pipelined architectures ~iGa92,GoAlGa94a]. The work was subsequently generalized to more complex architectures =-=[AlGoGa95]-=-. By the spring of 1995, this work was implemented at McGill in MOST, the Modulo Scheduling Toolset, which makes use of any one of several external ILP solvers. MOST was not intended as a component of... |

1 | Optimal Sof~are P@elining with Functional Unit and Register Constraints - Altman - 1995 |

1 | Automatic data layout using O-1 integer linear programming - Bixby, Kennedy, et al. - 1994 |

1 |
Computers and Intractabili~: A Guideto the Theoty of NP-Completeness
- Garey, Johnson
- 1979
(Show Context)
Citation Context ...must have a register allocation for the values used in the loop that is valid for that schedule. Such a schedule is called rate--optimal. The problem of finding rate-optimal schedules is NP--complete =-=[GaJo79]-=-, a fact that has led to a number of heuristic techniques [DeTo93, GaSc9 1, HuffP3, MoEb92, Rau94, Warter92, Lam88, AiNi88] for generatioti of optimal or near optimal soflware pipelined code. [RaFi93]... |

1 |
Designing the TFP
- Hsu
- 1994
(Show Context)
Citation Context ...se quite dissimilar. One technique was developed at Silicon Graphics and is used in the MIPSpro compiler. This is the production compiler for SG1’S systems which are based on the MIPS R8000 processor =-=[Hsu94]-=-. It is essentially a branch-and-bound enumeration of possible schedules with extensive pruning. This method is heuristic becaus(s of the way it prunes and also because of the interaction between regi... |

1 | Theageofoptimization: Solving large-scale real-world problems. Operations Research, 42(1):5-13, January-February 1994. ~iGa92] Q. Ning and - Nemhauser - 1992 |

1 | Some scheduling tecniques and an easily schedulable horizontal architecture for high performance scientific computing - Rau, Glaser - 1981 |

1 |
Iterative modulo scheduling: An algorithm for software pipelining loops
- au
- 1994
(Show Context)
Citation Context ... modtdo scheduling is computationally expensive as well as dljicult to implement, have inhibited its incorporation into product compilers. 1.0 Introduction 1.1 Software pipelining —B. Ramakrishna Rau =-=[Rau94]-=- Software pipelining is a coding technique that overlaps operations from various loop iterations in order to exploit instruction level parallelism. In order to be effective software pipelining code mu... |