Results 1 - 10
of
14
Improvements to Graph Coloring Register Allocation
- ACM Transactions on Programming Languages and Systems
, 1994
"... This paper describes both the techniques themselves and our experience building and using register allocators that incorporate them. It provides a detailed description of optimistic coloring and rematerialization. It presents experimental data to show the performance of several versions of the regis ..."
Abstract
-
Cited by 158 (8 self)
- Add to MetaCart
This paper describes both the techniques themselves and our experience building and using register allocators that incorporate them. It provides a detailed description of optimistic coloring and rematerialization. It presents experimental data to show the performance of several versions of the register allocator on a suite of FORTRAN programs. It discusses several insights that we discovered only after repeated implementation of these allocators. Categories and Subject Descriptors: D.3.4 [Programming Languages]: Processors---compi l ers , optimization General terms: Languages Additional Key Words and Phrases: Register allocation, code generation, graph coloring 1. INTRODUCTION The relationship between run-time performance and e#ective use of a machine's register set is well understood. In a compiler, the process of deciding which values to keep in registers at each point in the generated code is called register allocation. Value
Register Allocation via Graph Coloring
, 1992
"... Chaitin and his colleagues at IBM in Yorktown Heights built the first global register allocator based on graph coloring. This thesis describes a series of improvements and extensions to the Yorktown allocator. There are four primary results: Optimistic coloring Chaitin's coloring heuristic pessimis ..."
Abstract
-
Cited by 133 (4 self)
- Add to MetaCart
Chaitin and his colleagues at IBM in Yorktown Heights built the first global register allocator based on graph coloring. This thesis describes a series of improvements and extensions to the Yorktown allocator. There are four primary results: Optimistic coloring Chaitin's coloring heuristic pessimistically assumes any node of high degree will not be colored and must therefore be spilled. By optimistically assuming that nodes of high degree will receive colors, I often achieve lower spill costs and faster code; my results are never worse. Coloring pairs The pessimism of Chaitin's coloring heuristic is emphasized when trying to color register pairs. My heuristic handles pairs as a natural consequence of its optimism. Rematerialization Chaitin et al. introduced the idea of rematerialization to avoid the expense of spilling and reloading certain simple values. By propagating rematerialization information around the SSA graph using a simple variation of Wegman and Zadeck's constant propag...
A Generalized Algorithm for Graph-Coloring Register Allocation
, 2004
"... Graph-coloring register allocation is an elegant and extremely popular optimization for modern machines. But as currently formulated, it does not handle two characteristics commonly found in commercial architectures. First, a single register name may appear in multiple register classes, where a clas ..."
Abstract
-
Cited by 26 (5 self)
- Add to MetaCart
Graph-coloring register allocation is an elegant and extremely popular optimization for modern machines. But as currently formulated, it does not handle two characteristics commonly found in commercial architectures. First, a single register name may appear in multiple register classes, where a class is a set of register names that are interchangeable in a particular role. Second, multiple register names may be aliases for a single hardware register. We present a generalization of graph-coloring register allocation that handles these problematic characteristics while preserving the elegance and practicality of traditional graph coloring. Our generalization adapts easily to a new target machine, requiring only the sets of names in the register classes and a map of the register aliases. It also drops easily into a well-known graph-coloring allocator, is efficient at compile time, and produces high-quality code.
Graph coloring vs. optimal register allocation for optimizing compilers
- In Joint Modular Languages Conference
, 2003
"... Abstract. Optimizing compilers play an important role for the efficient execution of programs written in high level programming languages. Current microprocessors impose the problem that the gap between processor cycle time and memory latency increases. In order to fully exploit the potential of pro ..."
Abstract
-
Cited by 7 (1 self)
- Add to MetaCart
Abstract. Optimizing compilers play an important role for the efficient execution of programs written in high level programming languages. Current microprocessors impose the problem that the gap between processor cycle time and memory latency increases. In order to fully exploit the potential of processors, nearly optimal register allocation is of paramount importance. In the predominance of the x86 architecture and in the increased usage of high-level programming languages for embedded systems peculiarities and irregularities in their register sets have to be handled. These irregularities makes the task of register allocation for optimizing compilers more difficult than for regular architectures and register files. In this article we show how optimistic graph coloring register allocation can be extended to handle these irregularities. Additionally we present an exponential algorithm which in most cases can compute an optimal solution for register allocation and copy elimination. These algorithms are evaluated on a virtual processor architecture modeling two and three operand architectures with different register file sizes. The evaluation demonstrates that the heuristic graph coloring register allocator comes close to the optimal solution for large register files, but performs badly on small register files. For small register files the optimal algorithm is fast enough to replace a heuristic algorithm. 1
A Progressive Register Allocator for Irregular Architectures
"... ... a compiler performs. Conventional graph-coloring based register allocators are fast and do well on regular, RISC-like, architectures, but perform poorly on irregular, CISC-like, architectures with few registers and nonorthogonal instruction sets. At the other extreme, optimal register allocators ..."
Abstract
-
Cited by 7 (4 self)
- Add to MetaCart
... a compiler performs. Conventional graph-coloring based register allocators are fast and do well on regular, RISC-like, architectures, but perform poorly on irregular, CISC-like, architectures with few registers and nonorthogonal instruction sets. At the other extreme, optimal register allocators based on integer linear programming are capable of fully modeling and exploiting the peculiarities of irregular architectures but do not scale well. We introduce the idea of a progressive allocator. A progressive allocator finds an initial allocation of quality comparable to a conventional allocator, but as more time is allowed for computation the quality of the allocation approaches optimal. This paper presents a progressive register allocator which uses a multi-commodity network flow model to elegantly represent the intricacies of irregular architectures. We evaluate our allocator as a substitute for gcc's local register allocation pass.
Graph-Coloring Register Allocation for Irregular Architectures
, 2001
"... The graph-coloring metaphor leads to elegant algorithms for register allocation that have been shown to be quite effective for regular architectures with plenty of registers. Published attempts to make these algorithms applicable to architectures that are irregular in their use of registers have yie ..."
Abstract
-
Cited by 4 (0 self)
- Add to MetaCart
The graph-coloring metaphor leads to elegant algorithms for register allocation that have been shown to be quite effective for regular architectures with plenty of registers. Published attempts to make these algorithms applicable to architectures that are irregular in their use of registers have yielded several incompatible extensions that handle only a small subset of the irregularities seen in modern architectures. We propose an approach that is able to extend graphcoloring allocation to a broad class of irregular architectures, and we show that we can do this without destroying the simplicity of the basic algorithm. We have implemented our approach in the Machine-SUIF compiler, and we discuss our initial experiences with it. In particular, we describe how the irregularities of the x86 instruction set architecture are handled in Machine-SUIF's graphcoloring register allocator. 1Introduction In the context of global register allocation, graph coloring is a successful and popular tec...
A combined algorithm for graph-coloring in register allocation
- Proceedings of the Computational Symposium on Graph Coloring and its Generalizations
, 2002
"... Abstract. Our research involves improved algorithms for graph-coloring, in the context of register allocation. We extend the usual algorithm, first proposed by Chaitin, adding two further routines, one a form of semi-randomized greedy allocation of colors, and the second using local search with rand ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
Abstract. Our research involves improved algorithms for graph-coloring, in the context of register allocation. We extend the usual algorithm, first proposed by Chaitin, adding two further routines, one a form of semi-randomized greedy allocation of colors, and the second using local search with random restarts, a method developed in the context of logical satisfiability problems. For typical register-set sizes, the extended algorithm can color graphs using significantly less time and spilling significantly smaller numbers of nodes to memory than does the Chaitin algorithm; these advantages become less pronounced, as register-set size decreases, and Chaitin can offer some small measure of better performance for very small sizes. Our algorithm has some interesting additional features. On the one hand, it can be extended to the general graph-coloring problem, outside of the particular application considered. In addition, it makes available adjustable parameters for producing moreor less-optimal executables during compiler runs.
Optimizing Stack Frame Layout for Embedded Systems
, 2000
"... Many modern embedded processors have powerful, yet restricted addressing modes, they might for example allow indirect addressing but with a small oset. The small oset creates a partitioning of the activation frame into two parts, a fast part the can be accessed directly and a slow part that require ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
Many modern embedded processors have powerful, yet restricted addressing modes, they might for example allow indirect addressing but with a small oset. The small oset creates a partitioning of the activation frame into two parts, a fast part the can be accessed directly and a slow part that require arithmetic operations on the stack pointer before the activation frame can be accessed. We propose a greedy algorithm that places frequently accessed variables on the fast activation frame to reduce code size and uses the liveness properties of variables to let variables that not are live at the same time share space on the activation frame and thereby save space. By minimizing the number of variables on the slow part of the activation frame, it is possible to reduce both stack size, code size and execution time. We present experimental results that show that our method uses less stack and generates less code than the algorithm of a leading industrial compiler. Contents 1 Introduction 4 ...
Aliased Register Allocation for Straight-line Programs is NP-complete
"... Abstract. Register allocation is NP-complete in general but can be solved in linear time for straight-line programs where each variable has at most one definition point if the bank of registers is homogeneous. In this paper we study registers which may alias: an aliased register can be used both ind ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
Abstract. Register allocation is NP-complete in general but can be solved in linear time for straight-line programs where each variable has at most one definition point if the bank of registers is homogeneous. In this paper we study registers which may alias: an aliased register can be used both independently or in combination with an adjacent register. Such registers are found in commonly-used architectures such as x86, the HP PA-RISC, the Sun SPARC processor, and MIPS floating point. In 2004, Smith, Ramsey, and Holloway presented the best algorithm for aliased register allocation so far; their algorithm is based on a heuristic for coloring of general graphs. Most architectures with register aliasing allow only aligned registers to be combined: for example, the low-address register must have an even number. Open until now is the question of whether working with restricted classes of programs can improve the complexity of aliased register allocation with alignment restrictions. In this paper we show that aliased register allocation with alignment restrictions for straight-line programs is NP-complete. 1
Nearly optimal register allocation with PBQP
- In Proceedings of the 7th Joint Modular Languages Conference (JMLC’06). LNCS
, 2006
"... Abstract. For irregular architectures global register allocation remains a challenging problem, and has received a lot of attention in recent years. The classical graph-colouring analogy used by Chaitin and Briggs is not adequate for irregular architectures featuring non-orthogonal instruction sets ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
Abstract. For irregular architectures global register allocation remains a challenging problem, and has received a lot of attention in recent years. The classical graph-colouring analogy used by Chaitin and Briggs is not adequate for irregular architectures featuring non-orthogonal instruction sets and irregular register sets. Previous work [1, 2] on register allocation based on partitioned boolean quadratic programming (PBQP) has demonstrated that this approach is effective for highly irregular architectures and small benchmarks. However, experiments have shown that the heuristic used for non-reducible nodes performs poorly for larger benchmarks and more regular architectures. In this paper we present a new heuristic for PBQP, which significantly outperforms the old heuristic, and produces register allocations equal to those of the state-of-the-art graph-colouring approach. We also present a new solver for PBQP which is based on branch-and-bound and is able to solve register allocations optimally. The branch-and-bound solver allows PBQP to be used as a progressive register allocator, where programmers may explicitly trade extra compile time for a better register allocation. Experiments were conducted using the register allocation problems in the SPEC2000 benchmark suite as input, with IA-32 as the target architecture. Using an optimal solver for PBQP we were able to solve 97.4 % of the register allocation problems in SPEC2000 optimally. 1

