Results 1 -
4 of
4
Instruction-Level Parallel Processing: History, Overview and Perspective
, 1992
"... Instruction-level Parallelism CILP) is a family of processor and compiler design techniques that speed up execution by causing individual machine operations to execute in parallel. Although ILP has appeared in the highest performance uniprocessors for the past 30 years, the 1980s saw it become a muc ..."
Abstract
-
Cited by 166 (0 self)
- Add to MetaCart
Instruction-level Parallelism CILP) is a family of processor and compiler design techniques that speed up execution by causing individual machine operations to execute in parallel. Although ILP has appeared in the highest performance uniprocessors for the past 30 years, the 1980s saw it become a much more significant force in computer design. Several systems were built, and sold commercially, which pushed ILP far beyond where it had been before, both in terms of the amount of ILP offered and in the central role ILP played in the design of the system. By the end of the decade, advanced microprocessor design at all major CPU manufacturers had incorporated ILP, and new techniques for ILP have become a popular topic at academic conferences. This article provides an overview and historical perspective of the field of ILP and its development over the past three decades.
Local microcode compaction techniques
- ACM Computing Surveys
, 1980
"... Microcode compaction is an essential tool for the compilation of high-level language microprograms into microinstructions with parallel microoperations. Although guaranteeing minimum execution time is an exponentially complex problem, recent research indicates that it is not difficult to obtain prac ..."
Abstract
-
Cited by 45 (0 self)
- Add to MetaCart
Microcode compaction is an essential tool for the compilation of high-level language microprograms into microinstructions with parallel microoperations. Although guaranteeing minimum execution time is an exponentially complex problem, recent research indicates that it is not difficult to obtain practical results. This paper, which
Software Pipelining
, 1995
"... Utilizing parallelism at the instruction level is an important way to improve performance. Since the time spent in loop execution dominates total execution time, a large body of optimizations focus on decreasing the time to execute each iteration. Software pipelining is a technique that reforms t ..."
Abstract
-
Cited by 35 (1 self)
- Add to MetaCart
Utilizing parallelism at the instruction level is an important way to improve performance. Since the time spent in loop execution dominates total execution time, a large body of optimizations focus on decreasing the time to execute each iteration. Software pipelining is a technique that reforms the loop so that a faster execution rate is realized. Iterations are executed in overlapped fashion to increase parallelism. 1 Let --ABC n represent a loop containing operations A, B, C that is executed n times. Although the operations of a single iteration can be parallelized, more parallelism may be achievable if the entire loop is considered rather than a single iteration. The software pipelining transformation utilizes the fact that a loop --ABC n is equivalent to A--BCA n\Gamma1 BC. Although the operations contained in the loop do not change, the operations are from different iterations of the original loop. Various algorithms for software pipelining exist. A comparison of ...
Machine-description driven compilers for EPIC and VLIW processors. Design Automation for Embedded Systems
, 1999
"... retargetable compilers, table-driven compilers, machine description, processor description, instruction-level parallelism, EPIC processors, VLIW processors, EPIC compilers, VLIW compilers, code generation, scheduling, register allocation In the past, due to the restricted gate count available on an ..."
Abstract
-
Cited by 18 (9 self)
- Add to MetaCart
retargetable compilers, table-driven compilers, machine description, processor description, instruction-level parallelism, EPIC processors, VLIW processors, EPIC compilers, VLIW compilers, code generation, scheduling, register allocation In the past, due to the restricted gate count available on an inexpensive chip, embedded DSPs have had limited parallelism, few registers and irregular, incomplete interconnectivity. More recently, with increasing levels of integration, embedded VLIW processors have started to appear. Such processors typically have higher levels of instruction-level parallelism, more registers, and a relatively regular interconnect between the registers and the functional units. The central challenges faced by a code generator for an EPIC (Explicitly Parallel Instruction Computing) or VLIW processor are quite different from those for the earlier DSPs and, consequently, so is the structure of a code generator that is designed to be easily retargetable. In this report, we explain the nature of the challenges faced by an EPIC or VLIW compiler and present a strategy for performing code generation in an incremental fashion that is best suited to generating high-quality code efficiently. We also describe the Operation Binding Lattice, a formal model for incrementally binding the opcodes and register assignments in an EPIC code generator. As we show, this reflects the phase structure of the EPIC code generator. It also defines the structure of the machine-description database, which is queried by the code generator for the information that it needs about the target processor. Lastly, we discuss general features of our implementation of these ideas and techniques in Elcor, our EPIC compiler research infrastructure.

