Results 1 -
4 of
4
Architecture description languages for retargetable compilation
- in The Compiler Design Handbook: Optimizations & Machine Code Generation
, 2002
"... Retargetable compilation has been the subject of some study over the years. This work is motivated by the need to develop a single compiler infrastructure for a range of possible target architectures. Recent technology trends point to the growth of application specific programmable systems. This tre ..."
Abstract
-
Cited by 12 (4 self)
- Add to MetaCart
Retargetable compilation has been the subject of some study over the years. This work is motivated by the need to develop a single compiler infrastructure for a range of possible target architectures. Recent technology trends point to the growth of application specific programmable systems. This trend makes it doubly important that we develop efficient retargetable compilation
UFC: a Global Trade-off Strategy for Loop Unrolling for VLIW Architecture
- In Proc. CPC
, 2003
"... In order to minimize code size overhead on VLIW architectures, compilers for embedded processors have to pay higher attention on code optimization than on compilation time. Thus, the first demand on compiler for embedded processors consists in spending instruction memory space for optimization only ..."
Abstract
-
Cited by 8 (0 self)
- Add to MetaCart
In order to minimize code size overhead on VLIW architectures, compilers for embedded processors have to pay higher attention on code optimization than on compilation time. Thus, the first demand on compiler for embedded processors consists in spending instruction memory space for optimization only if the associated performance improvement justifies it. In this paper, we propose a novel method based on Integer Linear Programming for computing the unrolling factors for sets of loop nests with control over the code size and over the side-effects of the transformation. We define the notion of trade-off between code size and performance. Experiments on Phillips Trimedia show that the method achieve excellent trade-offs.
Reducing code size in VLIW instruction scheduling
- Journal of Embedded Computing
"... Code size is an important concern in embedded systems. VLIW architectures are popular for embedded systems, but often increase code size, by requiring NOPs to be inserted into the code to satisfy instruction placement constraints. Existing VLIW instruction schedulers target run-time but not code siz ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
Code size is an important concern in embedded systems. VLIW architectures are popular for embedded systems, but often increase code size, by requiring NOPs to be inserted into the code to satisfy instruction placement constraints. Existing VLIW instruction schedulers target run-time but not code size. Indeed, current schedulers often increase code size, by generating compensation copies of instructions when moving them across basic block boundaries. Our approach, for the first time, uses the power of scheduling instructions across blocks to reduce code size and not just runtime, for a certain class of VLIWs. We therefore show that trace scheduling, previously synonymous with increased code size, can in fact be used to reduce code size on such VLIWs. Our scheduler uses a cost-model driven, back-tracking approach that starts with an optimal algorithm for searching the solution space in exponential time, but then also employs branch-and-bound techniques and nonoptimal heuristics to keep the compile time reasonable (within a factor of 2). Our method reduces the code size for our benchmarks by 16 % versus the best existing across-block scheduler, while being within 0.8 % of its run-time. 1
Spacewalker: Automated Design Space Exploration for
- Compiler and Architecture Research Program, Hewlett Packard Laboratories
, 2001
"... This paper addresses the problem of automated design of a computer system for an embedded application. The computer system to be designed consists of a VLIW processor and/or a customized systolic array, along with a cache subsystem comprising a data cache, instruction cache and second-level uni ..."
Abstract
- Add to MetaCart
This paper addresses the problem of automated design of a computer system for an embedded application. The computer system to be designed consists of a VLIW processor and/or a customized systolic array, along with a cache subsystem comprising a data cache, instruction cache and second-level unified cache. Several algorithms for "walking" the design space are described, and experimental results of custom designed systems for two applications are presented

