Results 1 
6 of
6
Dynamically Managing Processor Temperature and Power
 IN 2ND WORKSHOP ON FEEDBACKDIRECTED OPTIMIZATION
, 1999
"... Hardware designers are facing the following dilemma: they must ensure that the processor temperature will never exceed a safe maximum, but they also know that this maximum is reached only under unrealistic benchmarks. In other words, the processor could be more ef cient for an average workload. Mai ..."
Abstract

Cited by 60 (0 self)
 Add to MetaCart
Hardware designers are facing the following dilemma: they must ensure that the processor temperature will never exceed a safe maximum, but they also know that this maximum is reached only under unrealistic benchmarks. In other words, the processor could be more ef cient for an average workload. Maintaining a safe temperature bound is made dicult because it depends on system statistics as well as external parameters such as the room temperature. We present
Energy Estimation and Optimization of Embedded VLIW Processors based on Instruction Clustering
 In Proc. of DAC
"... Aim of this paper is to propose a methodology for the definition of an instructionlevel energy estimation framework for VLIW (Very Long Instruction Word) processors. The power modeling methodology is the key issue to define an e#ective energyaware software optimisation strategy for stateof the ..."
Abstract

Cited by 11 (0 self)
 Add to MetaCart
Aim of this paper is to propose a methodology for the definition of an instructionlevel energy estimation framework for VLIW (Very Long Instruction Word) processors. The power modeling methodology is the key issue to define an e#ective energyaware software optimisation strategy for stateof theart ILP (Instruction Level Parallelism) processors. The methodology is based on an energy model for VLIW processors that exploits instruction clustering to achieve an e#cient and fine grained energy estimation. The approach aims at reducing the complexity of the characterization problem for VLIW processors from exponential, with respect to the number of parallel operations in the same very long instruction, to quadratic, with respect to the number of instruction clusters. Furthermore, the paper proposes a spatial scheduling algorithm based on a lowpower reordering of the parallel operations within the same long instruction. Experimental results have been carried out on the Lx processor, a 4issue VLIW core jointly designed by HPLabs and STMicroelectronics. The results have shown an average error of 1.9% between the clusterbased estimation model and the reference design, with a standard deviation of 5.8%. For the Lx architecture, the spatial instruction scheduling algorithm provides an average energy saving of 12%.
Transforming and Parallelizing ANSI C Programs Using Pattern Recognition
 IN PROCEEDINGS OF HPCN EUROPE’99
, 1999
"... Code transformations are a very effective method of parallelizing and improving the efficiency of programs. Unfortunately most compiler systems require implementing separate (sub)programs for each transformation. This paper describes a different approach. We designed and implemented a fully prog ..."
Abstract

Cited by 8 (2 self)
 Add to MetaCart
Code transformations are a very effective method of parallelizing and improving the efficiency of programs. Unfortunately most compiler systems require implementing separate (sub)programs for each transformation. This paper describes a different approach. We designed and implemented a fully programmable transformation engine. It can be programmed by means of a transformation language. This language was especially designed to be easy to use and flexible enough to express most of the common and more advanced transformations.
DataReuse Exploration For LowPower Realization Of Multimedia Applications On Embedded Cores
 Proc. Of 9 th Int. Workshop on Power and Timing Modeling, Optimization and Simulation (PATMOS’99
, 1999
"... Exploitation of data reuse in combination with the use of custom memory hierarchy that exploits the temporal locality of data accesses may introduce significant power savings especially for dataintensive applications. In this paper the effect of the datareuse decisions on the power dissipation but ..."
Abstract

Cited by 8 (4 self)
 Add to MetaCart
Exploitation of data reuse in combination with the use of custom memory hierarchy that exploits the temporal locality of data accesses may introduce significant power savings especially for dataintensive applications. In this paper the effect of the datareuse decisions on the power dissipation but also on area and performance of multimedia applications realized on embedded cores is explored. Experimental results prove that power savings of about 35% can be achieved through the exploitation of data reuse without introducing performance penalties in comparison to reference designs.
Power Exploration Of Multimedia Applications Realized On
"... Low power realization of video applications on embedded cores is described. Code transformations are applied to reduce the data memory power consumption. The transformed code indicates a power efficient data memory architecture while transformations move the main part of memory accesses from larger ..."
Abstract
 Add to MetaCart
Low power realization of video applications on embedded cores is described. Code transformations are applied to reduce the data memory power consumption. The transformed code indicates a power efficient data memory architecture while transformations move the main part of memory accesses from larger memories (possibly offchip) to smaller ones (onchip). The effect of transformations on performance, which is usually the overriding issue in such systems, is evaluated. It is shown that performance is closely related to program memory power consumption that is in some case orders of magnitude larger than data memory power consumption. The aim of the proposed research is the development of a methodology for the application of data storage and transfer optimizing transformations that achieve a close to optimal balance between power and performance in realizations of multimedia applications on embedded cores.
54.2 Energy Estimation and Optimization of Embedded VLIW Processors based on Instruction Clustering
"... Aim of this paper is to propose a methodology for the definition of an instructionlevel energy estimation framework for VLIW (Very Long Instruction Word) processors. The power modeling methodology is the key issue to define an effective energyaware software optimisation strategy for stateofthear ..."
Abstract
 Add to MetaCart
Aim of this paper is to propose a methodology for the definition of an instructionlevel energy estimation framework for VLIW (Very Long Instruction Word) processors. The power modeling methodology is the key issue to define an effective energyaware software optimisation strategy for stateoftheart ILP (Instruction Level Parallelism) processors. The methodology is based on an energy model for VLIW processors that exploits instruction clustering to achieve an efficient and fine grained energy estimation. The approach aims at reducing the complexity of the characterization problem for VLIW processors from exponential, with respect to the number of parallel operations in the same very long instruction, to quadratic, with respect to the number of instruction clusters. Furthermore, the paper proposes a spatial scheduling algorithm based on a lowpower reordering of the parallel operations within the same long instruction. Experimental results have been carried out on the Lx processor, a 4issue VLIW core jointly designed by HPLabs and STMicroelectronics. The results have shown an average error of 1.9 % between the clusterbased estimation model and the reference design, with a standard deviation of 5.8%. For the Lx architecture, the spatial instruction scheduling algorithm provides an average energy saving of 12%.