## Optimizing Power Using Transformations (1995)

### Cached

### Download Links

- [infopad.eecs.berkeley.edu]
- [www.scarpaz.com]
- [www.scarpaz.com]
- DBLP

### Other Repositories/Bibliography

Venue: | IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems |

Citations: | 189 - 14 self |

### BibTeX

@ARTICLE{Chandrakasan95optimizingpower,

author = {Anantha P. Chandrakasan and Miodrag Potkonjak and Renu Mehra and Jan Rabaey and Robert W. Brodersen},

title = {Optimizing Power Using Transformations},

journal = {IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems},

year = {1995},

volume = {14},

pages = {12--31}

}

### Years of Citing Articles

### OpenURL

### Abstract

: The increasing demand for portable computing has elevated power consumption to be one of the most critical design parameters. A high-level synthesis system, HYPER-LP, is presented for minimizing power consumption in application specific datapath intensive CMOS circuits using a variety of architectural and computational transformations. The synthesis environment consists of high-level estimation of power consumption, a library of transformation primitives, and heuristic/probabilistic optimization search mechanisms for fast and efficient scanning of the design space. Examples with varying degree of computational complexity and structures are optimized and synthesized using the HYPER-LP system. The results indicate that more than an order of magnitude reduction in power can be achieved over current-day design methodologies while maintaining the system throughput; in some cases this can be accomplished while preserving or reducing the implementation area. 1.0 Introduction VLSI research a...

### Citations

4457 |
Classification and Regression Trees
- Breiman, Friedman, et al.
- 1984
(Show Context)
Citation Context ...he implementation area predicted by HYPER. It is widely recognized that the quality of the prediction model is inversely proportional to number of parameters used during the prediction model building =-=[32]-=-. The number of parameters is equal to the sum of the number of input variables in the model, and amount of data needed to describe the model. We built the interconnect capacitance model using only on... |

472 | Lowpower cmos digital design
- Chandrakasan, Sheng, et al.
- 1992
(Show Context)
Citation Context ...verified in Figure 1a, which is an experimentally derived plot of the normalized energy vs. V dd . This dependence on supply voltage has been verified for a number of logic functions and logic styles =-=[2]-=-. The average capacitance switched, C avg = S p t C L , for a uniformly distributed set of input values has been characterized for each logic and memory element in the cell library. Similarly, the del... |

384 | What every computer scientist should know about floating-point arithmetic
- Goldberg
- 1991
(Show Context)
Citation Context ...ed wordlength) varies a lot. While some transformations, for example retiming, pipelining and commutativity, do not affect wordlength, associativity and distributivity often have a dramatic influence =-=[26]-=-. In some cases, it is possible to reduce both the number of power expensive operations and the required wordlength. In other cases, however, a reduction in wordlength comes at the expense of an incre... |

348 | Retiming synchronous circuitry
- Leiserson, Saxe
- 1991
(Show Context)
Citation Context ...onal complexity of retiming for power optimization is particularly interesting and somewhat surprising, since retiming for critical path reduction has several algorithms of polynomial-time complexity =-=[34]-=-. The computational complexity of the power minimization problem implies that it is very unlikely, even when the set of applied transformations is restricted, that polynomial time optimal algorithm ca... |

139 |
Principles of Compiler Design
- AHO, ULLMAN
- 1977
(Show Context)
Citation Context ...n than other operations. A prime example of transformation which explores this trade-off is strength reduction, often used in software compilers, in which multiplications are substituted by additions =-=[24]-=-. Although this situation is not as common as the ones presented in the previous section, sometimes it is possible to achieve significant savings using this type of tradeoff. Unfortunately, this type ... |

110 |
Fast prototyping of datapath–intensive architectures
- Rabaey, Chu, et al.
- 1991
(Show Context)
Citation Context ...porated comprehensive sets of transformations, coupled with powerful optimization strategies. Example systems with elaborate applications of transformations are Flamel [8], SAW [9], SPAID [10], HYPER =-=[11]-=-, and CATHEDRAL [12]. Among the set of transformations used by the Flamel design system are loop transformations, height reduction and constant propagation. SAW uses among other transformations in-lin... |

107 | Transition Density, a Stochastic Measure of Activity
- Najm
(Show Context)
Citation Context ...ate estimates, while being the most computationally intensive. Gate-level probabilistic approaches estimate the internal node activities of a network given the distribution of the input signals [13], =-=[14], [1-=-5]. Once the signal probability for each node in the network is determined, the total average capacitance switched is then estimated as �� p i0 (1-p i0 ) C i , where p i0 is the probability that n... |

81 |
Estimation of Average Switching Activity
- Ghosh, Devadas, et al.
(Show Context)
Citation Context ...timates, while being the most computationally intensive. Gate-level probabilistic approaches estimate the internal node activities of a network given the distribution of the input signals [13], [14], =-=[15]. On-=-ce the signal probability for each node in the network is determined, the total average capacitance switched is then estimated as �� p i0 (1-p i0 ) C i , where p i0 is the probability that node i ... |

74 |
Fast algorithm for the discrete cosine transform
- Feig, Winograd
- 1992
(Show Context)
Citation Context ...consider a bigger and more widely used example, the DCT (Discrete Cosine Transform), to illustrate this point further. We will compare two implementation of the DCT: one proposed by Feig and Winograd =-=[20]-=- and the other, the direct maximally fast form. Feig's DCT algorithm can be derived from the direct form using the exceptionally sophisticated application of common subexpression elimination/replicati... |

62 |
Algorithm transformation techniques for concurrent processors
- Parhi
- 1989
(Show Context)
Citation Context ...mations. To illustrate the application of speed-up transformations to lower power, consider a first order IIR filter, as shown in Figure 2a, with a critical path of 2. Due to the recursive bottleneck =-=[19]-=- imposed by the filter structure, it is impossible to reduce the critical path using retiming or pipelining. Also, the simple structure does not provide opportunities for the application of algebraic ... |

60 |
IRSIM: An Incremental MOS Switch-Level Simulator
- Salz, Horowitz
- 1989
(Show Context)
Citation Context ... where N i is the total number of power consuming transitions for node i, N is the number of simulation cycles, and C i is the physical capacitance of node i. This approach (using the IRSIM simulator =-=[17]-=-) was used to estimate power at the layout level. The results from a few fabricated chips, indicate that the predicted power from IRSIM is within 30% of the measured power. At the lowest level, power ... |

58 |
Optimizing resource utilization using transformations
- Potkonjak, Rabaey
- 1991
(Show Context)
Citation Context ...put at the expense of additional number of operations [21]. Similarly, it has been demonstrated that retiming for throughput often results in designs with exceptionally high interconnect requirements =-=[22]-=-. 4.2 Operation Reduction The most obvious approach to reduce the switched capacitance, is to reduce the number of operations (and hence the number of switching events) in the data control flow graph.... |

53 | Behavioral transformation for algorithmic level IC design
- Walker, Thomas
- 1989
(Show Context)
Citation Context ...esis systems have incorporated comprehensive sets of transformations, coupled with powerful optimization strategies. Example systems with elaborate applications of transformations are Flamel [8], SAW =-=[9]-=-, SPAID [10], HYPER [11], and CATHEDRAL [12]. Among the set of transformations used by the Flamel design system are loop transformations, height reduction and constant propagation. SAW uses among othe... |

51 |
Power estimation for high level synthesis
- LANDMAN, RABAEY
- 1993
(Show Context)
Citation Context ...e estimated efficiently from a high level of abstraction. Recently, techniques to estimate the power consumption at the architecture level (after the flowgraph has been scheduled) have been developed =-=[18]-=-. In this work, power is estimated from an algorithmic level so the design space can be quickly explored. 4.0 Using Transformations to Optimize Power Transformations are changes to the computational s... |

46 |
An algorithm for the evaluation of finite trigonometric series
- Goertzel
- 1958
(Show Context)
Citation Context ...y common in digital signal processing, and Horner's scheme (the final structure in our examples) is often suggested in filter design and FFT calculations when very few frequency components are needed =-=[23]-=-. First we will analyze the second order polynomial X 2 + AX + B. The left side of Figure 4a shows the straightforward implementation which requires two multiplications and two additions and has a cri... |

42 |
Estimating Dynamic Power Consumption of CMOS Circuits
- Cirit
- 1987
(Show Context)
Citation Context ... accurate estimates, while being the most computationally intensive. Gate-level probabilistic approaches estimate the internal node activities of a network given the distribution of the input signals =-=[13], [1-=-4], [15]. Once the signal probability for each node in the network is determined, the total average capacitance switched is then estimated as �� p i0 (1-p i0 ) C i , where p i0 is the probability ... |

38 |
Flamel: A high-level hardware compiler
- Trickey
- 1987
(Show Context)
Citation Context ...vel synthesis systems have incorporated comprehensive sets of transformations, coupled with powerful optimization strategies. Example systems with elaborate applications of transformations are Flamel =-=[8]-=-, SAW [9], SPAID [10], HYPER [11], and CATHEDRAL [12]. Among the set of transformations used by the Flamel design system are loop transformations, height reduction and constant propagation. SAW uses a... |

35 |
Techniques for Area Estimation of VLSI Layouts
- Kurdahi
- 1989
(Show Context)
Citation Context ... yield, floorplanning and synthesis considerations for throughput and area optimization, several elaborate prediction models for total chip and interconnect area have been built and successfully used =-=[30]-=-, [31]. However, high-level synthesis adds additional requirements on the prediction tools next to accuracy; during the optimization process in high level synthesis, it is necessary to estimate the fi... |

35 |
Efficient simulated annealing on fractal energy landscapes. Algorithmica 6, 367–418. ECCC ISSN 1433-8092 http://www.eccc.uni-trier.de/eccc ftp://ftp.eccc.uni-trier.de/pub/eccc ftpmail@ftp.eccc.uni-trier.de, subject ’help eccc
- Sorkin
- 1991
(Show Context)
Citation Context ...ype of problems they are best suited for (for example a deep relationship between simulated annealing and solution space with fractal topology have been verified both experimentally and theoretically =-=[35]-=-), algorithm selection for the task at hand is still mainly an experimental and intuitive art. In order to satisfy all major considerations for the power minimization problem, we decided to use a comb... |

33 | Maximally Fast and Arbitrarily Fast Implementation of Linear Computations
- Potkonjak, Rabaey
- 1992
(Show Context)
Citation Context ...nd algebraic rules. The direct form can be derived from the Feig’s DCT, by the simple application of the transformation set using the procedure for maximally fast implementation of linear computation =-=[21]-=-. While Feig’s DCT has a critical path of 11 cycles, the maximally fast DCT has a critical path of only 7 cycles. Therefore the Vdd can be reduced from 5 V to 3.25 V. While the reduction of the supply... |

25 |
Architectural synthesis for DSP silicon compiler
- Haroun, Elmasry
- 1986
(Show Context)
Citation Context ...s have incorporated comprehensive sets of transformations, coupled with powerful optimization strategies. Example systems with elaborate applications of transformations are Flamel [8], SAW [9], SPAID =-=[10]-=-, HYPER [11], and CATHEDRAL [12]. Among the set of transformations used by the Flamel design system are loop transformations, height reduction and constant propagation. SAW uses among other transforma... |

15 |
An Integrated CAD System for Algorithm-Specific IC Design
- Shung
- 1991
(Show Context)
Citation Context ...r of 1.5 larger than the tree implementation for a four input addition and 2.5 larger for an eight input addition. The above simulations were done on layouts generated by the LagerIV silicon compiler =-=[25]-=- using the IRSIM [17] switch-level simulator over 1000 uncorrelated random input patterns. The results presented above indicate that increasing the logic depth (through more cascading) will increase t... |

9 | Power-supply voltage impact on circuit performance for half and lower submicrometer CMOS - Kakumu, Kinugawa - 1990 |

7 |
Potkonjak: "Transforming Linear Systems for Joint Latency and Throughput Optimization
- Srivastava, M
- 1994
(Show Context)
Citation Context ...arallel filter is reduced by a factor of four for a given throughput. Note that the direct form can be transformed to the parallel form (or vice-versa) using a specific ordered set of transformations =-=[27]-=-. The results from section 4 can be summarized in the following requirements for the application of transformations for power reduction: n Efficient implementation of known and new transformations so ... |

6 |
H.: "Numerical Recipes in C
- Press
- 1992
(Show Context)
Citation Context ... relationship using an accurate and computationally efficient procedure. This relationship (of delay-V dd ) was modeled using Neville's algorithm for rational function interpolation and extrapolation =-=[33]-=-. Neville's algorithm provides an indirect way for constructing a polynomial of degree N-1 so that all of the used points are exactly matched. Figure 15 also shows the accuracy of the interpolated dat... |

6 |
Transforming linear systems for joint latency and throughput optimization
- SRIVASTAVA, POTKONJAK
- 1994
(Show Context)
Citation Context ...arallel filter is reduced by a factor of four for a given throughput. Note that the direct form can be transformed to the parallel form (or vice-versa) using a specific ordered set of transformations =-=[27]-=-. The results from section 4 can be summarized in the following requirements for the application of transformations for power reduction: ■ Efficient implementation of known and new transformations so ... |

4 |
Designing High Performance Systems to Run from 3.3V or Lower Sources
- Dahle
- 1991
(Show Context)
Citation Context ... to some extent for a velocity-saturated device with very little penalty in speed performance. This was found to achieve a 60% reduction in power for a 3.3V system when compared to a 5 volt operation =-=[4]-=-. [2] presents an architecture based voltage scaling strategy that results in an optimal voltage for power that is much lower (in the 1-1.5V range) than obtained from the technology based scaling. The... |

4 |
Rabaey: "Maximally Fast and Arbitrarily Fast Implementation of Linear Computations
- Potkonjak, J
- 1992
(Show Context)
Citation Context ...nd algebraic rules. The direct form can be derived from the Feig's DCT, by the simple application of the transformation set using the procedure for maximally fast implementation of linear computation =-=[21]-=-. While Feig's DCT has a critical path of 11 cycles, the maximally fast DCT has a critical path of only 7 cycles. Therefore the Vdd can be reduced from 5 V to 3.25 V. While the reduction of the supply... |

3 |
On Average Power Dissipation and Random
- Shen, Ghosh, et al.
- 1992
(Show Context)
Citation Context ...iated parasitic capacitances [2]. To minimize the total switched capacitance in random logic modules, several logic synthesis optimization techniques have been proposed to lower the power dissipation =-=[7]-=-. 3.2 Transformations in High-level Synthesis Over the last few years, several high-level synthesis systems have incorporated comprehensive sets of transformations, coupled with powerful optimization ... |

3 |
The Influence of Hardware Mapping on High-Level Synthesis
- Schultz
- 1992
(Show Context)
Citation Context ...tion is built on top of an existing estimation routine in HYPER that determines bounds and activity of various execution, register and interconnect components as well as the implementation area [28], =-=[29]-=-. The details of the capacitance estimation routines are described below. 5.1.1 Execution Units The capacitance switched by the execution units is estimated by multiplying (over all types of operation... |

3 |
et al., "Prediction of wiring space requirements for LSI
- Heller
- 1977
(Show Context)
Citation Context ..., floorplanning and synthesis considerations for throughput and area optimization, several elaborate prediction models for total chip and interconnect area have been built and successfully used [30], =-=[31]-=-. However, high-level synthesis adds additional requirements on the prediction tools next to accuracy; during the optimization process in high level synthesis, it is necessary to estimate the final co... |

2 |
A 3.8ns CMOS 16x16 Multiplier Using Complementary Pass Transistor Logic
- Yano
- 1990
(Show Context)
Citation Context ...roach and topology for implementing various logic and arithmetic functions. A pass-transistor logic family was found to minimize physical capacitance when compared to a conventional CMOS logic family =-=[5]-=-. At a another level, there are various topological choices for implementing a given function. For example, an adder can implemented using ripple-carry or carry-lookahead approaches. The power trade-o... |

2 |
De Man: "Modeling Multidimensional Data and Control Flow
- Franssen, Swaaij, et al.
- 1993
(Show Context)
Citation Context ...e sets of transformations, coupled with powerful optimization strategies. Example systems with elaborate applications of transformations are Flamel [8], SAW [9], SPAID [10], HYPER [11], and CATHEDRAL =-=[12]-=-. Among the set of transformations used by the Flamel design system are loop transformations, height reduction and constant propagation. SAW uses among other transformations in-line expansion, dead co... |

2 |
Estimating Implementation Bounds for Real Time Application Specific Circuits
- Rabaey, Potkonjak
- 1994
(Show Context)
Citation Context ...estimation is built on top of an existing estimation routine in HYPER that determines bounds and activity of various execution, register and interconnect components as well as the implementation area =-=[28]-=-, [29]. The details of the capacitance estimation routines are described below. 5.1.1 Execution Units The capacitance switched by the execution units is estimated by multiplying (over all types of ope... |

2 |
et al., ‘‘A 3.8ns CMOS 16x16 Multiplier Using Complementary Pass Transistor Logic
- Yano
- 1990
(Show Context)
Citation Context ...roach and topology for implementing various logic and arithmetic functions. A pass-transistor logic family was found to minimize physical capacitance when compared to a conventional CMOS logic family =-=[5]-=-. At a another level, there are various topological choices for implementing a given function. For example, an adder can implemented using ripple-carry or carry-lookahead approaches. The power trade-o... |

2 |
et al. Prediction of wiring space requirements for LSI
- Heller
- 1977
(Show Context)
Citation Context ..., floorplanning and synthesis considerations for throughput and area optimization, several elaborate prediction models for total chip and interconnect area have been built and successfully used [30], =-=[31]-=-. However, high-level synthesis adds additional requirements on the prediction tools next to accuracy; during the optimization process in high level synthesis, it is necessary to estimate the final co... |

1 |
Optimizing Aritmitic Elements for
- Callaway, Swartzlander
- 1992
(Show Context)
Citation Context ...ting a given function. For example, an adder can implemented using ripple-carry or carry-lookahead approaches. The power trade-off between various types of adders and multipliers were investigated in =-=[6] and they -=-concluded that a carry-lookahead topology was the "best" after taking into Previous Work 4 of 27 account the speed-capacitance trade-off. Optimizing transistor sizing is yet another degree o... |

1 |
Calculation of Total Dynamic Current of VLSI Using a Switch Level Timing Simulator (RSIM-FX
- Kimura, Tsujimoto
- 1991
(Show Context)
Citation Context ...ce associated with node i. The total power is then estimated as C avg * V dd 2 * f clk An approach for estimating the power consumption in CMOS circuits using a switch-level simulator is presented in =-=[16]. Th-=-e basic idea is to monitor the number of times each node in the circuit transitions during the simulation period. C avg is given by ��N i / N C i , where N i is the total number of power consuming... |

1 |
A Low-power Chipset for Multimedia Applications
- Chandrakasan, Burstein, et al.
- 1994
(Show Context)
Citation Context ...s can be used at the system level to communicate between subsystems that are working at different voltages. Voltage level conversion circuitry can be implemented with very low area and power overhead =-=[38]. The use -=-of multiple supply voltages (in which each part of the system operates at its own "optimum" voltage) can result in significant system power reduction compared to a solution which uses a sing... |

1 |
De Man: “Modeling Multidimensional Data and Control Flow
- Franssen, Balasa, et al.
- 1993
(Show Context)
Citation Context ...e sets of transformations, coupled with powerful optimization strategies. Example systems with elaborate applications of transformations are Flamel [8], SAW [9], SPAID [10], HYPER [11], and CATHEDRAL =-=[12]-=-. Among the set of transformations used by the Flamel design system are loop transformations, height reduction and constant propagation. SAW uses among other transformations in-line expansion, dead co... |