Results 1 - 10
of
17
Uncertainty-Aware Circuit Optimization
- IN DAC
, 2002
"... Almost by definition, well-tuned digital circuits have a large number of equally critical paths, which form a so-called "wall" in the slack histogram. However, by the time the design has been through manufacturing, many uncertainties cause these carefully aligned delays to spread out. Inaccuracies i ..."
Abstract
-
Cited by 17 (1 self)
- Add to MetaCart
Almost by definition, well-tuned digital circuits have a large number of equally critical paths, which form a so-called "wall" in the slack histogram. However, by the time the design has been through manufacturing, many uncertainties cause these carefully aligned delays to spread out. Inaccuracies in parasitic predictions, clock slew, model-to-hardware correlation, static timing assumptions and manufacturing variations all cause the performance to vary from prediction. Simple statistical principles tell us that the variation of the limiting slack is larger when the height of the wall is greater. Although the wall may be the optimum solution if the static timing predictions were perfect, in the presence of uncertainty in timing and manufacturing, it may no longer be the best choice. The application of formal mathematical optimization in transistor sizing increases the height of the wall, thus exacerbating the problem. There is also a practical matter that schematic restructuring downstream in the design methodology is easier to conceive when there are fewer equally critical paths. This paper describes a method that gives formal mathematical optimizers the incentive to avoid the wall of equally critical paths, while giving up as little as possible in nominal performance. Surprisingly, such a formulation reduces the degeneracy of the optimization problem and can render the optimizer more effective. This "uncertainty-aware" mode has been implemented and applied to several high-performance microprocessor macros. Numerical results are included.
A new method for design of robust digital circuits
- Proceedings International Symposium on Quality Electronic Design (ISQED
, 2005
"... As technology continues to scale beyond 100nm, there is a significant increase in performance uncertainty of CMOS logic due to process and environmental variations. Traditional circuit optimization methods assuming deterministic gate delays produce a flat “wall ” of equally critical paths, resulting ..."
Abstract
-
Cited by 10 (1 self)
- Add to MetaCart
As technology continues to scale beyond 100nm, there is a significant increase in performance uncertainty of CMOS logic due to process and environmental variations. Traditional circuit optimization methods assuming deterministic gate delays produce a flat “wall ” of equally critical paths, resulting in variation-sensitive designs. This paper describes a new method for sizing of digital circuits, with uncertain gate delays, to minimize their performance variation leading to a higher parametric yield. The method is based on adding margins on each gate delay to account for variations and using a new “soft maximum ” function to combine path delays at converging nodes. PSfrag Using replacements analytic models to predict the means and standard deterministic deviations method of gate delays as posynomial functions of the device sizes, PDF we create a simple, computationally efficient heuristic for uncertainty-aware sizing of digital circuits via Geometric Programming. Monte-Carlo simulations on custom 32bit adders and ISCAS’85 benchmarks show that about 10 % to 20 % delay reduction over deterministic sizing methods can be achieved, without any additional cost in area. 1.
Two-Step Algorithms for Nonlinear Optimization with Structured Applications
- SIAM Journal on Optimization
, 1999
"... In this paper we propose extensions to trust-region algorithms in which the classical step is augmented with a second step that we insist yields a decrease in the value of the objective function. The classical convergence theory for trust-region algorithms is adapted to this class of two-step alg ..."
Abstract
-
Cited by 9 (6 self)
- Add to MetaCart
In this paper we propose extensions to trust-region algorithms in which the classical step is augmented with a second step that we insist yields a decrease in the value of the objective function. The classical convergence theory for trust-region algorithms is adapted to this class of two-step algorithms. The algorithms can be applied to any problem with variable(s) whose contribution to the objective function is a known functional form. In the nonlinear programming package LANCELOT, they have been applied to update slack variables and variables introduced to solve minimax problems, leading to enhanced optimization eciency. Extensive numerical results are presented to show the eectiveness of these techniques. Keywords. Trust regions, line searches, two-step algorithms, spacer steps, slack variables, LANCELOT, minimax problems, expensive function evaluations, circuit optimization. AMS subject classications. 49M37, 90C06, 90C30 1 Introduction In nonlinear optimization proble...
A heuristic for optimizing stochastic activity networks with applications to statistical digital circuit sizing
- IEEE Transactions on Circuits and Systems-I
, 2004
"... A deterministic activity network (DAN) is a collection of activities, each with some duration, along with a set of precedence constraints, which specify that activities begin only when certain others have finished. One critical performance measure for an activity network is its makespan, which is th ..."
Abstract
-
Cited by 8 (4 self)
- Add to MetaCart
A deterministic activity network (DAN) is a collection of activities, each with some duration, along with a set of precedence constraints, which specify that activities begin only when certain others have finished. One critical performance measure for an activity network is its makespan, which is the minimum time required to complete all activities. In a stochastic activity network (SAN), the durations of the activities and the makespan are random variables. The analysis of SANs is quite involved, but can be carried out numerically by Monte Carlo analysis. This paper concerns the optimization of a SAN, i.e., the choice of some design variables that affect the probability distributions of the activity durations. We concentrate on the problem of minimizing a quantile (e.g., 95%) of the makespan, subject to constraints on the variables. This problem has many applications, ranging from project management to digital integrated circuit (IC) sizing (the latter being our motivation). While there are effective methods for optimizing DANs, the SAN optimization problem is much more difficult; the few existing methods cannot handle large-scale problems.
A Semi-custom Design Flow in High-performance Microprocessor Design
- Proc. of Design Automation Conference (DAC
, 2001
"... In this paper we present techniques shown to significantly enhance the custom circuit design process typical of highperformance microprocessors. This methodology combines flexible custom circuit design with automated tuning and physical design tools to provide new opportunities to optimized design t ..."
Abstract
-
Cited by 5 (0 self)
- Add to MetaCart
In this paper we present techniques shown to significantly enhance the custom circuit design process typical of highperformance microprocessors. This methodology combines flexible custom circuit design with automated tuning and physical design tools to provide new opportunities to optimized design throughout the development cycle.
Robust Energy-Efficient Adder Topologies
"... In this paper we explore the relationship between adder topology and energy efficiency. We compare the energy-delay tradeoff curves of selected 32-bit adder topologies, to determine how architectural features and design techniques affect energy efficiency. Optimizing different adders for the supply ..."
Abstract
-
Cited by 5 (2 self)
- Add to MetaCart
In this paper we explore the relationship between adder topology and energy efficiency. We compare the energy-delay tradeoff curves of selected 32-bit adder topologies, to determine how architectural features and design techniques affect energy efficiency. Optimizing different adders for the supply and threshold voltages, and transistor sizing, we show that topologies with the least number of logic stages having an average fanin of two per stage, and fewest wires are most energy efficient. While a design with fully custom sizes can be extremely tedious to layout, we show that custom sizing can be used as a guide to group different gates in the design, resulting in a manageable layout overhead without significant loss of energy efficiency. 1.
Efficient and Accurate Gate Sizing with Piecewise Convex Delay Models
- DAC 2005
, 2005
"... We present an efficient and accurate gate sizing tool that employs a novel piecewise convex delay model, handling both rise and fall delays, for static CMOS gates. The delay model is used in a new version of a gate-sizing tool called Forge, which not only exhibits optimality, but also efficiently pr ..."
Abstract
-
Cited by 4 (0 self)
- Add to MetaCart
We present an efficient and accurate gate sizing tool that employs a novel piecewise convex delay model, handling both rise and fall delays, for static CMOS gates. The delay model is used in a new version of a gate-sizing tool called Forge, which not only exhibits optimality, but also efficiently produces the area versus delay tradeoff curve for a block in one step. Forge includes a realistic delay propagation scheme that combines arrival times and slew-rates. Forge is 6.4X faster than a commercial transistor sizing tool, while achieving better delay targets and uses 28 % less transistor area for specific delay targets, on average.
Area and Delay Trade-offs in the Circuit and Architecture . . .
, 2008
"... Field-programmable gate arrays (FPGAs) are used in a wide range of markets that have differing cost, performance and power consumption requirements. It would be advantageous if a single device family could serve these varied needs but the economics of catering to this wide distribution of market dem ..."
Abstract
-
Cited by 4 (1 self)
- Add to MetaCart
Field-programmable gate arrays (FPGAs) are used in a wide range of markets that have differing cost, performance and power consumption requirements. It would be advantageous if a single device family could serve these varied needs but the economics of catering to this wide distribution of market demands suggest more than one family is appropriate. Consequently, FPGA vendors have moved to provide a more diverse set of families that sit at different points in the areaspeed-power design space. In this work, our goal is to understand the circuit and architectural design attributes of an FPGA that enable tradeoffs between area and speed, and to determine the magnitude of the possible trade-offs. This will be useful for architects seeking to determine the number of device families in a suite of offerings, as well as the changes to make between families. We have found that varying both architecture and transistor sizing of an FPGA allows the effective area to change by a factor of 3.6 from largest to smallest and the speed to change by a factor of 2.6 from fastest to slowest. It is interesting to observe that the range of area and delay tradeoffs possible by varying only the transistor sizing of a single architecture is larger than the ranges observed in past architectural experiments. In addition to transistor size, we note that LUT size is one of the most useful parameters for trading off area and delay.
A 5GHzþ 128-bit Binary Floating-Point Adder for the POWER6
- Power6 Processor”, Proc. of ESSCIRC
, 2006
"... Abstract—A fast 128-bit end-around carry adder is designed and fabricated as part of the POWER6 floating-point unit in a 65nm SOI process technology. Efficient use of static circuits and careful balance of the look-ahead tree enable our floatingpoint design to operate beyond 5GHz with 1.1V supply. I ..."
Abstract
-
Cited by 3 (0 self)
- Add to MetaCart
Abstract—A fast 128-bit end-around carry adder is designed and fabricated as part of the POWER6 floating-point unit in a 65nm SOI process technology. Efficient use of static circuits and careful balance of the look-ahead tree enable our floatingpoint design to operate beyond 5GHz with 1.1V supply. I.
Large-Scale Nonlinear Optimization in Circuit Tuning
, 2003
"... Circuit tuning is an important task in the design of custom digital integrated circuits such as high-performance microprocessors. The goal is to improve certain aspects of the circuit, such as speed, area, or power, by optimally choosing the sizes of the transistors. This task can be formulated as a ..."
Abstract
-
Cited by 3 (1 self)
- Add to MetaCart
Circuit tuning is an important task in the design of custom digital integrated circuits such as high-performance microprocessors. The goal is to improve certain aspects of the circuit, such as speed, area, or power, by optimally choosing the sizes of the transistors. This task can be formulated as a large-scale nonlinear, nonconvex optimization problem, where function values and derivatives are obtained by simulation of individual gates. This application o#ers an excellent example of a nonlinear optimization problem, for which it is very desirable to increase the size of the problems that can be solved in a reasonable amount of time. In this paper we describe the mathematical formulation of this problem and the implementation of a circuit tuning tool. We demonstrate how the integration of a novel state-of-the-art interior point algorithm for nonlinear programming led to considerable improvement in e#- ciency and robustness. Particularly, as will be demonstrated with numerical results, the new approach has great potential for parallel and distributed computing.

