Results 1  10
of
24
The Elmore Delay as a Bound for RC Trees with Generalized Input Signals
 the IEEE Transactions on CAD. (Available
"... The Elmore delay is an extremely popular delay metric, particularly for RC tree analysis. The widespread usage of this metric is mainly attributable to it being the most accurate delay measure that is a simple analytical function of the circuit parameters. The only drawbacks to this delay metric are ..."
Abstract

Cited by 38 (0 self)
 Add to MetaCart
The Elmore delay is an extremely popular delay metric, particularly for RC tree analysis. The widespread usage of this metric is mainly attributable to it being the most accurate delay measure that is a simple analytical function of the circuit parameters. The only drawbacks to this delay metric are the uncertainty as to whether it is an optimistic or a pessimistic estimate, and the restriction to step response delay estimation. In this paper, we prove that the Elmore delay is an absolute upper bound on the 50 % delay of an RC tree response. Moreover, we prove that this bound holds for input signals other than steps, and that the actual delay asymptotically approaches the Elmore delay as the input signal rise time increases. A lower bound on the delay is also developed using the Elmore delay and the second moment of the impulse response. The utility of this bound is for understanding the accuracy and the limitations of the Elmore delay metric as we use it for design automation. I.
GradientBased Optimization of Custom Circuits Using a StaticTiming Formulation
, 1999
"... This paper describes a method of optimally sizing digital circuits on a statictiming basis. All paths through the logic are considered simultaneously and no input patterns need be specified by the user. The method is unique in that it is based on gradientbased, nonlinear optimization and can accom ..."
Abstract

Cited by 26 (4 self)
 Add to MetaCart
This paper describes a method of optimally sizing digital circuits on a statictiming basis. All paths through the logic are considered simultaneously and no input patterns need be specified by the user. The method is unique in that it is based on gradientbased, nonlinear optimization and can accommodate transistorlevel schematics without the need for precharacterization. It employs efficient timedomain simulation and gradient computation for each channelconnected component. A largescale, generalpurpose, nonlinear optimization package is used to solve the tuning problem. A prototype tuner has been developed that accommodates combinational circuits consisting of parameterized library cells. Numerical results are presented.
Performance benefits of monolithically stacked 3D FPGA
 IEEE Transactions on ComputerAided Design of Integrated Circuits and Systems
, 2007
"... Abstract—The performance benefits of a monolithically stacked threedimensional (3D) fieldprogrammable gate array (FPGA), whereby the programming overhead of an FPGA is stacked on top of a standard CMOS layer containing logic blocks (LBs) and interconnects, are investigated. A VirtexIIstyle two ..."
Abstract

Cited by 18 (6 self)
 Add to MetaCart
Abstract—The performance benefits of a monolithically stacked threedimensional (3D) fieldprogrammable gate array (FPGA), whereby the programming overhead of an FPGA is stacked on top of a standard CMOS layer containing logic blocks (LBs) and interconnects, are investigated. A VirtexIIstyle twodimensional (2D) FPGA fabric is used as a baseline architecture to quantify the relative improvements in logic density, delay, and power consumption achieved by such a 3D FPGA. It is assumed that only the switch transistor and configuration memory cells can be moved to the top layers and that the 3D FPGA employs the same LB and programmable interconnect architecture as the baseline 2D FPGA. Assuming they are ≤ 0.7, the area of a static randomaccess memory cell and switch transistors having the same characteristics as nchannel metal–oxide–semiconductor devices in the CMOS layer are used. It is shown that a monolithically stacked 3D FPGA can achieve 3.2 times higher logic density, 1.7 times lower critical path delay, and 1.7 times lower total dynamic power consumption than the baseline 2D FPGA fabricated in the same 65nm technology node. Index Terms—Fieldprogrammable gate arrays (FPGAs), monolithically stacked, performance, threedimensional (3D). I.
A Probabilistic Approach to Buffer Insertion
 IN PROC. INT. CONF. COMPUTERAIDED DESIGN
, 2003
"... This work presents a formal probabilistic approach for solving optimization problems in design automation. Prediction accuracy is very low especially at high levels of design flow. This can be attributed mainly to unawareness of low level layout information and variability in fabrication process. He ..."
Abstract

Cited by 17 (5 self)
 Add to MetaCart
This work presents a formal probabilistic approach for solving optimization problems in design automation. Prediction accuracy is very low especially at high levels of design flow. This can be attributed mainly to unawareness of low level layout information and variability in fabrication process. Hence a traditional deterministic design automation approach where each cost function is represented as a fixed value becomes obsolete. A new approach is gaining attention [15, 5, 2, 4, 12] in which the cost functions are represented as probability distributions and the optimization criteria is probabilistic, too. This design optimization philosophy is demonstrated through the classic buffer insertion problem [13]. Formally, we capture wirelengths as probability distributions (as compared to the traditional approach which considers wirelength as fixed values) and present several strategies for optimizing the probabilistic criteria. During the course of this work many problems are proved to be NPComplete. Comparisons are made with the VanGinneken "optimal under fixed wirelength" algorithm. Results show that the VanGinneken approach generated delay distributions at the root of the fanout wiring tree which had large probability (0.91 in the worst case and 0.55 on average) of violating the delay constraint. Our algorithms could achieve 100% probability of satisfying the delay constraint with similar buffer penalty. Although this work considers wirelength prediction inaccuracies, our probabilistic strategy could be extended trivially to consider fabrication variability in wire parasitics.
Addressing the Timing Closure Problem by Integrating Logic Optimization and Placement
 Proc. ICCAD‘01
, 2001
"... Timing closure problems occur when timing estimates computed during logic synthesis do not match with timing estimates computed from the layout of the circuit. In such a situation, logic synthesis and layout synthesis are iterated until the estimates match. The number of such iterations is becoming ..."
Abstract

Cited by 13 (0 self)
 Add to MetaCart
Timing closure problems occur when timing estimates computed during logic synthesis do not match with timing estimates computed from the layout of the circuit. In such a situation, logic synthesis and layout synthesis are iterated until the estimates match. The number of such iterations is becoming larger as technology scales. Timing closure problems occur mainly due to the difficulty in accurately predicting interconnect delay during logic synthesis.
Optimization of Custom MOS Circuits by Transistor Sizing
 IEEE INTERNATIONAL CONFERENCE ON COMPUTERAIDED DESIGN
, 1996
"... Optimization of a circuit by transistor sizing is often a slow, tedious and iterative manual process which relies on designer intuition. Circuit simulation is carried out in the inner loop of this tuning procedure. Automating the transistor sizing process is an important step towards being able to r ..."
Abstract

Cited by 9 (4 self)
 Add to MetaCart
Optimization of a circuit by transistor sizing is often a slow, tedious and iterative manual process which relies on designer intuition. Circuit simulation is carried out in the inner loop of this tuning procedure. Automating the transistor sizing process is an important step towards being able to rapidly design highperformance, custom circuits. JiffyTune is a new circuit optimization tool that automates the tuning task. Delay, rise/fall time, area and power targets are accommodated. Each (weighted) target can be either a constraint or an objective function. Minimax optimization is supported. Transistors can be ratioed and similar structures grouped to ensure regular layouts. Bounds on transistor widths are supported. JiffyTune uses
Circuit Optimization via Adjoint Lagrangians
 IEEE INTERNATIONAL CONFERENCE ON COMPUTERAIDED DESIGN
, 1997
"... The circuit tuning problem is best approached by means of gradientbased nonlinear optimization algorithms. For large circuits, gradient computation can be the bottleneck in the optimization procedure. Traditionally, when the number of measurements is large relative to the number of tunable paramete ..."
Abstract

Cited by 6 (3 self)
 Add to MetaCart
The circuit tuning problem is best approached by means of gradientbased nonlinear optimization algorithms. For large circuits, gradient computation can be the bottleneck in the optimization procedure. Traditionally, when the number of measurements is large relative to the number of tunable parameters, the direct method [2] is used to repeatedly solve the associated sensitivity circuit to obtain all the necessary gradients. Likewise, when the parameters outnumber the measurements, the adjoint method [1] is employed to solve the adjoint circuit repeatedly for each measurement to compute the sensitivities. In this paper, we propose the adjoint Lagrangian method, which computes all the gradients necessary for augmentedLagrangianbased optimization in a single adjoint analysis. After the nominal simulation of the circuit has been carried out, the gradients of the merit function are expressed as the gradients of a weighted sum of circuit measurements. The weights are dependent on the nominal solution and on optimizer quantities such as Lagrange multipliers. By suitably choosing the excitations of the adjoint circuit, the gradients of the merit function are computed via a single adjoint analysis, irrespective of the number of measurements and the number of parameters of the optimization. This procedure requires close integration between the nonlinear optimization software and the circuit simulation program. The adjoint
A practical repeater insertion method in high speed VLSI circuits
 Proc. 35th ACM/ IEEE Design Automation Conference
, 1998
"... In today’s design of VLSI high speed circuits, frequency has a major impact on the number of repeaters that needs to be inserted. A microprocessor operating at less than 200Mhz might require several hundred repeaters, while one operating at greater than 500Mhz may require a number in the thousands. ..."
Abstract

Cited by 5 (0 self)
 Add to MetaCart
In today’s design of VLSI high speed circuits, frequency has a major impact on the number of repeaters that needs to be inserted. A microprocessor operating at less than 200Mhz might require several hundred repeaters, while one operating at greater than 500Mhz may require a number in the thousands. The following paper describes an efficient and simple way to automatically determine buffer placement based on maintaining equal transition time for all gate input signals across the net. A maximum allowable transition time is determined (limited by the frequency of the circuit), and correlated with the interconnect Elmore Delay. A Spice RC model having nodes with physical locations (X, Y coordinates) can be obtained by extraction tools providing standard parasitic format (SPF). This can then be used with the results of the algorithm for repeater placement to determine the exact physical location desired for each repeater. 1.
Efficient and Accurate Gate Sizing with Piecewise Convex Delay Models
 DAC 2005
, 2005
"... We present an efficient and accurate gate sizing tool that employs a novel piecewise convex delay model, handling both rise and fall delays, for static CMOS gates. The delay model is used in a new version of a gatesizing tool called Forge, which not only exhibits optimality, but also efficiently pr ..."
Abstract

Cited by 4 (0 self)
 Add to MetaCart
We present an efficient and accurate gate sizing tool that employs a novel piecewise convex delay model, handling both rise and fall delays, for static CMOS gates. The delay model is used in a new version of a gatesizing tool called Forge, which not only exhibits optimality, but also efficiently produces the area versus delay tradeoff curve for a block in one step. Forge includes a realistic delay propagation scheme that combines arrival times and slewrates. Forge is 6.4X faster than a commercial transistor sizing tool, while achieving better delay targets and uses 28 % less transistor area for specific delay targets, on average.
Empirical Models for NetLength Probability Distribution and Applications
 IN IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION SYSTEMS
, 2004
"... In this paper, we propose a novel, empirical, and parameterizable model for estimating the probability distribution of wire length for each net in a placed netlist. The model is simple and fast to compute. We did extensive experimentation with stateoftheart commercial (Cadence) and academic (Parq ..."
Abstract

Cited by 4 (4 self)
 Add to MetaCart
In this paper, we propose a novel, empirical, and parameterizable model for estimating the probability distribution of wire length for each net in a placed netlist. The model is simple and fast to compute. We did extensive experimentation with stateoftheart commercial (Cadence) and academic (Parquet and Labyrinth) tools and validated our model. Our distribution model was around three times more accurate than assuming halfperimeter bounding box as the fixed netlength estimate. Since the model is parameterizable it can be easily tailored for different routing tools and benchmarks. This model would be very useful in defining a full fledged probabilistic design automation methodology in which various design metrics are optimized from a probabilistic point of view. We also discuss the application of our model in a novel probabilistic approach to the buffer insertion problem.