Results 1  10
of
23
Clock Distribution Networks in Synchronous Digital Integrated Circuits
 Proc. IEEE
, 2001
"... this paper, bears separate focus. The paper is organized as follows. In Section II, an overview of the operation of a synchronous system is provided. In Section III, fundamental definitions and the timing characteristics of clock skew are discussed. The timing relationships between a local data path ..."
Abstract

Cited by 77 (5 self)
 Add to MetaCart
(Show Context)
this paper, bears separate focus. The paper is organized as follows. In Section II, an overview of the operation of a synchronous system is provided. In Section III, fundamental definitions and the timing characteristics of clock skew are discussed. The timing relationships between a local data path and the clock skew of that path are described in Section IV. The interplay among the aforementioned three subsystems making up a synchronous digital system is described in Section V; particularly, how the timing characteristics of the memory and logic elements constrain the design and synthesis of clock distribution networks. Different forms of clock distribution networks, such as buffered trees and Htrees, are discussed. The automated layout and synthesis of clock distribution networks are described in Section VI. Techniques for making clock distribution networks less sensitive to process parameter variations are discussed in Section VII. Localized scheduling of the clock delays is useful in optimizing the performance of highspeed synchronous circuits. The process for determining the optimal timing characteristics of a clock distribution network is reviewed in Section VIII. The application of clock distribution networks to highspeed circuits has existed for many years. The design of the clock distribution network of certain important VLSIbased systems has been described in the literature, and some examples of these circuits are described in Section IX. In an effort to provide some insight into future and evolving areas of research relevant to highperformance clock distribution networks, some potentially important topics for future research are discussed in Section X. Finally, a summary of this paper with some concluding remarks is provided in Section XI
SkewTolerant Circuit Design
, 1999
"... As cycle times in highperformance digital systems shrink faster than simple process improvement allows, sequencing overhead consumes an increasing fraction of the clock period. In particular, the overhead of traditional domino pipelines can consume 25% or more of the cycle time in aggressive system ..."
Abstract

Cited by 30 (2 self)
 Add to MetaCart
(Show Context)
As cycle times in highperformance digital systems shrink faster than simple process improvement allows, sequencing overhead consumes an increasing fraction of the clock period. In particular, the overhead of traditional domino pipelines can consume 25% or more of the cycle time in aggressive systems. Fortunately, the designer can hide much of this overhead through better design techniques. The key to skewtolerant design is avoiding hard edges in which data must setup before a clock edge but will not continue propagating until after the clock edge. Skewtolerant domino circuits use multiple overlapping clocks to eliminate latches, removing hard edges and hiding the sequencing overhead.
Interleaving buffer insertion and transistor sizing into a single optimization
 IEEE Transactions on VLSI
, 1998
"... Buffer insertion is a technique that is used either to increase the driving power of a path in a circuit, or to isolate large capacitive loads that lie on noncritical or less critical paths. Gate sizing sets the sizes of gates within a circuit to achieve a given timing specification. Traditional des ..."
Abstract

Cited by 16 (0 self)
 Add to MetaCart
Buffer insertion is a technique that is used either to increase the driving power of a path in a circuit, or to isolate large capacitive loads that lie on noncritical or less critical paths. Gate sizing sets the sizes of gates within a circuit to achieve a given timing specification. Traditional design techniques perform gate sizing and buffer insertion as two separate and independent steps during synthesis. However, until sizing is performed, any information on capacitive loads is incomplete and therefore a buffer insertion algorithm must operate with incomplete information, leading to suboptimal results. Moreover, the insertion of buffers can change the structure of the circuit sufficiently so that it may lead to a different sizing solution from the unbuffered circuit. Therefore, these techniques of buffer insertion and sizing are intimately linked and it makes a lot of sense to integrate them into a single optimization. This work presents strategies to insert buffers in a circuit, combined with gate sizing, to achieve better powerdelay and areadelay tradeoffs. The purpose of this work is to examine how combining sizing algorithm with buffer insertion will help us achieve better areadelay or powerdelay tradeoffs, and to determine where and when to insert buffers in a circuit. The delay model incorporates placementbased information and the effect of input slew rates on gate delays. The results obtained by using the new method are significantly better than the results
Speeding up Pipelined Circuits through a Combination of Gate Sizing and Clock Skew Optimization
 Proc. Int'l Conf. on ComputerAided Design
, 1995
"... An algorithm for unifying the techniques of gate sizing and clockskew optimization for acyclic pipelines is presented in this paper. In the design of circuits under very tight timing specifications, the area overhead of gate sizing can be considerable. The procedure utilizes the idea of cycleborrow ..."
Abstract

Cited by 12 (0 self)
 Add to MetaCart
An algorithm for unifying the techniques of gate sizing and clockskew optimization for acyclic pipelines is presented in this paper. In the design of circuits under very tight timing specifications, the area overhead of gate sizing can be considerable. The procedure utilizes the idea of cycleborrowing using clock skew optimization to relax the stringency of the timing specification on the critical stages of the pipeline. Experimental results verify that cycleborrowing using sizing+skew results in a better overall areadelay tradeoff than with sizing alone.
PowerDelay Optimizations in Gate Sizing
, 2000
"... The problem of powerdelay tradeoffs in transistor sizing is examined using a nonlinear optimization formulation. Both the dynamic and the shortcircuit power are considered, and a new modeling technique is used to calculate the shortcircuit power. The notion of transition density is used, with an ..."
Abstract

Cited by 9 (0 self)
 Add to MetaCart
The problem of powerdelay tradeoffs in transistor sizing is examined using a nonlinear optimization formulation. Both the dynamic and the shortcircuit power are considered, and a new modeling technique is used to calculate the shortcircuit power. The notion of transition density is used, with an enhancement that considers the effect of gate delays on the transition density. When the shortcircuit power is neglected, the minimum power circuit is identical to the minimum area circuit. However, under our more realistic models, our experimental results on several circuits show that the minimum power circuit is not necessarily the same as the minimum area circuit.
Timing Analysis Including Clock Skew
 IEEE Trans. ComputerAided Design
, 1999
"... Clock skew is an increasing concern for highspeed circuit designers. Circuit designers use transparent latches and skewtolerant domino circuits to hide clock skew from the critical path and take advantage of shared portions of the clock network to budget less skew between nearby elements than acro ..."
Abstract

Cited by 8 (1 self)
 Add to MetaCart
(Show Context)
Clock skew is an increasing concern for highspeed circuit designers. Circuit designers use transparent latches and skewtolerant domino circuits to hide clock skew from the critical path and take advantage of shared portions of the clock network to budget less skew between nearby elements than across the entire die, but current timing analysis algorithms do not handle correlated clock skews. This paper extends the SakallahMudgeOlukotun (SMO) latchbased timing analysis to include different amounts of clock skew between different elements. The key change is that departure times from each latch must be defined with respect to launching clocks so that the skew between the launching and receiving clocks can be determined at each receiver. The exact analysis leads to an explosion in the number of timing constraints, but most constraints are not tight in practical situations and a modified version of the Szymanski Shenoy relaxation algorithm gives exact results with only a small incre...
Performance Optimization of SinglePhase LevelSensitive Circuits Using Time . . .
 IN ACM/IEEE INTERNATIONAL WORKSHOP ON TIMING ISSUES IN THE SPECIFICATION AND SYNTHESIS OF DIGITAL SYSTEMS
, 2002
"... This paper describes a linear programming (LP) formulation for performance optimization of largescale, synchronous circuits with levelsensitive latches. The proposed formulation permits circuits to operate at a higher clock frequencythat is, with a lower clock periodby the application of bot ..."
Abstract

Cited by 6 (2 self)
 Add to MetaCart
This paper describes a linear programming (LP) formulation for performance optimization of largescale, synchronous circuits with levelsensitive latches. The proposed formulation permits circuits to operate at a higher clock frequencythat is, with a lower clock periodby the application of both nonzero clock skew scheduling [7] and time borrowing [9]. This LP formulation is computationally efficient and demonstrates significant circuit performance improvement. Unlike the approach documented in [2], the LP model of the clock period minimization problem presented here is standalone and independent of the specific LP solver (solution algorithm) used. The modified big M (MBM) method is introduced and applied to the linearization of the nonlinear timing constraints of levelsensitive circuits into a solvable set of fully linear constraints. Clock period improvements as large as 63% are demonstrated over conventional flipflop based circuits with zero clock skew. These improvements are shown on the ISCAS'89 benchmark circuits by using the industrial linear solver CPLEX [1].
RESTA: A Robust and Extendable Symbolic Timing Analysis Tool
 In Proc. of Great Lakes Symposium on VLSI (GLSVLSI
, 2004
"... Successful timing analysis for highspeed integrated circuits requires accurate delay computation. However, fullcustom circuits popular in today’s CPU designs make this difficult. A good circuitlevel static timing analysis tool should 1) consider both internally or externally specified input constr ..."
Abstract

Cited by 5 (1 self)
 Add to MetaCart
(Show Context)
Successful timing analysis for highspeed integrated circuits requires accurate delay computation. However, fullcustom circuits popular in today’s CPU designs make this difficult. A good circuitlevel static timing analysis tool should 1) consider both internally or externally specified input constraints; 2) handle a wide range of circuit structures; and 3) have a robust underlying framework that can be applied independent of the actual device model. In this paper, we present RESTA, a Robust and Extendable Symbolic Timing Analysis tool that aims to address these three goals. RESTA estimates the delay for all valid input assignments, while naturally handling input constraints. We start with a simple linear resistor model for transistors and from there apply various heuristics to improve the delay estimation for the circuits without altering the symbolic algorithms. Our worstcase delay estimates are within 10% of SPICE for over 90 % of the circuits we simulated.
Bodyvoltage estimation in digital PDSOI circuits and its application to static timing analysis
, 1999
"... We describe a technique for estimating the floating body potentials of partiallydepleted silicononinsulator (PDSOI) circuits under steady switching activity and under initial activity after a long period of quiescence. The approach is based on a unique state diagram abstraction of the PDSOI FET ..."
Abstract

Cited by 5 (1 self)
 Add to MetaCart
(Show Context)
We describe a technique for estimating the floating body potentials of partiallydepleted silicononinsulator (PDSOI) circuits under steady switching activity and under initial activity after a long period of quiescence. The approach is based on a unique state diagram abstraction of the PDSOI FET that captures all of the essential device physics. This picture yields a simple analytic model of the body voltage which is used within the context of a prototype transistorlevel static timing analysis engine. Results are presented that demonstrate the accuracy of the analytic bodyvoltage model and the reduction in delay uncertainty possible with this technique. 1 Introduction Silicononinsulator (SOI) technology has long found niche applications for radiationhardened or highvoltage integrated circuits. Recently, SOI has emerged as a technology for highperformance, lowpower deepsubmicron digital integrated circuits [1, 2]. For digital applications, fullydepleted devices have been ...
Timing Verification of Sequential Domino Circuits
, 1996
"... Two methods are presented for static timing verification of sequential circuits implemented as a mix of static and domino logic. Constraints for proper operation of domino gates are derived. An important observation is that input signals to domino gates may start changing near the end of the evaluat ..."
Abstract

Cited by 5 (1 self)
 Add to MetaCart
(Show Context)
Two methods are presented for static timing verification of sequential circuits implemented as a mix of static and domino logic. Constraints for proper operation of domino gates are derived. An important observation is that input signals to domino gates may start changing near the end of the evaluate phase. The first method models domino gates explicitly, similar to latches. The second method treats domino gates only during pre and postprocessing steps. This method is shown to be more conservative, but easier to compute. 1 Introduction Domino logic is popular for highperformance microprocessors where high clock frequencies are required. Domino logic, a form of dynamic logic, has the advantage of small area, fast operation, and low power [1]. However the use of domino logic has been restricted mainly to full custom designs, in part because of the difficulty of verification. Not only do electrical effects such as charge sharing need to be verified, the timing of the circuits is also...