Results 1 - 10
of
18
Timing-Driven Placement for FPGAs
, 2000
"... In this paper we introduce a new Simulated Annealingbased timing-driven placement algorithm for FPGAs. This paper has three main contributions. First, our algorithm employs a novel method of determining source-sink connection delays during placement. Second, we introduce a new cost function that tra ..."
Abstract
-
Cited by 66 (2 self)
- Add to MetaCart
In this paper we introduce a new Simulated Annealingbased timing-driven placement algorithm for FPGAs. This paper has three main contributions. First, our algorithm employs a novel method of determining source-sink connection delays during placement. Second, we introduce a new cost function that trades off between wire-use and critical path delay, resulting in significant reductions in critical path delay without significant increases in wire-use. Finally, we combine connection-based and path-based timing-analysis to obtain an algorithm that has the low time-complexity of connection-based timing-driven placement, while obtaining the quality of path-based timing-driven placement. A comparison of our new algorithm to a well known nontiming -driven placement algorithm demonstrates that our algorithm is able to increase the post-place-and-route speed (using a full path-based timing-driven router and a realistic routing architecture) of 20 MCNC benchmark circuits by an average of 42%, whil...
Timing Driven Placement for Large Standard Cell Circuits
"... We present an algorithm for accurately controlling delays during the placement of large standard cell integrated circuits. Previous approaches to timing driven placement could not handle circuits containing 20,000 or more cells and yielded placement qualities which were well short of the state of th ..."
Abstract
-
Cited by 64 (0 self)
- Add to MetaCart
We present an algorithm for accurately controlling delays during the placement of large standard cell integrated circuits. Previous approaches to timing driven placement could not handle circuits containing 20,000 or more cells and yielded placement qualities which were well short of the state of the art. Our timing optimization algorithm has been added to the placement algorithm which has yielded the best results ever reported on the full set of MCNC benchmark circuits, including a circuit containing more than 100,000 cells. A novel pin-pair algorithm controls the delay without the need for user path specification. The timing algorithm is generally applicable to hierarchical, itera-tive placement methods. Using this algorithm, we present results for the only MCNC standard cell benchmark circuits (fract, struct, and avq.small) for which timing information is available. We decreased the delay of the longest path of circuit fract by 36 % at an area cost of only 2.5%. For circuit struct, the delay of the longest path was decreased by 50 % at an area cost of 6%. Finally, for the large (21,000 cell) circuit avq.small, the longest path delay was decreased by 28 % at an area cost of 6%.
Faster Minimization of Linear Wirelength for Global Placement
- IEEE Transactions on Computer-Aided Design
, 1997
"... A linear wirelength objective more e#ectively captures timing, congestion, and other global placement considerations than a squared wirelength objective. The GORDIAN-L cell placement tool #16# minimizes linear wirelength by #rst approximating the linear wirelength objectiveby a modi#ed squared wirel ..."
Abstract
-
Cited by 31 (8 self)
- Add to MetaCart
A linear wirelength objective more e#ectively captures timing, congestion, and other global placement considerations than a squared wirelength objective. The GORDIAN-L cell placement tool #16# minimizes linear wirelength by #rst approximating the linear wirelength objectiveby a modi#ed squared wirelength objective, then executing the following loop # #1# minimize the current objective to yield some approximate solution, and #2# use the resulting solution to construct a more accurate objective#until the solution converges. In this paper, we #rst show that the GORDIAN-L loop can be viewed as a special case of a new algorithm that generalizes a 1937 iteration due to Weiszfeld #19#. Speci#- cally,we formulate the Weiszfeld iteration using a regularization parameter to control the tradeo# between convergence and solution accuracy; the GORDIAN-L iteration is equivalent to setting this regularization parameter to zero. Other novel numerical methods described in the paper, the Primal Newton it...
Min-Max Placement for Large-Scale Timing Optimization
- In ACM International Symposium on Physical Design
, 2002
"... With feature-sizes below 0�25µm, interconnect delays account for over 40 % of worst delays [12]. Transitions to 0�18µm and 0�13µm further increase this figure, and thus the relative importance of timing-driven placement for VLSI. Our work introduces a novel minimization of maximal path delay that im ..."
Abstract
-
Cited by 21 (8 self)
- Add to MetaCart
With feature-sizes below 0�25µm, interconnect delays account for over 40 % of worst delays [12]. Transitions to 0�18µm and 0�13µm further increase this figure, and thus the relative importance of timing-driven placement for VLSI. Our work introduces a novel minimization of maximal path delay that improves upon previously known algorithms for timing-driven placement. Our placement algorithms have provable properties and are fast in practice. Our empirical validation is based on extending a scalable min-cut placer with proven empirical record in wirelength- and congestion-driven placement [4]. The overhead of timing-driven placement was within 50 % CPU time. We placed industrial circuits and evaluated the layouts with a commercial static timing analyzer.
FastRoute: A step to integrate global routing into placement
- IEEE/ACM Intl. Conf. Computer-Aided Design
, 2006
"... Because of the increasing dominance of interconnect issues in advanced IC technology, placement has become a critical step in the IC design flow. To get accurate interconnect information during the placement process, it is desirable to incorporate global routing into it. However, previous global rou ..."
Abstract
-
Cited by 16 (2 self)
- Add to MetaCart
Because of the increasing dominance of interconnect issues in advanced IC technology, placement has become a critical step in the IC design flow. To get accurate interconnect information during the placement process, it is desirable to incorporate global routing into it. However, previous global routers are computationally expensive. It is impractical to perform global routing repeatedly during placement. In this paper, we present an extremely fast and high-quality global router called FastRoute. In traditional global routing approaches, congestion is not considered during Steiner tree construction. So they have to rely on the time-consuming maze routing technique to eliminate routing congestion. Different from traditional approaches, we proposed a congestion-driven Steiner tree topology generation technique and an edge shifting technique to determine the good Steiner tree topologies and Steiner node positions. Based on the congestion-driven Steiner trees, we only need to apply maze routing to a small percentage of the two-pin nets once to obtain high quality global routing solutions. We also proposed a new cost function based on logistic function to direct the maze routing. Experimental results show that FastRoute generates less congested solutions in 132 × and 64 × faster runtimes than the stateof-the-art academic global routers Labyrinth [1] and Chi Dispersion router [2], respectively. It is even faster than the highly-efficient congestion estimator FaDGloR [3]. The promising results make it possible to incorporate global routing directly into placement process without much runtime penalty. This could dramatically improve the placement solution quality. We believe this work will fundamentally change the way the EDA community look at and make use of global routing in the whole design flow. 1.
Fast Post-placement Rewiring Using Easily Detectable Functional Symmetries
- IN DESIGN AUTOMATION CONFERENCE
, 2000
"... Timing convergence problem arises when the estimations made during logic synthesis can not be met during physical design. In this paper, an efficient rewiring engine is proposed to explore maximal freedom after placement. The most important feature of this approach is that the existing placement ..."
Abstract
-
Cited by 15 (3 self)
- Add to MetaCart
Timing convergence problem arises when the estimations made during logic synthesis can not be met during physical design. In this paper, an efficient rewiring engine is proposed to explore maximal freedom after placement. The most important feature of this approach is that the existing placement solution is left intact throughout the optimization. A linear time algorithm is proposed to detect functional symmetries in the Boolean network and is used as the basis for rewiring. Integration with an existing gate sizing algorithm further proves the effectiveness of our technique. Experimental results are very promising.
Sensitivity guided net weighting for placement driven synthesis
- in Proc. Int. Symp. on Physical Design
, 2004
"... Net weighting is a key technique in large scale timing driven placement, which plays a crucial role for deep submicron physical synthesis and timing closure. A popular way to assign net weight is based on the slack of the nets, trying to minimize the worst negative slack (WNS) for the entire circuit ..."
Abstract
-
Cited by 15 (4 self)
- Add to MetaCart
Net weighting is a key technique in large scale timing driven placement, which plays a crucial role for deep submicron physical synthesis and timing closure. A popular way to assign net weight is based on the slack of the nets, trying to minimize the worst negative slack (WNS) for the entire circuit. While WNS is an important optimization metric, another figure of merit (FOM), defined as the total slack difference compared to a certain slack threshold for all timing end points, is of equivalent importance to measure the overall timing closure result for highly complex modern ASIC and microprocessor designs. In this paper, we perform a comprehensive analysis of the slack and FOM sensitivities to the net weight, and propose a new net weighting scheme based on the slack and FOM sensitivities. Such sensitivity analysis implicitly takes potential physical synthesis effect into consideration. Experiment results on a set of industrial circuits are promising for both stand-alone timing driven placement and physical synthesis afterwards.
Simultaneous Gate Sizing and Placement
- IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
, 2000
"... In this paper, we present an algorithm for gate sizing with controlled displacement to improve the overall circuit timing. We use a path-based delay model to capture the timing constraints in the circuit. To reduce the problem size and improve the solution convergence, we iteratively identify and op ..."
Abstract
-
Cited by 9 (1 self)
- Add to MetaCart
In this paper, we present an algorithm for gate sizing with controlled displacement to improve the overall circuit timing. We use a path-based delay model to capture the timing constraints in the circuit. To reduce the problem size and improve the solution convergence, we iteratively identify and optimize the kmost critical paths in the circuit and their neighboring cells. More precisely in each iteration, we perform three operations: a) reposition the immediate fan-outs of the gates on the k-most critical paths; b) size down the immediate fan-outs of the gates on the k-most critical paths; c) simultaneously reposition and resize the gates on the k-most critical paths. Each of these operations is formulated and solved as a mathematical program by using efficient solution techniques. Experimental results on a set of benchmark circuits demonstrate the effectiveness of our approach compared to the conventional approaches which separate gate sizing from gate placement. 1
Force Directed Mongrel with Physical Net Constraints
- in Proc. ACM/IEEE Design Automation Conf
, 2003
"... This paper describes a new force directed global placement algorithm that exploits and extends techniques from two leading placers, Force-directed [12] [26] and Mongrel [22]. It combines the strengths of force directed global placement with Mongrel’s cell congestion removal to significantly improve ..."
Abstract
-
Cited by 7 (0 self)
- Add to MetaCart
This paper describes a new force directed global placement algorithm that exploits and extends techniques from two leading placers, Force-directed [12] [26] and Mongrel [22]. It combines the strengths of force directed global placement with Mongrel’s cell congestion removal to significantly improve the quality of placement during the difficult overlap removal stage of global placement. This is accomplished by using the spreading force in [12] to direct and control Mongrel’s ripple move optimization. This new placer is called Force Directed Mongrel (FD-Mongrel). FD-Mongrel also incorporates physical net constraints [26], and improves the congestion model for sparse placements. We propose a new placement flow that uses a limited number of the spreading iterations of [12] to form a preliminary global placement. We then use the new FD-Mongrel described in this paper to remove cell overlaps, while meeting net constraints and optimizing wirelength. We present results on wirelength as well as timing driven placement flows.
Cluster-Based Architecture, Timing-Driven Packing, and Timing-Driven Placement for FPGAs
- MASTER’S THESIS
, 1999
"... As process geometries shrink into the deep-submicron region, interconnect resistance and capacitance account for an increasingly significant portion of the delay of circuits implemented in Field-Programmable Gate Arrays (FPGAs). One way to improve FPGA speed is to employ logiccluster-based architect ..."
Abstract
-
Cited by 5 (1 self)
- Add to MetaCart
As process geometries shrink into the deep-submicron region, interconnect resistance and capacitance account for an increasingly significant portion of the delay of circuits implemented in Field-Programmable Gate Arrays (FPGAs). One way to improve FPGA speed is to employ logiccluster-based architectures which have high-speed local connections among groups of logic elements. In this work we show what size logic-cluster results in the best area-speed trade-off. To obtain the best choices for a cluster-based architecture, we use computer aided design (CAD) tools to experimentally evaluate architectures with different sized logic clusters. As part of this CAD flow, we develop a timing-driven algorithm that packs logic elements into these clusters. In addition, we develop a timing-driven placement algorithm that results in significant improvements in FPGA speed over existing non-timing-driven algorithms.

