Results 1 - 10
of
23
Physical Planning with Retiming
, 2000
"... In this paper, we propose a unified approach to partitioning, floorplanning, and retiming for effective and efficient performance optimization. The integration enables the partitioner to exploit more realistic geometric delay model provided by the underlying floorplan. Simultaneous consideration of ..."
Abstract
-
Cited by 35 (13 self)
- Add to MetaCart
In this paper, we propose a unified approach to partitioning, floorplanning, and retiming for effective and efficient performance optimization. The integration enables the partitioner to exploit more realistic geometric delay model provided by the underlying floorplan. Simultaneous consideration of partitioning and retiming under the geometric delay model enables us to hide global interconnect latency effectively by repositioning FF along long wires. Under the proposed geometric embedding based performance driven partitioning problem, our GEO algorithm performs multi-level topdown partitioning while determining the location of the partitions. We adopt the concept of sequential arrival time [14] and develop sequential required time in our retiming based timing analysis engine. GEO performs cluster-move based iterative improvement on top of multi-level cluster hierarchy [4], where the gain function obtained from the timing analysis is based on the minimization of cutsize, wirelength, a...
Physical hierarchy generation with routing congestion control
- In Proc. Int. Symp. on Physical Design
, 2002
"... ..."
An Enhanced Multilevel Routing System.
, 2002
"... In this paper, we present several novel techniques that make the recently published multilevel routing scheme [19] more effective and complete. Our contributions include: (1) resource reservation for local nets during the coarsening process, (2) congestion-driven, graph-based Steiner tree constructi ..."
Abstract
-
Cited by 21 (3 self)
- Add to MetaCart
In this paper, we present several novel techniques that make the recently published multilevel routing scheme [19] more effective and complete. Our contributions include: (1) resource reservation for local nets during the coarsening process, (2) congestion-driven, graph-based Steiner tree construction during the initial routing and the refinement process and (3) multi-iteration refmement considering the congestion history. The experiments show that each of these techniques helps to improve the completion rate considerately. Compared to [19], the new routing system reduces the number of failed nets by 2x to 18x, with less than 50% increase in runtime in most cases.
Multilevel Global Placement with Congestion Control
- IEEE Trans. on Computer-Aided Design of Integrated Circuits and Systems
, 2003
"... In this paper, we develop a multilevel global placement algorithm (MGP) integrated with fast incremental global routing for directly updating and optimizing congestion cost during physical hierarchy generation. Fast global routing is achieved using a fast two-bend routing and incremental A-tree algo ..."
Abstract
-
Cited by 18 (6 self)
- Add to MetaCart
In this paper, we develop a multilevel global placement algorithm (MGP) integrated with fast incremental global routing for directly updating and optimizing congestion cost during physical hierarchy generation. Fast global routing is achieved using a fast two-bend routing and incremental A-tree algorithm. The routing congestion is modeled by the wire usage estimated by the fast global router. A hierarchical area density control is developed for placing objects with significant size variations. Experimental results show that, compared to GORDIAN-L, the wire length-driven MGP is 4--6.7 times faster and generates slightly better wire length for test circuits larger than 100 000 cells. Moreover, the congestion-driven MGP improves wiring overflow by 45%--74% with 5% larger bounding box wire length but 3%--7% shorter routing wire length measured by a graph-based A-tree global router.
Multilevel Global Placement with Retiming
, 2003
"... Multiple clock cycles are needed to cross the global interconnects for multi-gigahertz designs in nanometer technologies. For synchronous designs, this requires retiming and pipelining on global interconnects. In this paper, we present a practical solution for simultaneous retiming and multilevel gl ..."
Abstract
-
Cited by 14 (2 self)
- Add to MetaCart
Multiple clock cycles are needed to cross the global interconnects for multi-gigahertz designs in nanometer technologies. For synchronous designs, this requires retiming and pipelining on global interconnects. In this paper, we present a practical solution for simultaneous retiming and multilevel global placement for performance optimization, based on the theory and algorithms of sequential timing analysis (Seq-TA). We extend the Seq-TA to handle gates/clusters with multiple outputs and integrate it into a multilevel optimization framework for simultaneous retiming and placement. We also develop two speed-up techniques which enable the Seq-TA to be efficiently integrated into a simulated annealing-based multilevel coarse placement for large-scale designs. Experimental results show that (i) retiming can improve the performance (delay) by 14% on average when it is applied after placement; (ii) our approach for simultaneous retiming and placement can outperform the two-step approach (placement followed by retiming) by 10% on average in terms of delay minimization.
MR: A new framework for multilevel full-chip routing
- IEEE Trans. CAD
, 2004
"... Abstract—In this paper, we propose a novel framework for multilevel full-chip routing considering both routabilityand performance called MR. The two-stage multilevel framework consists of coarsening, followed byuncoarsening. Unlike the previous multilevel routing, MR integrates global routing, detai ..."
Abstract
-
Cited by 13 (8 self)
- Add to MetaCart
Abstract—In this paper, we propose a novel framework for multilevel full-chip routing considering both routabilityand performance called MR. The two-stage multilevel framework consists of coarsening, followed byuncoarsening. Unlike the previous multilevel routing, MR integrates global routing, detailed routing, and resource estimation, together at each level of the framework, leading to more accurate routing resource estimation during coarsening and thus facilitating the solution refinement during uncoarsening. Further, the exact routing information obtained at each level makes MR more flexible in dealing with various routing objectives (such as crosstalk, power, etc.). Experimental results show that MR obtains significantlybetter routing solutions than previous works. For example, for a set of 11 commonlyused benchmark circuits, MR achieves 100 % routing completion for all circuits, while the previous multilevel routing, the three-level routing, and the hierarchical routing can complete routing for only2, 0, 2 circuits, respectively. In particular, the number of routing layers used by MR is even smaller. We also have performed experiments on timing-driven routing. The results are also verypromising.
MARS–A Multilevel Full-Chip Gridless Routing System
- IEEE TCAD
, 2005
"... Abstract—This paper presents MARS, a novel multilevel full-chip gridless routing system. The multilevel framework with recursive coarsening and refinement allows for scaling of our gridless routing system to very large designs. The downward pass of recursive coarsening builds the representations of ..."
Abstract
-
Cited by 11 (0 self)
- Add to MetaCart
Abstract—This paper presents MARS, a novel multilevel full-chip gridless routing system. The multilevel framework with recursive coarsening and refinement allows for scaling of our gridless routing system to very large designs. The downward pass of recursive coarsening builds the representations of routing regions at different levels while the upward pass of iterative refinement allows a gradually improved solution. We introduced a number of efficient techniques in the multilevel routing scheme, including resource reservation, graph-based Steiner tree heuristic and history-based iterative refinement. We compared our multilevel framework with a recently published three-level routing flow [1]. Experimental results show that MARS helps to improve the completion rate by over 10%, and the runtime by II U. Index Terms—Design automation, routing optimization methods, very large scale integration (VLSI). I.
Retiming with interconnect and gate delay
- In Proc. Intl. Conf. on Computer-Aided Design
, 2003
"... In this paper, we study the problem of retiming of sequential circuits with both interconnect and gate delay. Most retiming algorithms have assumed ideal conditions for the non-logical portions of the data paths, which are not sufficiently accurate to be used in high performance circuits today. In o ..."
Abstract
-
Cited by 10 (0 self)
- Add to MetaCart
In this paper, we study the problem of retiming of sequential circuits with both interconnect and gate delay. Most retiming algorithms have assumed ideal conditions for the non-logical portions of the data paths, which are not sufficiently accurate to be used in high performance circuits today. In our modeling, we assume that the delay of a wire is directly proportional to its length. This assumption is reasonable since the quadratic component of a wire delay is significantly smaller than its linear component when the more accurate Elmore delay model is used. A simple experiment is conducted to illustrate the validity of this assumption. We present two approaches to solve this problem, both of which have polynomial time complexity. The first one can compute the optimal clock period while the second one is an improvement over the first one in terms of practical applicability. The second approach gives solutions very close to the optimal (0.13% more than the optimal on average) but in a much shorter runtime. A circuit with more than 22K gates and 32K wires can be optimally retimed in 83.56 seconds by a PC with an 1.8GHz Intel Xeon processor. 1
Edge Separability-Based Circuit Clustering with Application to Multilevel Circuit Partitioning
- IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
, 2004
"... In this paper, we propose a new efficient ( log ) connectivity-based bottom-up clustering algorithm called edge separability-based clustering (ESC). Unlike existing bottom-up algorithms that are based on local connectivity information of the netlist, ESC exploits more global connectivity information ..."
Abstract
-
Cited by 7 (0 self)
- Add to MetaCart
In this paper, we propose a new efficient ( log ) connectivity-based bottom-up clustering algorithm called edge separability-based clustering (ESC). Unlike existing bottom-up algorithms that are based on local connectivity information of the netlist, ESC exploits more global connectivity information using edge separability to guide the clustering process, while carefully monitoring cluster area balance. Exact computation of the edge separability ( ) for a given edge =( ) in an edge-weighted undirected graph is equivalent to finding the maximum flow between and . Since the currently best known time bounds for solving the maximum flow problem is ( log( )), due to Goldberg and Tarjan (Goldberg and Tarjan, 1988), the computation of ( ) for all edges in requires ( log( )) time. However, we show that a simple and efficient algorithm CAPFOREST (Nagamochi and Ibaraki, 1992) can be used to provide a good approximation of edge separability (within 9.1% empirical error bound) for all edges in without using any network flow computation in ( log ) time. Our experimental results based on large-scale benchmark circuits demonstrate the effectiveness of using edge separability in the context of multilevel partitioning framework for cutsize minimization. We observe that exploiting edge separability yields better quality partitioning solution compared to existing clustering algorithms (Sun and Sechen, 1993), (Cong and Smith, 1993), (Huang and Kahng, 1995), (Ng et al., 1987), (Wei and Cheng, 1991), (Shin and Kim, 1993), (Schuler and Ulrich, 1972), (Karypis et al., 1997), that rely on local connectivity information. In addition, our ESC-based iterative improvement based multilevel partitioning algorithm LR/ESC-PM provides comparable results to state-of-the-art hMetis package (Karyp...
A Novel Framework for Multilevel Full-Chip Gridless Routing
- Proc. ASP-DAC
, 2006
"... Abstract — Due to its great flexibility, gridless routing is desirable for nanometer circuit designs that use variable wire widths and spacings. Nevertheless, it is much more difficult than grid-based routing because of its larger solution space. In this paper, we present a novel “V-shaped ” multile ..."
Abstract
-
Cited by 7 (3 self)
- Add to MetaCart
Abstract — Due to its great flexibility, gridless routing is desirable for nanometer circuit designs that use variable wire widths and spacings. Nevertheless, it is much more difficult than grid-based routing because of its larger solution space. In this paper, we present a novel “V-shaped ” multilevel framework (called VMF) for full-chip gridless routing. Unlike the traditional “Λ-shaped ” multilevel framework (inaccurately called the “Vcycle” framework in the literature), our VMF works in the V-shaped manner: top-down uncoarsening followed by bottom-up coarsening. Based on the novel framework, we develop a multilevel full-chip gridless router (called VMGR) for large-scale circuit designs. The top-down uncoarsening stage of VMGR starts from the coarsest regions and then processes down to finest ones level by level; at each level, it performs global pattern routing and detailed routing for local nets and then estimate the routing resource for the next level. Then, the bottom-up coarsening stage performs global maze routing and detailed routing to reroute failed connections and refine the solution level by level from the finest level to the coarsest one. We employ a dynamic congestion map to guide the global routing at all stages and propose a new cost function for congestion control. Experimental results show that VMGR achieves the best routability among all published gridless routers based on a set of commonly used MCNC benchmarks. Besides, VMGR can obtain significantly less wirelength, smaller critical path delay, and smaller average net delay than the previous works. In particular, VMF is general and thus can readily apply to other problems. I.

