Results 1 - 10
of
45
An Interconnect-Centric Design Flow for Nanometer Technologies
- Proceedings of the IEEE
, 1999
"... As the IC devices is scaled into nanometer dimen- sions and operates in giga-hertz frequencies, interconnect design and optimization have become critical in determining the system performance and reliability. ..."
Abstract
-
Cited by 80 (26 self)
- Add to MetaCart
(Show Context)
As the IC devices is scaled into nanometer dimen- sions and operates in giga-hertz frequencies, interconnect design and optimization have become critical in determining the system performance and reliability.
Efficient Circuit Clustering for Area and Power Reduction in FPGAs
- In Proceedings of ACM/SIGDA international symposium on Field-programmable gate arrays
, 2002
"... We present a routability-driven bottom-up clustering technique for area and power reduction in clustered FPGAs. This technique uses a cell connectivity metric to identify seeds for efficient clustering. Effective seed selection, coupled with an interconnect-resource aware clustering and placement, c ..."
Abstract
-
Cited by 57 (0 self)
- Add to MetaCart
(Show Context)
We present a routability-driven bottom-up clustering technique for area and power reduction in clustered FPGAs. This technique uses a cell connectivity metric to identify seeds for efficient clustering. Effective seed selection, coupled with an interconnect-resource aware clustering and placement, can have a favorable impact on circuit routability. It leads to better device utilization, savings in area, and reduction in power consumption. Routing area reduction of 35 % is achieved over previously published results. Power dissipation simulations using a buffered pass-transistor-based FPGA interconnect model are presented. They show that our clustering technique can reduce the overall device power dissipation by an average of 13%. 1.
Physical Planning with Retiming
- ICCAD2000, PAGES 2-7
, 2000
"... In this paper, we propose a unified approach to partitioning, floorplanning, and retiming for effective and efficient performance optimization. The integration enables the partitioner to exploit more realistic geometric delay model provided by the underlying floorplan. Simultaneous consideration of ..."
Abstract
-
Cited by 41 (15 self)
- Add to MetaCart
(Show Context)
In this paper, we propose a unified approach to partitioning, floorplanning, and retiming for effective and efficient performance optimization. The integration enables the partitioner to exploit more realistic geometric delay model provided by the underlying floorplan. Simultaneous consideration of partitioning and retiming under the geometric delay model enables us to hide global interconnect latency effectively by repositioning FF along long wires. Under the proposed geometric embedding based performance driven partitioning problem, our GEO algorithm performs multi-level topdown partitioning while determining the location of the partitions. We adopt the concept of sequential arrival time [14] and develop sequential required time in our retiming based timing analysis engine. GEO performs cluster-move based iterative improvement on top of multi-level cluster hierarchy [4], where the gain function obtained from the timing analysis is based on the minimization of cutsize, wirelength, and sequential slack. In our comparison to (i) state-of-the-art partitioner hMetis [9] followed by retiming [11] and simulated annealing based slicing floorplanning [15], and (ii) state-of-the-art simultaneous partitioning with retiming HPM [7] followed by floorplanning [15], GEO obtains 35 % and 23 % better delay results while maintaining comparable cutsize, wirelength, and runtime results.
Performance Driven Multi-level and Multiway Partitioning with Retiming
- IN PROC. DESIGN AUTOMATION CONF
, 2000
"... In this paper, we study the performance driven multiway circuit partitioning problem with consideration of the significant difference of local and global interconnect delay induced by the partitioning. We develop an efficient algorithm HPM (Hierarchical Performance driven Multi-level partitioning) t ..."
Abstract
-
Cited by 27 (14 self)
- Add to MetaCart
In this paper, we study the performance driven multiway circuit partitioning problem with consideration of the significant difference of local and global interconnect delay induced by the partitioning. We develop an efficient algorithm HPM (Hierarchical Performance driven Multi-level partitioning) that simultaneously considers cutsize and delay minimization with retiming. HPM builds a multi-level cluster hierarchy and performs various refinement while gradually decomposing the clusters for simultaneous cutsize and delay minimization. We provide comprehensive experimental justification for each step involved in HPM and in-depth analysis of cutsize and delay tradeoff existing in the performance driven partitioning problem. HPM obtains (i) 7% to 23% better delay compared to the state-of-the-art cutsize driven hMetis [11] at the expense of 19% increase in cutsize, and (ii) 81% better cutsize compared to the state-of-the-art delay driven PRIME [2] at the expense of 6% increase in delay.
Multilevel Global Placement with Congestion Control
- IEEE Trans. on Computer-Aided Design of Integrated Circuits and Systems
, 2003
"... In this paper, we develop a multilevel global placement algorithm (MGP) integrated with fast incremental global routing for directly updating and optimizing congestion cost during physical hierarchy generation. Fast global routing is achieved using a fast two-bend routing and incremental A-tree algo ..."
Abstract
-
Cited by 21 (6 self)
- Add to MetaCart
(Show Context)
In this paper, we develop a multilevel global placement algorithm (MGP) integrated with fast incremental global routing for directly updating and optimizing congestion cost during physical hierarchy generation. Fast global routing is achieved using a fast two-bend routing and incremental A-tree algorithm. The routing congestion is modeled by the wire usage estimated by the fast global router. A hierarchical area density control is developed for placing objects with significant size variations. Experimental results show that, compared to GORDIAN-L, the wire length-driven MGP is 4--6.7 times faster and generates slightly better wire length for test circuits larger than 100 000 cells. Moreover, the congestion-driven MGP improves wiring overflow by 45%--74% with 5% larger bounding box wire length but 3%--7% shorter routing wire length measured by a graph-based A-tree global router.
Congestion aware layout driven logic synthesis
- in Proc. Int. Conf. on Computer Aided Design
, 2001
"... In this paper, we present novel algorithms that effectively combine physical layout and early logic synthesis to improve overall design quality. In addition, we employ partitioning and clustering algorithms to achieve faster turn around times. With the increasing complexity of designs, the tradition ..."
Abstract
-
Cited by 20 (0 self)
- Add to MetaCart
(Show Context)
In this paper, we present novel algorithms that effectively combine physical layout and early logic synthesis to improve overall design quality. In addition, we employ partitioning and clustering algorithms to achieve faster turn around times. With the increasing complexity of designs, the traditional separation of logic and physical design leads to sub-optimal results as the cost functions employed during logic synthesis do not accurately represent physical design information. While this problem has been addressed extensively, the existing solutions apply only simple synthesis transforms during physical layout and are generally unable to reverse decisions made during logic minimization and technology mapping, that have a major negative impact on circuit structure. In our novel approach, we propose congestion aware algorithms for layout driven decomposition and technology mapping, two of the steps that affect congestion the most during logic synthesis, to effectively decrease wire length and improve congestion. In addition, to improve design turn-around-time and handle large designs, we present an approach in which synthesis partitioning and placement clustering co-exist, reflecting the different characteristics of logical and physical domain. 1
SPFD-based Global Rewiring. In
, 2001
"... ABSTRACT This paper presents the theory and algorithm for SPFD-based global rewiring (SPFD-GR). SPFD-GR allows us to globally replace a target wire with some alternative wire possibly far away from the target. It successfully overcomes the limitations of the existing SPFD-based local rewiring algor ..."
Abstract
-
Cited by 17 (2 self)
- Add to MetaCart
(Show Context)
ABSTRACT This paper presents the theory and algorithm for SPFD-based global rewiring (SPFD-GR). SPFD-GR allows us to globally replace a target wire with some alternative wire possibly far away from the target. It successfully overcomes the limitations of the existing SPFD-based local rewiring algorithm (SPFD-LR), which can only replace a wire with another wire that has the same destination node. In order to perform SPFD-based global rewiring, we developed the theory and algorithm for solving a fundamental problem in SPFD-based rewiring: Given the in-pin functions of a node and the SPFD at the node's out-pin, is there a way to modify the node's internal function so that the SPFD at the node's out-pin can be satisfied? Combined with a state-of-the-art partitioning algorithm, SPFD-GR scales well to large circuits with good synthesis quality. Our SPFD-based rewiring algorithm is ideal for LUT-based FPGAs, where the node's internal function can be changed freely without any area or delay penalty. Extensive experimental results show that for LUT-based FPGAs, the rewiring ability of SPFD-GR (in terms of the number of wires that have alternative wires) is 1.45, and 3 times that of SPFD-LR and an ATPG-based rewiring algorithm (with a preliminary experimental flow), respectively, while the run time is quite acceptable. When applied to the post-mapping area reduction for large LUT-based FPGAs under circuit depth restriction, SPFD-GR achieves 17.1% average area reduction, with no or little delay increase.
Optimality, Scalability and Stability Study of Partitioning and Placement Algorithms
, 2003
"... state-of-the-art partitioning and placement algorithms. We present algorithms to construct two classes of benchmarks, one for partitioning and the other for placement, which have known upper bounds of their optimal solutions, and can match any given net distribution vector. Using these partitioni ..."
Abstract
-
Cited by 15 (4 self)
- Add to MetaCart
(Show Context)
state-of-the-art partitioning and placement algorithms. We present algorithms to construct two classes of benchmarks, one for partitioning and the other for placement, which have known upper bounds of their optimal solutions, and can match any given net distribution vector. Using these partitioning and placement benchmarks, we studied the optimality of state-ofthe -art algorithms by comparing their solutions with the upper bounds of the optimal solutions, and their scalability and stability by varying the sizes and characteristics of the benchmarks. The conclusions from this study are: 1) State-ofthe -art, multilevel two way partitioning algorithms scale very well and are able to find solutions very close to the upper bounds of the optimal solutions of our benchmarks. This suggests that existing circuit partitioning techniques are fairly mature. There is not much room for improvement for cutsize minimization for problems of the current sizes. Multiway partitioning algorithms, on the other hand, do not perform that well. Their results can be up to 18% worse than our estimated upper bounds. 2) The state-of-the-art placement algorithms produce significantly inferior results compared with the estimated optimal solutions. There is still significant room for improvement in circuit placement. 3) Existing placement algorithms are not stable. Their effectiveness varies considerably depending on the characteristics of the benchmarks. New hybrid techniques are probably needed for future generation placement engines that are more scalable and stable.
Performance Driven Multiway Partitioning
- In Proc. Asia and South Pacific Design Automation Conf
, 2000
"... Under the interconnect-centric design paradigm, partitioning is seen as the crucial step that defines the interconnect [1]. To meet the performance requirement of today's complex design, performance driven partitioners must consider the amount of interconnect induced by partitioning as well as ..."
Abstract
-
Cited by 10 (7 self)
- Add to MetaCart
Under the interconnect-centric design paradigm, partitioning is seen as the crucial step that defines the interconnect [1]. To meet the performance requirement of today's complex design, performance driven partitioners must consider the amount of interconnect induced by partitioning as well as its impact on performance. In this paper, we provide new performance driven formulation for cell move based top-down multiway partitioning algorithms with consideration of the local and global interconnect delay. In our "constrained acyclic partitioning" formulation, cell moves are restricted to maintain acyclicity in partitioning solution to prevent cyclic dependency among cells in different partitions. In our "relaxed acyclic partitioning" formulation, acyclic constraints are relaxed to give partitioners capability of minimizing cutsize and delay. Our new acyclic constraint based performance driven multiway partitioning algorithm FLARE obtains (i) 4% to 13% better delay compared to the state-of...