Results 1 - 10
of
10
Active Leakage Power Optimization for FPGAs
- FPGA'04
, 2004
"... We consider active leakage power dissipation in FPGAs and present a "no cost" approach for active leakage reduction. It is well-known that the leakage power consumed by a digital CMOS circuit depends strongly on the state of its inputs. Our leakage reduction technique leverages a fundamental propert ..."
Abstract
-
Cited by 30 (2 self)
- Add to MetaCart
We consider active leakage power dissipation in FPGAs and present a "no cost" approach for active leakage reduction. It is well-known that the leakage power consumed by a digital CMOS circuit depends strongly on the state of its inputs. Our leakage reduction technique leverages a fundamental property of basic FPGA logic elements (look-uptables) that allows a logic signal in an FPGA design to be interchanged with its complemented form without any area or delay penalty. We apply this property to select polarities for logic signals so that FPGA hardware structures spend the majority of time in low leakage states. In an experimental study, we optimize active leakage power in circuits mapped into a state-of-the-art 90nm commercial FPGA. Results show that the proposed approach reduces active leakage by 25%, on average.
Performance benefits of monolithically stacked 3-D FPGA
- IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
, 2007
"... Abstract—The performance benefits of a monolithically stacked three-dimensional (3-D) field-programmable gate array (FPGA), whereby the programming overhead of an FPGA is stacked on top of a standard CMOS layer containing logic blocks (LBs) and interconnects, are investigated. A Virtex-II-style two- ..."
Abstract
-
Cited by 14 (6 self)
- Add to MetaCart
Abstract—The performance benefits of a monolithically stacked three-dimensional (3-D) field-programmable gate array (FPGA), whereby the programming overhead of an FPGA is stacked on top of a standard CMOS layer containing logic blocks (LBs) and interconnects, are investigated. A Virtex-II-style two-dimensional (2-D) FPGA fabric is used as a baseline architecture to quantify the relative improvements in logic density, delay, and power consumption achieved by such a 3-D FPGA. It is assumed that only the switch transistor and configuration memory cells can be moved to the top layers and that the 3-D FPGA employs the same LB and programmable interconnect architecture as the baseline 2-D FPGA. Assuming they are ≤ 0.7, the area of a static random-access memory cell and switch transistors having the same characteristics as n-channel metal–oxide–semiconductor devices in the CMOS layer are used. It is shown that a monolithically stacked 3-D FPGA can achieve 3.2 times higher logic density, 1.7 times lower critical path delay, and 1.7 times lower total dynamic power consumption than the baseline 2-D FPGA fabricated in the same 65-nm technology node. Index Terms—Field-programmable gate arrays (FPGAs), monolithically stacked, performance, three-dimensional (3-D). I.
Interconnect Driver Design for Long Wires in Field-Programmable Gate Arrays
, 2006
"... ii Designers of field-programmable gate arrays (FPGAs) are always striving to improve the performance of their designs. As they migrate to newer process technologies in search of higher speeds, the challenge of interconnect delay grows larger. For an FPGA, this challenge is crucial since most FPGA i ..."
Abstract
-
Cited by 4 (0 self)
- Add to MetaCart
ii Designers of field-programmable gate arrays (FPGAs) are always striving to improve the performance of their designs. As they migrate to newer process technologies in search of higher speeds, the challenge of interconnect delay grows larger. For an FPGA, this challenge is crucial since most FPGA implementations use many long wires. A common technique used to reduce interconnect delay is repeater insertion. Recent work has shown that FPGA interconnect delay can be improved by using unidirectional wires with a single driver at only one end of a wire. With this change, it is now possible to consider interconnect optimization techniques such as repeater insertion. In this work, a technique to construct switch driver circuit designs is developed. Using this method, it is possible to determine the driver sizing, spacing and the number of stages of the circuit design. A computer-aided design model of the new circuit designs is developed to assess the impact they have on the delay performance of FPGAs. Results indicate that, by using the presented circuit design technique, the critical path can be reduced by 19 % for short wires, and up to 40 % for longer wires. iii
A routing fabric for monolithically stacked 3d-fpga
- in FPGA ’07: Proceedings of the 2007 ACM/SIGDA 15th international
, 2007
"... A previous study on the benefits of monolithically stacked 3D-FPGA has estimated a 3.2x improvement in logic density, a 1.7x improvement in delay, and a 1.7x improvement in dynamic power consumption over a baseline 2D-FPGA with no change in architecture. This paper describes a new routing fabric and ..."
Abstract
-
Cited by 4 (1 self)
- Add to MetaCart
A previous study on the benefits of monolithically stacked 3D-FPGA has estimated a 3.2x improvement in logic density, a 1.7x improvement in delay, and a 1.7x improvement in dynamic power consumption over a baseline 2D-FPGA with no change in architecture. This paper describes a new routing fabric and shows that a 3D-FPGA using this fabric can achieve a 3.3x improvement in logic density, a 2.35x improvement in delay, and a 2.82x improvement in dynamic power consumption over the same baseline 2D-FPGA. The additional improvements in delay and power consumption are achieved by reducing net loading in several ways: (i) Only Single and Double interconnect segments are used. This reduces the total interconnect length used to implement each net. (ii) The routing fabric is hierarchical. Each logic block’s inputs and outputs connect first to local segments. These segments can be then programmably connected to local segments in neighboring routing blocks via programmable buffers and/or to interconnect segments in routing channels via muxes with buffered outputs. (iii) Interconnect segments can be directly connected to form longer segments using programmable buffers without going through routing blocks. (iv) The routing block provides switching capability beyond that of a conventional switch box. A 3D-FPGA using this new routing fabric can be realized by stacking two configuration memory layers and a switch layer on top of a standard CMOS layer with a total of 12 metal layers interspersed between them. A CAD flow based on VPR with appropriate modifications to the routing graph generation and routing algorithm is developed and used in the performance analysis.
Analytical Framework for Switch Block Design
- INTERNATIONAL CONFERENCE ON FIELD-PROGRAMMABLE LOGIC AND APPLICATIONS
, 2002
"... One popular FPGA interconnection network is based on the islandstyle model, where rows and columns of logic blocks are separated by channels containing routing wires. Switch blocks are placed at the intersections of the horizontal and vertical channels to allow the wires to be connected together. ..."
Abstract
-
Cited by 3 (0 self)
- Add to MetaCart
One popular FPGA interconnection network is based on the islandstyle model, where rows and columns of logic blocks are separated by channels containing routing wires. Switch blocks are placed at the intersections of the horizontal and vertical channels to allow the wires to be connected together. Previous switch block design has focused on the analysis of individual switch blocks or the use of ad hoc design with experimental evaluation. This paper presents an analytical framework which considers the design of a continuous fabric of switch blocks containing wire segments of any length. The framework is used to design new switch blocks which are experimentally shown to be as effective as the best ones known to date. With this framework, we hope to inspire new ways of looking at switch block design.
TORCH: A Design Tool for Routing Channel Segmentation in FPGAs
"... A design tool for routing channel segmentation in islandstyle FPGAs is presented. Given the FPGA architecture parameters and a set of benchmark designs, the tool optimizes routing channel segmentation using the average interconnect power-delay product as a performance metric, which is estimated from ..."
Abstract
-
Cited by 2 (2 self)
- Add to MetaCart
A design tool for routing channel segmentation in islandstyle FPGAs is presented. Given the FPGA architecture parameters and a set of benchmark designs, the tool optimizes routing channel segmentation using the average interconnect power-delay product as a performance metric, which is estimated from placed and routed designs. A simulatedannealing procedure is used, whereby segmentation is incrementally changed in each iteration, the benchmark designs are mapped using VPR, and the performance metric is computed to decide whether to accept or reject the new segmentation. Run time is significantly reduced by using incremental routing in each iteration and parallelizing the metric evaluation. Experimental results using the MCNC benchmark designs demonstrate an average of 22 % and 15% reduction in delay and power relative to a baseline segmentation. The results also show that average segment length should decrease with technology scaling. Finally, we demonstrate how the tool can be used to optimize other aspects of programmable routing in an FPGA
A Low-Power Field-Programmable Gate Array Routing Fabric
"... Abstract—This paper describes a new programmable routing fabric for field-programmable gate arrays (FPGAs). Our results show that an FPGA using this fabric can achieve 1.57 times lower dynamic power consumption and 1.35 times lower average net delays with only 9 % reduction in logic density over a b ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
Abstract—This paper describes a new programmable routing fabric for field-programmable gate arrays (FPGAs). Our results show that an FPGA using this fabric can achieve 1.57 times lower dynamic power consumption and 1.35 times lower average net delays with only 9 % reduction in logic density over a baseline island-style FPGA implemented in the same 65-nm CMOS technology. These improvements in power and delay are achieved by 1) using only short interconnect segments to reduce routed net lengths, and 2) reducing interconnect segment loading due to programming overhead relative to the baseline FPGA without compromising routability. The new routing fabric is also well-suited to monolithically stacked 3-D-IC implementation. It is shown that a 3-D-FPGA using this fabric can achieve a 3.3 times improvement in logic density, a 2.51 times improvement in delay, and a 2.93 times improvement in dynamic power consumption over the same baseline 2-D-FPGA. Index Terms—Field-programmable gate arrays (FPGAs), lowpower, performance analysis, routing architecture/fabric. I.
ENERGY-PERFORMANCE TUNABLE DIGITAL CIRCUITS
"... The continued scaling of CMOS technology has enabled incredible computing devices to be created, but also pushed these devices to their energy dissipation limits. Process, temperature and workload variation change the power and performance of a chip, making it impossible to create a single energy op ..."
Abstract
- Add to MetaCart
The continued scaling of CMOS technology has enabled incredible computing devices to be created, but also pushed these devices to their energy dissipation limits. Process, temperature and workload variation change the power and performance of a chip, making it impossible to create a single energy optimal design. As a result, adaptive energy and performance adjustment methods have emerged as attractive methods to improve the effective power efficiency of a circuit. Dynamic voltage scaling and adaptive transistor body biasing have been employed to control the dynamic and leakage power of the circuits, respectively. Unfortunately, in modern technologies body bias is not very effective in controlling leakage current. In this thesis, we propose an alternative approach to the in situ adjustment of the energy-performance point of the design. We first show how the effective threshold of the transistors can be adjusted over a wide range using skewed supplies. Leveraging this method for pure static logic is difficult, so we create a new circuit architecture that is intrinsically faster than static circuits, and more importantly, can use skewed
mrFPGA: A Novel FPGA Architecture with Memristor-Based Reconfiguration
"... Abstract — In this paper, we introduce a novel FPGA architecture with memristor-based reconfiguration (mrFPGA). The proposed architecture is based on the existing CMOS-compatible memristor fabrication process. The programmable interconnects of mrFPGA use only memristors and metal wires so that the i ..."
Abstract
- Add to MetaCart
Abstract — In this paper, we introduce a novel FPGA architecture with memristor-based reconfiguration (mrFPGA). The proposed architecture is based on the existing CMOS-compatible memristor fabrication process. The programmable interconnects of mrFPGA use only memristors and metal wires so that the interconnects can be fabricated over logic blocks, resulting in significant reduction of overall area and interconnect delay but without using a 3D diestacking process. Using memristors to build up the interconnects can also provide capacitance shielding from unused routing paths and reduce interconnect delay further. Moreover we propose an improved architecture that allows adaptive buffer insertion in interconnects to achieve more speedup. Compared to the fixed buffer pattern in conventional FPGAs, the positions of inserted buffers in mrFPGA are optimized on demand. A complete CAD flow is provided for mrFPGA, with an advanced P&R tool named mrVPR that was developed for mrFPGA. The tool can deal with the novel routing structure of mrFPGA, the memristor shielding effect, and the algorithm for optimal buffer insertion. We evaluate the area, performance and power consumption of mrFPGA based on the 20 largest MCNC benchmark circuits. Results show that mrFPGA achieves 5.18x area savings, 2.28x speedup and 1.63x power savings. Further improvement is expected with combination of 3D technologies and mrFPGA. Keywords-FPGA; memristor; ASIC; reconfiguration; I.
Exploring FPGA Routing Architecture Stochastically
"... Abstract—This paper proposes a systematic strategy to efficiently explore the design space of field-programmable gate array (FPGA) routing architectures. The key idea is to use stochastic methods to quickly locate near-optimal solutions in designing FPGA routing architectures without exhaustively en ..."
Abstract
- Add to MetaCart
Abstract—This paper proposes a systematic strategy to efficiently explore the design space of field-programmable gate array (FPGA) routing architectures. The key idea is to use stochastic methods to quickly locate near-optimal solutions in designing FPGA routing architectures without exhaustively enumerating all design points. The main objective of this paper is not as much about the specific numerical results obtained, as it is to show the applicability and effectiveness of the proposed optimization approach. To demonstrate the utility of the proposed stochastic approach, we developed the tool for optimizing routing architecture (TORCH) software based on the versatile place and route tool [1]. Given FPGA architecture parameters and a set of benchmark designs, TORCH simultaneously optimizes the routing channel segmentation and switch box patterns using the performance metric of average interconnect power-delay product estimated from placed and routed benchmark designs. Special techniques—such as incremental routing, infrequent placement, multi-modal move selection, and parallelized metric evaluation— are developed to reduce the overall run time and improve the quality of results. Our experimental results have shown that the stochastic design strategy is quite effective in co-optimizing both routing channel segmentation and switch patterns. With the optimized routing architecture, relative to the performance of our chosen architecture baseline, TORCH can achieve average improvements of 24 % and 15 % in delay and power consumption for the 20 largest Microelectronics Center of North Carolina benchmark designs, and 27 % and 21 % for the eight benchmark designs synthesized with the Altera Quartus II University Interface Program tool. Additionally, we found that the average segment length in an FPGA routing channel should decrease with technology scaling. Finally, we demonstrate the versatility of TORCH by illustrating how TORCH can be used to optimize other aspects of the routing architecture in an FPGA. Index Terms—Design exploration, FPGA, routing architecture, stochastic.

