Results 1 - 10
of
10
A routing fabric for monolithically stacked 3d-fpga
- in FPGA ’07: Proceedings of the 2007 ACM/SIGDA 15th international
, 2007
"... A previous study on the benefits of monolithically stacked 3D-FPGA has estimated a 3.2x improvement in logic density, a 1.7x improvement in delay, and a 1.7x improvement in dynamic power consumption over a baseline 2D-FPGA with no change in architecture. This paper describes a new routing fabric and ..."
Abstract
-
Cited by 4 (1 self)
- Add to MetaCart
A previous study on the benefits of monolithically stacked 3D-FPGA has estimated a 3.2x improvement in logic density, a 1.7x improvement in delay, and a 1.7x improvement in dynamic power consumption over a baseline 2D-FPGA with no change in architecture. This paper describes a new routing fabric and shows that a 3D-FPGA using this fabric can achieve a 3.3x improvement in logic density, a 2.35x improvement in delay, and a 2.82x improvement in dynamic power consumption over the same baseline 2D-FPGA. The additional improvements in delay and power consumption are achieved by reducing net loading in several ways: (i) Only Single and Double interconnect segments are used. This reduces the total interconnect length used to implement each net. (ii) The routing fabric is hierarchical. Each logic block’s inputs and outputs connect first to local segments. These segments can be then programmably connected to local segments in neighboring routing blocks via programmable buffers and/or to interconnect segments in routing channels via muxes with buffered outputs. (iii) Interconnect segments can be directly connected to form longer segments using programmable buffers without going through routing blocks. (iv) The routing block provides switching capability beyond that of a conventional switch box. A 3D-FPGA using this new routing fabric can be realized by stacking two configuration memory layers and a switch layer on top of a standard CMOS layer with a total of 12 metal layers interspersed between them. A CAD flow based on VPR with appropriate modifications to the routing graph generation and routing algorithm is developed and used in the performance analysis.
TORCH: A Design Tool for Routing Channel Segmentation in FPGAs
"... A design tool for routing channel segmentation in islandstyle FPGAs is presented. Given the FPGA architecture parameters and a set of benchmark designs, the tool optimizes routing channel segmentation using the average interconnect power-delay product as a performance metric, which is estimated from ..."
Abstract
-
Cited by 2 (2 self)
- Add to MetaCart
A design tool for routing channel segmentation in islandstyle FPGAs is presented. Given the FPGA architecture parameters and a set of benchmark designs, the tool optimizes routing channel segmentation using the average interconnect power-delay product as a performance metric, which is estimated from placed and routed designs. A simulatedannealing procedure is used, whereby segmentation is incrementally changed in each iteration, the benchmark designs are mapped using VPR, and the performance metric is computed to decide whether to accept or reject the new segmentation. Run time is significantly reduced by using incremental routing in each iteration and parallelizing the metric evaluation. Experimental results using the MCNC benchmark designs demonstrate an average of 22 % and 15% reduction in delay and power relative to a baseline segmentation. The results also show that average segment length should decrease with technology scaling. Finally, we demonstrate how the tool can be used to optimize other aspects of programmable routing in an FPGA
A Detailed Delay Path Model for FPGAs
"... Abstract—A complete circuit-level description of a representative FPGA is presented in this paper, from which a simple RC delay model as a function of architectural and technology parameters is derived. Using this model, the expression for the optimal delay of any path through the FPGA can be formul ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
Abstract—A complete circuit-level description of a representative FPGA is presented in this paper, from which a simple RC delay model as a function of architectural and technology parameters is derived. Using this model, the expression for the optimal delay of any path through the FPGA can be formulated. We distill our model into being purely architecture dependent, and use it to capture new insight into how FPGA parameters can directly affect its delay. Several applications of this model are: (1) to gain better intuition of how architecture and process parameters affect the delay path in an FPGA, (2) for initial studies into new circuit designs and integrated circuit technologies, (3) in CAD tools for optimisation and sensitivity analysis. The technique described can be applied to arbitrary circuits, and simulations show that our closed form equations give delay values that are accurate to approximately 10 % when compared to HSPICE simulation. I.
A Low-Power Field-Programmable Gate Array Routing Fabric
"... Abstract—This paper describes a new programmable routing fabric for field-programmable gate arrays (FPGAs). Our results show that an FPGA using this fabric can achieve 1.57 times lower dynamic power consumption and 1.35 times lower average net delays with only 9 % reduction in logic density over a b ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
Abstract—This paper describes a new programmable routing fabric for field-programmable gate arrays (FPGAs). Our results show that an FPGA using this fabric can achieve 1.57 times lower dynamic power consumption and 1.35 times lower average net delays with only 9 % reduction in logic density over a baseline island-style FPGA implemented in the same 65-nm CMOS technology. These improvements in power and delay are achieved by 1) using only short interconnect segments to reduce routed net lengths, and 2) reducing interconnect segment loading due to programming overhead relative to the baseline FPGA without compromising routability. The new routing fabric is also well-suited to monolithically stacked 3-D-IC implementation. It is shown that a 3-D-FPGA using this fabric can achieve a 3.3 times improvement in logic density, a 2.51 times improvement in delay, and a 2.93 times improvement in dynamic power consumption over the same baseline 2-D-FPGA. Index Terms—Field-programmable gate arrays (FPGAs), lowpower, performance analysis, routing architecture/fabric. I.
A Three-Tier Asynchronous FPGA
"... Field programmable gate arrays (FPGA) are widely used for their versatility and programmability in place of custom-designed circuits. Their flexibility comes at a cost of density: supporting programmable logic incurs a significant overhead in configuration logic and interconnect, relative to custom ..."
Abstract
- Add to MetaCart
Field programmable gate arrays (FPGA) are widely used for their versatility and programmability in place of custom-designed circuits. Their flexibility comes at a cost of density: supporting programmable logic incurs a significant overhead in configuration logic and interconnect, relative to custom logic. The dominance and criticality of interconnect overhead in FPGAs gives a strong case for potential benefit from multi-layer integration. Migrating designs to new technologies often depends on good process characterization for static timing analysis and verification in synchronous designs. However, the asynchronous (delay-insensitive) design methodology eliminates the dependence on speculative timing analysis by tolerating arbitrary variation of gate delays. Our proposed 3D asynchronous FPGA (AFPGA) architecture is based on an existing 2D AFPGA. Pipelined AFPGAs have demonstrated a 3x improvement in performance over their synchronous counterparts. In this paper, we present the design of a 3D AFPGA, fabricated in MIT-LL’s 3D (3-tier).18µm SOI technology. The logical resources for the 3D AFPGA were kept the same as the original 2D design, while the switch boxes were expanded with inter-layer channels for tier-to-tier routing. Our test chip demonstrates the viability and competitiveness of multi-layer asynchronous FPGA designs. 1
ENERGY-PERFORMANCE TUNABLE DIGITAL CIRCUITS
"... The continued scaling of CMOS technology has enabled incredible computing devices to be created, but also pushed these devices to their energy dissipation limits. Process, temperature and workload variation change the power and performance of a chip, making it impossible to create a single energy op ..."
Abstract
- Add to MetaCart
The continued scaling of CMOS technology has enabled incredible computing devices to be created, but also pushed these devices to their energy dissipation limits. Process, temperature and workload variation change the power and performance of a chip, making it impossible to create a single energy optimal design. As a result, adaptive energy and performance adjustment methods have emerged as attractive methods to improve the effective power efficiency of a circuit. Dynamic voltage scaling and adaptive transistor body biasing have been employed to control the dynamic and leakage power of the circuits, respectively. Unfortunately, in modern technologies body bias is not very effective in controlling leakage current. In this thesis, we propose an alternative approach to the in situ adjustment of the energy-performance point of the design. We first show how the effective threshold of the transistors can be adjusted over a wide range using skewed supplies. Leveraging this method for pure static logic is difficult, so we create a new circuit architecture that is intrinsically faster than static circuits, and more importantly, can use skewed
A Low-Power Monolithically Stacked 3D-TCAM
"... Abstract—This paper presents three techniques to reduce the power consumption in ternary content-addressable memories (TCAMs). The first technique is to use newly developed monolithically stacked 3D-IC technology for the implementation, because vertical stacking can drastically reduce interconnect l ..."
Abstract
- Add to MetaCart
Abstract—This paper presents three techniques to reduce the power consumption in ternary content-addressable memories (TCAMs). The first technique is to use newly developed monolithically stacked 3D-IC technology for the implementation, because vertical stacking can drastically reduce interconnect length in both matchlines and searchlines, hence reducing signal path delay and power consumption. The second technique is to replace the conventional SRAM memory in a TCAM with an array of programmable vias (or electrolyte non-volatile memory). Special programming circuitry is designed to read/write memory bits from/to the programmable via array because they do not simply store data in the form of low and high voltage levels. We also devised a new TCAM cell design to further reduce power consumption in TCAMs by taking full advantage of 3D-IC technology. A 1024×144-bit TCAM using the proposed schemes is implemented with 1.0-V 65nm CMOS technology. Our analysis and simulations have shown that the proposed monolithically stacked 3D-TCAM can reduce the total dynamic power consumption by almost 3.5 times and increase TCAM cell density by about 4 times in comparison with a conventional 2D-TCAM chip of the same capacity. I.
IEEE 2009 Custom Intergrated Circuits Conference (CICC) The Prospect of 3D-IC
"... Abstract — This paper illustrates the performance advantages of 3D integrated circuits with two specific examples, namely 3D-FPGA and 3D-SRAM. Through strategic modification of the architectures to take advantage of 3D, significant improvement in speed and reduction in power consumption can be achie ..."
Abstract
- Add to MetaCart
Abstract — This paper illustrates the performance advantages of 3D integrated circuits with two specific examples, namely 3D-FPGA and 3D-SRAM. Through strategic modification of the architectures to take advantage of 3D, significant improvement in speed and reduction in power consumption can be achieved. I.
mrFPGA: A Novel FPGA Architecture with Memristor-Based Reconfiguration
"... Abstract — In this paper, we introduce a novel FPGA architecture with memristor-based reconfiguration (mrFPGA). The proposed architecture is based on the existing CMOS-compatible memristor fabrication process. The programmable interconnects of mrFPGA use only memristors and metal wires so that the i ..."
Abstract
- Add to MetaCart
Abstract — In this paper, we introduce a novel FPGA architecture with memristor-based reconfiguration (mrFPGA). The proposed architecture is based on the existing CMOS-compatible memristor fabrication process. The programmable interconnects of mrFPGA use only memristors and metal wires so that the interconnects can be fabricated over logic blocks, resulting in significant reduction of overall area and interconnect delay but without using a 3D diestacking process. Using memristors to build up the interconnects can also provide capacitance shielding from unused routing paths and reduce interconnect delay further. Moreover we propose an improved architecture that allows adaptive buffer insertion in interconnects to achieve more speedup. Compared to the fixed buffer pattern in conventional FPGAs, the positions of inserted buffers in mrFPGA are optimized on demand. A complete CAD flow is provided for mrFPGA, with an advanced P&R tool named mrVPR that was developed for mrFPGA. The tool can deal with the novel routing structure of mrFPGA, the memristor shielding effect, and the algorithm for optimal buffer insertion. We evaluate the area, performance and power consumption of mrFPGA based on the 20 largest MCNC benchmark circuits. Results show that mrFPGA achieves 5.18x area savings, 2.28x speedup and 1.63x power savings. Further improvement is expected with combination of 3D technologies and mrFPGA. Keywords-FPGA; memristor; ASIC; reconfiguration; I.
Exploring FPGA Routing Architecture Stochastically
"... Abstract—This paper proposes a systematic strategy to efficiently explore the design space of field-programmable gate array (FPGA) routing architectures. The key idea is to use stochastic methods to quickly locate near-optimal solutions in designing FPGA routing architectures without exhaustively en ..."
Abstract
- Add to MetaCart
Abstract—This paper proposes a systematic strategy to efficiently explore the design space of field-programmable gate array (FPGA) routing architectures. The key idea is to use stochastic methods to quickly locate near-optimal solutions in designing FPGA routing architectures without exhaustively enumerating all design points. The main objective of this paper is not as much about the specific numerical results obtained, as it is to show the applicability and effectiveness of the proposed optimization approach. To demonstrate the utility of the proposed stochastic approach, we developed the tool for optimizing routing architecture (TORCH) software based on the versatile place and route tool [1]. Given FPGA architecture parameters and a set of benchmark designs, TORCH simultaneously optimizes the routing channel segmentation and switch box patterns using the performance metric of average interconnect power-delay product estimated from placed and routed benchmark designs. Special techniques—such as incremental routing, infrequent placement, multi-modal move selection, and parallelized metric evaluation— are developed to reduce the overall run time and improve the quality of results. Our experimental results have shown that the stochastic design strategy is quite effective in co-optimizing both routing channel segmentation and switch patterns. With the optimized routing architecture, relative to the performance of our chosen architecture baseline, TORCH can achieve average improvements of 24 % and 15 % in delay and power consumption for the 20 largest Microelectronics Center of North Carolina benchmark designs, and 27 % and 21 % for the eight benchmark designs synthesized with the Altera Quartus II University Interface Program tool. Additionally, we found that the average segment length in an FPGA routing channel should decrease with technology scaling. Finally, we demonstrate the versatility of TORCH by illustrating how TORCH can be used to optimize other aspects of the routing architecture in an FPGA. Index Terms—Design exploration, FPGA, routing architecture, stochastic.

