Results 1 - 10
of
10
A Quick Safari Through the Reconfiguration Jungle
- In Design Automation Conference
, 2001
"... Cost effective systems use specialization to optimize factors such as power consumption, processing throughput, flexibility or combinations thereof. Reconfigurable systems obtain this specialization at run-time. System reconfiguration has a vertical, a horizontal and a time dimension. We organize th ..."
Abstract
-
Cited by 29 (3 self)
- Add to MetaCart
Cost effective systems use specialization to optimize factors such as power consumption, processing throughput, flexibility or combinations thereof. Reconfigurable systems obtain this specialization at run-time. System reconfiguration has a vertical, a horizontal and a time dimension. We organize this design space as the reconfiguration hierarchy, and discuss the design methods that deal with it. Finally, we survey existing commercial platforms that support reconfiguration and situate them in the reconfiguration jungle.
Interfacing a High Speed Crypto Accelerator to an Embedded CPU
- in Proceedings of the 38th Asilomar Conference on Signals, Systems, and Computers
, 2004
"... Abstract- Crypto co-processors are needed for acceleration of encryption functions. But critical to the performance gain is the selection of an adequate interface. This paper presents the AES acceleration for two interface options to the LEON CPU core: the CPI interface and the memorymapped interfac ..."
Abstract
-
Cited by 10 (0 self)
- Add to MetaCart
Abstract- Crypto co-processors are needed for acceleration of encryption functions. But critical to the performance gain is the selection of an adequate interface. This paper presents the AES acceleration for two interface options to the LEON CPU core: the CPI interface and the memorymapped interface. The complete system including the LEON core and the loosely coupled AES accelerators are implemented on an FPGA and the software programs that control the AES accelerators are tested. The cycle count, the throughput, the LUT usage, and the energy cost of running a complete AES program using the above accelerators are compared with a pure software implementation, and with a tightly coupled instruction set extension option. I.
Crisp: A template for reconfigurable instruction set processors
- Set Processors, International conference on Field Programmable Logic (FPL 2002
, 2001
"... Abstract. A template for reconfigurable instruction set processors is described. This template defines a design space that enables the exploration of processors potentially suitable for flexible, power and cost efficient implementations of embedded multimedia applications, such as video compression ..."
Abstract
-
Cited by 9 (3 self)
- Add to MetaCart
Abstract. A template for reconfigurable instruction set processors is described. This template defines a design space that enables the exploration of processors potentially suitable for flexible, power and cost efficient implementations of embedded multimedia applications, such as video compression in a hand held device. The template is based on a VLIW processor with a reconfigurable instruction set. In the future this template will be used for design space exploration, compiler retargeting and automatic hardware synthesis. Several existing reconfigurable- and non-reconfigurable processors were mapped onto the template to assess its expressiveness. 1
Exploiting mixed-mode parallelism for matrix operations on the HERA architecture through reconfiguration
- IEE Proc. Computers Digital Techniques
, 2006
"... Recent advances in multi-million-gate platform FPGAs have made it possible to design and implement complex parallel systems on a programmable chip (PSOPCs) that also incorporate hardware floating-point units (FPUs). These options take advantage of resource reconfiguration. In contrast to the majorit ..."
Abstract
-
Cited by 4 (2 self)
- Add to MetaCart
Recent advances in multi-million-gate platform FPGAs have made it possible to design and implement complex parallel systems on a programmable chip (PSOPCs) that also incorporate hardware floating-point units (FPUs). These options take advantage of resource reconfiguration. In contrast to the majority of the FPGA community that still employs reconfigurable logic to develop algorithm-specific circuitry, our FPGA-based mixed-mode reconfigurable computing machine can implement simultaneously a variety of parallel execution modes and is also user programmable. Our HERA (HEterogeneous Reconfigurable Architecture) machine can implement the SIMD (Single-Instruction, Multiple-Data), MIMD (Multiple-Instruction, Multiple-Data) and M-SIMD (Multiple-SIMD) execution modes. Each processing element (PE) is centered on a single-precision IEEE 754 FPU with tightly-coupled local memory, and supports dynamic switching between SIMD and MIMD at runtime. Mixed-mode parallelism has the potential to best match the characteristics of all subtasks in applications, thus resulting in sustained high performance. We evaluate HERA’s performance by two common computation-intensive testbenches: matrix-matrix multiplication (MMM) and LU factorization of sparse Doubly-Bordered-Block-Diagonal (DBBD) matrices. Experimental results with electrical power network matrices show that the mixed-mode scheduling for LU factorization can result in speedups of about 19 % and 15.5 % compared to the SIMD and MIMD implementations, respectively.
Eric Conquer and Jcan-Luc Marty. "formal Design for Automatic Coding and Testing: The ESSI/SPACES Project
- Proceedings of World Congress on Formal Methods in the Development of Computing Systems (FM'99), LNCS 1708
, 1999
"... An architecture for a reconfigurable superscalar processor is described in which some of its execution units are implemented in reconfigurable hardware. The overall configuration of the processor is defined according to how its reconfigurable execution units are configured. An efficient micro-archit ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
An architecture for a reconfigurable superscalar processor is described in which some of its execution units are implemented in reconfigurable hardware. The overall configuration of the processor is defined according to how its reconfigurable execution units are configured. An efficient micro-architectural solution to configuration management is presented that effectively steers the current processor configuration toward a configuration that is well matched with the execution unit requirements of instructions being scheduled for execution. The approach first selects the best matched among four steering configurations based on the number and type of execution units required by the instructions. One of the steering configurations is dynamically defined as the current configuration; the other three are statically predefined. Once a steering configuration is selected, portions of it begin loading on corresponding reconfigurable execution units that are not busy. The active configuration of the processor is generally the overlap of two or more steering configurations. 1. Introduction and Related
Software pipelining for coarse-grained reconfigurable instruction set processors
- in Proc. ASP-DAC
, 2002
"... This paper shows that software pipelining can be an effective technique for code generation for coarsegrained reconfigurable instruction set processors. The paper describes a technique based on software pipelining that performs reconfigurable instruction generation and instruction scheduling on a co ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
This paper shows that software pipelining can be an effective technique for code generation for coarsegrained reconfigurable instruction set processors. The paper describes a technique based on software pipelining that performs reconfigurable instruction generation and instruction scheduling on a combined algorithm. Although typical compiler for reconfigurable processors perform these steps separately, results shows that the combination allows a successful usage of the reconfigurable resources. The technique presented is also able to exploit spatial computation inside the reconfigurable functional unit by which the output of a processing element is directly connected to the input of another processing element without the need of an intermediate register. Results show that it is possible to reduce the cycle count by using spatial computation. 1
Exploiting Mixed-Mode Parallelism for Matrix Operations
"... Recent advances in multi-million-gate platform FPGAs have made it possible to design and implement complex parallel systems on a programmable chip (PSOPCs) that also incorporate hardware floating-point units (FPUs). These options take advantage of resource reconfiguration. In contrast to the majorit ..."
Abstract
- Add to MetaCart
Recent advances in multi-million-gate platform FPGAs have made it possible to design and implement complex parallel systems on a programmable chip (PSOPCs) that also incorporate hardware floating-point units (FPUs). These options take advantage of resource reconfiguration. In contrast to the majority of the FPGA community that still employs reconfigurable logic to develop algorithm-specific circuitry, our FPGA-based mixed-mode reconfigurable computing machine can implement simultaneously a variety of parallel execution modes and is also user programmable. Our HERA (HEterogeneous Reconfigurable Architecture) machine can implement the SIMD (Single-Instruction, Multiple-Data), MIMD (Multiple-Instruction, Multiple-Data) and M-SIMD (Multiple-SIMD) execution modes. Each processing element (PE) is centered on a single-precision IEEE 754 FPU with tightly-coupled local memory, and supports dynamic switching between SIMD and MIMD at runtime. Mixed-mode parallelism has the potential to best match the characteristics of all subtasks in applications, thus resulting in sustained high performance. We evaluate HERA s performance by two common computation-intensive testbenches: matrix-matrix multiplication (MMM) and LU factorization of sparse Doubly-Bordered-Block-Diagonal (DBBD) matrices. Experimental results with electrical power network matrices show that the mixed-mode scheduling for LU factorization can result in speedups of about 19% and 15.5% compared to the SIMD and MIMD implementations, respectively.
Current Trends in Resource Management of Reconfigurable Systems
"... Abstract — Considering multiple applications on a system which are executing concurrently, there should be mechanisms and policies which manage the competition for resources between them and resolve the conflicts. In a traditional system, these management activities can be summarized as storage mana ..."
Abstract
- Add to MetaCart
Abstract — Considering multiple applications on a system which are executing concurrently, there should be mechanisms and policies which manage the competition for resources between them and resolve the conflicts. In a traditional system, these management activities can be summarized as storage management for saving the required data and I/O management to interact with the outside world. Theoretic foundations of these activities have been fully explored in literature. In view of reconfigurable systems, additional management tasks would be imposed which include FPGA logic area allocation, placement, routing, and network on chip management. This paper presents those management activities. Index Terms — Operating systems, Reconfigurable architectures, Resource management, Scheduling
Customized Kernel Execution on Reconfigurable Hardware for Embedded Applications
"... To conserve space and power as well as to harness high performance in embedded systems, high utilization of the hardware is required. This can be facilitated through dynamic adaptation of the silicon resources in reconfigurable systems in order to realize various customized kernels as execution proc ..."
Abstract
- Add to MetaCart
To conserve space and power as well as to harness high performance in embedded systems, high utilization of the hardware is required. This can be facilitated through dynamic adaptation of the silicon resources in reconfigurable systems in order to realize various customized kernels as execution proceeds. Fortunately, the encountered reconfiguration overheads can be estimated. Therefore, if the scheduling of timeconsuming kernels considers also the reconfiguration overheads, an overall performance gain can be obtained. We present our policy, experiments, and performance results of customizing and reconfiguring Field-Programmable Gate Arrays (FPGAs) for embedded kernels. Experiments involving EEMBC (EDN Embedded Microprocessor Benchmarking Consortium) and MiBench embedded benchmark kernels show high performance using our main policy, when considering reconfiguration overheads. Our policy reduces the required reconfigurations by more than 50 % as compared to brute-force solutions, and performs within 25 % of the ideal execution time while conserving 60 % of the FPGA resources. Alternative strategies to reduce the reconfiguration overhead are also presented and evaluated.

