Results 1 - 10
of
25
Hardware/Software Co-Design
- IEEE MICRO
, 1997
"... ... This paper introduces the reader to various aspects of co-design. We highlight the commonalities and point out the differences in various co-design problems in some application areas. Co-design issues and their relationship to classical system implementation tasks are discussed to help the reade ..."
Abstract
-
Cited by 70 (0 self)
- Add to MetaCart
... This paper introduces the reader to various aspects of co-design. We highlight the commonalities and point out the differences in various co-design problems in some application areas. Co-design issues and their relationship to classical system implementation tasks are discussed to help the reader develop a perspective on modern digital system design that relies on computer-aided design (CAD) tools and methods.
Improving Functional Density Through Run-Time Circuit Reconfiguration
, 1997
"... orting a C compiler to the DISC processor. Justin Diether assisted in the design, hand-layout, and testing of many partially reconfigured circuits. I would also like to thank Paul Graham for his generous assistance and support of our many mutual activities, classes, and projects at BYU. Other gradua ..."
Abstract
-
Cited by 42 (2 self)
- Add to MetaCart
orting a C compiler to the DISC processor. Justin Diether assisted in the design, hand-layout, and testing of many partially reconfigured circuits. I would also like to thank Paul Graham for his generous assistance and support of our many mutual activities, classes, and projects at BYU. Other graduate students assisting me with this work include Russel Peterson, Mike Rencher, Richard Ross, and Peter Bellows. My advisor, Brad Hutchings, provided essential assistance and encouragement in all of the projects, ideas, and results presented within this work. My decision to complete this degree and write this dissertation was influenced largely by his advice and positive encouragement. Brent Nelson and other faculty members within the Electrical and Computer Engineering department at BYU have provided critical feedback on a wide variety of topics relating to this work. I would also like to acknowledge the insight and assistance of many collaborators researching closely related subjects. For
Sequencing Run-Time Reconfigured Hardware with Software
- Software”, ACM/SIGDA International Symposium on Field-Programmable Gate Arrays
, 1996
"... Run-Time Reconfigured systems offer additional hardware resources to systems based on reconfigurable FPGAs. These systems, however, are often difficult to build and must tolerate substantial reconfiguration times. A processor based architecture has been built to simplify the development of these sys ..."
Abstract
-
Cited by 31 (1 self)
- Add to MetaCart
Run-Time Reconfigured systems offer additional hardware resources to systems based on reconfigurable FPGAs. These systems, however, are often difficult to build and must tolerate substantial reconfiguration times. A processor based architecture has been built to simplify the development of these systems by providing programmable control of hardware sequencing while retaining the performance of hardware. Configuration overhead of this system is reduced by "caching" hardware on the reconfigurable resource. An image processing application was developed on this system to demonstrate both the performance improvements of custom hardware and the ease of software development. 1 Introduction The high bandwidth of data and computational load of digital signal processing algorithms generally overwhelm even the highest performance generalpurpose processors. Achieving real-time execution rates typically requires custom hardware. SRAMbased Field-Programmable Gate Arrays (FPGAs) are often used to ...
Implementation Approaches for Reconfigurable Logic Applications
- In International Workshop on Field-Programmable Logic and Applications
, 1995
"... . Reconfigurable FPGAs provide designers with new implementation approaches for designing high-performance applications. This paper discusses two basic implementation approaches with FPGAs: compiletime reconfiguration and run-time reconfiguration. Compile-time reconfiguration is a static implementat ..."
Abstract
-
Cited by 29 (3 self)
- Add to MetaCart
. Reconfigurable FPGAs provide designers with new implementation approaches for designing high-performance applications. This paper discusses two basic implementation approaches with FPGAs: compiletime reconfiguration and run-time reconfiguration. Compile-time reconfiguration is a static implementation strategy where each application consists of one configuration. Run-time reconfiguration is a dynamic implementation strategy where each application consists of multiple cooperating configurations. This paper introduces these strategies and discusses the implementation approaches for each strategy. Existing applications for each strategy are also discussed. 1 Overview Reconfigurable logic is an emerging branch of computer architecture that seeks to build flexible computing systems that can achieve very high levels of performance --much higher performance than is possible with the highest performance microprocessors, or in many cases, even supercomputers. At the heart of these computing s...
A Survey of Boolean Matching Techniques for Library Binding
- ACM Transactions on Design Automation of Electronic Systems
, 1997
"... When binding a logic network to a set of cells, a fundamental problem is recognizing whether a cell can implement a portion of the network. Boolean... ..."
Abstract
-
Cited by 23 (1 self)
- Add to MetaCart
When binding a logic network to a set of cells, a fundamental problem is recognizing whether a cell can implement a portion of the network. Boolean...
RPM: A rapid prototyping engine for multiprocessor systems
- IEEE Computer
, 1995
"... In multiprocessor systems, processing nodes contain a processor, some cache and a share of the system memory, and are connected through a scalable interconnect. The system memory partitions may be shared (shared-memory systems) or disjoint (messagepassing systems). Within each class of systems many ..."
Abstract
-
Cited by 17 (5 self)
- Add to MetaCart
In multiprocessor systems, processing nodes contain a processor, some cache and a share of the system memory, and are connected through a scalable interconnect. The system memory partitions may be shared (shared-memory systems) or disjoint (messagepassing systems). Within each class of systems many architectural variations are possible. Fair comparisons among systems are difficult because of the lack of a common hardware platform to implement the different architectures. RPM (Rapid Prototyping engine for Multiprocessors) is a hardware emulator for the rapid prototyping of various multiprocessor architectures. In RPM, the hardware of the target machine is emulated by reprogrammable controllers implemented with Field-Programmable Gate Arrays (FPGAs). The processors, memories and interconnect are off-theshelf and their relative speeds can be modified to emulate various component technologies. Every emulation is an actual incarnation of the target machine and therefore software written for the target machine can be easily ported on it with little modification and without instrumentation of the code. In this paper, we describe the architecture of RPM, its performance and the prototyping methodology. We also compare our approach with simulation and breadboard prototyping. Keywords: Field-Programmable Gate Arrays (FPGAs), message-passing multicomputers, shared-memory multiprocessors, design verification, performance evaluation, simulation.
Automated Target Recognition on SPLASH 2
- PROCEEDINGS OF IEEE WORKSHOP ON FPGAS FOR CUSTOM COMPUTING MACHINES
, 1997
"... Automated target recognition is an application area that requires special-purpose hardware to achieve reasonable performance. FPGA-based platforms can provide a high level of performance for ATR systems if the implementation can be adapted to the limited FPGA and routing resources of these architect ..."
Abstract
-
Cited by 17 (2 self)
- Add to MetaCart
Automated target recognition is an application area that requires special-purpose hardware to achieve reasonable performance. FPGA-based platforms can provide a high level of performance for ATR systems if the implementation can be adapted to the limited FPGA and routing resources of these architectures. This paper discusses a mapping experiment where a linear-systolic implementation of an ATR algorithm is mapped to the Splash 2 platform. Simple columnoriented processors were used throughout the design to achieve high performance with limited nearestneighbor communication. The distributed Splash 2 memories are also exploited to achieve a high degree of parallelism. The resulting design is scalable and can be spread across multiple Splash 2 boards with a linear increase in performance.
The Design of RPM: An FPGA-based Multiprocessor Emulator
- Proceedings of the 3rd ACM International Symposium on Field-Programmable Gate Arrays
, 1995
"... Recent advances in Field-Programmable Gate Arrays (FPGA) and programmable interconnects have made it possible to build efficient hardware emulation engines. In addition, improvements in Computer-Aided Design (CAD) tools, mainly in synthesis tools, greatly simplify the design of large circuits. The R ..."
Abstract
-
Cited by 14 (3 self)
- Add to MetaCart
Recent advances in Field-Programmable Gate Arrays (FPGA) and programmable interconnects have made it possible to build efficient hardware emulation engines. In addition, improvements in Computer-Aided Design (CAD) tools, mainly in synthesis tools, greatly simplify the design of large circuits. The RPM (Rapid Prototype Engine for Multiprocessors) Project leverages these two technological advances. Its goal is to develop a common hardware platform for the emulation of multiprocessor systems with different architectures. For cost reasons, the use of FPGAs in RPM is limited to the memory controllers, while the rest of the emulator, including the processors, memories and interconnect, is built with off-the-shelf components. A flexible non-intrusive event logging mechanism is included at all levels of the memory hierarchy, making it possible to monitor the emulation in very fine detail. This paper presents the hardware design of RPM. Keywords: Field-Programmable Gate Arrays (FPGAs), message-passing multicomputers, shared-memory multiprocessors, rapid prototyping, logic emulation. 1.
Sequential circuit fault simulation using logic emulation
- IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, Volume
, 1998
"... Abstract—A fast fault simulation approach based on ordinary logic emulation is proposed. The circuit configured into our system that emulates the faulty circuit’s behavior is synthesized from the good circuit and the given fault list in a novel way. Fault injection is made easy by shifting the conte ..."
Abstract
-
Cited by 13 (0 self)
- Add to MetaCart
Abstract—A fast fault simulation approach based on ordinary logic emulation is proposed. The circuit configured into our system that emulates the faulty circuit’s behavior is synthesized from the good circuit and the given fault list in a novel way. Fault injection is made easy by shifting the content of a fault injection scan chain or by selecting the output of a parallel fault injection selector, with which we get rid of the time-consuming bit-stream regeneration process. Experimental results for ISCAS-89 benchmark circuits show that our serial fault emulator is about 20 times faster than HOPE. The speedup grows with the circuit size by our analysis. Two hybrid fault emulation approaches are also proposed. The first reduces the number of faults actually emulated by screening off faults not activated or with short propagation distances before emulation, and by collapsing nonstem faults into their equivalent stem faults. The second reduces the hardware requirement of the fault emulator by incorporating an ordinary fault simulator. Index Terms—CLB, fault emulation, fault injection, fault simulation, FPGA, logic emulation, logic testing. I.
Computer Vision Algorithms on Reconfigurable Logic Arrays
- IEEE TRANS. ON PARALLEL AND DISTRIBUTED SYSTEMS
, 1999
"... Computer vision algorithms are natural candidates for high performance computing due to their inherent parallelism and intense computational demands. For example, a simple 3 x 3 convolution on a 512 x 512 gray scale image at 30 frames per second requires 67.5 million multiplications and 60 million a ..."
Abstract
-
Cited by 11 (1 self)
- Add to MetaCart
Computer vision algorithms are natural candidates for high performance computing due to their inherent parallelism and intense computational demands. For example, a simple 3 x 3 convolution on a 512 x 512 gray scale image at 30 frames per second requires 67.5 million multiplications and 60 million additions to be performed in one second. Computer vision tasks can be classified into three categories based on their computational complexity andcommunication complexity: low-level, intermediate-level and high-level. Special-purpose hardware provides better performance compared to a general-purpose hardware for all the three levels of vision tasks. With recent advances in very large scale integration (VLSI) technology, an application specific integrated circuit (ASIC) can provide the best performance in terms of total execution time. However, long design cycle time, high development cost and inflexibility of a dedicated hardware deter design of ASICs. In contrast, field programmable gate arrays (FPGAs) support lower design verification time and easier design adaptability atalower cost. Hence, FPGAs with an array of reconfigurable logic blocks canbevery useful compute elements. FPGA-based custom computing machines are

