Results 1 - 10
of
16
Improving Functional Density Through Run-Time Circuit Reconfiguration
, 1997
"... orting a C compiler to the DISC processor. Justin Diether assisted in the design, hand-layout, and testing of many partially reconfigured circuits. I would also like to thank Paul Graham for his generous assistance and support of our many mutual activities, classes, and projects at BYU. Other gradua ..."
Abstract
-
Cited by 42 (2 self)
- Add to MetaCart
orting a C compiler to the DISC processor. Justin Diether assisted in the design, hand-layout, and testing of many partially reconfigured circuits. I would also like to thank Paul Graham for his generous assistance and support of our many mutual activities, classes, and projects at BYU. Other graduate students assisting me with this work include Russel Peterson, Mike Rencher, Richard Ross, and Peter Bellows. My advisor, Brad Hutchings, provided essential assistance and encouragement in all of the projects, ideas, and results presented within this work. My decision to complete this degree and write this dissertation was influenced largely by his advice and positive encouragement. Brent Nelson and other faculty members within the Electrical and Computer Engineering department at BYU have provided critical feedback on a wide variety of topics relating to this work. I would also like to acknowledge the insight and assistance of many collaborators researching closely related subjects. For
Artificial Neural Network Implementation on a Fine-Grained FPGA
- in Field Programmable Logic and Applications
, 1994
"... This paper reports on the implementation of an Artificial Neural Network (ANN) on an Atmel AT6005 Field Programmable Gate Array (FPGA). The work was carried out as an experiment in mapping a bit-level, logically intensive application onto the specific logic resources of a fine-grained FPGA. By ex ..."
Abstract
-
Cited by 19 (4 self)
- Add to MetaCart
This paper reports on the implementation of an Artificial Neural Network (ANN) on an Atmel AT6005 Field Programmable Gate Array (FPGA). The work was carried out as an experiment in mapping a bit-level, logically intensive application onto the specific logic resources of a fine-grained FPGA. By exploiting the reconfiguration capabilities of the Atmel FPGA, individual layers of the network are time multiplexed onto the logic array. This allows a larger ANN to be implemented on a single FPGA at the expense of slower overall system operation.
Run-Time Reconfiguration: A Method for Enhancing the Functional Density of SRAM-based FPGAs
- Journal of VLSI Signal Processing
, 1996
"... . One way to further exploit the reconfigurable resources of SRAM FPGAs and increase functional density is to reconfigure them during system operation. This process is referred to as Run-Time Reconfiguration (RTR). RTR is an approach to system implementation that divides an application or algorithm ..."
Abstract
-
Cited by 17 (2 self)
- Add to MetaCart
. One way to further exploit the reconfigurable resources of SRAM FPGAs and increase functional density is to reconfigure them during system operation. This process is referred to as Run-Time Reconfiguration (RTR). RTR is an approach to system implementation that divides an application or algorithm into time-exclusive operations that are implemented as separate configurations. The Run-Time Reconfiguration Artificial Neural Network (RRANN) is a proof-of-concept system that demonstrates the effectiveness of RTR for implementing neural networks. It implements the popular backpropagation training algorithm as three distinct time-exclusive FPGA configurations: feed-forward, backpropagation and update. System operation consists of sequencing through these three reconfigurations at run-time, one configuration at a time. RRANN has been fully implemented with Xilinx FPGAs, tested and shown to increase the functional density of a network up to 500% when compared to FPGA-based implementations tha...
Computer Vision Algorithms on Reconfigurable Logic Arrays
- IEEE TRANS. ON PARALLEL AND DISTRIBUTED SYSTEMS
, 1999
"... Computer vision algorithms are natural candidates for high performance computing due to their inherent parallelism and intense computational demands. For example, a simple 3 x 3 convolution on a 512 x 512 gray scale image at 30 frames per second requires 67.5 million multiplications and 60 million a ..."
Abstract
-
Cited by 11 (1 self)
- Add to MetaCart
Computer vision algorithms are natural candidates for high performance computing due to their inherent parallelism and intense computational demands. For example, a simple 3 x 3 convolution on a 512 x 512 gray scale image at 30 frames per second requires 67.5 million multiplications and 60 million additions to be performed in one second. Computer vision tasks can be classified into three categories based on their computational complexity andcommunication complexity: low-level, intermediate-level and high-level. Special-purpose hardware provides better performance compared to a general-purpose hardware for all the three levels of vision tasks. With recent advances in very large scale integration (VLSI) technology, an application specific integrated circuit (ASIC) can provide the best performance in terms of total execution time. However, long design cycle time, high development cost and inflexibility of a dedicated hardware deter design of ASICs. In contrast, field programmable gate arrays (FPGAs) support lower design verification time and easier design adaptability atalower cost. Hence, FPGAs with an array of reconfigurable logic blocks canbevery useful compute elements. FPGA-based custom computing machines are
Design Patterns for Reconfigurable Computing
- In Proceedings of the IEEE Symposium on Field-Programmable Custom Computing Machines
, 2004
"... It is valuable to identify and catalog design patterns for reconfigurable computing. These design patterns are canonical solutions to common and recurring design challenges which arise in reconfigurable systems and applications. The catalog can form the basis for creating designs, for educating new ..."
Abstract
-
Cited by 7 (1 self)
- Add to MetaCart
It is valuable to identify and catalog design patterns for reconfigurable computing. These design patterns are canonical solutions to common and recurring design challenges which arise in reconfigurable systems and applications. The catalog can form the basis for creating designs, for educating new designers, for understanding the needs of tools and languages, and for discussing reconfigurable design. Tying application and implementation lessons to the expansion and refinement of this catalog will make those lessons more relevant to the design community. In this paper, we articulate this role for design patterns in reconfigurable computing, provide a few example patterns, offer a starting point for the contents of the catalog, and discuss the potential benefits of this effort. 1
Learning in Stochastic Bit Stream Neural Networks
"... This paper presents learning techniques for a novel feedforward stochastic neural network. The model uses stochastic weights and the `bit stream' data representation . It has a clean analysable functionality and is very attractive with its great potential to be implemented in hardware using standard ..."
Abstract
-
Cited by 5 (1 self)
- Add to MetaCart
This paper presents learning techniques for a novel feedforward stochastic neural network. The model uses stochastic weights and the `bit stream' data representation . It has a clean analysable functionality and is very attractive with its great potential to be implemented in hardware using standard digital VLSI technology. The design allows simulation at three different levels and learning techniques are described for each level. The lowest level corresponds to on-chip learning. Simulation results on three benchmark MONK's problems and handwritten digit recognition with a clean set of 500 16\Theta16 pixel digits demonstrate that the new model is powerful enough for the real world applications. Keywords---Stochastic Computing, Bit Stream, Stochastic Neural Networks, Gradient Descent Learning. 1. INTRODUCTION The stochastic neural network is a very promising model for global optimization with its ability to escape from local minima. In a stochastic neural network the information is us...
A Fast FPGA Implementation of a General Purpose Neuron
- IN PROC. OF THE FOURTH INTERNATIONAL WORKSHOP ON FIELD PROGRAMMABLE LOGIC AND APPLICATIONS
, 1994
"... The implementation of larger digital neural networks has not been possible due to the real-estate requirements of single neurons. We present an expandable digital architecture which allows fast and spaceefficient computation of the sum of weighted inputs, providing an efficient implementation ba ..."
Abstract
-
Cited by 5 (2 self)
- Add to MetaCart
The implementation of larger digital neural networks has not been possible due to the real-estate requirements of single neurons. We present an expandable digital architecture which allows fast and spaceefficient computation of the sum of weighted inputs, providing an efficient implementation base for large neural networks. The actual digital circuitry is simple and highly regular, thus allowing very efficient space usage of fine grained FPGAs. We take advantage of the re-programmability of the devices to automatically generate new custom hardware for each topology of the neural network.
Space Efficient Neural Net Implementation
- Proc. of the Second ACM Workshop on Field-Programmable Gate Arrays
, 1994
"... We show how field-programable gate arrays can be used to efficiently implement neural nets. By implementing the training phase in software and the actual application in hardware, conflicting demands can be met: training benefits from a fast edit-debug cycle, and once the design has stabilized, a har ..."
Abstract
-
Cited by 4 (2 self)
- Add to MetaCart
We show how field-programable gate arrays can be used to efficiently implement neural nets. By implementing the training phase in software and the actual application in hardware, conflicting demands can be met: training benefits from a fast edit-debug cycle, and once the design has stabilized, a hardware implementation results in higher performance. While neural nets have been implemented in hardware in the past, larger digital nets have not been possible due to the real-estate requirements of single neurons. We present a bit-serial encoding scheme and computation model, which allows space-efficient computation of the sum of weighted inputs, thereby facilitating the implementation of complex neural networks. 1 Introduction Conventional computer hardware is not optimized for simulating neural networks. Therefore, several hardware implementations for neural nets have been suggested ([MS88], [Sal90], [CB92], [vDJST93]). While the functions of neural networks are comparatively s...
Neural Networks Using Bit Stream Arithmetic: a Space Efficient Implementation
- IN PROCEEDINGS OF THE IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS
, 1994
"... In this paper an expandable digital architecture that provides an efficient implementation base for large neural networks, is presented. The architecture uses the circuit for arithmetic operations on delta encoded signals to carry out the large number of required parallel synaptic calculations. Al ..."
Abstract
-
Cited by 3 (2 self)
- Add to MetaCart
In this paper an expandable digital architecture that provides an efficient implementation base for large neural networks, is presented. The architecture uses the circuit for arithmetic operations on delta encoded signals to carry out the large number of required parallel synaptic calculations. All real valued quantities are encoded on delta bit streams. The actual digital circuitry is simple and highly regular, thus allowing very efficient space usage of fine grained FPGAs.
Real Time Output Derivatives For On Chip Learning Using Digital Stochastic Bit Stream Neurons
- Electronics Letters
, 1994
"... In this paper we present the hardware design of an extremely compact and novel digital stochastic neuron, that has the ability to generate the derivative of its output with respect to an arbitrary input. These derivatives may be used to form the basis of an on chip gradient descent learning algorith ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
In this paper we present the hardware design of an extremely compact and novel digital stochastic neuron, that has the ability to generate the derivative of its output with respect to an arbitrary input. These derivatives may be used to form the basis of an on chip gradient descent learning algorithm. Introduction An artificial neuron is required to calculate a single output value by applying an `activation function' to the weighted sum of its inputs. Such neurons are intended to operate in massively parallel networks, often processing real time data. Conventionally, feedforward networks containing neurons of this type are trained off line using learning algorithms such as back propagation, but recently some research has focused on building the learning algorithms directly into the neural hardware [1, 2]. In this paper we present the design of an enhanced stochastic bit stream neuron that contains additional circuitry that allows the real time calculation of the neuron's output deriva...

