Results 1 - 10
of
39
An Approach for Quantitative Analysis of Application-Specific Dataflow Architectures
- IN PROC. ASAP'97
, 1997
"... In this paper we present an approach for quantitative analysis of application-specific dataflow architectures. The approach allows the designer to rate design alternatives in a quantitative way and therefore supports him in the design process to find better performing architectures. The context of o ..."
Abstract
-
Cited by 67 (13 self)
- Add to MetaCart
In this paper we present an approach for quantitative analysis of application-specific dataflow architectures. The approach allows the designer to rate design alternatives in a quantitative way and therefore supports him in the design process to find better performing architectures. The context of our work is Video Signal Processing algorithms which are mapped onto weakly-programmable, coarse-grain dataflow architectures. The algorithms are represented as Kahn graphs with the functionality of the nodes being coarse-grain functions. We have implemented an architecture simulation environment that permits the definition of dataflow architectures as a composition of architecture elements, such as functional units, buffer elements and communication structures. The abstract, clockcycle accurate simulator has been built using a multi-threading package and employs object oriented principles. This results in a configurable and efficient simulator. Algorithms can subsequently be executed on the architecture model producing quantitative information for selected performance metrics. Results are presented for the simulation of a realistic application on several dataflow architecture alternatives, showing that many different architectures can be simulated in modest time on a modern workstation.
Design Considerations for Distributed Microsensor Systems
, 1999
"... Wireless distributed microsensor systems will enable the reliable monitoring and control of a variety of applications that range from medical and home security to machine diagnosis, chemical/biological detection and other military applications. The sensors have to be designed in a highly integrated ..."
Abstract
-
Cited by 47 (0 self)
- Add to MetaCart
Wireless distributed microsensor systems will enable the reliable monitoring and control of a variety of applications that range from medical and home security to machine diagnosis, chemical/biological detection and other military applications. The sensors have to be designed in a highly integrated fashion, optimizing across all levels of system abstraction, with the goal of minimizing energy dissipation. This paper addresses some of the key design considerations for future microsensor systems including the network protocols required for collaborative sensing and information distribution, system partitioning considering computation and communication costs, low energy electronics, power system design and energy harvesting techniques. 1. Introduction Over the last few years, the design of micropower wireless sensor systems has gained increasing importance for a variety of civil and military applications. The Low Power Wireless Integrated Microsensors (LWIM) project has made major advan...
Compaan: Deriving Process Networks from Matlab for Embedded Signal Processing Architectures
- IN PROCEEDINGS OF THE 8TH INTERNATIONAL WORKSHOP ON HARDWARE/SOFTWARE CODESIGN (CODES
, 2000
"... This paper presents the Compaan tool that automatically transforms a nested loop program written in Matlab into a processnetwork specification. The process ..."
Abstract
-
Cited by 45 (11 self)
- Add to MetaCart
This paper presents the Compaan tool that automatically transforms a nested loop program written in Matlab into a processnetwork specification. The process
Mobile Multimedia Systems
- In Proc. of PROGRESS workshop 2000
, 2000
"... system, or transmitted, in any form or by any means, electronic, mechanical, photocopying, recording, or otherwise, without prior written permission of the author. ..."
Abstract
-
Cited by 31 (7 self)
- Add to MetaCart
system, or transmitted, in any form or by any means, electronic, mechanical, photocopying, recording, or otherwise, without prior written permission of the author.
Stream Computations Organized for Reconfigurable Execution (SCORE): Introduction and Tutorial
- in Proceedings of the International Conference on Field-Programmable Logic and Applications
, 2000
"... A primary impediment to wide-spread exploitation of reconfigurable computing is the lack of a unifying computational model which allows application portability and longevity without sacrificing a substantial fraction of the raw capabilities. We introduce SCORE (Stream Computation Organized for Recon ..."
Abstract
-
Cited by 30 (8 self)
- Add to MetaCart
A primary impediment to wide-spread exploitation of reconfigurable computing is the lack of a unifying computational model which allows application portability and longevity without sacrificing a substantial fraction of the raw capabilities. We introduce SCORE (Stream Computation Organized for Reconfigurable Execution), a streambased compute model which virtualizes reconfigurable computing resources (compute, storage, and communication) by dividing a computation up into fixed-size "pages" and time-multiplexing the virtual pages on available physical hardware. Consequently, SCORE applications can scale up or down automatically to exploit a wide range of hardware sizes. We hypothesize that the SCORE model will ease development and deployment of reconfigurable applications and expand the range of applications which can benefit from reconfigurable execution. Further, we believe that a well engineered SCORE implementation can be efficient, wasting little of the capabilities of the raw hardw...
Totem: Custom Reconfigurable Array Generation
, 2001
"... Reconfigurable hardware has been shown to provide an efficient compromise between the flexibility of software and the performance of hardware. However, even coarse-grained reconfigurable architectures target the general case, and miss optimization opportunities present if characteristics of the ..."
Abstract
-
Cited by 29 (13 self)
- Add to MetaCart
Reconfigurable hardware has been shown to provide an efficient compromise between the flexibility of software and the performance of hardware. However, even coarse-grained reconfigurable architectures target the general case, and miss optimization opportunities present if characteristics of the desired application set are known. We can therefore increase efficiency by restricting the structure to support a class or a specific set of algorithms, while still providing flexibility within that set. By generating a custom array for a given computation domain, we explore the design space between an ASIC and an FPGA.
Synthesis of Custom Processors based on Extensible Platforms
- In ICCAD
, 2002
"... E#ciency and flexibility are critical, but often conflicting, design goals in embedded system design. The recent emergence of extensible processors promises a favorable tradeo# between e#- ciency and flexibility, while keeping design turnaround times short. Current extensible processor design flows ..."
Abstract
-
Cited by 25 (2 self)
- Add to MetaCart
E#ciency and flexibility are critical, but often conflicting, design goals in embedded system design. The recent emergence of extensible processors promises a favorable tradeo# between e#- ciency and flexibility, while keeping design turnaround times short. Current extensible processor design flows automate several tedious tasks, but typically require designers to manually select the parts of the program that are to be implemented as custom instructions. In this work, we describe an automatic methodology to select custom instructions to augment an extensible processor, in order to maximize its e#ciency for a given application program. We demonstrate that the number of custom instruction candidates grows rapidly with program size, leading to a large design space, and that the quality (speedup) of custom instructions varies significantly across this space, motivating the need for the proposed flow. Our methodology features cost functions to guide the custom instruction selection process, as well as static and dynamic pruning techniques to eliminate inferior parts of the design space from consideration. Further, we employ a two-stage process, wherein a limited number of promising instruction candidates are first selected, and then evaluated in more detail through cycle-accurate instruction set simulation and synthesis of the corresponding hardware, to identify the custom instruction combinations that result in the highest program speedup or maximize speedup under a given area constraint.
Design Methodology of a Low-Energy Reconfigurable Single-Chip DSP System
- Journal of VLSI Signal Processing
, 2000
"... ABSTRACT- In this paper, we first present a reconfigurable architecture template for low-power digital signal processing, and then an energy conscious design methodology to bridge the algorithm to architecture gap. The energy efficiency of such an architecture and the effectiveness of the methodolog ..."
Abstract
-
Cited by 22 (2 self)
- Add to MetaCart
ABSTRACT- In this paper, we first present a reconfigurable architecture template for low-power digital signal processing, and then an energy conscious design methodology to bridge the algorithm to architecture gap. The energy efficiency of such an architecture and the effectiveness of the methodology are demonstrated in case study implementations targeting baseband voice processing and digital signal processing. 1.
Automatic Layout of Domain-Specific Reconfigurable Subsystems for System-on-a-Chip
- ACM/SIGDA Symposium on Field-Programmable Gate Arrays
, 2002
"... When designing sacs, a unique opportunity exists to generate custom FPGA architectures that are specific to the application domain in which the device will be used. The inclusion of such a device will provide an efficient compromise between the flexibility of software and the performance of hardware ..."
Abstract
-
Cited by 20 (5 self)
- Add to MetaCart
When designing sacs, a unique opportunity exists to generate custom FPGA architectures that are specific to the application domain in which the device will be used. The inclusion of such a device will provide an efficient compromise between the flexibility of software and the performance of hardware, while at the same time allowing for post-fabrication modification of circuits. To automate the layout of reconfigurable subsystems for system-on-a-chip we present template reduction, standard cell, and circuit generator methods. We explore the standard cell method, as well as the creation of FPGA-specific standard cells. Compared to full custom circuits, we achieve designs that are 46% smaller and 36% faster when the application domain is well known in advance. In cases where no reduction from the full functionality is possible, the standard cell approach is 42% larger and 64% slower than full-custom circuits. Standard cells can thus provide competitive implementations, with significantly greater opportunity for adaptation to new domains.
A Streaming Multi-Threaded Model
- In Proceedings of the Third Workshop on Media and Stream Processors
, 2001
"... We present SCORE (Stream Computations Organized for Reconfigurable Execution), a multi-threaded model that relies on streams to expose thread parallelism and to enable e#cient scheduling, low-overhead communication, and scalability. We present work to-date on SCORE for scalable reconfigurable log ..."
Abstract
-
Cited by 15 (0 self)
- Add to MetaCart
We present SCORE (Stream Computations Organized for Reconfigurable Execution), a multi-threaded model that relies on streams to expose thread parallelism and to enable e#cient scheduling, low-overhead communication, and scalability. We present work to-date on SCORE for scalable reconfigurable logic, as well as implementation ideas for SCORE for processor architectures. We demonstrate that streams can be exposed as a clean architectural feature that supports forward compatibility to larger, more parallel hardware. 1. OVERVIEW For the past several decades, the predominant architectural abstraction for programmable computation systems has been the instruction set architecture (ISA). An ISA defines an instruction set and semantics for executing it. A key benefit of the ISA model is that those semantics decouple software from hardware development. A piece of software, written and compiled once, is guaranteed to run on any ISA-compatible device. This guarantee allows hardware to evolve...

