Results 1  10
of
22
Recent Developments in HighLevel Synthesis
 ACM Transactions on Design Automation of Electronic Systems
, 1997
"... ing with credit is permitted. To copy otherwise, to republish, to post on servers, to redistribute to lists, or to use any component of this work in other works, requires prior specific permission and/or a fee. Permissions may be requested from Publications Dept, ACM Inc., 1515 Broadway, New York, N ..."
Abstract

Cited by 45 (0 self)
 Add to MetaCart
(Show Context)
ing with credit is permitted. To copy otherwise, to republish, to post on servers, to redistribute to lists, or to use any component of this work in other works, requires prior specific permission and/or a fee. Permissions may be requested from Publications Dept, ACM Inc., 1515 Broadway, New York, NY 10036 USA, fax +1 (212) 8690481, or permissions@acm.org Recent Development in High Level Synthesis y YounLong Lin Department of Computer Science Tsing Hua University HsinChu, Taiwan 30043, R. O. C. Abstract We survey recent development in high level synthesis technology for VLSI design. The need for higher level design automation tools are first discussed. We then describe some basic techniques for various subtasks of high level synthesis. Techniques that have been proposed in the past few years (since 1994) for various subtasks of high level synthesis are surveyed. We also survey some new synthesis objectives including testability, power efficiency and reliability. Keywords: High ...
AutomataBased Symbolic Scheduling
, 2000
"... This dissertation presents a set of techniques for representing the highlevel behavior of a digital subsystem as a collection of nondeterministic finite automata, NFA. Desired behavioral and implementation dynamics: dependencies, repetition, bounded resources, sequential character, and control stat ..."
Abstract

Cited by 15 (0 self)
 Add to MetaCart
(Show Context)
This dissertation presents a set of techniques for representing the highlevel behavior of a digital subsystem as a collection of nondeterministic finite automata, NFA. Desired behavioral and implementation dynamics: dependencies, repetition, bounded resources, sequential character, and control state, can also be similarly modeled. All possible system execution sequences, obeying imposed constraints, are encapsulated in a composed NFA. Technology similar to that used in symbolic model checking enables implicit exploration and extraction of bestpossible execution sequences. This provides a very general, systematic procedure to perform exact highlevel synthesis of cyclic, controldominated behaviors constrained by arbitrary sequential constraints. This dissertation further demonstrates that these techniques are scalable to practical problem sizes and complexities. Exact scheduling solutions are constructed for a variety of academic and industrial problems, including a pipelined RISC processor. The ability to represent and schedule sequential models with hundreds of tasks and onehalf million control cases substantially raises the bar as to what is believed possible for exact scheduling models. Keywords: Scheduling; Binary Decision Diagrams; HighLevel Synthesis; Nondeterminism; Automata; Symbolic Model.
A Mathematical Formulation of the Loop Pipelining Problem
 XI Design of integrated Circuits and Systems Conference (DCIS'96
, 1995
"... A mathematical model for the loop pipelining problem is presented. The model considers several parameters for optimization and supports any combination of resource and timing constraints. The unrolling degree of the loop is one of the variables explored by the model. By using Farey's series, an ..."
Abstract

Cited by 9 (2 self)
 Add to MetaCart
(Show Context)
A mathematical model for the loop pipelining problem is presented. The model considers several parameters for optimization and supports any combination of resource and timing constraints. The unrolling degree of the loop is one of the variables explored by the model. By using Farey's series, an optimal exploration of the unrolling degree is performed and optimal solutions not considered by other methods are obtained. Finding an optimal schedule that minimizes resource requirements (including registers) is solved by an ILP model. A novel paradigm called branch and prune is proposed to efficiently converge towards the optimal schedule and prune the search tree for integer solutions, thus drastically reducing the running time. This is the first formulation that combines the unrolling degree of the loop with timing and resource constraints in a mathematical model that guarantees optimal solutions. 1 1 Introduction It is well known that loops monopolize most execution time of programs. I...
ResourceConstrained Software Pipelining For HighLevel Synthesis Of DSP Systems
, 1994
"... . This paper presents UNRET (Unrolling and Retiming), a new approach for software pipelining with resource constraints which is suitable for highlevel synthesis of DSP systems. UNRET works with the dataflow graph which describes the loop body. Two graph transformations are considered: loop unrolli ..."
Abstract

Cited by 6 (4 self)
 Add to MetaCart
. This paper presents UNRET (Unrolling and Retiming), a new approach for software pipelining with resource constraints which is suitable for highlevel synthesis of DSP systems. UNRET works with the dataflow graph which describes the loop body. Two graph transformations are considered: loop unrolling and retiming . The target architecture is composed of a limited number of different (possibly pipelined) functional units. The goal of UNRET is to find a scheduling that maximizes the resource utilization of the architecture. UNRET improves the results obtained by other systems, achieving optimal solutions in most cases. KEYWORDS. Software Pipelining, Loop Pipelining, Unrolling, Retiming, Scheduling, Resource Utilization. 1 INTRODUCTION Software Pipelining techniques [11] attempt to overlap the execution of different loop iterations, by searching for a new loop body (steady state) with more execution parallelism. In general, in order to maintain the semantics of the original loop, a pie...
A Genetic Approach to the Overlapped Scheduling of Iterative DataFlow Graphs for Target Architectures with Communication Delays
 ProRISC Workshop on Circuits, Systems and Signal Processing
, 1997
"... This paper presents a method to solve the overlapped fullystatic multiprocessor scheduling problem. An iterative dataflow graph (IDFG) is mapped on a target architecture that allows finegrain parallelism. The goal is the minimization of the iteration period. The method can deal with nonzero delay ..."
Abstract

Cited by 6 (2 self)
 Add to MetaCart
(Show Context)
This paper presents a method to solve the overlapped fullystatic multiprocessor scheduling problem. An iterative dataflow graph (IDFG) is mapped on a target architecture that allows finegrain parallelism. The goal is the minimization of the iteration period. The method can deal with nonzero delay times to communicate data between processors as well as with link capacities in the interconnection network. Excellent results for benchmark IDFGs have been obtained by the method that consists of three layers, each concentrating on a different aspect of the optimization problem. I. Introduction An algorithm that contains computations that can be executed simultaneously, offers possibilities of exploiting the parallelism present by implementing it on appropriate hardware such as a multiprocessor system. The class of algorithms considered in this paper is limited to algorithms that can be represented by homogeneous synchronous dataflow graphs [1], also called iterative dataflow graphs (ID...
Design space exploration in applicationspecific hardware synthesis for multiple communicating nested loops
 in SAMOS 2012
, 2012
"... Abstract—Application specific MPSoCs are often used to implement highperformance dataintensive applications. MPSoC design requires a rapid and efficient exploration of the hardware architecture possibilities to adequately orchestrate the data distribution and architecture of parallel MPSoC comput ..."
Abstract

Cited by 5 (2 self)
 Add to MetaCart
(Show Context)
Abstract—Application specific MPSoCs are often used to implement highperformance dataintensive applications. MPSoC design requires a rapid and efficient exploration of the hardware architecture possibilities to adequately orchestrate the data distribution and architecture of parallel MPSoC computing resources. Behavioral specifications of dataintensive applications are usually given in the form of a loopbased sequential code, which requires parallelization and task scheduling for an efficient MPSoC implementation. Existing approaches in application specific hardware synthesis, use loop transformations to efficiently parallelize single nested loops and use Synchronous Data Flows to statically schedule and balance the data production and consumption of multiple communicating loops. This creates a separation between data and task parallelism analyses, which can reduce the possibilities for throughput optimization in highperformance dataintensive applications. This paper proposes a method for a concurrent exploration of data and task parallelism when using loop transformations to optimize data transfer and storage mechanisms for both single and multiple communicating nested loops. This method provides orchestrated application specific decisions on communication architecture, memory hierarchy and computing resource parallelism. It is computationally efficient and produces highperformance architectures. I.
An integer linear programming approach to the overlapped scheduling of iterative dataflow graphs for target architectures with communication delays
 In PROGRESS 2000 Workshop on Embedded Systems
, 2000
"... Abstract — This paper considers the scheduling of homogeneous synchronous dataflow graphs also called iterative dataflow graphs (IDFGs) on a multiprocessor system. Algorithms described by such graphs consist of a core computation that is iterated “infinitely often”. The computation does not contai ..."
Abstract

Cited by 4 (0 self)
 Add to MetaCart
(Show Context)
Abstract — This paper considers the scheduling of homogeneous synchronous dataflow graphs also called iterative dataflow graphs (IDFGs) on a multiprocessor system. Algorithms described by such graphs consist of a core computation that is iterated “infinitely often”. The computation does not contain datadependent decisions. All scheduling decisions for such algorithms can be taken at compile time. Finegrain parallelism is assumed where the basic tasks are primitive operations (such as additions) and the interprocessor communication times are just a few clock cycles. Scheduling methods for such a model have recently been presented by several authors. These approaches assign operations to processors and data transfers to links at appropriate times. The work presented here extends the one reported in [16] based on integer linear programming. Optimal results to problems of reasonable size were found after acceptable computation times. I.
Low power pipelining of linear systems: A common operand centric approach
 in Proc. of the IEEE/ACM Int. Symp. on Low Power Design
, 2001
"... In this paper, we propose a systematic pipelining method for a linear system to minimize power and maximize throughput, given a constraint on the number of pipeline stages and a set of resource constraints. The method first retimes operations such that as many operations as possible take common oper ..."
Abstract

Cited by 4 (1 self)
 Add to MetaCart
(Show Context)
In this paper, we propose a systematic pipelining method for a linear system to minimize power and maximize throughput, given a constraint on the number of pipeline stages and a set of resource constraints. The method first retimes operations such that as many operations as possible take common operands as their inputs, and then performs the operand sharing based on the list scheduling. Experimental results show that the proposed approach reduces the power consumption of the functional units by up to more than 20%, compared to the stateoftheart pipelining and operand sharing techniques.
MGTP: a model generation theorem proverits advanced features and applications
 ACM Transactions on Design Automation of Electronic Systems
, 1997
"... We survey recent developments in high level synthesis technology for VLSI design. The need for higherlevel design automation tools are discussed first. We then describe some basic techniques for various subtasks of highlevel synthesis. Techniques that have been proposed in the past few years (sinc ..."
Abstract

Cited by 3 (0 self)
 Add to MetaCart
We survey recent developments in high level synthesis technology for VLSI design. The need for higherlevel design automation tools are discussed first. We then describe some basic techniques for various subtasks of highlevel synthesis. Techniques that have been proposed in the past few years (since 1994) for various subtasks of highlevel synthesis are surveyed. We also survey some new synthesis objectives including testability, power efficiency, and reliability.
A HighLevel Synthesis Tool for the Assignment of Storage Values to Sequential ReadWrite Memories
 In IFIP TC 10 WG 10.5 International Workshop on Logic and Architecture Synthesis
, 1995
"... Sequential readwrite memories (SRWMs) could be used as an alternative to register files in datapath synthesis. They can be considered RAMs without address decoder. A shift register is used instead to sequentially point at memory locations for read or write. Although SRWMs behave similarly to the m ..."
Abstract

Cited by 3 (3 self)
 Add to MetaCart
(Show Context)
Sequential readwrite memories (SRWMs) could be used as an alternative to register files in datapath synthesis. They can be considered RAMs without address decoder. A shift register is used instead to sequentially point at memory locations for read or write. Although SRWMs behave similarly to the memory structures recently proposed by Aloqeely and Chen, they are more interesting because of their lower power consumption. Algorithms are presented to check whether a set of storage value fit in a single SRWM (exactly by means of branchandbound) and to automatically map storage values in as few SRWMs as possible (by means of heuristics). As opposed to Aloqeely and Chen, also good benchmark results have been obtained for applications with a low degree of "regularity". 1 Introduction Highlevel synthesis, the automatic mapping of an algorithmic description of some computation to a description at the registertransfer level, is normally tackled by dividing the problem into a number of subpr...