Results 1 - 10
of
41
OSCAR: Optimum Simultaneous Scheduling, Allocation and Resource Binding Based on Integer Programming
, 1994
"... : This paper presents an approach to high-level synthesis which is based upon a 0/1 integer programming model. In contrast to other approaches, this model allows solving all three subtasks of high-level synthesis (scheduling, allocation and binding) simultaneously. As a result, designs which are opt ..."
Abstract
-
Cited by 42 (8 self)
- Add to MetaCart
: This paper presents an approach to high-level synthesis which is based upon a 0/1 integer programming model. In contrast to other approaches, this model allows solving all three subtasks of high-level synthesis (scheduling, allocation and binding) simultaneously. As a result, designs which are optimal with respect to the cost function are generated. The model is able to exploit large component libraries with multi-functional units and complex components such as multiplier-accumulators. Furthermore, the model is capable of handling mixed speeds and chaining in its general form. 1 Introduction During the recent years, there has been an ever-increasing demand to speed up the design cycles for the design of electronic systems. This demand is caused by time-to-market requirements for products in this area. At the same time, there has been an increasing need to achieve for correctness by construction. Due to these driving forces, synthesis techniques are now being used for the design of ...
Time-constrained Code Compaction for DSPs
- IEEE Trans. on VLSI Systems
, 1995
"... DSP algorithms in most cases are subject to hard real-time constraints. In case of programmable DSP processors, meeting those constraints must be ensured by appropriate code generation techniques. For processors offering instruction-level parallelism, the task of code generation includes code compac ..."
Abstract
-
Cited by 38 (14 self)
- Add to MetaCart
DSP algorithms in most cases are subject to hard real-time constraints. In case of programmable DSP processors, meeting those constraints must be ensured by appropriate code generation techniques. For processors offering instruction-level parallelism, the task of code generation includes code compaction. The exact timing behavior of a DSP program is only known after compaction. Therefore, real-time constraints should be taken into account during the compaction phase. While most known DSP code generators rely on rigid heuristics for that phase, this paper proposes a novel approach to local code compaction based on an Integer Programming model, which obeys exact timing constraints. Due to a general problem formulation, the model also obeys encoding restrictions and possible side effects. 1 1 Introduction & related work Design requirements for embedded systems including DSP functionality strongly differ from those for interactive environments such as workstations. While in the latter ca...
SALSA: A New Approach to Scheduling with Timing Constraints
, 1993
"... This paper describes a new approach to the scheduling problem in high-level synthesis that meets timing constraints while attempting to minimize hardware resource costs. The approach is based on a modified control/data flow graph (CDFG) representation called SALSA. SALSA provides a simple move set t ..."
Abstract
-
Cited by 25 (3 self)
- Add to MetaCart
This paper describes a new approach to the scheduling problem in high-level synthesis that meets timing constraints while attempting to minimize hardware resource costs. The approach is based on a modified control/data flow graph (CDFG) representation called SALSA. SALSA provides a simple move set that allows alternative schedules to be quickly explored while maintaining timing constraints. It is shown that this move set is complete in that any legal schedule can be reached using some sequence of move applications. In addition, SALSA provides support for scheduling with conditionals, loops, and subroutines. Scheduling with SALSA is performed in two steps. First, an initial schedule that meets timing constraints is generated using a constraint solution algorithm adapted from layout compaction. Second, the schedule is improved using the SALSA move set under control of a simulated annealing algorithm. Results show the scheduler's ability to find good schedules which meet timing constraint...
Bitwidth Cognizant Architecture Synthesis of Custom Hardware Accelerators
- IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
, 2001
"... applicationspecific design, architecture synthesis, bitwidth, clustering, embedded system, hardware accelerator, operation scheduling, resource allocation PICO is a system for automatically synthesizing embedded hardware accelerators from loop nests specified in the C programming language. A key iss ..."
Abstract
-
Cited by 22 (2 self)
- Add to MetaCart
applicationspecific design, architecture synthesis, bitwidth, clustering, embedded system, hardware accelerator, operation scheduling, resource allocation PICO is a system for automatically synthesizing embedded hardware accelerators from loop nests specified in the C programming language. A key issue confronted when designing such accelerators is the optimization of hardware by exploiting information that is known about the varying number of bits required to represent and process operands. In this paper, we describe the handling and exploitation of integer bitwidth in PICO. A bitwidth analysis procedure is used to determine bitwidth requirements for all integer variables and operations in a C application. Given known bitwidths for all variables, complex problems arise when determining a program schedule that specifies on which function unit and at what time each operation executes. If operations are assigned to function units with no knowledge of bitwidth, bitwidth-related cost benefit is lost when each unit is built to accommodate the widest operation assigned. By carefully placing operations of similar width on the same unit, hardware costs are decreased. This problem is addressed using a preliminary clustering of operations that is based jointly on width and implementation cost. These clusters are then honored during resource allocation and operation scheduling to create an efficient widthconscious design. Experimental results show that exploiting integer bitwidth substantially reduces the gate count of PICO-synthesized hardware accelerators across a range of applications.
Optimal Selection of Supply Voltages and Level Conversions During Data Path Scheduling under Resource Constraints
- IN PROC. INT. CONF. COMPUTER DESIGN
, 1996
"... In this paper we will consider how to select an optimal set of supply voltages and account for level conversion costs when optimizing the schedule of a resource dominated data path for minimum energy dissipation. An integer linear program (ILP) is presented for minimum energy schedules under latency ..."
Abstract
-
Cited by 16 (2 self)
- Add to MetaCart
In this paper we will consider how to select an optimal set of supply voltages and account for level conversion costs when optimizing the schedule of a resource dominated data path for minimum energy dissipation. An integer linear program (ILP) is presented for minimum energy schedules under latency, supply voltage, and resource constraints. The supply voltage assignment for each resource is modeled as fixed for all time. Schedules were generated for a variety of data path structures, resource and latency constraints. Resource constraints tended to limit the use of reduced supply voltages. With latency constraints loosened to 1:5\Theta minimum latency, unlimited resources, and two power supplies, energy savings ranged from 53% to 70% compared to 5V operation. When resource constraints were applied, savings dropped to a range of 46% to 58%. Loosened latency constraints resulted in increased use of lower supply voltages. With resource constraints unchanged and latency constraints of 2\T...
PROPAN: A Retargetable System for Postpass Optimisations and Analyses
, 2000
"... Propan is a system that allows for the generation of machine-dependent postpass optimisations and analyses on assembly level. It has been especially designed to perform high-quality optimisations for irregular architectures. All information about the target architecture is specied in the machine des ..."
Abstract
-
Cited by 16 (4 self)
- Add to MetaCart
Propan is a system that allows for the generation of machine-dependent postpass optimisations and analyses on assembly level. It has been especially designed to perform high-quality optimisations for irregular architectures. All information about the target architecture is specied in the machine description language Tdl. For each target architecture a phase-coupled code optimiser is generated which can perform integrated global instruction scheduling, register reassignment, and resource allocation by integer linear programming (ILP). All relevant hardware characteristics of the target processor are precisely incorporated in the generated integer linear programs. Two dierent ILP models are available so that the most appropriate modelling can be selected individually for each target architecture. The integer linear programs can be solved either exactly or by the use of ILP-based approximations. This allows for high quality solutions to be calculated in acceptable time. A set of practic...
High-Level Synthesis Scheduling and Allocation using Genetic Algorithms based on Constructive Topological Scheduling Techniques
- Proceedings of the ASP-DAC95/CHDL95/VLSI95. Asia and South Pacific Design Automation Conference. IFIP International conference on Computer Hardware Description Languages and their Applications. IFIP International Conference on Very Large Scale Integration
, 1995
"... In this article constructive scheduling methods combined with genetic algorithms are used to searchfor a suitable order to schedule the operations. The method is extended with an encoding capable of allocating supplementary resources during scheduling. This makes it very suitable in high-level synth ..."
Abstract
-
Cited by 15 (1 self)
- Add to MetaCart
In this article constructive scheduling methods combined with genetic algorithms are used to searchfor a suitable order to schedule the operations. The method is extended with an encoding capable of allocating supplementary resources during scheduling. This makes it very suitable in high-level synthesis strategies based on lower bound estimations techniques. Experiments and comparisons show high quality results and fast run times that outperform results produced by other heuristic scheduling methods 1 Introduction During high-level synthesis a behavioral description of a chip is translated into a digital network structure [McFa90]. The behavioral description consists of calculations (like additions, multiplications, logical operations etc.) and control structures (like conditionals, loops and procedure calls) which are used to transform input data into output data. The digital network structure consists of functional modules (adders, multipliers, ALUs, logical gates), storage (like r...
A Fast Approach to Computing Exact Solutions to the Resource-Constrained Scheduling Problem
- ACM TRANS. DESIGN AUTOMATION OF ELECTRONIC SYSTEMS
, 1997
"... This paper presents an algorithm that substantially reduces the computational effort required to obtain the exact solution to the Resource Constrained Scheduling (RCS) problem. The reduction is obtained by (a) using a branch-and-bound search technique, which computes both upper and lower bounds, and ..."
Abstract
-
Cited by 15 (3 self)
- Add to MetaCart
This paper presents an algorithm that substantially reduces the computational effort required to obtain the exact solution to the Resource Constrained Scheduling (RCS) problem. The reduction is obtained by (a) using a branch-and-bound search technique, which computes both upper and lower bounds, and (b) using ecient techniques to accurately estimate the possible time-steps at which each operation can be scheduled and using this to prune the search space. Results on several benchmarks with varying resource constraints indicate the clear superiority of the algorithm presented here over traditional approaches using integer linear programming, with speed-ups of several orders of magnitude.
Data Path Allocation using an Extended Binding Model
- in Proceedings of the Design Automation Conference
, 1992
"... * Existing approaches to data path allocation in highlevel synthesis use a binding model in which values are assigned to the same register for their entire lifetimes. This paper describes an extended binding model in which segments of a value's lifetime may reside in different registers if there is ..."
Abstract
-
Cited by 14 (0 self)
- Add to MetaCart
* Existing approaches to data path allocation in highlevel synthesis use a binding model in which values are assigned to the same register for their entire lifetimes. This paper describes an extended binding model in which segments of a value's lifetime may reside in different registers if there is a cost advantage in doing so. In addition, the model supports multiple copies of values and the use of functional units to "pass through" unmodified values to reduce interconnect. This model is exploited in an allocation tool that uses iterative improvement to search for low-cost designs. Results show that allocation costs can be substantially reduced using this model. 1. Introduction Data path allocation [1] is the problem of assigning hardware to a scheduled control/data flow graph (CDFG) to implement a specified behavior while meeting performance and timing constraints and minimizing implementation cost. The CDFG specifies operators that manipulate data, data values that require storage, ...
ILP-based Instruction Scheduling for IA-64
- IN PROCEEDINGS OF THE WORKSHOP ON LANGUAGES, COMPILERS
, 2001
"... The IA-64 architecture has been designed as a synthesis of VLIW and superscalar design principles. It incorporates typical functionality known from embedded processors as multiply/accumulate units and SIMD operations for 3D graphics operations. In this paper we present an ILP formulation for the pro ..."
Abstract
-
Cited by 11 (1 self)
- Add to MetaCart
The IA-64 architecture has been designed as a synthesis of VLIW and superscalar design principles. It incorporates typical functionality known from embedded processors as multiply/accumulate units and SIMD operations for 3D graphics operations. In this paper we present an ILP formulation for the problem of instruction scheduling for IA-64. In order to obtain a feasible schedule it is necessary to model the data dependences, resource constraints as well as additional encoding restrictions -- the bundling mechanism. These dierent aspects represent subproblems that are closely coupled which gives the motivation for a modeling based on integer linear programming. The presented approach is divided into two phases which allows us to compute mostly optimal solutions with acceptable computation time.

