Results 1 - 10
of
35
Some efficient solutions to the affine scheduling problem -- Part I One-dimensional Time
, 1996
"... Programs and systems of recurrence equations may be represented as sets of actions which are to be executed subject to precedence constraints. In many cases, actions may be labelled by integral vectors in some iteration domain, and precedence constraints may be described by affine relations. A s ..."
Abstract
-
Cited by 192 (18 self)
- Add to MetaCart
Programs and systems of recurrence equations may be represented as sets of actions which are to be executed subject to precedence constraints. In many cases, actions may be labelled by integral vectors in some iteration domain, and precedence constraints may be described by affine relations. A schedule for such a program is a function which assigns an execution date to each action. Knowledge of such a schedule allows one to estimate the intrinsic degree of parallelism of the program and to compile a parallel version for multiprocessor architectures or systolic arrays. This paper deals with the problem of finding closed form schedules as affine or piecewise affine functions of the iteration vector. An efficient algorithm is presented which reduces the scheduling problem to a parametric linear program of small size, which can be readily solved by an efficient algorithm.
Loop Parallelization in the Polytope Model
- CONCUR '93, Lecture Notes in Computer Science 715
, 1993
"... . During the course of the last decade, a mathematical model for the parallelization of FOR-loops has become increasingly popular. In this model, a (perfect) nest of r FOR-loops is represented by a convex polytope in Z r . The boundaries of each loop specify the extent of the polytope in a dis ..."
Abstract
-
Cited by 87 (23 self)
- Add to MetaCart
. During the course of the last decade, a mathematical model for the parallelization of FOR-loops has become increasingly popular. In this model, a (perfect) nest of r FOR-loops is represented by a convex polytope in Z r . The boundaries of each loop specify the extent of the polytope in a distinct dimension. Various ways of slicing and segmenting the polytope yield a multitude of guaranteed correct mappings of the loops' operations in space-time. These transformations have a very intuitive interpretation and can be easily quantified and automated due to their mathematical foundation in linear programming and linear algebra. With the recent availability of massively parallel computers, the idea of loop parallelization is gaining significance, since it promises execution speed-ups of orders of magnitude. The polytope model for loop parallelization has its origin in systolic design, but it applies in more general settings and methods based on it will become a part of futur...
Generation of Efficient Nested Loops from Polyhedra
- International Journal of Parallel Programming
, 2000
"... Automatic parallelization in the polyhedral model is based on affine transformations from an original computation domain (iteration space) to a target space-time domain, often with a different transformation for each variable. Code generation is an often ignored step in this process that has a signi ..."
Abstract
-
Cited by 62 (1 self)
- Add to MetaCart
Automatic parallelization in the polyhedral model is based on affine transformations from an original computation domain (iteration space) to a target space-time domain, often with a different transformation for each variable. Code generation is an often ignored step in this process that has a significant impact on the quality of the final code. It involves making a trade-off between code size and control code simplification/optimization. Previous methods of doing code generation are based on loop splitting, however they have non-optimal behavior when working on parameterized programs. We present a general parameterized method for code generation based on dual representation of polyhedra. Our algorithm uses a simple recursion on the dimensions of the domains, and enables fine control over the tradeoff between code size and control overhead.
Constructive Methods for Scheduling Uniform Loop Nests
- IEEE Transactions on Parallel and Distributed Systems
, 1994
"... This paper surveys scheduling techniques for loop nests with uniform dependences. First we introduce the hyperplane method and related variants. Then we extend it by using a different affine scheduling for each statement within the nest. In both cases we present a new, constructive and efficient met ..."
Abstract
-
Cited by 60 (3 self)
- Add to MetaCart
This paper surveys scheduling techniques for loop nests with uniform dependences. First we introduce the hyperplane method and related variants. Then we extend it by using a different affine scheduling for each statement within the nest. In both cases we present a new, constructive and efficient method to determine optimal solutions, i.e. schedules whose total execution time is minimum. 1 Introduction Loop nests lie in the heart of supercompilers-parallelizers for supercomputers. On one hand their importance in terms of applications is evident: in many scientific programs, the time spent in the execution of a small number of loops represents a large fraction of the total execution time, while the potential parallelism of these loops is very important. On the other hand, the regular and repetitive structure of loop nests greatly facilitates the use of dependence analysis techniques and of scheduling and allocation strategies. The general problem of finding the optimal scheduling for a ...
Memory Reuse Analysis in the Polyhedral Model
- Parallel Processing Letters
, 1996
"... In the context of developing a compiler for a Alpha, a functional dataparallel language based on systems of affine recurrence equations (SAREs), we address the problem of transforming scheduled single-assignment code to multiple assignment code. We show how the polyhedral model allows us to statical ..."
Abstract
-
Cited by 14 (1 self)
- Add to MetaCart
In the context of developing a compiler for a Alpha, a functional dataparallel language based on systems of affine recurrence equations (SAREs), we address the problem of transforming scheduled single-assignment code to multiple assignment code. We show how the polyhedral model allows us to statically compute the lifetimes of program variables, and thus enables us to derive necessary and sufficient conditions for reusing memory. 1. Introduction The methodology of automatic systolic array synthesis from Systems of Affine Recurrence Equations (SAREs) has a close bearing on parallelizing compilers and on efficient implementation of functional languages. To study this relationship, we are currently developing a compiler for Alpha [9], a functional, data parallel language based on SAREs defined over polyhedral index domains. The language semantics directly lead to sequential code based on demand driven evaluation. However, the resulting context switches can be avoided if the program is tra...
The ALPHA Language
- 38330 MONTBONNOT ST MARTIN UNITE DE RECHERCHE INRIA ROCQUENCOURT, DOMAINE DE VOLUCEAU, ROCQUENCOURT, BP 105, 78153 LE CHESNAY CEDEX UNITE DE RECHERCHE INRIA SOPHIA-ANTIPOLIS, 2004 ROUTE DES LUCIOLES, BP 93, 06902 SOPHIA-ANTIPOLIS CEDEX EDITEUR INRIA, DOMA
, 1994
"... This report is a formal description of the Alpha language, as it is currently implemented. Alpha is a strongly typed, functional language which embodies the formalism of systems of affine recurrence equations. In this report, Alpha language constructs are described, and denotational and type sema ..."
Abstract
-
Cited by 14 (1 self)
- Add to MetaCart
This report is a formal description of the Alpha language, as it is currently implemented. Alpha is a strongly typed, functional language which embodies the formalism of systems of affine recurrence equations. In this report, Alpha language constructs are described, and denotational and type semantics are given. The theorems which are the basis for doing transformations on an Alpha program are stated. And finally, the syntax and semantics of Alpha are given.
Scheduling Uniform Loop Nests
- In Proceedings of ISMN International Conference on Parallel and Distributed Computer Systems
, 1992
"... This paper surveys scheduling techniques for uniform loop nests. First we introduce the hyperplane method and related variants. Then we extend it by using a different affine scheduling for each statement within the nest. In both cases we present a new, constructive and efficient method to determine ..."
Abstract
-
Cited by 11 (2 self)
- Add to MetaCart
This paper surveys scheduling techniques for uniform loop nests. First we introduce the hyperplane method and related variants. Then we extend it by using a different affine scheduling for each statement within the nest. In both cases we present a new, constructive and efficient method to determine optimal solutions. 1 Introduction Loop nests lie in the heart of supercompilers-parallelizers for supercomputers. On one hand their importance in terms of applications is evident: in many scientific programs, the time spent in the execution of a couple of loops represents a large fraction of the total execution time, while the potential parallelism of these loops is often very important. On the other hand, the regular and repetitive structure of loop nests greatly facilitates the use of dependence analysis techniques and of scheduling and allocation strategies. The general problem of finding the optimal scheduling for a task system on a parallel machine is NP-hard (due to the communications...
Affine-by-Statement Scheduling of Uniform Loop Nests over Parametric Domains
- J. Parallel and Distributed Computing
, 1993
"... this report we deal with affine-by-statement scheduling, a high-level technique for the parallelization of loop nests with uniform dependences. Affineby -statement scheduling can be viewed as a natural extension of Lamport's hyperplane method [4], which has been proposed by many authors, including R ..."
Abstract
-
Cited by 11 (1 self)
- Add to MetaCart
this report we deal with affine-by-statement scheduling, a high-level technique for the parallelization of loop nests with uniform dependences. Affineby -statement scheduling can be viewed as a natural extension of Lamport's hyperplane method [4], which has been proposed by many authors, including Rao [10], Quinton [7] and Rajopadhye [9]. For the sake of self-completeness, we briefly review affine-by-statement scheduling in the next section. Given a loop nest, a method for determining the best affine-by-statement scheduling has been proposed by Darte and Robert [3]. The underlying idea is to use the duality theorem of linear programming and to solve an optimization problem, e.g. via the simplex method, to find the optimal parameters. Although constructive, the method in [3] suffers from several drawbacks. Roughly speaking, solving the optimization problem is very costly, since it depends upon the total number of dependences between statements. Even more, for parametric computation domains (think of a matrix-matrix product of size N , where N is the parameter), the solution is only possible at
The Synthesis of Control Signals for One-Dimensional Systolic Arrays
- Integration
, 1992
"... This paper presents a method for the synthesis of control signals for one-dimensional systolic arrays from a program expressed as a set of uniform recurrence equations (source UREs). The basic idea underlying the synthesis of control signals is to distinguish different types of computation prescribe ..."
Abstract
-
Cited by 7 (5 self)
- Add to MetaCart
This paper presents a method for the synthesis of control signals for one-dimensional systolic arrays from a program expressed as a set of uniform recurrence equations (source UREs). The basic idea underlying the synthesis of control signals is to distinguish different types of computation prescribed by the source UREs with another set of uniform recurrence equations (control UREs). To obtain one-dimensional systolic arrays with a description of both data and control signals, one simply applies the standard space-time mapping technique to the source and control UREs.
An Optimal Algo-Tech-Cuit for the Knapsack Problem
- IN PROC. INTERNATIONAL CONFERENCE ON APPLICATIONSPECIFIC ARRAY PROCESSORS - ASAP'93
, 1994
"... We present a formal derivation and proof of correctness of a systolic array for the knapsack problem, a well known, NP-complete problem. The dependency graph of the algorithm is not completely known statically, so the derivation also serves as a case study for systolic synthesis for this class of p ..."
Abstract
-
Cited by 7 (6 self)
- Add to MetaCart
We present a formal derivation and proof of correctness of a systolic array for the knapsack problem, a well known, NP-complete problem. The dependency graph of the algorithm is not completely known statically, so the derivation also serves as a case study for systolic synthesis for this class of programs. The array is itself important since it achieves optimal performance on a model much weaker than a PRAM (ring of PE's with a fixed size memory and only nearest neighbor interconnections). We show how the memory size of each PE can be chosen so that the running time is minimized by formulating and solving a non linear optimization problem. For this, we use the expected running time as the cost function and a register level model of VLSI. The original array has an intricate tag-based control mechanism which is difficult to implement. We show how this can be reduced to two simple counters and a few flip-flops. Coefficient loading is done with a multi-rate clock which avoids the need ...

