## Automatic Parallelization in the Polytope Model (1996)

Venue: | Laboratoire PRiSM, Université des Versailles St-Quentin en Yvelines, 45, avenue des États-Unis, F-78035 Versailles Cedex |

Citations: | 52 - 3 self |

### BibTeX

@INPROCEEDINGS{Feautrier96automaticparallelization,

author = {Paul Feautrier},

title = {Automatic Parallelization in the Polytope Model},

booktitle = {Laboratoire PRiSM, Université des Versailles St-Quentin en Yvelines, 45, avenue des États-Unis, F-78035 Versailles Cedex},

year = {1996},

pages = {79--103},

publisher = {Springer-Verlag}

}

### Years of Citing Articles

### OpenURL

### Abstract

. The aim of this paper is to explain the importance of polytope and polyhedra in automatic parallelization. We show that the semantics of parallel programs is best described geometrically, as properties of sets of integral points in n-dimensional spaces, where n is related to the maximum nesting depth of DO loops. The needed properties translate nicely to properties of polyhedra, for which many algorithms have been designed for the needs of optimization and operation research. We show how these ideas apply to scheduling, placement and parallel code generation. R'esum'e Le but de cet article est d'expliquer le role jou'e par les poly`edres et les polytopes en parall'elisation automatique. On montre que la s'emantique d'un programme parall`ele se d'ecrit naturellement sous forme g'eom'etrique, les propri'et'es du programme 'etant formalis'ees comme des propri'et'es d'ensemble de points dans un espace `a n dimensions. n est li'e `a la profondeur maximale d'imbrication des boucles DO. Les...

### Citations

1626 | The Definition of Standard ML
- Milner, Tofte, et al.
- 1990
(Show Context)
Citation Context ...or modifications. A limited amount of knowledge was then added to improve the final result, for instance in the form of a type system. In the case of a fully developed type system, like the one in ML =-=[MTH90]-=-, knowledge about operators in the language is given to the compiler in the form of typing rules, which are essentially Horn clauses. A program is correct if, for each of its expressions, one can prov... |

1546 |
Theory of Linear and Integer Programming
- Schrijver
- 1986
(Show Context)
Citation Context ...next section the tools which are needed for such calculations. 3 Basic tools for handling polyhedra and Z-polyhedra The basic reference on linear inequalities in rationals or integers is the treatise =-=[Sch86]-=-. 3.1 Polyhedra and polytopes There are two ways of defining a polyhedron. The simplest one is to give a set of linear inequalities: Ax + as0: The polyhedron is the set of all x which satisfies these ... |

904 |
Linear Programming and Extensions
- Dantzig
- 1963
(Show Context)
Citation Context ...intersections, while vertices are better for convex unions 3 . The basic algorithms for handling polyhedra are feasibility tests: the Fourier -Motzkin cross-elimination method [Fou90] and the Simplex =-=[Dan63]-=-. The interested reader is referred to the above quoted treatise of Schrijver for details. Both algorithms prove that the object polynomial is empty, or exhibit a point which belongs to it. For defini... |

477 | The Omega test: A fast and practical integer programming algorithm for dependence analysis
- Pugh
- 1992
(Show Context)
Citation Context ... question of their emptiness or not. For canonical Z-polyhedra, this is the linear integer programming question [Sch86, Min83]. I will briefly sketch two integer programming algorithm: the Omega test =-=[Pug91a]-=- which is an extension of Fourier-Motzkin, and the Gomory cut method, which is an extension of the Simplex [Gom63]. Recall that in the Fourier-Motzkin method, we start by extracting lower and upper bo... |

390 |
A loop transformation theory and an algorithm to maximize parallelism
- Wolf, Lam
- 1991
(Show Context)
Citation Context ...s first q components are zero. Selection of a transformation The only case in which it has been possible to devise an algorithm for finding T in one step is the one of uniform perfect loop nests, see =-=[WL91]-=-. Another possibility is to search for a good transformation among a finite -- albeit very large -- set of possible candidates, see [KP94]. Other researchers use methods which find only parts of T . T... |

234 |
An algorithm for integer solutions to linear programs
- Gomory
- 1963
(Show Context)
Citation Context ...h86, Min83]. I will briefly sketch two integer programming algorithm: the Omega test [Pug91a] which is an extension of Fourier-Motzkin, and the Gomory cut method, which is an extension of the Simplex =-=[Gom63]-=-. Recall that in the Fourier-Motzkin method, we start by extracting lower and upper bounds for the selected variable, and then write that each lower bound is not greater than each upper bound. This co... |

228 | Some efficient solutions to the affine scheduling problem, part I, multidimensional time - Feautrier - 1992 |

225 |
The parallel execution of DO loops
- Lamport
- 1974
(Show Context)
Citation Context ...esearchers use methods which find only parts of T . The problem is then to extend T to a one-to-one transformation, or to fit the parts together. Scheduling Since the pioneering papers of [KMW67] and =-=[Lam74], there ha-=-ve been a large number of papers on scheduling, mainly from the "systolic" community. The basic observation is that for any function ` from the set of operations to any totally ordered set, ... |

203 | Scanning polyhedra with do loops - Ancourt, Irigoin - 1991 |

180 | Parametric integer programming
- Feautrier
- 1988
(Show Context)
Citation Context ...ponential. However, it is very easy to program, and experiments have shown that it is very fast for small problems, say of the order of 10 inequalities at most. Our implementation of the Simplex, PIP =-=[Fea88b]-=- is a geometrical method which can be explained in the following way. Let n be the number of unknowns and m be the number of inequalities in the problem to be solved. One obtains a vertex of a polyhed... |

170 |
SUPERB: a tool for semi-automatic MIMD/SIMD parallelization
- Zima, Bast, et al.
- 1988
(Show Context)
Citation Context ...the calculations and communications. Let us suppose that distribution is specified by a placement function T , and let q be the current processor number. Operation u is replaced by the following code =-=[ZBG88]-=-: 8a 2 R(u) : if T (u) 6= qsT (a) = q then Send(a) to T (u) if T (u) = qsT (a) 6= q then Receive(a) from T (a) if T (u) = q then c = f(R(u)) This code is highly inefficient, due to the numerous guards... |

169 |
The organization of computations for uniform recurrence equations
- Karp, Miller, et al.
- 1967
(Show Context)
Citation Context ...94]. Other researchers use methods which find only parts of T . The problem is then to extend T to a one-to-one transformation, or to fit the parts together. Scheduling Since the pioneering papers of =-=[KMW67] and [Lam7-=-4], there have been a large number of papers on scheduling, mainly from the "systolic" community. The basic observation is that for any function ` from the set of operations to any totally o... |

100 |
Analysis of programs for parallel processing
- Bernstein
- 1966
(Show Context)
Citation Context ...dition for determinism: we have already sacrificed some parallelism for simplicity. Computing ffi may be of arbitrary complexity. However, a sufficient condition for commutation is easily constructed =-=[Ber66]: let R(u) (res-=-p. M (u)) be the set of memory cells which are read (resp modified) by u. u and v commute if: M (u) " R(v) = ;; R(u) " M (v) = ;; M (u) " M (v) = ;: The three terms in that formula appe... |

99 |
Dataflow Analysis of Scalar and Array References
- Feautrier
- 1991
(Show Context)
Citation Context ...ension of x and y. -- N RS is the number of loops surrounding both R and S. Accordingly, the number of loop surrounding S should be written N SS . It will be abbreviated to N S here. We have shown in =-=[Fea91]-=- that: hR; xi OE hS; yi j x[1::N RS ]sb[1::N RS ]s(x[1::N RS ] = y[1::N RS ] R! S): (3) The predicate OE is not convex, hence it cannot be represented as a polyhedron. However, OE can be split into N ... |

90 | Array expansion - Feautrier - 1988 |

83 | An exact method for analysis of valuebased array data dependences
- Pugh, Wonnacott
- 1993
(Show Context)
Citation Context ...and g are the subscripting functions; they have the same number of components, namely the rank of array X. The operations in dependence at depth p are the members of the following dependence relation =-=[PW93]-=-: fhR; xi; hS; yi j Q p RS (x; y)g where Q p RS is the following polytope: Q p RS (x; y) j f (x) = g(y)shR; xi OE p hS; yisx 2 D Rsy 2 D S (4) The union of all dependence relations is a symbolic descr... |

75 | Fuzzy array dataflow analysis
- Barthou, Collard, et al.
- 1997
(Show Context)
Citation Context ...complicated bounds than is necessary, unless one programs a redundancy eliminator. Another solution is to use PIP for computing maxima and minima, in which case redundancy is automatically eliminated =-=[CBF95]-=-. Non-unimodular transformations In case T is not unimodular, the solution is to build its Hermite normal form T = HU [Dar93, Xue94]. One builds, according to the above method, a loop nest which scans... |

53 | Toward automatic distribution - Feautrier - 1994 |

48 | An overview of a compiler for scalable parallel machines
- Amarasinghe, Anderson, et al.
- 1994
(Show Context)
Citation Context ...ost level. There are various devices for improving the results. One may move invariant calculations up through the loop hierarchy, split loops according to the value of a guard, peel loops, and so on =-=[AALL93]-=-. 5.3 Communication code If the parallel program is to run on a distributed memory machine, one has to insert code for the residual communications. This depends in a complicated way on the architectur... |

42 | Programmation mathématique, théorie et algorithmes: Tome1. C.N.E.T et E.N.S.T - Minoux - 1983 |

42 | Uniform techniques for loop optimization - PUGH - 1991 |

36 | Linear scheduling is nearly optimal - Darte, Khachiyan, et al. - 1991 |

29 |
The systematic design of systolic arrays
- Quinton
- 1987
(Show Context)
Citation Context ...inear allows one to construct a finite summary for this potentially infinite problem. One way of obtaining a summary is to notice that (12) is true everywhere iff it is true at the vertices of Q p RS =-=[Qui87]-=-. We obtain in this way as many linear constraints as the Q p RS 's have vertices. Another solution is to use the affine version of Farkas lemma [Sch86, Fea92a]: the general solution of (Ax + bs0) ) (... |

25 | Automating non-unimodular loop transformations for massive parallelism - Xue - 1994 |

22 |
The importance of direct dependences for automatic parallelization
- Brandes
- 1988
(Show Context)
Citation Context ...and also in the sense that the value written into x by W1 never reach R since it is killed by W2. The set of flow dependences which give rise to a real flow of data constitutes the direct dependences =-=[Bra88]-=- or the value based dependences [PW93]. There are several methods for computing this set: I will describe here the original solution of [Fea88a, Fea91]. Suppose that in the dependence polytope (4), st... |

20 | A unified framework for systematic loop transformations - Lu - 1991 |

18 | Mapping affine loop nests: New results - Dion, Robert - 1995 |

18 | Automatic parallelization based on multi-dimensional scheduling
- Darte, Vivien
- 1994
(Show Context)
Citation Context ... i.e. simpler polytopes which still enclose Q p RS . One possibility is to ignore the dependence on both x and y and to consider only dependence distances, i.e to project on the difference y \Gamma x =-=[DV94]-=-. This has meaning only when statements R and S have the same iteration space, i.e. belong to the same loop nest. The set of dependence distances can be enclosed in a cone which can be represented by ... |

14 |
Computing dependence direction vectors and dependence cones with linear systems
- Irigoin, Triolet
- 1987
(Show Context)
Citation Context ... only when statements R and S have the same iteration space, i.e. belong to the same loop nest. The set of dependence distances can be enclosed in a cone which can be represented by its extremal rays =-=[IT87]-=-. Another possibility is to note only the signs of the components of the extremal rays of the dependence cone, giving the dependence directions. The usual solution is to test each Q p RS for emptiness... |

9 | Partitionnement deboucles imbriquees, une technique d'optimisation pour les programmes scienti ques - Irigoin - 1995 |

9 |
Detection of reductions in sequentials programs with loops
- Redon, Feautrier
- 1993
(Show Context)
Citation Context ...or for worse -- the sensitivity of the algorithm to rounding errors, and have to be used with caution. The study of that kind of transformation is just beginning; the interested reader is referred to =-=[RF93]. 4.1 Reor-=-dering Transformations Introduction One of the earliest discovery in the field was that most "old style" reordering transformations were in fact linear or affine transformations of iteration... |

8 | Techniques de parall'elisation automatique de nids de boucles - Darte - 1993 |

7 |
Selecting affine mappings based on performance estimation
- Kelly, Pugh
- 1994
(Show Context)
Citation Context ...g T in one step is the one of uniform perfect loop nests, see [WL91]. Another possibility is to search for a good transformation among a finite -- albeit very large -- set of possible candidates, see =-=[KP94]-=-. Other researchers use methods which find only parts of T . The problem is then to extend T to a one-to-one transformation, or to fit the parts together. Scheduling Since the pioneering papers of [KM... |

4 |
Œuvres de Fourier, t
- Fourier
(Show Context)
Citation Context ...better for constructing intersections, while vertices are better for convex unions 3 . The basic algorithms for handling polyhedra are feasibility tests: the Fourier -Motzkin cross-elimination method =-=[Fou90]-=- and the Simplex [Dan63]. The interested reader is referred to the above quoted treatise of Schrijver for details. Both algorithms prove that the object polynomial is empty, or exhibit a point which b... |

4 |
Analyses interprocédurales du flot des données
- Leservot
- 1996
(Show Context)
Citation Context ... control, with the exception of toy examples and small library subroutines. However, it is possible to isolate static control kernels in large programs and have them parallelized by the above methods =-=[Les96]-=-. If it so happens that these kernels represent a large fraction of the total running time, our job is done. Some programs have irregular control and/or irregular data accesses. It is still possible t... |

2 | Goueslier d'Argence. Contribution `a l"etude des probl`emes d'ordonnancement cyclique multidimensionnel - Le - 1996 |