Results 1  10
of
138
Some efficient solutions to the affine scheduling problem  Part I Onedimensional Time
, 1996
"... Programs and systems of recurrence equations may be represented as sets of actions which are to be executed subject to precedence constraints. In many cases, actions may be labelled by integral vectors in some iteration domain, and precedence constraints may be described by affine relations. A s ..."
Abstract

Cited by 216 (18 self)
 Add to MetaCart
Programs and systems of recurrence equations may be represented as sets of actions which are to be executed subject to precedence constraints. In many cases, actions may be labelled by integral vectors in some iteration domain, and precedence constraints may be described by affine relations. A schedule for such a program is a function which assigns an execution date to each action. Knowledge of such a schedule allows one to estimate the intrinsic degree of parallelism of the program and to compile a parallel version for multiprocessor architectures or systolic arrays. This paper deals with the problem of finding closed form schedules as affine or piecewise affine functions of the iteration vector. An efficient algorithm is presented which reduces the scheduling problem to a parametric linear program of small size, which can be readily solved by an efficient algorithm.
Dataflow Analysis of Array and Scalar References
 International Journal of Parallel Programming
, 1991
"... Given a program written in a simple imperative language (assignment statements, for loops, affine indices and loop limits), this paper presents an algorithm for analyzing the patterns along which values flow as the execution proceeds. For each array or scalar reference, the result is the name an ..."
Abstract

Cited by 209 (2 self)
 Add to MetaCart
Given a program written in a simple imperative language (assignment statements, for loops, affine indices and loop limits), this paper presents an algorithm for analyzing the patterns along which values flow as the execution proceeds. For each array or scalar reference, the result is the name and iteration vector of the source statement as a function of the iteration vector of the referencing statement. The paper discusses several applications of the method: conversion of a program to a set of recurrence equations, array and scalar expansion, program verification and parallel program construction. Keywords dataflow analysis, semantics analysis, array expansion. 1 Introduction It is a well known fact that scientific programs spend most of their running time in executing loops operating on arrays. Hence if a restructuring or optimizing compiler is to do a good job, it must be able to do a thorough analysis of the addressing patterns in such loops. If taken in full generality, ...
Practical Dependence Testing
, 1991
"... Precise and efficient dependence tests are essential to the effectiveness of a parallelizing compiler. This paper proposes a dependence testing scheme based on classifying pairs of subscripted variable references. Exact yet fast dependence tests are presented for certain classes of array references, ..."
Abstract

Cited by 138 (16 self)
 Add to MetaCart
Precise and efficient dependence tests are essential to the effectiveness of a parallelizing compiler. This paper proposes a dependence testing scheme based on classifying pairs of subscripted variable references. Exact yet fast dependence tests are presented for certain classes of array references, as well as empirical results showing that these references dominate scientific Fortran codes. These dependence tests are being implemented at Rice University in both PFC, a parallelizing compiler, and ParaScope, a parallel programming environment.
Code generation in the polyhedral model is easier than you think
 In IEEE Intl. Conf. on Parallel Architectures and Compilation Techniques (PACT’04
, 2004
"... Many advances in automatic parallelization and optimization have been achieved through the polyhedral model. It has been extensively shown that this computational model provides convenient abstractions to reason about and apply program transformations. Nevertheless, the complexity of code generation ..."
Abstract

Cited by 109 (16 self)
 Add to MetaCart
Many advances in automatic parallelization and optimization have been achieved through the polyhedral model. It has been extensively shown that this computational model provides convenient abstractions to reason about and apply program transformations. Nevertheless, the complexity of code generation has long been a deterrent for using polyhedral representation in optimizing compilers. First, code generators have a hard time coping with generated code size and control overhead that may spoil theoretical benefits achieved by the transformations. Second, this step is usually time consuming, hampering the integration of the polyhedral framework in production compilers or feedbackdirected, iterative optimization schemes. Moreover, current code generation algorithms only cover a restrictive set of possible transformation functions. This paper discusses a general transformation framework able to deal with nonunimodular, noninvertible, nonintegral or even nonuniform functions. It presents several improvements to a stateoftheart code generation algorithm. Two directions are explored: generated code size and code generator efficiency. Experimental evidence proves the ability of the improved method to handle reallife problems. 1.
Symbolic Analysis for Parallelizing Compilers
, 1994
"... Symbolic Domain The objects in our abstract symbolic domain are canonical symbolic expressions. A canonical symbolic expression is a lexicographically ordered sequence of symbolic terms. Each symbolic term is in turn a pair of an integer coefficient and a sequence of pairs of pointers to program va ..."
Abstract

Cited by 105 (4 self)
 Add to MetaCart
Symbolic Domain The objects in our abstract symbolic domain are canonical symbolic expressions. A canonical symbolic expression is a lexicographically ordered sequence of symbolic terms. Each symbolic term is in turn a pair of an integer coefficient and a sequence of pairs of pointers to program variables in the program symbol table and their exponents. The latter sequence is also lexicographically ordered. For example, the abstract value of the symbolic expression 2ij+3jk in an environment that i is bound to (1; (( " i ; 1))), j is bound to (1; (( " j ; 1))), and k is bound to (1; (( " k ; 1))) is ((2; (( " i ; 1); ( " j ; 1))); (3; (( " j ; 1); ( " k ; 1)))). In our framework, environment is the abstract analogous of state concept; an environment is a function from program variables to abstract symbolic values. Each environment e associates a canonical symbolic value e x for each variable x 2 V ; it is said that x is bound to e x. An environment might be represented by...
Loop Parallelization in the Polytope Model
 CONCUR '93, Lecture Notes in Computer Science 715
, 1993
"... . During the course of the last decade, a mathematical model for the parallelization of FORloops has become increasingly popular. In this model, a (perfect) nest of r FORloops is represented by a convex polytope in Z r . The boundaries of each loop specify the extent of the polytope in a dis ..."
Abstract

Cited by 94 (23 self)
 Add to MetaCart
. During the course of the last decade, a mathematical model for the parallelization of FORloops has become increasingly popular. In this model, a (perfect) nest of r FORloops is represented by a convex polytope in Z r . The boundaries of each loop specify the extent of the polytope in a distinct dimension. Various ways of slicing and segmenting the polytope yield a multitude of guaranteed correct mappings of the loops' operations in spacetime. These transformations have a very intuitive interpretation and can be easily quantified and automated due to their mathematical foundation in linear programming and linear algebra. With the recent availability of massively parallel computers, the idea of loop parallelization is gaining significance, since it promises execution speedups of orders of magnitude. The polytope model for loop parallelization has its origin in systolic design, but it applies in more general settings and methods based on it will become a part of futur...
Array Expansion
 In ACM Int. Conf. on Supercomputing
, 1988
"... A common problem in restructuring programs for vector or parallel execution is the suppression of false dependencies which originate in the reuse of the same memory cell for unrelated values. The method is simple and well understood in the case of scalars. This paper gives the general solution f ..."
Abstract

Cited by 88 (10 self)
 Add to MetaCart
A common problem in restructuring programs for vector or parallel execution is the suppression of false dependencies which originate in the reuse of the same memory cell for unrelated values. The method is simple and well understood in the case of scalars. This paper gives the general solution for the case of arrays. The expansion is done in two steps: first, modify all definitions of the offending array in order to obtain the single assignment property. Then, reconstruct the original data flow by adapting all uses of the array. This is done with the help of a new algorithm for solving parametric integer programs. The technique is quite general and may be used for other purposes, including program checking, collecting array predicates, etc... 1 Introduction 1.1 Motivation One of the most striking trends in today's computer architecture is the development of special purpose machines for numerical computations. The idea behind this effort is that by capitalizing on the pecul...
An Exact Method for Analysis of Valuebased Array Data Dependences
 In Sixth Annual Workshop on Programming Languages and Compilers for Parallel Computing
, 1993
"... Standard array data dependence testing algorithms give information about the aliasing of array references. If statement 1 writes a[5], and statement 2 later reads a[5], standard techniques described this as a flow dependence, even if there was an intervening write. We call a dependence between two ..."
Abstract

Cited by 84 (14 self)
 Add to MetaCart
Standard array data dependence testing algorithms give information about the aliasing of array references. If statement 1 writes a[5], and statement 2 later reads a[5], standard techniques described this as a flow dependence, even if there was an intervening write. We call a dependence between two references to the same memory location a memorybased dependence. In contrast, if there are no intervening writes, the references touch the same value and we call the dependence a valuebased dependence. There has been a surge of recent work on valuebased array data dependence analysis (also referred to as computation of array dataflow dependence information). In this paper, we describe a technique that is exact over programs without control flow (other than loops) and nonlinear references. We compare our proposal with the technique proposed by Paul Feautrier, which is the other technique that is complete over the same domain as ours. We also compare our work with that of Tu and Padua, a ...
A Linear Algebra Framework for Static HPF Code Distribution
, 1995
"... High Performance Fortran (hpf) was developed to support data parallel programming for simd and mimd machines with distributed memory. The programmer is provided a familiar uniform logical address space and specifies the data distribution by directives. The compiler then exploits these directives to ..."
Abstract

Cited by 75 (7 self)
 Add to MetaCart
High Performance Fortran (hpf) was developed to support data parallel programming for simd and mimd machines with distributed memory. The programmer is provided a familiar uniform logical address space and specifies the data distribution by directives. The compiler then exploits these directives to allocate arrays in the local memories, to assign computations to elementary processors and to migrate data between processors when required. We show here that linear algebra is a powerful framework to encode Hpf directives and to synthesize distributed code with spaceefficient array allocation, tight loop bounds and vectorized communications for INDEPENDENT loops. The generated code includes traditional optimizations such as guard elimination, message vectorization and aggregation, overlap analysis... The systematic use of an affine framework makes it possible to prove the compilation scheme correct. An early version of this paper was presented at the Fourth International Workshop on Comp...
Nonlinear Array Dependence Analysis
, 1991
"... Standard array data dependence techniques can only reason about linear constraints. There has also been work on analyzing some dependences involving polynomial constraints. Analyzing array data dependences in realworld programs requires handling many "unanalyzable" terms: subscript arrays, runtime ..."
Abstract

Cited by 74 (6 self)
 Add to MetaCart
Standard array data dependence techniques can only reason about linear constraints. There has also been work on analyzing some dependences involving polynomial constraints. Analyzing array data dependences in realworld programs requires handling many "unanalyzable" terms: subscript arrays, runtime tests, function calls. The standard approach to analyzing such programs has been to omit and ignore any constraints that cannot be reasoned about. This is unsound when reasoning about valuebased dependences and whether privatization is legal. Also, this prevents us from determining the conditions that must be true to disprove the dependence. These conditions could be checked by a runtime test or verified by a programmer or aggressive, demanddriven interprocedural analysis. We describe a solution to these problems. Our solution makes our system sound and more accurate for analyzing valuebased dependences and derives conditions that can be used to disprove dependences. We also give some p...