## Computation of Uniform Recurrence Equations over Finite Domains (1993)

Citations: | 1 - 1 self |

### BibTeX

@TECHREPORT{Aldrich93computationof,

author = {William J. Aldrich and P. R. Kumar},

title = {Computation of Uniform Recurrence Equations over Finite Domains},

institution = {},

year = {1993}

}

### OpenURL

### Abstract

Uniform recurrence equations arise frequently in scientific problems. We examine the problem of evaluating a given set of recurrence functions on an arbitrary finite set of index points. We provide a necessary and sufficient condition to determine when the equations can be grouped by index points into batch sets, for all finite domains, such that the batch sets can be sequentially evaluated. We show how the batch sets can be ordered using the hyperplane method. Next we provide a necessary and sufficient condition to determine when one can "index shift" a set of recurrence equations so that it is batch computable for all finite domains. Finally, we provide an algorithm for performing an index shifting that enables all of the computations within a batch to be performed in parallel. Index terms: Finite Domains, Hyperplane Method, Parallel Computation, Scientific Computation, Uniform Recurrence Equations The research reported here has been supported in part by the U.S. Army Research Of...

### Citations

442 |
Optimizing Supercompilers for Supercomputers
- Wolfe
- 1989
(Show Context)
Citation Context ...les within the body of the loop. The method has been shown to be extendable to nested loops by separately testing each loop dimension for dependence cycles starting with the innermost loop; see Wolfe =-=[4]-=-. The loop dimensions are successively vectorized until an index is found with a dependence cycle. Loop vectorization is particularly useful because it can be applied to nonuniform recurrences; howeve... |

379 |
A loop transformation theory and an algorithm to maximize parallelism
- Wolf, Lam
- 1991
(Show Context)
Citation Context .... Both of these transformations have been represented as unimodular transformations on the set of index points. Methods for finding optimal unimodular transformations are presented by Wolf and Lam in =-=[7]-=-. When loops cannot be completely vectorized, some partial parallelism can often be realized through cycle shrinking. This is a technique to perform a certain number of loop 4 iterations simultaneousl... |

302 |
Advanced Compiler Optimizations for Supercomputers
- Padua, Wolfe
- 1986
(Show Context)
Citation Context ...), are both lexicographically negative. One of the simplest techniques developed for extracting parallelism from a loop has been loop vectorization, discussed by Muraoka in [2] and Padua and Wolfe in =-=[3]-=-. In this technique, all iterations of the loop are executed simultaneously provided that there are no dependence cycles within the body of the loop. The method has been shown to be extendable to nest... |

215 |
The parallel execution of do loops
- Lamport
- 1974
(Show Context)
Citation Context ... which a number of different approaches are described for nested loops. The more specific problem of extracting parallelism from nested loops with uniform recurrences was initially studied by Lamport =-=[9]-=-. His hyperplane method, referred to in Section 4, accomplishes the same result as skewing followed by vectorization [4]. More recently, authors have developed techniques to optimize the hyperplane me... |

162 |
The organization of computations for uniform recurrence equations
- Karp, Miller, et al.
- 1967
(Show Context)
Citation Context ...ins. Finally, we provide an algorithm for performing such index shifting in such a way that one can also perform all the computations within a batch in parallel. 3 2 History Karp, Miller and Winograd =-=[1]-=- have considered systems of uniform recurrence equations, deriving tests for computability and parallelism. The most notable difference in the work presented here is the domain of computation, D. In [... |

105 | Automatic Program Parallelization
- Banerjee, Eigenmann, et al.
- 1993
(Show Context)
Citation Context ...a loop iteration can be performed concurrently. For an excellent survey of parallelization techniques, including the more general case of nonuniform recurrences, we refer the reader to Banerjee et al =-=[14]-=-. 3 Graph Representations First we introduce some notation. For convenience we denote the RHS of (1) by f i (p). We will refer to a function, or the value of a variable, at an index point, as an insta... |

72 |
Graphs and Digraphs
- CHARTRAND, LESNIAK
- 1986
(Show Context)
Citation Context ...Delta \Delta ; x j (p \Gamma o j \Gamma ffi); \Delta \Delta \Delta), then a directed arc is drawn from vertex x j to x i and labeled with the vector ffi. Here we extend the definition of a digraph in =-=[15]-=- to allow for multiple arcs from one vertex to another. We also allow multiple loops (i.e., arcs directed towards the same vertex from which they originated). The recurrence graph for Example 2 is sho... |

66 |
Time optimal linear schedules for algorithms with uniform dependencies
- Shang, Fortes
- 1991
(Show Context)
Citation Context ...More recently, authors have developed techniques to optimize the hyperplane method. Algorithms for finding the optimal scheduling vector, denoted by q in Section 4, were developed by Shang and Fortes =-=[10]-=-. These algorithms find the q vector that enables the hyperplane method to be used with the smallest number of steps. Methods for index shifting to enhance the hyperplane method with a given q vector ... |

51 |
Compiler optimizations for enhancing parallelism and their impact on architecture design
- Polychronopoulos
- 1988
(Show Context)
Citation Context ... is a technique to perform a certain number of loop 4 iterations simultaneously, provided that there are no dependences between adjacent iterations. This technique is described by Polychronopoulos in =-=[8]-=-, in which a number of different approaches are described for nested loops. The more specific problem of extracting parallelism from nested loops with uniform recurrences was initially studied by Lamp... |

39 |
Speedup of Ordinary Programs
- Banerjee
- 1988
(Show Context)
Citation Context ...arly useful because it can be applied to nonuniform recurrences; however, more elaborate dependence analysis is required. Methods for nonuniform dependence analysis were developed by Banerjee in [5], =-=[6]-=-. Loop vectorization can be enhanced by several transformation techniques. Loop interchangesis the process of interchanging the order of loops. This can enable vectorization if the innermost loop is n... |

29 |
Data dependence in ordinary programs
- Banerjee
- 1976
(Show Context)
Citation Context ...ticularly useful because it can be applied to nonuniform recurrences; however, more elaborate dependence analysis is required. Methods for nonuniform dependence analysis were developed by Banerjee in =-=[5]-=-, [6]. Loop vectorization can be enhanced by several transformation techniques. Loop interchangesis the process of interchanging the order of loops. This can enable vectorization if the innermost loop... |

21 |
Parallelism Exposure and Exploitation in Programs
- Muraoka
- 1971
(Show Context)
Citation Context ...\Gamma1; 0) and (0; \Gamma1), are both lexicographically negative. One of the simplest techniques developed for extracting parallelism from a loop has been loop vectorization, discussed by Muraoka in =-=[2]-=- and Padua and Wolfe in [3]. In this technique, all iterations of the loop are executed simultaneously provided that there are no dependence cycles within the body of the loop. The method has been sho... |

16 |
Loop nest scheduling and transformations
- Darte, Risset, et al.
- 1993
(Show Context)
Citation Context ...for t = min to max, Do: Simultaneously compute all x i (p) such that q T i p + t i = t: x 1 (p) = f 1 (\Delta \Delta \Delta ) x 2 (p) = f 2 (\Delta \Delta \Delta ) . . . end. Darte, Risset and Robert =-=[13]-=-, show how the optimal affine by statement parameters can be found from the solution of a linear program. The shifting algorithm presented here extends the amount of parallelism using the hyperplane m... |

9 | Revisiting cycle shrinking
- Robert, Song
- 1994
(Show Context)
Citation Context ...ndex shifting to enhance the hyperplane method with a given q vector were developed by Liu, Ho and Sheu in [11]. Methods to optimally combine these two techniques were developed by Robert and Song in =-=[12]-=-. In all these methods, viz., loop vectorization, cycle shrinking, and hyperplane scheduling, the body of the loop nest is executed sequentially. An alternative is affine by statement scheduling. This... |

4 |
On the parallelism of nested for -- loops using index shift method
- Liu, Ho, et al.
- 1990
(Show Context)
Citation Context ...hat enables the hyperplane method to be used with the smallest number of steps. Methods for index shifting to enhance the hyperplane method with a given q vector were developed by Liu, Ho and Sheu in =-=[11]-=-. Methods to optimally combine these two techniques were developed by Robert and Song in [12]. In all these methods, viz., loop vectorization, cycle shrinking, and hyperplane scheduling, the body of t... |

2 | Automating the simulation of complex discrete-time control systems: A mathematicalframework, algorithms and a software package
- Ellis, Ravikanth, et al.
- 1994
(Show Context)
Citation Context ...shown constructively by an algorithm for index dimensions of 2 or greater. Sufficiency for the special case of a one-dimensional system of recurrence equations is proven in Ellis, Ravikanth and Kumar =-=[17]-=-, also constructively. We use a single matrix C to represent all of the cyclic sums in the recurrence graph, C = [c 1 ; c 2 ; \Delta \Delta \Delta]: (27) We start with a vector q that has a strictly p... |