## Solving Linear Recurrences with Loop Raking (1992)

### Download From

IEEE### Download Links

- [www.cs.cmu.edu]
- [www.cs.cmu.edu]
- [www.cs.cmu.edu]
- DBLP

### Other Repositories/Bibliography

Venue: | In Proceedings Sixth International Parallel Processing Symposium |

Citations: | 7 - 4 self |

### BibTeX

@INPROCEEDINGS{Blelloch92solvinglinear,

author = {Guy E. Blelloch and Siddhartha Chatterjee and Marco Zagha},

title = {Solving Linear Recurrences with Loop Raking},

booktitle = {In Proceedings Sixth International Parallel Processing Symposium},

year = {1992},

pages = {416--424}

}

### Years of Citing Articles

### OpenURL

### Abstract

We present a variation of the partitionmethod for solving m th -order linear recurrences that is well-suited to vector multiprocessors. The algorithm fully utilizes both vector and multiprocessor capabilities, and reduces the number of memory accesses as compared to the more commonly used version of the partition method. Our variation uses a general loop restructuring technique called loop raking. We describe an implementation of this technique on the CRAY Y-MP, and present performance results on first- and second-order linear recurrences, as well as on Livermore loops 5, 11 and 19, which are based on linear recurrences. On a single processor of the Y-MP our implementations run between 1.5 and 4 times faster than the corresponding optimized library routines in SCILIB [7]. On 4 processors, we gain an additional speedup of at least 3.7. We also use the first-order recurrence to implement the PACK operation without use of the CRAY compress-index instruction. For long vectors, our vers...

### Citations

302 |
Advanced Compiler Optimizations for Supercomputers
- Padua, Wolfe
- 1986
(Show Context)
Citation Context ...e was placed over the vector and shifted on each vector load. On each load a total of L elements are loaded into a vector register. See Figure 4. Loop raking can be viewed as the dual of strip mining =-=[27]. Both are-=- loop transformations that define orders in which to execute the iterations of a loop, VL elements at a time. \Gamma l \Gamma! " x 0 x s . . . x (l\Gamma2)s x (l\Gamma1)s " . . . . . . . . .... |

272 | Parallel prefix computation
- Ladner, Fischer
- 1980
(Show Context)
Citation Context ...l algorithms [19, 1], and can be used to solve a much broader class of recurrences [18, 11]. Researchers have been studying parallel and vector algorithms to solve linear recurrences since the 1960's =-=[17, 3, 18, 31, 6, 16, 29, 22, 11, 35, 12, 23]-=-, and considerable effort has gone into producing fast implementations of these algorithms on parallel and vector machines [24, 21, 33, 32, 13, 26, 30]. Some supercomputer manufacturers have considere... |

167 |
A Parallel Algorithm for The Efficient Solution of a General Class of Recurrence Equations
- Kogge, Stone
- 1973
(Show Context)
Citation Context .... Such linear recurrences frequently appear in scientific applications [20], are very useful in the design of parallel algorithms [19, 1], and can be used to solve a much broader class of recurrences =-=[18, 11]-=-. Researchers have been studying parallel and vector algorithms to solve linear recurrences since the 1960's [17, 3, 18, 31, 6, 16, 29, 22, 11, 35, 12, 23], and considerable effort has gone into produ... |

161 |
The organization of computations for uniform recurrence equations
- Karp, Miller, et al.
- 1967
(Show Context)
Citation Context ...l algorithms [19, 1], and can be used to solve a much broader class of recurrences [18, 11]. Researchers have been studying parallel and vector algorithms to solve linear recurrences since the 1960's =-=[17, 3, 18, 31, 6, 16, 29, 22, 11, 35, 12, 23]-=-, and considerable effort has gone into producing fast implementations of these algorithms on parallel and vector machines [24, 21, 33, 32, 13, 26, 30]. Some supercomputer manufacturers have considere... |

157 | Scans as primitive parallel operations
- Blelloch
- 1989
(Show Context)
Citation Context ...mega are binary associative operators, and\Omega distributes over \Phi. Such linear recurrences frequently appear in scientific applications [20], are very useful in the design of parallel algorithms =-=[19, 1]-=-, and can be used to solve a much broader class of recurrences [18, 11]. Researchers have been studying parallel and vector algorithms to solve linear recurrences since the 1960's [17, 3, 18, 31, 6, 1... |

144 |
The Livermore Fortran Kernels: A Computer Test Of The Numerical Performance Range
- McMahon
- 1986
(Show Context)
Citation Context ...st and second order recurrences, we show how some other loops can be converted into linear recurrences and give timings for them. The loops we discuss are loops 5 and 19 out of the 24 Livermore Loops =-=[25]-=-, and the PACK operation. The PACK operation takes a set of data and a set of flags and packs the data where the flags are true (1) into contiguous locations. The PACK operation is one of the Fortran ... |

95 | Prefix Sums and Their Applications
- Blelloch
- 1990
(Show Context)
Citation Context ...:k+i:s] = Vsum; 4.2 Algorithm for R2 We now consider the second-order linear recurrence x[i] = a[i]*x[i-1] + b[i] * x[i-2]. The computations in this recurrence involve multiplying 2 \Theta 2 matrices =-=[2]-=-. The registers V 11, V 12, V 21, and V 22 hold the elements from the corresponding positions in the matrices. The first phase of this algorithm is as follows. V11 = 1.0; V12 = 0.0; V21 = 0.0; V22 = 1... |

91 |
W.: On direct methods of solving Poisson’s equation
- Buzbee, Golub, et al.
- 1970
(Show Context)
Citation Context ...l algorithms [19, 1], and can be used to solve a much broader class of recurrences [18, 11]. Researchers have been studying parallel and vector algorithms to solve linear recurrences since the 1960's =-=[17, 3, 18, 31, 6, 16, 29, 22, 11, 35, 12, 23]-=-, and considerable effort has gone into producing fast implementations of these algorithms on parallel and vector machines [24, 21, 33, 32, 13, 26, 30]. Some supercomputer manufacturers have considere... |

81 |
A fast direct solution of Poisson’s equation using Fourier analysis
- Hockney
- 1965
(Show Context)
Citation Context ...re to execute recurrences efficiently [34]. Three major classes of algorithms have emerged out of the research on algorithms to solve linear recurrences: recursive doubling [18, 31], cyclic reduction =-=[14, 3, 15]-=-, and the partition method [5, 11, 35]. These methods take advantage of the associativity of \Phi and\Omega . With a variation of cyclic reduction, Chen and Kuck showed how to solve an m th -order lin... |

53 |
The power of parallel prefix
- Kruskal, Rudolph, et al.
- 1985
(Show Context)
Citation Context ...mega are binary associative operators, and\Omega distributes over \Phi. Such linear recurrences frequently appear in scientific applications [20], are very useful in the design of parallel algorithms =-=[19, 1]-=-, and can be used to solve a much broader class of recurrences [18, 11]. Researchers have been studying parallel and vector algorithms to solve linear recurrences since the 1960's [17, 3, 18, 31, 6, 1... |

43 | Radix sort for vector multiprocessors
- Zagha, Blelloch
- 1991
(Show Context)
Citation Context ...raking can be used to vectorize linear recurrences, which cannot be vectorized with strip mining. Elsewhere it has been shown how loop raking can be used to ensure stability in a step of a radix sort =-=[36]-=-. Choosing parameters It is important to choose the parameters of a raked loop so as to avoid memory conflicts. For the purposes of loop raking, we consider a vector of n elements as being organized i... |

38 | Scan primitives for vector computers
- Chatterjee, Blelloch, et al.
- 1990
(Show Context)
Citation Context ...ing organized in a rectangular fashion, characterized by three parameters l, s, and f, as shown in Figure 5. These three parameters characterize the vector completely and are called its shape factors =-=[4]-=-. From Figure 5, n = (l \Gamma 1)s + f . This equation is underspecified for the purposes of determining the shape factors given n. We use the following additional constraints for choosing the paramet... |

37 | A parallel method for tridiagonal equations
- Wang
- 1981
(Show Context)
Citation Context |

36 | Program optimization and parallelization using idioms
- Pinter, Pinter
- 1991
(Show Context)
Citation Context ...t intended as a comprehensive formula for converting loops to recurrences, but rather as a set of case studies. Work on identifying scans, reductions, and recurrences in loop code is currently sparse =-=[32, 28]-=-. LL5 The inner loop of Livermore loop 5 is as follows. DO 5 I = 2,N 5 X(I) = Z(I) * (Y(I) - X(I-1)) By multiplying out the quantities on the right-hand side of the assignment, and some precomputation... |

33 |
Parallel tridiagonal equation solvers
- Stone
- 1975
(Show Context)
Citation Context |

28 |
An analysis of the computational and parallel complexity of the livermore loops
- Feo
- 1988
(Show Context)
Citation Context ... and pattern recognition (to recognize recurrences). These are Livermore loops 5 and 19, and the PACK operation. Feo previously described how loops 5 and 19 can be implemented with recursive doubling =-=[10]-=-. Here we transform the loops directly into linear recurrences. In the code that follows, function names conform to those used in Table 1, and Fortran 90 constructs are used where appropriate. This di... |

18 |
Practical parallel band triangular system solvers
- Chen, Kuck, et al.
- 1978
(Show Context)
Citation Context ...[34]. Three major classes of algorithms have emerged out of the research on algorithms to solve linear recurrences: recursive doubling [18, 31], cyclic reduction [14, 3, 15], and the partition method =-=[5, 11, 35]-=-. These methods take advantage of the associativity of \Phi and\Omega . With a variation of cyclic reduction, Chen and Kuck showed how to solve an m th -order linear recurrence in O(lgn lg m) time [6]... |

18 |
Jesshope, Parallel computers : architecture, programming and algorithms
- Hockney, R
- 1981
(Show Context)
Citation Context ...re to execute recurrences efficiently [34]. Three major classes of algorithms have emerged out of the research on algorithms to solve linear recurrences: recursive doubling [18, 31], cyclic reduction =-=[14, 3, 15]-=-, and the partition method [5, 11, 35]. These methods take advantage of the associativity of \Phi and\Omega . With a variation of cyclic reduction, Chen and Kuck showed how to solve an m th -order lin... |

16 |
Solving triangular systems on a parallel computer
- Sameh, Brent
- 1977
(Show Context)
Citation Context |

14 |
Time and parallel processor bounds for linear recurrence systems
- Chen, Kuck
- 1988
(Show Context)
Citation Context |

14 |
The complexity of parallel evaluation of linear recurrence
- Hyafil, Kung
- 1975
(Show Context)
Citation Context |

9 |
An algorithm for solving linear recurrence systems on parallel and pipelined machines
- Gajski
- 1981
(Show Context)
Citation Context .... Such linear recurrences frequently appear in scientific applications [20], are very useful in the design of parallel algorithms [19, 1], and can be used to solve a much broader class of recurrences =-=[18, 11]-=-. Researchers have been studying parallel and vector algorithms to solve linear recurrences since the 1960's [17, 3, 18, 31, 6, 16, 29, 22, 11, 35, 12, 23], and considerable effort has gone into produ... |

9 |
Compiling techniques for first-order linear recurrences on a vector computer
- Tanaka, Iwasawa, et al.
- 1988
(Show Context)
Citation Context ...r recurrences since the 1960's [17, 3, 18, 31, 6, 16, 29, 22, 11, 35, 12, 23], and considerable effort has gone into producing fast implementations of these algorithms on parallel and vector machines =-=[24, 21, 33, 32, 13, 26, 30]-=-. Some supercomputer manufacturers have considered the solution of linear recurrences important enough to warrant the addition of special hardware to execute recurrences efficiently [34]. Three major ... |

6 |
Vectorization of linear recurrence relations
- Vorst, Dekker
- 1989
(Show Context)
Citation Context ...r recurrences since the 1960's [17, 3, 18, 31, 6, 16, 29, 22, 11, 35, 12, 23], and considerable effort has gone into producing fast implementations of these algorithms on parallel and vector machines =-=[24, 21, 33, 32, 13, 26, 30]-=-. Some supercomputer manufacturers have considered the solution of linear recurrences important enough to warrant the addition of special hardware to execute recurrences efficiently [34]. Three major ... |

5 |
Efficient parallel algorithms for linear recurrence computation
- Greenberg, Ladner, et al.
- 1982
(Show Context)
Citation Context |

5 |
Measurements of Parallelism in Ordinary FORTRAN Programs
- Kuck, Budnik, et al.
- 1974
(Show Context)
Citation Context ... the Livermore Loops are for Fortran code. where \Phi and\Omega are binary associative operators, and\Omega distributes over \Phi. Such linear recurrences frequently appear in scientific applications =-=[20]-=-, are very useful in the design of parallel algorithms [19, 1], and can be used to solve a much broader class of recurrences [18, 11]. Researchers have been studying parallel and vector algorithms to ... |

5 | Efficient parallel algorithms for linear recurrence computation - Greenberg, Ladner, et al. - 1982 |

4 |
High-speed processing schemes for summation type and iteration type vector instructions on HITACHI supercomputer S-820 system
- Wada, Ishiii, et al.
- 1988
(Show Context)
Citation Context ...3, 32, 13, 26, 30]. Some supercomputer manufacturers have considered the solution of linear recurrences important enough to warrant the addition of special hardware to execute recurrences efficiently =-=[34]-=-. Three major classes of algorithms have emerged out of the research on algorithms to solve linear recurrences: recursive doubling [18, 31], cyclic reduction [14, 3, 15], and the partition method [5, ... |

4 | Investigation of different algorithms for the first order recurrence - Häfner, Schönauer - 1990 |

4 | Parallel recurrence solvers for vector and SIMD supercomputers - Conn, Podrazik - 1992 |

3 |
Solving linear recurrence problems on supercomputers
- Shimizu, Kanada
- 1991
(Show Context)
Citation Context ...r recurrences since the 1960's [17, 3, 18, 31, 6, 16, 29, 22, 11, 35, 12, 23], and considerable effort has gone into producing fast implementations of these algorithms on parallel and vector machines =-=[24, 21, 33, 32, 13, 26, 30]-=-. Some supercomputer manufacturers have considered the solution of linear recurrences important enough to warrant the addition of special hardware to execute recurrences efficiently [34]. Three major ... |

2 |
Solving linear recurrences on pipelined computers
- Kunkel, Smith
- 1987
(Show Context)
Citation Context |

2 |
New class of parallel algorithms for solving first-order and certain classes of second-order linear recurrences
- Lakshmivarahan, Dhall
- 1985
(Show Context)
Citation Context |

2 |
The solution of tridiagonal linear sytems on the CDC STAR-100 computer
- Lambiotte, Voigt
- 1975
(Show Context)
Citation Context |

2 |
On the efficient vectorization of the general first-order linear recurrence relation
- Overill
- 1991
(Show Context)
Citation Context |

1 |
andWilli Schonauer. Investigation of different algorithms for the first order recurrence
- Hafner
- 1990
(Show Context)
Citation Context |