## Systematic Efficient Parallelization of Scan and Other List Homomorphisms (1996)

### Cached

### Download Links

- [brahms.fmi.uni-passau.de]
- [wwwmath.uni-muenster.de]
- DBLP

### Other Repositories/Bibliography

Venue: | In Annual European Conference on Parallel Processing, LNCS 1124 |

Citations: | 27 - 7 self |

### BibTeX

@INPROCEEDINGS{Gorlatch96systematicefficient,

author = {Sergei Gorlatch},

title = {Systematic Efficient Parallelization of Scan and Other List Homomorphisms},

booktitle = {In Annual European Conference on Parallel Processing, LNCS 1124},

year = {1996},

pages = {401--408},

publisher = {Springer-Verlag}

}

### OpenURL

### Abstract

Homomorphisms are functions which can be parallelized by the divide-and-conquer paradigm. A class of distributable homomorphisms (DH) is introduced and an efficient parallel implementation schema for all functions of the class is derived by transformations in the Bird-Meertens formalism. The schema can be directly mapped on the hypercube with an unlimited or an arbitrary fixed number of processors, providing provable correctness and predictable performance. The popular scan-function (parallel prefix) illustrates the presentation: the systematically derived implementation for scan coincides with the practically used "folklore" algorithm for distributed-memory machines.

### Citations

1308 | Monads for functional programming
- Wadler
- 1995
(Show Context)
Citation Context ...es of correctness and performance should and can be addressed during the design/derivation process, rather than as an afterthought. As a derivational calculus we use the Bird-Meertens Formalism (BMF) =-=[1]-=-. Computations are specified using a set of higher-order functions over lists and other data structures; the specification is refined into an executable form by semantically sound transformation rules... |

207 |
The cubeconnected cycles: A versatile network for parallel computation
- Preparata, Vuillemin
- 1981
(Show Context)
Citation Context ...tion yields the parallel algorithm for scan, which is nowadays considered to be the best in practice [8]. The structure of the parallel implementation for DH is similar to the ascending algorithms of =-=[10]-=-, which provides confidence that other important application algorithms can be derived in a similar way. Because of the lack of of space, we only mention the related work on divideand -conquer [11], f... |

157 | Scans as primitive parallel operations
- Blelloch
- 1989
(Show Context)
Citation Context ...ntation schema for all functions of the class. As an illustrating example, we use the scan (parallel prefix) function, which encapsulates a computational pattern common for many parallel applications =-=[2]-=-. The specialization of our parallel implementation schema for the case of scan yields the known parallel algorithms for scan on a hypercube with either a linear or an arbitrary fixed number of proces... |

69 | The design of a standard message passing interface for distributed memory concurrent computers
- Walker
- 1994
(Show Context)
Citation Context ...: R 1 \Psi R 2 = zip (fi) (R 1 ; R 2 ) ++ zip (fi) (R 1 ; R 2 ) (6) This fits format (4), thus redd is a DH: redd (fi) = filfi (7) Function redd is implemented as ReduceAll in the recent MPI standard =-=[9]-=-. Scan: Adjusting to DH. Let us try to express the right-hand side of (3) in format (4), i.e., with both arguments of ++ in zipped form. This is easy for the left argument but requires an additional f... |

59 | Powerlist: A structure for parallel recursion
- Misra
- 1994
(Show Context)
Citation Context ..., and by the Project INTAS-93-1702. 2 BMF, Homomorphisms and Scan We use a variant of the Bird-Meertens Formalism (BMF) with non-empty lists of length 2 k ; k = 0; 1; \Delta \Delta \Delta (powerlists =-=[3]-=-). Function length yields the length of a list. The constructors are: (i) [:] yielding the singleton list and (ii) balanced concatenation ++, where x ++ y is defined iff length x = length y = 2 k . We... |

58 | A Cost Calculus for Parallel Functional Programming
- Skillicorn, Cai
- 1995
(Show Context)
Citation Context ...t algorithm. However, when a homomorphism yields a list, i.e., its combine operator contains ++, then the tree implementation requires linear execution time, independently of the number of processors =-=[6]. Scan as -=-a Homomorphism. Our illustrating example is the scan-function which, for associative fi and a list, computes "prefix sums", e.g.: scan (fi) [a; b; c; d ] = [a; (a fi b); (a fi b fi c); (a fi... |

31 | List ranking and list scan on the Cray C-90
- Reid-Miller
- 1994
(Show Context)
Citation Context ...3) where S 1 = scan (fi) x ; S 2 = scan (fi) y : Here, so-called sectioning (a fi) is used: (a fi) b = a fi b. Despite the fact that ff contains ++, there exist efficient parallel algorithms for scan =-=[7, 8]-=-, which, rather than producing a monolithic output list, distribute it between processors. Our goal is to derive such algorithms systematically. 3 Distributable Homomorphisms We introduce a specific c... |

27 |
Upwards and downwards accumulations on trees
- Gibbons
- 1993
(Show Context)
Citation Context ...er important application algorithms can be derived in a similar way. Because of the lack of of space, we only mention the related work on divideand -conquer [11], formal derivation of scan algorithms =-=[12, 13, 14]-=-, parallelizing transformations in BMF [6] and transition from functional to parallel imperative representations [15]. An extended version of the paper with the full comparison to the related work and... |

23 |
Parallel Computing
- Quinn
- 1994
(Show Context)
Citation Context ...3) where S 1 = scan (fi) x ; S 2 = scan (fi) y : Here, so-called sectioning (a fi) is used: (a fi) b = a fi b. Despite the fact that ff contains ++, there exist efficient parallel algorithms for scan =-=[7, 8]-=-, which, rather than producing a monolithic output list, distribute it between processors. Our goal is to derive such algorithms systematically. 3 Distributable Homomorphisms We introduce a specific c... |

12 |
Constructing list homomorphisms
- Gorlatch
- 1995
(Show Context)
Citation Context ...e, with ff applied in the nodes. There are two problems for a given function: first, how to find the combine operator ff of (2) and, second, how to implement the red stage efficiently in parallel. In =-=[5]-=-, we described a systematic approach to constructing the combine operator, starting from two sequential representations of the given function. The present paper deals with the second problem, the impl... |

12 |
Divacon: A parallel language for scientific computing based on divide-and-conquer
- Mou
- 1990
(Show Context)
Citation Context ...of [10], which provides confidence that other important application algorithms can be derived in a similar way. Because of the lack of of space, we only mention the related work on divideand -conquer =-=[11]-=-, formal derivation of scan algorithms [12, 13, 14], parallelizing transformations in BMF [6] and transition from functional to parallel imperative representations [15]. An extended version of the pap... |

12 |
A correctness proof of parallel scan
- Oâ€™Donnell
- 1994
(Show Context)
Citation Context ...er important application algorithms can be derived in a similar way. Because of the lack of of space, we only mention the related work on divideand -conquer [11], formal derivation of scan algorithms =-=[12, 13, 14]-=-, parallelizing transformations in BMF [6] and transition from functional to parallel imperative representations [15]. An extended version of the paper with the full comparison to the related work and... |

8 | Stages and transformations in parallel programming
- Gorlatch
- 1996
(Show Context)
Citation Context ...ction h is a homomorphism iff: h = red (ff) ffi map (f ) (2) where ff is from (1) and f (a) = h ([a]). Theorem 2 provides a common parallelization for all homomorphisms as a composition of two stages =-=[4]-=-: the first, map, is totally parallel, the second, red, can be parallelized on a tree-like structure, with ff applied in the nodes. There are two problems for a given function: first, how to find the ... |