## Architecture Independent Massive Parallelization of Divide-and-Conquer Algorithms (1995)

Venue: | Mathematics of Program Construction, Lecture Notes in Computer Science 947 |

Citations: | 9 - 1 self |

### BibTeX

@INPROCEEDINGS{Achatz95architectureindependent,

author = {Klaus Achatz and Wolfram Schulte},

title = {Architecture Independent Massive Parallelization of Divide-and-Conquer Algorithms},

booktitle = {Mathematics of Program Construction, Lecture Notes in Computer Science 947},

year = {1995},

pages = {97--127},

publisher = {Springer-Verlag}

}

### OpenURL

### Abstract

. We present a strategy to develop, in a functional setting, correct, efficient and portable Divide-and-Conquer (DC) programs for massively parallel architectures. Starting from an operational DC program, mapping sequences to sequences, we apply a set of semantics preserving transformation rules, which transform the parallel control structure of DC into a sequential control flow, thereby making the implicit data parallelism in a DC scheme explicit. In the next phase of our strategy, the parallel architecture is fully expressed, where `architecture dependent' higher-order functions are introduced. Then -- due to the rising communication complexities on particular architectures -- topology dependent communication patterns are optimized in order to reduce the overall communication costs. The advantages of this approach are manifold and are demonstrated with a set of non-trivial examples. 1 Introduction It is well-known that the main problems in exploiting the power of modern parallel sys...

### Citations

1309 | Monads for Functional Programming - Wadler - 1995 |

635 | An Introduction to Parallel Algorithms - JaJa - 1992 |

501 | Sorting networks and their applications
- Batcher
- 1968
(Show Context)
Citation Context ...roblem in computational geometry, namely the construction of a convex hull. 5.1 Bitonic Sort The well-known bitonic sort algorithm was proposed by K. E. Batcher in 1968 for so called sorting networks =-=[Bat68]-=- and later adapted to parallel computers [NS79]. 17 Preliminaries and Operational Specifications The bitonic sort algorithm is based on the central notion of the bitonic sequence. A sequence s is said... |

415 |
Algorithmic Skeletons: Structured Management of Parallel Computation
- Cole
- 1989
(Show Context)
Citation Context ... by Axford and Joy. Aside from this, no calculation nor interesting distributed implementation is presented. Among the first, who used the skeleton approach in a functional setting, initiated by Cole =-=[Col89]-=-, was a group at Imperial College [DFH + 93]. Their skeletons are rather highlevel, e.g., they distinguish farming, pipelining, DC and other high level skeletons, but do not tackle massive parallelism... |

171 |
The Design and Analysis of Parallel Algorithms
- Akl
- 1989
(Show Context)
Citation Context ... : ; s 2n g of points in the plane, the convex hull of S is the smallest convex polygon P , for which each point in S is either on the boundary of P or in its interior. The following analogy given in =-=[Akl89]-=- might be useful: Assume that the points of S are nails driven halfway into a wooden board. A rubber band is now stretched around the set of nails and then released. When the band settles, it has the ... |

95 | NESL: A Nested Data-Parallel Language (version 2.6
- Blelloch
- 1993
(Show Context)
Citation Context ...rounding text. 2 The Balanced Sequence Model Sequences in general can be used to express data parallelism in an abstract way, where parallelism is achieved exclusively through operations on sequences =-=[Ble92]-=-. In this section we explore this approach, present the traditional operations on sequences and its data parallel view (Sect. 2.1), introduce communication oriented operations (Sect. 2.2), and define ... |

95 | Prefix Sums and Their Applications - Blelloch - 1990 |

69 |
Specification and Transformation of Programs
- Partsch
- 1990
(Show Context)
Citation Context ...o develop parallel programs where transformational programming summarizes a methodology for constructing correct and efficient programs from formal specifications by applying meaning-preserving rules =-=[Par90]-=-. Starting with a functional specification, we derive programs for the massively data parallel model , which assumes a large data collection that needs to be processed and that there is a single proce... |

44 | A Survey and Classification of Some Program Transformation Techniques
- Feather
- 1986
(Show Context)
Citation Context ...d very efficient algorithms can be derived. -- The presented transformations can be automated using an extended compilation approach, where the user may give hints in the form of laws to the compiler =-=[Fea87]-=-. -- Architecture independent data parallelism is distinguished from architecture dependent one. Correspondingly we operate on different levels of abstraction (sequences vs. skeletons) and supply diff... |

42 |
Bitonic sort on a meshconnected parallel computer
- Nassimi, Sahni
- 1979
(Show Context)
Citation Context ...struction of a convex hull. 5.1 Bitonic Sort The well-known bitonic sort algorithm was proposed by K. E. Batcher in 1968 for so called sorting networks [Bat68] and later adapted to parallel computers =-=[NS79]-=-. 17 Preliminaries and Operational Specifications The bitonic sort algorithm is based on the central notion of the bitonic sequence. A sequence s is said to be bitonic if it either monotonically incre... |

19 | Parallel computing comes of age: supercomputer level parallel computations at Caltech, Concurrency Practice Exper - Fox - 1989 |

17 | Deductive Derivation of Parallel Programs - Pepper - 1993 |

16 |
An Algebraic Model for Divide-and-Conquer Algorithms and its Parallelism
- Mou, Hudak
(Show Context)
Citation Context ...arallelization on a particular architecture. Thus, our work can be seen as a completion of Smith's work towards data parallel execution. Mou and Houdak describe DC in a algebraic model called Divacon =-=[MH88]-=-. They recognize that the original DC model is too restrictive with respect to decomposition and communication. For the latter, they introduce so called preand postmorphims, which correspond with our ... |

10 |
Compile-time transformations and optimization of parallel divide-and-conquer algorithms
- Carpentieri, Mou
- 1991
(Show Context)
Citation Context ...nge of examples. However, they only sketch the mapping of the model on parallel computers. This algebraic model was later picked up by Carpentiery and Mou, who study communication issues in the model =-=[CM91]-=-. They present hypercube specific rules to optimize communication by introducing new storage levels. These rules 27 are expressed in Divacon, whereas our approach takes the architecture explicitly int... |

9 | List homomorphic parallel algorithms for bracket matching
- Cole
- 1993
(Show Context)
Citation Context ...ckle massive parallelism, as it is understood by us. Still more abstract is the work on investigating parallelism within the BirdMeertens formalism, which recently has gained much attention (cf. e.g. =-=[Col93]-=-). However, all these different approaches have in common that they stop on the level of DC algorithms or homomorphisms, whereas our approach proceeds down to an architecture specific target program. ... |

7 |
Functional development of massively parallel programs
- Pepper, Exner, et al.
- 1993
(Show Context)
Citation Context ...number of real processors. However, not all massively parallel machines support virtual processors. The28 refore, data distribution is still a major problem, which is tackled by a group around Pepper =-=[PES93]-=-. 7 Conclusion and Future Research In this paper, we have presented a transformation strategy to develop correct, efficient, data parallel DC algorithms, and showed how such derivation is guided. The ... |

6 | List processing primitives for parallel computation - Axford, Joy - 1993 |

5 | Some experiments in transforming towards parallel executability - Partsch - 1993 |

3 | Two examples of parallel program derivation: Parallel prefix and matrix multiplication - Geerling - 1992 |

3 |
Parallelization of divide-and conquer in the Bird- Meertens formalism
- Gorlatch, Lengauer
- 1993
(Show Context)
Citation Context ...y stop on the level of DC algorithms or homomorphisms, whereas our approach proceeds down to an architecture specific target program. An exception to these works is presented by Gorlatch and Lengauer =-=[GL93]-=-. They develop a DC function, using mainly the control parallelism. In particular, they do not require that there is a single PE for each member in the sequence, but assume that there is a single PE f... |

2 | Transformational derivation of (parallel) programs using skeletons," available by ftp from ftp.win.tue.nl - Boiten, Geerling, et al. |

2 | Domain morphisms: A new construct for parallel programming and formalizing program optimization - Chen, Choo - 1990 |

2 | Formal derivation of SIMD parallelism from non-linear recursive specifications - Geerling - 1994 |

1 | The divide-and conquer paradigm as a basis for parallel language design - Crystal - 1992 |