Results 1 -
9 of
9
Powerlist: a structure for parallel recursion
- ACM Transactions on Programming Languages and Systems
, 1994
"... Many data parallel algorithms – Fast Fourier Transform, Batcher’s sorting schemes and prefixsum – exhibit recursive structure. We propose a data structure, powerlist, that permits succinct descriptions of such algorithms, highlighting the roles of both parallelism and recursion. Simple algebraic pro ..."
Abstract
-
Cited by 55 (2 self)
- Add to MetaCart
Many data parallel algorithms – Fast Fourier Transform, Batcher’s sorting schemes and prefixsum – exhibit recursive structure. We propose a data structure, powerlist, that permits succinct descriptions of such algorithms, highlighting the roles of both parallelism and recursion. Simple algebraic properties of this data structure can be exploited to derive properties of these algorithms and establish equivalence of different algorithms that solve the same problem.
Systematic Efficient Parallelization of Scan and Other List Homomorphisms
- In Annual European Conference on Parallel Processing, LNCS 1124
, 1996
"... Homomorphisms are functions which can be parallelized by the divide-and-conquer paradigm. A class of distributable homomorphisms (DH) is introduced and an efficient parallel implementation schema for all functions of the class is derived by transformations in the Bird-Meertens formalism. The schema ..."
Abstract
-
Cited by 25 (7 self)
- Add to MetaCart
Homomorphisms are functions which can be parallelized by the divide-and-conquer paradigm. A class of distributable homomorphisms (DH) is introduced and an efficient parallel implementation schema for all functions of the class is derived by transformations in the Bird-Meertens formalism. The schema can be directly mapped on the hypercube with an unlimited or an arbitrary fixed number of processors, providing provable correctness and predictable performance. The popular scan-function (parallel prefix) illustrates the presentation: the systematically derived implementation for scan coincides with the practically used "folklore" algorithm for distributed-memory machines.
Parallelization of Divide-and-Conquer by Translation to Nested Loops
- J. Functional Programming
, 1997
"... We propose a sequence of equational transformations and specializations which turns a divide-and-conquer skeleton in Haskell into a parallel loop nest in C. Our initial skeleton is often viewed as general divide-and-conquer. The specializations impose a balanced call tree, a fixed degree of the prob ..."
Abstract
-
Cited by 12 (6 self)
- Add to MetaCart
We propose a sequence of equational transformations and specializations which turns a divide-and-conquer skeleton in Haskell into a parallel loop nest in C. Our initial skeleton is often viewed as general divide-and-conquer. The specializations impose a balanced call tree, a fixed degree of the problem division, and elementwise operations. Our goal is to select parallel implementations of divide-and-conquer via a space-time mapping, which can be determined at compile time. The correctness of our transformations is proved by equational reasoning in Haskell; recursion and iteration are handled by induction. Finally, we demonstrate the practicality of the skeleton by expressing Strassen's matrix multiplication in it.
Extracting and Implementing List Homomorphisms in Parallel Program Development
- Science of Computer Programming
, 1997
"... this paper, we study functions called list homomorphisms, which represent a particular pattern of parallelism. ..."
Abstract
-
Cited by 12 (0 self)
- Add to MetaCart
this paper, we study functions called list homomorphisms, which represent a particular pattern of parallelism.
Derivation of Efficient Data Parallel Programs
- In 17th Australasian Computer Science Conference
, 1993
"... This paper considers the expression and derivation of efficient data parallel programs for SIMD and MIMD machines. It is shown that efficient parallel programs must utilise both sequential and parallel computation; these are termed hybrid programs. The Bird--Meertens formalism, a calculus of higher ..."
Abstract
-
Cited by 6 (0 self)
- Add to MetaCart
This paper considers the expression and derivation of efficient data parallel programs for SIMD and MIMD machines. It is shown that efficient parallel programs must utilise both sequential and parallel computation; these are termed hybrid programs. The Bird--Meertens formalism, a calculus of higher order functions, is used to derive and express programs. Our goal is to derive efficient parallel programs for a variety of machines by: starting with an abstract specification, deriving an abstract algorithm and successively refining this to more efficient and machine dependent algorithms incorporating greater implementation detail. Nested data structures are used to express hybrid algorithms. Using this technique efficient accumulate (scan/parallel prefix) algorithms are derived for SIMD and MIMD machines. 1 Introduction The main reason for parallel programming is to achieve high performance. Unfortunately designing and writing efficient parallel programs, especially for MIMD machines, i...
List Processing Primitives for Parallel Computation
- Computer Languages
, 1993
"... A new model of list processing is proposed which is more suitable as a basic data structure for architecture-independent programming languages than the traditional model of lists. Its main primitive functions are: concatenate, which concatenates two lists; split, which partitions a list into two pa ..."
Abstract
-
Cited by 6 (0 self)
- Add to MetaCart
A new model of list processing is proposed which is more suitable as a basic data structure for architecture-independent programming languages than the traditional model of lists. Its main primitive functions are: concatenate, which concatenates two lists; split, which partitions a list into two parts; and length, which gives the number of elements in a list. This model contains a degree of non-determinism which allows greater freedom to the implementation to achieve high performance on both parallel and serial architectures. Keywords: data structures, functional programming, list processing, parallel programming. 1 Introduction Lists have been used as basic data structures within programming languages since the 1950s. The most elegant and successful formulation was in Lisp [9] with its primitive functions car, cdr and cons, often now referred to by the more meaningful names of head, tail and cons respectively. Lisp and its model of list processing based on the head, tail and cons ...
Formal Derivation and Implementation of Divide-and-Conquer on a Transputer Network
- Transputer Applications and Systems '94
, 1994
"... This paper considers parallel program development based on functional mutually recursive specifications. The development yields a communication structure linking an arbitrary fixed number of processors and an SPMD program executable on the structure. There are two steps in the development proces ..."
Abstract
-
Cited by 2 (2 self)
- Add to MetaCart
This paper considers parallel program development based on functional mutually recursive specifications. The development yields a communication structure linking an arbitrary fixed number of processors and an SPMD program executable on the structure. There are two steps in the development process: first, a parallel functional implementation is obtained through formal transformations in the Bird-Meertens formalism; it is then systematically transformed into an imperative target program with message passing. The approach is illustrated with a divide-and-conquer algorithm for numerical twodimensional sparse grid integration. The optimization of the target program and the results of experimental performance measurements on a 64-transputer network under OS Parix are presented. 1 Introduction We take the following approach to parallelization: we try to identify certain standard patterns of high-level functional specifications and to associate equivalent parallel programs to them...
Notes on the Space-Time Mapping of Divide-and-Conquer Recursions
- In GI/ITG FG PARS'95, number 14 in PARS Mitteilungen
, 1995
"... We propose a functional program skeleton for balanced fixed-degree divide-and-conquer and a method for its parallel implementation on message-passing multiprocessors. In the method, the operations of the skeleton are first mapped to a geometric computational model which is then mapped to space-ti ..."
Abstract
-
Cited by 2 (2 self)
- Add to MetaCart
We propose a functional program skeleton for balanced fixed-degree divide-and-conquer and a method for its parallel implementation on message-passing multiprocessors. In the method, the operations of the skeleton are first mapped to a geometric computational model which is then mapped to space-time in order to expose the inherent parallelism. This approach is inspired by the method of parallelizing nested loops in the polytope model. Keywords: divide-and-conquer, functional programming, parallelization, polytope model, skeleton, space-time mapping. 1 Introduction The divide-and-conquer (D&C) paradigm is a special case of cascading recursion which enables efficient solutions to many practical problems like the multiplication of matrices or large integers, fast Fourier transform, sorting, etc. We are interested in the parallelization of D&C recursions with the goal of sublinear execution times on a mesh. Sublinearity can only be achieved if input data are read in parallel and pro...
A personal, historical perspective of parallel programming for high performance
- Communication-Based Systems (CBS 2000
, 2000
"... ..."

