Results 1  10
of
11
Systematic Extraction and Implementation of DivideandConquer Parallelism
 Programming languages: Implementation, Logics and Programs, Lecture Notes in Computer Science 1140
, 1996
"... Homomorphisms are functions that match the divideandconquer paradigm and thus can be computed in parallel. Two problems are studied for homomorphisms on lists: (1) parallelism extraction: finding a homomorphic representation of a given function; (2) parallelism implementation: deriving an efficien ..."
Abstract

Cited by 22 (5 self)
 Add to MetaCart
(Show Context)
Homomorphisms are functions that match the divideandconquer paradigm and thus can be computed in parallel. Two problems are studied for homomorphisms on lists: (1) parallelism extraction: finding a homomorphic representation of a given function; (2) parallelism implementation: deriving an efficient parallel program that computes the function. A systematic approach to parallelism extraction proceeds by generalization of two sequential representations based on traditional cons lists and dual snoc lists. For some nonhomomorphic functions, e.g., the maximum segment sum problem, our method provides an embedding into a homomorphism. The implementation is addressed by introducing a subclass of distributable homomorphisms and deriving for them a parallel program schema, which is time optimal on the hypercube architecture. The derivation is based on equational reasoning in the BirdMeertens formalism, which guarantees the correctness of the parallel target program. The approach is illustrated with function...
Parallelization of DivideandConquer by Translation to Nested Loops
 J. Functional Programming
, 1997
"... We propose a sequence of equational transformations and specializations which turns a divideandconquer skeleton in Haskell into a parallel loop nest in C. Our initial skeleton is often viewed as general divideandconquer. The specializations impose a balanced call tree, a fixed degree of the prob ..."
Abstract

Cited by 13 (7 self)
 Add to MetaCart
(Show Context)
We propose a sequence of equational transformations and specializations which turns a divideandconquer skeleton in Haskell into a parallel loop nest in C. Our initial skeleton is often viewed as general divideandconquer. The specializations impose a balanced call tree, a fixed degree of the problem division, and elementwise operations. Our goal is to select parallel implementations of divideandconquer via a spacetime mapping, which can be determined at compile time. The correctness of our transformations is proved by equational reasoning in Haskell; recursion and iteration are handled by induction. Finally, we demonstrate the practicality of the skeleton by expressing Strassen's matrix multiplication in it.
Architecture Independent Massive Parallelization of DivideandConquer Algorithms
 Mathematics of Program Construction, Lecture Notes in Computer Science 947
, 1995
"... . We present a strategy to develop, in a functional setting, correct, efficient and portable DivideandConquer (DC) programs for massively parallel architectures. Starting from an operational DC program, mapping sequences to sequences, we apply a set of semantics preserving transformation rules, wh ..."
Abstract

Cited by 9 (1 self)
 Add to MetaCart
(Show Context)
. We present a strategy to develop, in a functional setting, correct, efficient and portable DivideandConquer (DC) programs for massively parallel architectures. Starting from an operational DC program, mapping sequences to sequences, we apply a set of semantics preserving transformation rules, which transform the parallel control structure of DC into a sequential control flow, thereby making the implicit data parallelism in a DC scheme explicit. In the next phase of our strategy, the parallel architecture is fully expressed, where `architecture dependent' higherorder functions are introduced. Then  due to the rising communication complexities on particular architectures  topology dependent communication patterns are optimized in order to reduce the overall communication costs. The advantages of this approach are manifold and are demonstrated with a set of nontrivial examples. 1 Introduction It is wellknown that the main problems in exploiting the power of modern parallel sys...
Practical Parallel DivideandConquer Algorithms
, 1997
"... Nested data parallelism has been shown to be an important feature of parallel languages, allowing the concise expression of algorithms that operate on irregular data structures such as graphs and sparse matrices. However, previous nested dataparallel languages have relied on a vector PRAM impleme ..."
Abstract

Cited by 6 (2 self)
 Add to MetaCart
Nested data parallelism has been shown to be an important feature of parallel languages, allowing the concise expression of algorithms that operate on irregular data structures such as graphs and sparse matrices. However, previous nested dataparallel languages have relied on a vector PRAM implementation layer that cannot be efficiently mapped to MPPs with high interprocessor latency. This thesis shows that by restricting the problem set to that of dataparallel divideandconquer algorithms I can maintain the expressibility of full nested dataparallel languages while achieving good efficiency on current distributedmemory machines. Specifically, I define
Massive parallelization of divideandconquer algorithms over powerlists. Science of Computer Programming, 26:5978
 In 4th Principles and Practice of Parallel Programming
, 1996
"... It contains all proofs of the introduced transformation rules as well as programming examples on a SIMD computer. ..."
Abstract

Cited by 4 (0 self)
 Add to MetaCart
It contains all proofs of the introduced transformation rules as well as programming examples on a SIMD computer.
Parallelization of DivideandConquer in the BirdMeertens Formalism
, 1995
"... . An SPMD parallel implementation schema for divideandconquer specifications is proposed and derived by formal refinement (transformation) of the specification. The specification is in the form of a mutually recursive functional definition. In a first phase, a parallel functional program schema is ..."
Abstract

Cited by 4 (0 self)
 Add to MetaCart
. An SPMD parallel implementation schema for divideandconquer specifications is proposed and derived by formal refinement (transformation) of the specification. The specification is in the form of a mutually recursive functional definition. In a first phase, a parallel functional program schema is constructed which consists of a communication tree and a functional program that is shared by all nodes of the tree. The fact that this phase proceeds by semanticspreserving transformations in the BirdMeertens formalism of higherorder functions guarantees the correctness of the resulting functional implementation. A second phase yields an imperative distributed messagepassing implementation of this schema. The derivation process is illustrated with an example: a twodimensional numerical integration algorithm. 1. Introduction One of the main problems in exploiting modern multiprocessor systems is how to develop correct and efficient programs for them. We address this problem using the ap...
Formal Derivation and Implementation of DivideandConquer on a Transputer Network
 Transputer Applications and Systems '94
, 1994
"... This paper considers parallel program development based on functional mutually recursive specifications. The development yields a communication structure linking an arbitrary fixed number of processors and an SPMD program executable on the structure. There are two steps in the development proces ..."
Abstract

Cited by 2 (2 self)
 Add to MetaCart
(Show Context)
This paper considers parallel program development based on functional mutually recursive specifications. The development yields a communication structure linking an arbitrary fixed number of processors and an SPMD program executable on the structure. There are two steps in the development process: first, a parallel functional implementation is obtained through formal transformations in the BirdMeertens formalism; it is then systematically transformed into an imperative target program with message passing. The approach is illustrated with a divideandconquer algorithm for numerical twodimensional sparse grid integration. The optimization of the target program and the results of experimental performance measurements on a 64transputer network under OS Parix are presented. 1 Introduction We take the following approach to parallelization: we try to identify certain standard patterns of highlevel functional specifications and to associate equivalent parallel programs to them...
Mapping Adl to the BirdMeertens Formalism
, 1994
"... Bulk data operations such as map and reduce are an elegant medium for expressing repetitive computation over aggregate data structures. They also serve as a tool for abstraction: not all details of the computation, such as the exact ordering of the constituent operations, need to be specified by the ..."
Abstract

Cited by 1 (1 self)
 Add to MetaCart
(Show Context)
Bulk data operations such as map and reduce are an elegant medium for expressing repetitive computation over aggregate data structures. They also serve as a tool for abstraction: not all details of the computation, such as the exact ordering of the constituent operations, need to be specified by the programmer. A precise description of the behaviour of the bulk data operator is the preserve of the language implementor. If the implementation of these operators is parallel then they become a medium for expressing implicit data parallelism. There is a large body of work formally the relating bulk data operators to each other and to their underlying data types. Much of this research stems from Category Theory where a number of general properties of types and operators have been established. One theoretical framework in particular, the BirdMeertens Formalism (BMF), has proved to be extremely useful. The BMF theory of a type provides a set of operators on that type and a set of algebraic id...
Notes on the Classification of Parallel Implementations of Linearly Recursive Programs
"... We propose a classification of the best parallel implementations of different kinds of functional linearly recursive programs and present examples for different classes. Keywords: functional programming, linear recursion, parallelization, skeleton 1 Introduction Functional programming offers a ver ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
(Show Context)
We propose a classification of the best parallel implementations of different kinds of functional linearly recursive programs and present examples for different classes. Keywords: functional programming, linear recursion, parallelization, skeleton 1 Introduction Functional programming offers a very highlevel way of specifying executable problem solutions. For example, the paradigm of linear recursion can be expressed very concisely as a functional "skeleton" [Col89] which we call the source skeleton. However, looking at a specific linear recursion at this level of abstraction, it is far from clear how it can be implemented efficiently on a given processor network. In this paper, we propose a method of selecting an appropriate parallel implementation. We proceed as follows. We transform the source specification such that it matches a different skeleton, the derived skeleton, which exposes the possible parallelism more clearly. In general, the programmer will have to propose the tran...
From a Tabular Classication to Parallel Implementations of Linearly Recursive Functions
, 1997
"... We propose a classication for a set of linearly recursive functions, which can be expressed as instances of a skeleton for parallel linear recursion, and present new parallel implementations for them. This set includes well known higherorder functions, like Broadcast, Reduction and Scan, which we c ..."
Abstract
 Add to MetaCart
We propose a classication for a set of linearly recursive functions, which can be expressed as instances of a skeleton for parallel linear recursion, and present new parallel implementations for them. This set includes well known higherorder functions, like Broadcast, Reduction and Scan, which we call basic components. Many compositions of these basic components are also linearly recursive functions; we present transformation rules from compositions of up to three basic components to instances of our skeleton. The advantage of this approach is that these instances have better parallel implementations than the compositions of the individual implementations of the corresponding basic components. Keywords: functional programming, linear recursion, parallelization, skeletons 1 Introduction Functional programming ooeers a very highlevel approach to specifying executable problem solutions. For example, the scheme of linear recursion can be expressed concisely as a higherorder function. I...