Results 1 
7 of
7
Architecture Independent Massive Parallelization of DivideandConquer Algorithms
 Mathematics of Program Construction, Lecture Notes in Computer Science 947
, 1995
"... . We present a strategy to develop, in a functional setting, correct, efficient and portable DivideandConquer (DC) programs for massively parallel architectures. Starting from an operational DC program, mapping sequences to sequences, we apply a set of semantics preserving transformation rules, wh ..."
Abstract

Cited by 9 (1 self)
 Add to MetaCart
. We present a strategy to develop, in a functional setting, correct, efficient and portable DivideandConquer (DC) programs for massively parallel architectures. Starting from an operational DC program, mapping sequences to sequences, we apply a set of semantics preserving transformation rules, which transform the parallel control structure of DC into a sequential control flow, thereby making the implicit data parallelism in a DC scheme explicit. In the next phase of our strategy, the parallel architecture is fully expressed, where `architecture dependent' higherorder functions are introduced. Then  due to the rising communication complexities on particular architectures  topology dependent communication patterns are optimized in order to reduce the overall communication costs. The advantages of this approach are manifold and are demonstrated with a set of nontrivial examples. 1 Introduction It is wellknown that the main problems in exploiting the power of modern parallel sys...
Data Distribution Algebras  A Formal Basis for Programming Using Skeletons
, 1994
"... this paper functional languages are proposed as such a methodology using an extension of the concept of skeletons  higherorder functions coupled with parallel implementation templates. An essential part of the proposed methodology is the use of data distribution algebras ..."
Abstract

Cited by 8 (1 self)
 Add to MetaCart
this paper functional languages are proposed as such a methodology using an extension of the concept of skeletons  higherorder functions coupled with parallel implementation templates. An essential part of the proposed methodology is the use of data distribution algebras
Deriving Parallel Numerical Algorithms using Data Distribution Algebras: Wang's Algorithm
, 1996
"... Parallel and distributed programming are much more difficult than the development of sequential algorithms due to data distribution issues and communication requirements. This paper presents a methodology that enables the abstract description of the distribution of data structures by means of overla ..."
Abstract

Cited by 4 (1 self)
 Add to MetaCart
Parallel and distributed programming are much more difficult than the development of sequential algorithms due to data distribution issues and communication requirements. This paper presents a methodology that enables the abstract description of the distribution of data structures by means of overlapping covers that form data distribution algebras. Algorithms are formulated and derived by transformation in a functional environment using skeletons, i.e. higherorder functions with specific parallel implementations. Communication is specified implicitly through the access to overlapping parts of covers. Such specifications enable the derivation of explicit lowerlevel communication statements. We illustrate the concepts by a complete derivation of Wang's partition algorithm for the solution of tridiagonal systems of linear equations.
From Transformations to Methodology in Parallel Program Development: A Case Study
 Microprocessing and Microprogramming
, 1996
"... The BirdMeertens formalism (BMF) of higherorder functions over lists is a mathematical framework supporting formal derivation of algorithms from functional specifications. This paper reports results of a case study on the systematic use of BMF in the process of parallel program development. We dev ..."
Abstract

Cited by 4 (1 self)
 Add to MetaCart
The BirdMeertens formalism (BMF) of higherorder functions over lists is a mathematical framework supporting formal derivation of algorithms from functional specifications. This paper reports results of a case study on the systematic use of BMF in the process of parallel program development. We develop a parallel program for polynomial multiplication, starting with a straightforward mathematical specification and arriving at the target processor topology together with a program for each processor of it. The development process is based on formal transformations; design decisions concerning data partitioning, processor interconnections, etc. are governed by formal type analysis and performance estimation rather than made ad hoc. The parallel target implementation is parameterized for an arbitrary number of processors; for the particular number, the target program is both time and costoptimal. We compare our results with systolic solutions to polynomial multiplication.
Transformational Derivation of (parallel) Programs Using Skeletons
 Katholieke Universiteit Nijmegen
"... We describe a framework for the derivation of programs for arbitrary (in particular, parallel) architectures, motivated by a generalization of the derivation process for sequential algorithms. The central concept in this approach is that of a skeleton: on the one hand, a higherorder function for ta ..."
Abstract

Cited by 2 (0 self)
 Add to MetaCart
We describe a framework for the derivation of programs for arbitrary (in particular, parallel) architectures, motivated by a generalization of the derivation process for sequential algorithms. The central concept in this approach is that of a skeleton: on the one hand, a higherorder function for targeting transformational derivations at, on the other hand representing an elementary computation on the architecture aimed at. Skeletons thus form a basis for intermediate languages, that can be implemented once and for all, as a process separate from individual program developments. The available knowledge on the derivation of (higherorder) functional programs can be used for deriving parallel ones. This paper presents an overview of the method, illustrated with an example (trapezoidal rule on SIMD processor array), and ideas for future research. 1 Introduction and overview The introduction of various computer networks and parallel computers in recent years has led to a large increase in...
Solving Large Systems of Differential Equations in Parallel Using Covers and Skeletons
, 1997
"... The design and implementation of parallel algorithms for distributed memory architectures is much harder than the development of sequential algorithms. This is mainly due to the communication and synchronization that is necessary to manage distributed data correctly. This paper applies a methodology ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
The design and implementation of parallel algorithms for distributed memory architectures is much harder than the development of sequential algorithms. This is mainly due to the communication and synchronization that is necessary to manage distributed data correctly. This paper applies a methodology for the transformational derivation of parallel programs using data distribution algebras that enable an abstract description of data distribution issues. Algorithms are formulated using skeletons, that is, specialized higherorder functions with particular parallel implementations. The methodology is applied to a the solution of a system of ordinary differential equations where convolutions can be computed using the Fast Fourier transformation. The example illustrates the practical optimization problems for a development model of the visual system that involves large scale neural network simulations. Finally, this algorithm is compared to an implementation of the same system of equations i...
Efficient Functional Programming Communication Functions on the AP1000
, 1994
"... One problem of parallel computing is that parallel computers vary greatly in architecture so that a program written to run efficiently on a particular architecture, when porting to a different architecture, would often need to be changed and adapted substantially in order to run with reasonable perf ..."
Abstract
 Add to MetaCart
One problem of parallel computing is that parallel computers vary greatly in architecture so that a program written to run efficiently on a particular architecture, when porting to a different architecture, would often need to be changed and adapted substantially in order to run with reasonable performance on the target architecture. Porting with performance is, hence, labourintensive and costly. A method of parallel programming using the BirdMeertens Formalism where programs are formulated as compositions of (mainly) higher order functions on some data type in the data parallel functional style has been proposed as a solution. The library of (mainly) higherorder functions in which all communication and parallelism in a program is embedded could (it is argued) be implemented efficiently on different parallel architectures. This gives the advantage of portability between different architectures with reasonable and predictable performance without change in program source. ...