Heterogeneous Concurrent Modeling and Design in Java (Volumes 1: Introduction to Ptolemy II)
, 2005
Towards Parallel Programming by Transformation: The FAN Skeleton Framework
, 2001
Cited by 20
A Functional Abstract Notation (FAN) is proposed for the specification and design of parallel algorithms by means of skeletons  highlevel patterns with parallel semantics. The main weakness of the current programming systems based on skeletons is that the user is still responsible for finding the most appropriate skeleton composition for a given application and a given parallel architecture. We describe a transformational framework for the development of skeletal programs which is aimed at filling this gap. The framework makes use of transformation rules which are semantic equivalences among skeleton compositions. For a given problem, an initial, possibly inefficient skeleton specification is refined by applying a sequence of transformations. Transformations are guided by a set of performance prediction models which forecast the behavior of each skeleton and the performance benefits of different rules. The design process is supported by a graphical tool which locates applicable transformations and provides performance estimates, thereby helping the programmer in navigating through the program refinement space. We give an overview of the FAN framework and exemplify its use with performancedirected program derivations for simple case studies. Our experience can be viewed as a first feasibility study of methods and tools for transformational, performancedirected parallel programming using skeletons.
Derivation of Efficient Data Parallel Programs
 In 17th Australasian Computer Science Conference
, 1993
Cited by 6
This paper considers the expression and derivation of efficient data parallel programs for SIMD and MIMD machines. It is shown that efficient parallel programs must utilise both sequential and parallel computation; these are termed hybrid programs. The BirdMeertens formalism, a calculus of higher order functions, is used to derive and express programs. Our goal is to derive efficient parallel programs for a variety of machines by: starting with an abstract specification, deriving an abstract algorithm and successively refining this to more efficient and machine dependent algorithms incorporating greater implementation detail. Nested data structures are used to express hybrid algorithms. Using this technique efficient accumulate (scan/parallel prefix) algorithms are derived for SIMD and MIMD machines. 1 Introduction The main reason for parallel programming is to achieve high performance. Unfortunately designing and writing efficient parallel programs, especially for MIMD machines, i...
Parallelization of DivideandConquer in the BirdMeertens Formalism
, 1995
Cited by 4
. An SPMD parallel implementation schema for divideandconquer specifications is proposed and derived by formal refinement (transformation) of the specification. The specification is in the form of a mutually recursive functional definition. In a first phase, a parallel functional program schema is constructed which consists of a communication tree and a functional program that is shared by all nodes of the tree. The fact that this phase proceeds by semanticspreserving transformations in the BirdMeertens formalism of higherorder functions guarantees the correctness of the resulting functional implementation. A second phase yields an imperative distributed messagepassing implementation of this schema. The derivation process is illustrated with an example: a twodimensional numerical integration algorithm. 1. Introduction One of the main problems in exploiting modern multiprocessor systems is how to develop correct and efficient programs for them. We address this problem using the ap...
From Transformations to Methodology in Parallel Program Development: A Case Study
 Microprocessing and Microprogramming
, 1996
Cited by 4
The BirdMeertens formalism (BMF) of higherorder functions over lists is a mathematical framework supporting formal derivation of algorithms from functional specifications. This paper reports results of a case study on the systematic use of BMF in the process of parallel program development. We develop a parallel program for polynomial multiplication, starting with a straightforward mathematical specification and arriving at the target processor topology together with a program for each processor of it. The development process is based on formal transformations; design decisions concerning data partitioning, processor interconnections, etc. are governed by formal type analysis and performance estimation rather than made ad hoc. The parallel target implementation is parameterized for an arbitrary number of processors; for the particular number, the target program is both time and costoptimal. We compare our results with systolic solutions to polynomial multiplication.
A lazy, selfoptimising parallel matrix library
 Functional Programming Workshop, Ullapool
, 1995
Formal Derivation and Implementation of DivideandConquer on a Transputer Network
 Transputer Applications and Systems '94
, 1994
Cited by 2
This paper considers parallel program development based on functional mutually recursive specifications. The development yields a communication structure linking an arbitrary fixed number of processors and an SPMD program executable on the structure. There are two steps in the development process: first, a parallel functional implementation is obtained through formal transformations in the BirdMeertens formalism; it is then systematically transformed into an imperative target program with message passing. The approach is illustrated with a divideandconquer algorithm for numerical twodimensional sparse grid integration. The optimization of the target program and the results of experimental performance measurements on a 64transputer network under OS Parix are presented. 1 Introduction We take the following approach to parallelization: we try to identify certain standard patterns of highlevel functional specifications and to associate equivalent parallel programs to them...
Authors:
Cited by 2
This document is a draft proposal whose purpose is to solicit additional input and convey the current state of the ebXML packaging recommendations. This document defines the structure (or envelope) used to encapsulate data for transport between parties, following the specifications defined by ebXML. Every attempt has been made to ensure that ebXML requirements, related to transport, routing and packaging are addressed within this specification. Adherence to industry standards, consideration of existing businesstobusiness practices and support for Small and Medium Enterprises were key factors influencing the direction of this specification.
A Preliminary Case Study in a Methodology for Deriving Parallel Programs Using APMs
We present a methodology based on Abstract Parallel Machines (APMs) for deriving an executable parallel program from a highlevel specification. The specification is given initially in mathematical notation and then transformed into a functional specification which is not explicitly parallel. This is refined through a sequence of intermediate executable programs in the functional language using equational reasoning. At many of the steps in this process there are decisions which need to be made producing a variety of possible derivation paths, leading to a range of possible implementations. Hence the final implementation can be in a variety of languages and for a variety of programming models and architectures. We illustrate the method with a simple case study: the summation of the columns of a triangular matrix using load balancing to improve performance. We use Haskell in the derivation and C+MPI as the target language, and show the intermediate steps in the derivation and the transfo...
Efficient Functional Programming Communication Functions on the AP1000
, 1994
One problem of parallel computing is that parallel computers vary greatly in architecture so that a program written to run efficiently on a particular architecture, when porting to a different architecture, would often need to be changed and adapted substantially in order to run with reasonable performance on the target architecture. Porting with performance is, hence, labourintensive and costly. A method of parallel programming using the BirdMeertens Formalism where programs are formulated as compositions of (mainly) higher order functions on some data type in the data parallel functional style has been proposed as a solution. The library of (mainly) higherorder functions in which all communication and parallelism in a program is embedded could (it is argued) be implemented efficiently on different parallel architectures. This gives the advantage of portability between different architectures with reasonable and predictable performance without change in program source. ...