Results 1 -
6 of
6
The HDG-Machine: A Highly Distributed Graph-Reducer for a Transputer Network
- The Computer Journal
, 1991
"... Distributed implementations of programming languages with implicit parallelism hold out the prospect that the parallel programs are immediately scalable. This paper presents some of the results of our part of Esprit 415, in which we considered the implementation of lazy functional programming langua ..."
Abstract
-
Cited by 28 (0 self)
- Add to MetaCart
Distributed implementations of programming languages with implicit parallelism hold out the prospect that the parallel programs are immediately scalable. This paper presents some of the results of our part of Esprit 415, in which we considered the implementation of lazy functional programming languages on distributed architectures. A compiler and abstract machine were designed to achieve this goal. The abstract parallel machine was formally specified, using Miranda 1 . Each instruction of the abstract machine was then implemented as a macro in the Transputer Assembler. Although macro expansion of the code results in non-optimal code generation, use of the Miranda specification makes it possible to validate the compiler before the Transputer code is generated. The hardware currently available consists of five T800--25's, each board having 16M bytes of memory. Benchmark timings using this hardware are given. In spite of the straight forward code-generation, the resulting system compar...
Using Projection Analysis in Compiling Lazy Functional Programs
- In Proceedings of the 1990 ACM Conference on Lisp and Functional Programming
, 1990
"... Projection analysis is a technique for finding out information about lazy functional programs. We show how the information obtained from this analysis can be used to speed up sequential implementations, and introduce parallelism into parallel implementations. The underlying evaluation model is evalu ..."
Abstract
-
Cited by 15 (6 self)
- Add to MetaCart
Projection analysis is a technique for finding out information about lazy functional programs. We show how the information obtained from this analysis can be used to speed up sequential implementations, and introduce parallelism into parallel implementations. The underlying evaluation model is evaluation transformers, where the amount of evaluation that is allowed of an argument in a function application depends on the amount of evaluation allowed of the application. We prove that the transformed programs preserve the semantics of the original programs. Compilation rules, which encode the information from the analysis, are given for sequential and parallel machines. 1 Introduction A number of analyses have been developed which find out information about programs. The methods that have been developed fall broadly into two classes, forwards analyses such as those based on the ideas of abstract interpretation (e.g. [9, 18, 19, 7, 17, 12, 4, 20]), and backward analyses such as those based...
Implementing the Evaluation Transformer Model of Reduction on Parallel Machines
, 1991
"... The evaluation transformer model of reduction generalises lazy evaluation in two ways: it can start the evaluation of expressions before their first use, and it can evaluate expressions further than weak head normal form. Moreover, the amount of evaluation required of an argument to a function may d ..."
Abstract
-
Cited by 7 (1 self)
- Add to MetaCart
The evaluation transformer model of reduction generalises lazy evaluation in two ways: it can start the evaluation of expressions before their first use, and it can evaluate expressions further than weak head normal form. Moreover, the amount of evaluation required of an argument to a function may depend on the amount of evaluation required of the function application. It is a suitable candidate model for implementing lazy functional languages on parallel machines. In this paper we explore the implementation of lazy functional languages on parallel machines, both shared and distributed memory architectures, using the evaluation transformer model of reduction. We will see that the same code can be produced for both styles of architecture, and the definition of the instruction set is virtually the same for each style. The essential difference is that a distributed memory architecture has one extra node type for non-local pointers, and instructions which involve the value of such nodes need their definitions extended to cover this new type of node. To make our presentation accessible, we base our description on a variant of the well-knon G-machine, a machine for executing lazy functional programs.
A Parallel Functional Language Compiler for Message-Passing Multicomputers
, 1998
"... The research presented in this thesis is about the design and implementation of Naira, a parallel, parallelising compiler for a rich, purely functional programming language. The source language of the compiler is a subset of Haskell 1.2. The front end of Naira is written entirely in the Haskell subs ..."
Abstract
-
Cited by 4 (2 self)
- Add to MetaCart
The research presented in this thesis is about the design and implementation of Naira, a parallel, parallelising compiler for a rich, purely functional programming language. The source language of the compiler is a subset of Haskell 1.2. The front end of Naira is written entirely in the Haskell subset being compiled. Naira has been successfully parallelised and it is the largest successfully parallelised Haskell program having achieved good absolute speedups on a network of SUN workstations. Having the same basic structure as other production compilers of functional languages, Naira's parallelisation technology should carry forward to other functional language compilers. The back end of Naira is written in C and generates parallel code in the C language which is envisioned to be run on distributed-memory machines. The code generator is based on a novel compilation scheme specified using a restricted form of Milner's ß-calculus which achieves asynchronous communication. We present the f...
The ν-STG machine: a parallelized Spineless Tagless Graph Reduction Machine in a distributed memory architecture
, 1992
"... . This paper describes the --STG machine, a parallelized Spineless Tagless Graph Reduction (STG) machine in a distributed memory architecture. In the --STG machine, a stack for each task is distributed by allocating a context frame for each tail--call sequence on the heap. Two sparking mechanisms, B ..."
Abstract
- Add to MetaCart
. This paper describes the --STG machine, a parallelized Spineless Tagless Graph Reduction (STG) machine in a distributed memory architecture. In the --STG machine, a stack for each task is distributed by allocating a context frame for each tail--call sequence on the heap. Two sparking mechanisms, BSpark and ISpark, are supported to introduce parallelism, which are similar to those in the HDG machine. The BSpark cannot be ignored and should be synchronized with the parent task to prevent it being blocked and resumed too frequently, while ISpark may be ignored. Even though a variable is sparked by BSpark or ISpark, it may be evaluated in--line by the parent task. Creating a new task to evaluate a boxed variable according to the spark annotation is delayed until it is really needed. A message passing mechanism is used to distribute jobs and synchronize them in a machine. A lazy task creation mechanism is supported to exploit parallelism in unboxed arithmetic expression by boxing up on d...
Automatic Parallelization of Lazy Functional Programs
- Proc. of 4th European Symposium on Programming, ESOP'92, LNCS 582:254-268
, 1992
"... We present a parallelizing compiler for lazy functional programs that uses strictness analysis to detect the implicit parallelism within programs. It generates an intermediate functional program, where a special syntactic construct `letpar', which is semantically equivalent to the well-known let-c ..."
Abstract
- Add to MetaCart
We present a parallelizing compiler for lazy functional programs that uses strictness analysis to detect the implicit parallelism within programs. It generates an intermediate functional program, where a special syntactic construct `letpar', which is semantically equivalent to the well-known let-construct, is used to indicate subexpressions for which a parallel execution is allowed. Only for sufficiently complex expressions a parallelization will be worthwhile. For small expressions the communication overhead may outweigh the benefits of the parallel execution. Therefore, the parallelizing compiler uses some heuristics to estimate the complexity of expressions. The distributed implementation of parallelized functional programs described in [Loogen et al. 89] enabled us to investigate the impact of various parallelization strategies on the runtimes and speedups. The strategy, which only allows the parallel execution of non-predefined function calls in strict positions, shows t...

