Results 1 - 10
of
31
A Cost Analysis for a Higher-order Parallel Programming Model
, 1996
"... Programming parallel computers remains a difficult task. An ideal programming environment should enable the user to concentrate on the problem solving activity at a convenient level of abstraction, while managing the intricate low-level details without sacrificing performance. This thesis investiga ..."
Abstract
-
Cited by 17 (1 self)
- Add to MetaCart
Programming parallel computers remains a difficult task. An ideal programming environment should enable the user to concentrate on the problem solving activity at a convenient level of abstraction, while managing the intricate low-level details without sacrificing performance. This thesis investigates a model of parallel programming based on the BirdMeertens Formalism (BMF). This is a set of higher-order functions, many of which are implicitly parallel. Programs are expressed in terms of functions borrowed from BMF. A parallel implementation is defined for each of these functions for a particular topology, and the associated execution costs are derived. The topologies which have been considered include the hypercube, 2-D torus, tree and the linear array. An analyser estimates the costs associated with different implementations of a given program and selects a cost-effective one for a given topology. All the analysis is performed at compile-time which has the advantage of reducing run-...
Functional Skeletons Generate Process Topologies in Eden
- In: Int. Symp. on Programming Languages, Implementations Logics and Programs PLILP’96
, 1996
"... . We present a collection of skeletons that are appropriate to instantiate process systems in the functional-concurrent language Eden [BLOM96]. Eden is a functional language providing facilities for the explicit definition and instantiation of processes. Skeletons in this language are just highe ..."
Abstract
-
Cited by 12 (2 self)
- Add to MetaCart
. We present a collection of skeletons that are appropriate to instantiate process systems in the functional-concurrent language Eden [BLOM96]. Eden is a functional language providing facilities for the explicit definition and instantiation of processes. Skeletons in this language are just higher order functions having process definitions as parameters. We introduce skeletons for both transformational (i.e. deterministic) and reactive (usually non deterministic) process topologies and illustrate their use by applying them to several examples. Some pointers to the skeletons literature are also given. Keywords: Functional programming, concurrent programming, parallel programming, skeletons, higher order functions. 1 Introduction Functional languages are often said to be amenable for implicit parallelism because referential transparency allows evaluating expressions in any order, or even in parallel, without changing the denotational meaning of programs. However, to exploit pa...
An Accumulative Parallel Skeleton for All
, 2001
"... Parallel skeletons intend to encourage programmers to build... ..."
Abstract
-
Cited by 11 (9 self)
- Add to MetaCart
Parallel skeletons intend to encourage programmers to build...
The AIM is laziness in a data-parallel language
, 1993
"... Although many data-parallel functional languages exist, Lisp, NESL, Paralation Lisp, FX and Parallel EuLisp, few researchers have investigated incorporating data-parallelism with a lazy language. This paper describes data-parallel extensions which have been incorporated into the lazy functional lang ..."
Abstract
-
Cited by 10 (5 self)
- Add to MetaCart
Although many data-parallel functional languages exist, Lisp, NESL, Paralation Lisp, FX and Parallel EuLisp, few researchers have investigated incorporating data-parallelism with a lazy language. This paper describes data-parallel extensions which have been incorporated into the lazy functional language Haskell. We describe pods, parallel data structures that share many of the characteristics of Haskell arrays---their distinguishing feature however is they are unbounded. We present POD comprehensions, a framework within which communication and parallel operations on PODs can be expressed. The semantics of these extensions is given in terms of translation rules into a core set of primitive parallel operations. Particular attention is given to the non-strict nature of these extensions. Development of the higher order parallel map, fold, and scan is presented, a trio of functions that is widely accepted as being fundamental to a data-parallel paradigm. Ladner classifies a problem as being susceptible to parallel scanning if it is of a fixed size and can be solved by a finite state transducer. We show that by utilising lazy evaluation, Ladners requirements can be relaxed such that the lazy verzion of scan presented here has the potential to scan an infinite POD.
Clumps: A Candidate Model Of Efficient, General Purpose Parallel Computation
, 1994
"... A new model of parallel computation is proposed, CLUMPS (Campbell's Lenient, Unified Model of Parallel Systems). This is composed of an abstract machine with an associated cost model, and aims to be more portable, reflective of costs, expressible and encouraging of more efficient implementations of ..."
Abstract
-
Cited by 10 (6 self)
- Add to MetaCart
A new model of parallel computation is proposed, CLUMPS (Campbell's Lenient, Unified Model of Parallel Systems). This is composed of an abstract machine with an associated cost model, and aims to be more portable, reflective of costs, expressible and encouraging of more efficient implementations of algorithms than other existing models. It is shown that each basic parallel architecture class can congruently perform each other's computations, but the congruent simulation of each other's communication is not generally possible (where for a simulation to be congruent the simulation costs on the target architecture are asymptotically equivalent to the implementation costs on the native architectures). This is reflected in the CLUMPS abstract machine through its flexibility in terms of program control and memory access. The congruence requirement is relaxed so that though strict congruence may not be achieved according to the above definition, communication costs are reflectively accounted ...
Towards a Scalable Parallel Object Database - The Bulk Synchronous Parallel Approach
, 1996
"... Parallel computers have been successfully deployed in many scientific and numerical application areas, although their use in non-numerical and database applications has been scarce. In this report, we first survey the architectural advancements beginning to make general-purpose parallel computing co ..."
Abstract
-
Cited by 8 (2 self)
- Add to MetaCart
Parallel computers have been successfully deployed in many scientific and numerical application areas, although their use in non-numerical and database applications has been scarce. In this report, we first survey the architectural advancements beginning to make general-purpose parallel computing cost-effective, the requirements for non-numerical (or symbolic) applications, and the previous attempts to develop parallel databases. The central theme of the Bulk Synchronous Parallel model is to provide a high level abstraction of parallel computing hardware whilst providing a realisation of a parallel programming model that enables architecture independent programs to deliver scalable performance on diverse hardware platforms. Therefore, the primary objective of this report is to investigate the feasibility of developing a portable, scalable, parallel object database, based on the Bulk Synchronous Parallel model of computation. In particular, we devise a way of providing high-level abstra...
The Transformational Derivation of Parallel Programs using Data-Distribution Algebras and Skeletons
, 1997
"... The transformational derivation of parallel programs for distributed-memory architectures using skeleton-based approaches is one of the most promising methods for parallel program development. These approaches support the derivation of provably correct, efficient and portable parallel programs using ..."
Abstract
-
Cited by 8 (0 self)
- Add to MetaCart
The transformational derivation of parallel programs for distributed-memory architectures using skeleton-based approaches is one of the most promising methods for parallel program development. These approaches support the derivation of provably correct, efficient and portable parallel programs using a predefined set of encapsulated efficiently implemented parallel base algorithms. Encapsulation requires that compositions of skeletons are explicitly defined by means of transformations. More flexible approaches which enable the compositional development of parallel programs --- that is, without reliance on ad hoc transformations --- are, however, often advantageous. The research
Diffusion: Calculating Efficient Parallel Programs
- IN 1999 ACM SIGPLAN WORKSHOP ON PARTIAL EVALUATION AND SEMANTICS-BASED PROGRAM MANIPULATION (PEPM ’99
, 1999
"... Parallel primitives (skeletons) intend to encourage programmers to build a parallel program from ready-made components for which efficient implementations are known to exist, making the parallelization process easier. However, programmers often suffer from the difficulty to choose a combination of p ..."
Abstract
-
Cited by 8 (7 self)
- Add to MetaCart
Parallel primitives (skeletons) intend to encourage programmers to build a parallel program from ready-made components for which efficient implementations are known to exist, making the parallelization process easier. However, programmers often suffer from the difficulty to choose a combination of proper parallel primitives so as to construct efficient parallel programs. To overcome this difficulty, we shall propose a new transformation, called diffusion, which can efficiently decompose a recursive definition into several functions such that each function can be described by some parallel primitive. This allows programmers to describe algorithms in a more natural recursive form. We demonstrate our idea with several interesting examples. Our diffusion transformation should be significant not only in development of new parallel algorithms, but also in construction of parallelizing compilers.
Tuning Task Granularity and Data Locality of Data Parallel GpH Programs
, 2001
"... The performance of data parallel programs often hinges on two key coordination aspects: the computational costs of the parallel tasks relative to their management overhead | task granularity ; and the communication costs induced by the distance between tasks and their data | data locality . In da ..."
Abstract
-
Cited by 7 (3 self)
- Add to MetaCart
The performance of data parallel programs often hinges on two key coordination aspects: the computational costs of the parallel tasks relative to their management overhead | task granularity ; and the communication costs induced by the distance between tasks and their data | data locality . In data parallel programs both granularity and locality can be improved by clustering, i.e. arranging for parallel tasks to operate on related sub-collections of data.
Constructing Skeletons in Clean The Bare Bones
- High Performance Functional Computing
, 1995
"... Skeletons are well-suited to structure parallel programming. They allow the easy use of some well-known parallel programming paradigms to construct portable, efficient programs. Much research has been focused on the use of skeletons in functional programming languages, because they can be expressed ..."
Abstract
-
Cited by 7 (0 self)
- Add to MetaCart
Skeletons are well-suited to structure parallel programming. They allow the easy use of some well-known parallel programming paradigms to construct portable, efficient programs. Much research has been focused on the use of skeletons in functional programming languages, because they can be expressed elegantly as higher order functions. On the other hand, little attention has been paid to an elementary weakness of skeletons: how to implement them. In this paper we will show that functional languages with some low level constructs for parallelism can be used to efficiently implement a range of high level skeletons. We will construct skeletons for data parallelism, for parallel I/O, and for stream processing. Our experiments demonstrate that no performance penalty needs to be paid, compared to more restrictive solutions. 1. Introduction The development of efficient parallel programs is an important non-trivial problem that programmers and compiler writers have to deal with. The use of fun...

