Results 1 - 10
of
14
Nested Algorithmic Skeletons from Higher Order Functions
, 2000
"... Algorithmic skeletons provide a promising basis for the automatic utilisation of parallelism at sites of higher-order function use through static program analysis. However, decisions about whether or not to realise particular higher-order function instances as skeletons must be based on informati ..."
Abstract
-
Cited by 25 (12 self)
- Add to MetaCart
Algorithmic skeletons provide a promising basis for the automatic utilisation of parallelism at sites of higher-order function use through static program analysis. However, decisions about whether or not to realise particular higher-order function instances as skeletons must be based on information about available processing resources, and such resources may change subsequent to program analysis. In principle, nested higher-order functions may be realised as nested skeletons. However, where higher-order function arguments result from partially applied functions, free-variable bindings must be identified and communicated through the corresponding skeleton hierarchy to where those arguments are actually applied. Here, a skeleton based parallelising compiler from Standard ML to native code is presented. Hybrid skeletons, which can change from parallel to serial evaluation at run-time, are considered and mechanisms for their nesting are discussed. Compilation stages are illustra...
A Cost Analysis for a Higher-order Parallel Programming Model
, 1996
"... Programming parallel computers remains a difficult task. An ideal programming environment should enable the user to concentrate on the problem solving activity at a convenient level of abstraction, while managing the intricate low-level details without sacrificing performance. This thesis investiga ..."
Abstract
-
Cited by 17 (1 self)
- Add to MetaCart
Programming parallel computers remains a difficult task. An ideal programming environment should enable the user to concentrate on the problem solving activity at a convenient level of abstraction, while managing the intricate low-level details without sacrificing performance. This thesis investigates a model of parallel programming based on the BirdMeertens Formalism (BMF). This is a set of higher-order functions, many of which are implicitly parallel. Programs are expressed in terms of functions borrowed from BMF. A parallel implementation is defined for each of these functions for a particular topology, and the associated execution costs are derived. The topologies which have been considered include the hypercube, 2-D torus, tree and the linear array. An analyser estimates the costs associated with different implementations of a given program and selects a cost-effective one for a given topology. All the analysis is performed at compile-time which has the advantage of reducing run-...
Co-ordinating Heterogeneous Parallel Computation
- Europar '96
, 1996
"... . There is a growing interest in heterogeneous high performance computing environments. These systems are difficult to program owing to the complexity of choosing the appropriate resource allocations and the difficulties in expressing these choices in traditional parallel languages. In this pape ..."
Abstract
-
Cited by 17 (1 self)
- Add to MetaCart
. There is a growing interest in heterogeneous high performance computing environments. These systems are difficult to program owing to the complexity of choosing the appropriate resource allocations and the difficulties in expressing these choices in traditional parallel languages. In this paper we propose that functional skeletons are used to express these resource allocation strategies. By associating performance models with each skeleton it is possible to predict and optimise the performance of different resource allocation strategies, thus providing a tool for guiding the choice of resource allocation. Through a case study of a parallel conjugate gradient algorithm on a mixed vector and scalar parallel machine we demonstrate these features of the SPP(X) approach. 1 Introduction Parallel computation platforms often now have a heterogeneous structure arising from either exploiting clusters of workstations or from using parallel computers with specialist hardware such as ...
High-Performance Data Mining with Skeleton-based Structured Parallel Programming
- PARALLEL COMPUTING, SPECIAL ISSUE ON PARALLEL DATA INTENSIVE COMPUTING
, 2001
"... We show how to apply a Structured Parallel Programming methodology based on skeletons to Data Mining problems, reporting several results about three commonly used mining techniques, namely association rules, decision tree induction and spatial clustering. We analyze the structural patterns common to ..."
Abstract
-
Cited by 13 (4 self)
- Add to MetaCart
We show how to apply a Structured Parallel Programming methodology based on skeletons to Data Mining problems, reporting several results about three commonly used mining techniques, namely association rules, decision tree induction and spatial clustering. We analyze the structural patterns common to these applications, looking at application performance and software engineering efficiency. Our aim is to clearly state what features a Structured Parallel Programming Environment should have to be useful for parallel Data Mining. Within the skeleton-based PPE SkIE that we have developed, we study the different patterns of data access of parallel implementations of Apriori, C4.5 and DBSCAN. We need to address large partitions reads, frequent and sparse access to small blocks, as well as an irregular mix of small and large transfers, to allow efficient development of applications on huge databases. We examine the addition of an object/component interface to the skeleton structured model, to simplify the development of environment-integrated, parallel Data Mining applications.
The Performance of Parallel Algorithmic Skeletons
, 1995
"... Several authors have proposed the use of algorithmic skeletons as a highlevel, machine-independent means of developing parallel programs. This paper addresses the question of modelling the performance of such skeletons. The execution time for a skeleton is presented as a generic higher order complex ..."
Abstract
-
Cited by 10 (4 self)
- Add to MetaCart
Several authors have proposed the use of algorithmic skeletons as a highlevel, machine-independent means of developing parallel programs. This paper addresses the question of modelling the performance of such skeletons. The execution time for a skeleton is presented as a generic higher order complexity function. Instantiation of the skeletion with a specific set of functional parameters enables the time complexity of the particular application to be derived. The approach is illustrated by examples based on existing special purpose languages for image processing, and is extended to analyse the scalability of skeleton-based applications, using isoefficiency functions. 1. Introduction Parallel programming is widely regarded as a complex, machine-dependent and timeconsuming task. This perceived difficulty has hindered the more widespread use of parallel computer systems. Several high-level parallel programming approaches have arisen to meet this challenge, such as [1, 2]. These abstract f...
Data Distribution Algebras - A Formal Basis for Programming Using Skeletons
, 1994
"... this paper functional languages are proposed as such a methodology using an extension of the concept of skeletons --- higher-order functions coupled with parallel implementation templates. An essential part of the proposed methodology is the use of data distribution algebras ..."
Abstract
-
Cited by 8 (1 self)
- Add to MetaCart
this paper functional languages are proposed as such a methodology using an extension of the concept of skeletons --- higher-order functions coupled with parallel implementation templates. An essential part of the proposed methodology is the use of data distribution algebras
Performance Models for Co-ordinating Parallel Data Classification
- In Proceedings of the Seventh International Parallel Computing Workshop (PCW-97
, 1997
"... In this paper we investigate the use of performance models for structuring parallel programs through a case study in data mining. Performance models have been shown to be an integral part of providing a more structured approach to the problems of performance portability and resource allocation in pa ..."
Abstract
-
Cited by 7 (0 self)
- Add to MetaCart
In this paper we investigate the use of performance models for structuring parallel programs through a case study in data mining. Performance models have been shown to be an integral part of providing a more structured approach to the problems of performance portability and resource allocation in parallel programming. This is particularly true in the context of skeletons, where parallel programs are expressed as combinations of predefined, often higher-order, functions. The use of performance models has, to some extent, been limited by the difficulty in applying the approach to irregular and dynamic parallel algorithms. We explore this problem in the context of a well known data mining algorithm, C4.5, which exhibits both irregular and dynamic characteristics. C4.5 is rich in inherent parallelism making the choice of a suitable parallel implementation for a given architecture non-trivial. We demonstrate how a structured approach to developing the performance models enables a c...
Practical Parallel Divide-and-Conquer Algorithms
, 1997
"... Nested data parallelism has been shown to be an important feature of parallel languages, allowing the concise expression of algorithms that operate on irregular data structures such as graphs and sparse matrices. However, previous nested dataparallel languages have relied on a vector PRAM impleme ..."
Abstract
-
Cited by 5 (2 self)
- Add to MetaCart
Nested data parallelism has been shown to be an important feature of parallel languages, allowing the concise expression of algorithms that operate on irregular data structures such as graphs and sparse matrices. However, previous nested dataparallel languages have relied on a vector PRAM implementation layer that cannot be efficiently mapped to MPPs with high inter-processor latency. This thesis shows that by restricting the problem set to that of data-parallel divide-and-conquer algorithms I can maintain the expressibility of full nested data-parallel languages while achieving good efficiency on current distributed-memory machines. Specifically, I define
A lazy, self-optimising parallel matrix library
- Functional Programming Workshop, Ullapool
, 1995
"... Published in collaboration with the ..."

