Results 1 
9 of
9
A cost calculus for parallel functional programming
 JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING
, 1995
"... Building a cost calculus for a parallel program development environment is difficult because of the many degrees of freedom available in parallel implementations, and because of difficulties with compositionality. We present a strategy for building cost calculi for skeletonbased programming languag ..."
Abstract

Cited by 61 (6 self)
 Add to MetaCart
Building a cost calculus for a parallel program development environment is difficult because of the many degrees of freedom available in parallel implementations, and because of difficulties with compositionality. We present a strategy for building cost calculi for skeletonbased programming languages which can be used for derivational software development and which deals in a pragmatic way with the difficulties of composition. The approach is illustrated for the BirdMeertens theory of lists, a parallel functional language with an associated equational transformation system. Keywords: functional programming, parallel programming, program transformation, cost calculus, equational theories, architecture independence, BirdMeertens formalism.
The BirdMeertens Formalism as a Parallel Model
 Software for Parallel Computation, volume 106 of NATO ASI Series F
, 1993
"... The expense of developing and maintaining software is the major obstacle to the routine use of parallel computation. Architecture independent programming offers a way of avoiding the problem, but the requirements for a model of parallel computation that will permit it are demanding. The BirdMeertens ..."
Abstract

Cited by 45 (0 self)
 Add to MetaCart
(Show Context)
The expense of developing and maintaining software is the major obstacle to the routine use of parallel computation. Architecture independent programming offers a way of avoiding the problem, but the requirements for a model of parallel computation that will permit it are demanding. The BirdMeertens formalism is an approach to developing and executing dataparallel programs; it encourages software development by equational transformation; it can be implemented efficiently across a wide range of architecture families; and it can be equipped with a realistic cost calculus, so that tradeoffs in software design can be explored before implementation. It makes an ideal model of parallel computation. Keywords: General purpose parallel computing, models of parallel computation, architecture independent programming, categorical data type, program transformation, code generation. 1 Properties of Models of Parallel Computation Parallel computation is still the domain of researchers and those ...
A practical hierarchical model of parallel computation: binary tree and FFT graph algorithms
, 1991
"... We introduce a model of parallel computation that retains the ideal properties of the PRAM by using it as a submodel, while simultaneously being more reflective of realistic parallel architectures by accounting for and providing abstract control over communication and synchronization costs. The Hi ..."
Abstract

Cited by 37 (5 self)
 Add to MetaCart
We introduce a model of parallel computation that retains the ideal properties of the PRAM by using it as a submodel, while simultaneously being more reflective of realistic parallel architectures by accounting for and providing abstract control over communication and synchronization costs. The Hierarchical PRAM (HPRAM) model controls conceptual complexity in the face of asynchrony in two ways. First, by providing the simplifying assumption of synchronization to the design of algorithms, but allowing the algorithms to work asynchronously with each other; and organizing this "control asynchrony " via an implicit hierarchy relation. Second, by allowing.the restriction of "communication asynchrony " in order to obtain determinate algorithms (thus greatly simplifying proofs of correctness). It is shown that the model is reflective of a variety of existing and proposed parallel architectures, particularly ones that can support massive parallelism. Relationships to programming
The Heterogeneous Bulk Synchronous Parallel Model
 In Parallel and Distributed Processing
, 2000
"... Trends in parallel computing indicate that heterogeneous parallel computing will be one of the most widespread platforms for computationintensive applications. A heterogeneous computing environment offers considerably more computational power at a lower cost than a parallel computer. We propose ..."
Abstract

Cited by 11 (2 self)
 Add to MetaCart
(Show Context)
Trends in parallel computing indicate that heterogeneous parallel computing will be one of the most widespread platforms for computationintensive applications. A heterogeneous computing environment offers considerably more computational power at a lower cost than a parallel computer. We propose the Heterogeneous Bulk Synchronous Parallel (HBSP) model, which is based on the BSP model of parallel computation, as a framework for developing applications for heterogeneous parallel environments. HBSP enhances the applicability of the BSP model by incorporating parameters that reflect the relative speeds of the heterogeneous computing components. Moreover, we demonstrate the utility of the model by developing parallel algorithms for heterogeneous systems.
Towards a Scalable Parallel Object Database  The Bulk Synchronous Parallel Approach
, 1996
"... Parallel computers have been successfully deployed in many scientific and numerical application areas, although their use in nonnumerical and database applications has been scarce. In this report, we first survey the architectural advancements beginning to make generalpurpose parallel computing co ..."
Abstract

Cited by 10 (2 self)
 Add to MetaCart
(Show Context)
Parallel computers have been successfully deployed in many scientific and numerical application areas, although their use in nonnumerical and database applications has been scarce. In this report, we first survey the architectural advancements beginning to make generalpurpose parallel computing costeffective, the requirements for nonnumerical (or symbolic) applications, and the previous attempts to develop parallel databases. The central theme of the Bulk Synchronous Parallel model is to provide a high level abstraction of parallel computing hardware whilst providing a realisation of a parallel programming model that enables architecture independent programs to deliver scalable performance on diverse hardware platforms. Therefore, the primary objective of this report is to investigate the feasibility of developing a portable, scalable, parallel object database, based on the Bulk Synchronous Parallel model of computation. In particular, we devise a way of providing highlevel abstra...
Transgressing The Boundaries: Unified Scalable Parallel Programming
, 1996
"... The diverse architectural features of parallel computers, and the lack of commonly accepted parallelprogramming environments, meant that software development for these systems has been significantly more difficult than the sequential case. Until better approaches are developed, the programming envi ..."
Abstract

Cited by 1 (1 self)
 Add to MetaCart
(Show Context)
The diverse architectural features of parallel computers, and the lack of commonly accepted parallelprogramming environments, meant that software development for these systems has been significantly more difficult than the sequential case. Until better approaches are developed, the programming environment will remain a serious obstacle to mainstream scalable parallel computing. The work reported in this paper attempts to integrate architectureindependent scalable parallel programming in the Bulk Synchronous Parallel (BSP) model with the sharedmemory parallel programming using the theoretical PRAM model. We start with a discussion of problem parallelism, that is, the parallelism inherent to a problem instead of a specific algorithm, and the parallelprogramming techniques that allow the capture of this notion. We then review the ubiquitous PRAM model in terms of the model's pragmatic limitations, where particular attention is paid to simulations on practical machines. The BSP model i...
EXPLOITING MULTIPLE LEVELS OF PARALLELISM IN SPARSE MATRIXMATRIX MULTIPLICATION
"... Abstract. Sparse matrixmatrix multiplication (or SpGEMM) is a key primitive for many highperformance graph algorithms as well as for some linear solvers, such as algebraic multigrid. The scaling of existing parallel implementations of SpGEMM is heavily bound by communication. Even though 3D (or 2. ..."
Abstract
 Add to MetaCart
(Show Context)
Abstract. Sparse matrixmatrix multiplication (or SpGEMM) is a key primitive for many highperformance graph algorithms as well as for some linear solvers, such as algebraic multigrid. The scaling of existing parallel implementations of SpGEMM is heavily bound by communication. Even though 3D (or 2.5D) algorithms have been proposed and theoretically analyzed in the flat MPI model on ErdősRényi matrices, those algorithms had not been implemented in practice and their complexities had not been analyzed for the general case. In this work, we present the first ever implementation of the 3D SpGEMM formulation that also exploits multiple (intranode and internode) levels of parallelism, achieving significant speedups over the stateoftheart publicly available codes at all levels of concurrencies. We extensively evaluate our implementation and identify bottlenecks that should be subject to further research. Key words. Parallel computing, numerical linear algebra, sparse matrixmatrix multiplication, 2.5D algorithms, 3D algorithms, multithreading, SpGEMM, 2D decomposition, graph algorithms.
Deriving Good Transformations for Mapping Nested Loops on Hierarchical Parallel Machines in Polynomial Time
 In Proceedings of the 1992 ACM International Conference on Supercomputing
, 1992
"... We present a computationally efficient method for deriving the most appropriate transformation and mapping of a nested loop for a given hierarchical parallel machine. This method is in the context of our systematic and general theory of unimodular loop transformations for the problem of iteration sp ..."
Abstract
 Add to MetaCart
We present a computationally efficient method for deriving the most appropriate transformation and mapping of a nested loop for a given hierarchical parallel machine. This method is in the context of our systematic and general theory of unimodular loop transformations for the problem of iteration space partitioning [7]. Finding an optimal mapping or an optimal associated unimodular transformation is NPcomplete. We present a polynomial time method for obtaining a `good' transformation using a simple parameterized model of the hierarchical machine. We outline a systematic methodology for obtaining the most appropriate mapping. 1
Model Programs for Computational Science: A Programming Methodology
"... We describe a programming methodology for computational science based on programming paradigms for multicomputers. Each paradigm is a class of algorithms that have the same control structure. For every paradigm, a general parallel program is developed. The general program is then used to derive two ..."
Abstract
 Add to MetaCart
(Show Context)
We describe a programming methodology for computational science based on programming paradigms for multicomputers. Each paradigm is a class of algorithms that have the same control structure. For every paradigm, a general parallel program is developed. The general program is then used to derive two or more model programs, which solve specific problems in science and engineering. These programs have been tested on a Computing Surface and published with every detail open to scrutiny. We explain the steps involved in developing model programs and conclude that the study of programming paradigms provides an architectural vision of parallel scientific computing. 1