Results 1 
3 of
3
Runtime Interprocedural Data Placement Optimisation for Lazy Parallel Libraries (extended abstract)
 In LCR98: Languages, Compilers and Runtime Systems for Scalable Computers, number 1511 in LNCS
, 1997
"... . We are developing a lazy, selfoptimising parallel library of vectormatrix routines. The aim is to allow users to parallelise certain computationally expensive parts of numerical programs by simply linking with a parallel rather than sequential library of subroutines. The library performs int ..."
Abstract

Cited by 10 (9 self)
 Add to MetaCart
. We are developing a lazy, selfoptimising parallel library of vectormatrix routines. The aim is to allow users to parallelise certain computationally expensive parts of numerical programs by simply linking with a parallel rather than sequential library of subroutines. The library performs interprocedural data placement optimisation at runtime, which requires the optimiser itself to be very efficient. We achieve this firstly by working from aggregate loop nests which have been optimised in isolation, and secondly by using a carefully constructed mathematical formulation for data distributions and the distribution requirements of library operators, which allows us largely to replace searching with calculation in our algorithm. 1 Introduction This paper describes an approach to interprocedural data placement optimisation in the context of a parallel numerical library. The idea for such a library, as described in our previous paper [4], is to make it easy for users to parall...
Automatic data distribution optimisation in a lazy, selfoptimising parallel matrix library (Extended Abstract)
, 1996
"... This short paper describes a matrixvector library implementation running on the Fujitsu AP1000. The library optimises data distribution at runtime, taking advantage of information about how operands and results are used by delaying evaluation where possible. The work extends our earlier paper on t ..."
Abstract

Cited by 1 (1 self)
 Add to MetaCart
This short paper describes a matrixvector library implementation running on the Fujitsu AP1000. The library optimises data distribution at runtime, taking advantage of information about how operands and results are used by delaying evaluation where possible. The work extends our earlier paper on the subject [5] by giving a general methodology for representing data distributions, which is then used for formulating the optimisation problem and for describing an optimisation algorithm. 1 Introduction This paper describes a methodology that aims to provide a way for users to incrementally have certain computationally expensive parts of their program execute in parallel. The basic idea is to have a library of parallel implementations for a number of common numerical problems, which users can simply plug into their program to execute parts of it in parallel. A naive implementation for such a library would, though, have a number of problems: Since, from the library implementor 's point of...
Experiments with Parallelising Numerical Applications via DESOLibraries (extended abstract)
, 1997
"... DESOLibraries are "delayed evaluation, selfoptimising " parallel libraries of numerical routines. The aim is to allow users to parallelise computationally expensive parts of numerical programs by simply linking with a parallel rather than sequential library of subroutines. The library ..."
Abstract
 Add to MetaCart
DESOLibraries are "delayed evaluation, selfoptimising " parallel libraries of numerical routines. The aim is to allow users to parallelise computationally expensive parts of numerical programs by simply linking with a parallel rather than sequential library of subroutines. The library performs interprocedural data placement optimisation at runtime, which requires the optimiser itself to be very efficient. This paper outlines the techniques we use to achieve this and describes the current state of our implementation. We show performance results for an implementation of the conjugate gradient iterative solver on the AP1000 that uses our library. 1 Introduction This short paper outlines some aspects of our approach to runtime interprocedural data placement optimisation in the context of a DESOLibrary of parallel vectormatrix routines which we are currently developing. The fundamental ideas behind our approach have been outlined in previous publications [5, 2, 3]. Bri...