Results 1 -
2 of
2
Runtime Interprocedural Data Placement Optimisation for Lazy Parallel Libraries (extended abstract)
- In LCR98: Languages, Compilers and Run-time Systems for Scalable Computers, number 1511 in LNCS
, 1997
"... . We are developing a lazy, self-optimising parallel library of vector-matrix routines. The aim is to allow users to parallelise certain computationally expensive parts of numerical programs by simply linking with a parallel rather than sequential library of subroutines. The library performs int ..."
Abstract
-
Cited by 9 (8 self)
- Add to MetaCart
. We are developing a lazy, self-optimising parallel library of vector-matrix routines. The aim is to allow users to parallelise certain computationally expensive parts of numerical programs by simply linking with a parallel rather than sequential library of subroutines. The library performs interprocedural data placement optimisation at runtime, which requires the optimiser itself to be very efficient. We achieve this firstly by working from aggregate loop nests which have been optimised in isolation, and secondly by using a carefully constructed mathematical formulation for data distributions and the distribution requirements of library operators, which allows us largely to replace searching with calculation in our algorithm. 1 Introduction This paper describes an approach to interprocedural data placement optimisation in the context of a parallel numerical library. The idea for such a library, as described in our previous paper [4], is to make it easy for users to parall...
Automatic data distribution optimisation in a lazy, self-optimising parallel matrix library (Extended Abstract)
, 1996
"... This short paper describes a matrix-vector library implementation running on the Fujitsu AP1000. The library optimises data distribution at run-time, taking advantage of information about how operands and results are used by delaying evaluation where possible. The work extends our earlier paper on t ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
This short paper describes a matrix-vector library implementation running on the Fujitsu AP1000. The library optimises data distribution at run-time, taking advantage of information about how operands and results are used by delaying evaluation where possible. The work extends our earlier paper on the subject [5] by giving a general methodology for representing data distributions, which is then used for formulating the optimisation problem and for describing an optimisation algorithm. 1 Introduction This paper describes a methodology that aims to provide a way for users to incrementally have certain computationally expensive parts of their program execute in parallel. The basic idea is to have a library of parallel implementations for a number of common numerical problems, which users can simply plug into their program to execute parts of it in parallel. A naive implementation for such a library would, though, have a number of problems: Since, from the library implementor 's point of...

