Results 1 -
5 of
5
Performance Models for the Processor Farm Paradigm
- IEEE Transactions on Parallel and Distributed Systems
, 1997
"... In this paper, we describe the design, implementation, and modeling of a runtime kernel to support the processor farm paradigm on multicomputers. We present a general topology-independent framework for obtaining performance models to predict the performance of the start-up, steady-state, and wind- ..."
Abstract
-
Cited by 16 (0 self)
- Add to MetaCart
In this paper, we describe the design, implementation, and modeling of a runtime kernel to support the processor farm paradigm on multicomputers. We present a general topology-independent framework for obtaining performance models to predict the performance of the start-up, steady-state, and wind-down phases of a processor farm. An algorithm is described, which for any interconnection network determines a tree-structured subnetwork that optimizes farm performance. The analysis technique is applied to the important case of k-ary tree topologies. The models are compared with the measured performance on a variety of topologies using both constant and varied task sizes. Index Terms---Parallel programming paradigms, performance evaluation, processor farm, tree networks, message passing architecture, network flow, master-slave. ------------------------------ F ------------------------------ 1I NTRODUCTION HE major problems in parallel computation revolve around questions of ease of...
An overview of the Adl language project
, 1995
"... The purpose of the Adl project is to demonstrate the efficient implementation of data parallel functional programming, firstly on the TMC CM-5 but ultimately on other parallel machines. We have designed a small polymorphic non-recursive language (Adl), which emphasizes high-level operations (second- ..."
Abstract
-
Cited by 4 (0 self)
- Add to MetaCart
The purpose of the Adl project is to demonstrate the efficient implementation of data parallel functional programming, firstly on the TMC CM-5 but ultimately on other parallel machines. We have designed a small polymorphic non-recursive language (Adl), which emphasizes high-level operations (second-order combinators) on aggregate structures. The Adl project incorporates the formal description of Adl semantics, translation and optimization, and the design of an abstract data-parallel machine which describes not only the CM-5 but also other distributed memory multicomputers and hence encourages architecture-independent code generation. An executable natural semantic description of translation to the Bird-Meertens Formalism (BMF) has been completed; similar techniques are being used with an optimizer. We also describe an implementation of the abstract machine for the CM-5 implementation. 1 Introduction Exploitation of parallelism in applications is a very attractive concept, but there is...
Mapping Adl to the Bird-Meertens Formalism
, 1994
"... Bulk data operations such as map and reduce are an elegant medium for expressing repetitive computation over aggregate data structures. They also serve as a tool for abstraction: not all details of the computation, such as the exact ordering of the constituent operations, need to be specified by the ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
Bulk data operations such as map and reduce are an elegant medium for expressing repetitive computation over aggregate data structures. They also serve as a tool for abstraction: not all details of the computation, such as the exact ordering of the constituent operations, need to be specified by the programmer. A precise description of the behaviour of the bulk data operator is the preserve of the language implementor. If the implementation of these operators is parallel then they become a medium for expressing implicit data parallelism. There is a large body of work formally the relating bulk data operators to each other and to their underlying data types. Much of this research stems from Category Theory where a number of general properties of types and operators have been established. One theoretical framework in particular, the Bird-Meertens Formalism (BMF), has proved to be extremely useful. The BMF theory of a type provides a set of operators on that type and a set of algebraic id...
EMPIRICAL PARALLEL PERFORMANCE PREDICTION FROM SEMANTICS-BASED PROFILING
"... Abstract. The PMLS parallelizing compiler for Standard ML is based upon the automatic instantiation of algorithmic skeletons at sites of higher order function (HOF) use. Rather than mechanically replacing HOFs with skeletons, which in general leads to poor parallel performance, PMLS seeks to predict ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
Abstract. The PMLS parallelizing compiler for Standard ML is based upon the automatic instantiation of algorithmic skeletons at sites of higher order function (HOF) use. Rather than mechanically replacing HOFs with skeletons, which in general leads to poor parallel performance, PMLS seeks to predict run-time parallel behaviour to optimise skeleton use. Static extraction of analytic cost models from programs is undecidable, and practical heuristic approaches are intractable. In contrast, PMLS utilises a hybrid approach by combining static analytic cost models for skeletons with dynamic information gathered from the sequential instrumentation of HOF argument functions. Such instrumentation is provided by an implementation independent SML interpreter, based on the languageās Structural Operational Semantics (SOS), in the form of SOS rule counts. PMLS then tries to relate the rule counts to program execution times through numerical techniques. This paper considers the design and implementation of the PMLS approach to parallel performance prediction. The formulation of a general rule count cost model as a set of over-determined linear equations is discussed, and their solution by single value decomposition, and by a genetic algorithm, are presented. Key words. Parallel computation, profiling, performance prediction, program transformation. 1. Introduction. The

