Results 1 - 10
of
13
Performance Prediction of Parallel Processing Systems: The PAMELA Methodology
- in Proc. 7th ACM Int. Conf. on Supercomputing
, 1993
"... In this paper we present a new methodology for the performance prediction of parallel programs on parallel platforms ranging from shared-memory to distributed-memory (vector) machines. The methodology comprises a procedural program and machine specification paradigm based on Pamela (PerformAnce ModE ..."
Abstract
-
Cited by 37 (17 self)
- Add to MetaCart
In this paper we present a new methodology for the performance prediction of parallel programs on parallel platforms ranging from shared-memory to distributed-memory (vector) machines. The methodology comprises a procedural program and machine specification paradigm based on Pamela (PerformAnce ModEling LAnguage), along with a performance calculus, called "serialization analysis". This calculus extends conventional parallel program analysis technology by explicitly accounting for resource contention, yet at the low evaluation cost typical for static techniques. It is shown that, where conventional techniques introduce fundamental errors, predictions from serialization analysis remain realistic. Apart from the merits of the methodology itself, this high reliability/cost ratio makes Pamela an attractive candidate for compile-time application within the performance prediction hierarchy often found in parallel programming environments. 1 Introduction The performance of a concurrent syste...
Static Performance Prediction of Data-Dependent Programs
, 2000
"... Static program performance prediction has been less successful in providing information on the distribution of execution times when considering a large space of input data sets. Typically, low-cost performance analysis techniques model system parameters in terms of deterministic variables, ignoring ..."
Abstract
-
Cited by 11 (3 self)
- Add to MetaCart
Static program performance prediction has been less successful in providing information on the distribution of execution times when considering a large space of input data sets. Typically, low-cost performance analysis techniques model system parameters in terms of deterministic variables, ignoring the fact that most of these parameters are stochastic due to data dependencies in programs. In this paper we present a symbolic program performance prediction approach based on characterizing loop bounds and branch probabilities, as well as basic block execution time delays in terms of their statistical moments, which reect the data-dependent behavior of these parameters. Our compositional approach allows for a low-cost analysis yielding the statistical moments of the overall program execution time distribution. We present the approach and report on its application to two small example codes.
A Probabilistic Approach to the Analysis of Program Execution Time
, 1998
"... We present a new approach to the performance prediction of parallel programs that provides information on the distribution of execution times when considering a large space of input data sets. The research aims to extend low-cost performance analysis techniques by accounting for the stochastic behav ..."
Abstract
-
Cited by 8 (1 self)
- Add to MetaCart
We present a new approach to the performance prediction of parallel programs that provides information on the distribution of execution times when considering a large space of input data sets. The research aims to extend low-cost performance analysis techniques by accounting for the stochastic behavior of system parameters. Current analysis techniques are based on path analysis with the assumption of deterministic task times (mean values) instead of accounting for variance. Most of the system model parameters, however, are stochastic rather than deterministic due to data dependency in programs, for example in terms of branches and loop bounds, as well as due to various other probabilistic model abstractions. The approach is based on moments representations of distribution. We present a lowcost algorithm that computes the moments of the program execution time based on the moments associated with branching, loop bounds, and basic blocks. The novelty of the analysis technique is the combi...
Performance Prediction of Data-Dependent Task Parallel Programs
- in Proc. of the 7th Intl. Conference on Parallel Processing (EuroPar 2001), Manchester, United Kingdom
, 2001
"... Current analytic solutions to the execution time prediction Y of binary parallel compositions of tasks with arbitrary execution time distributions X1 and X2 are either computationally complex or very inaccurate. In this paper we introduce an analytical approach based on the use of lambda distribu ..."
Abstract
-
Cited by 6 (2 self)
- Add to MetaCart
Current analytic solutions to the execution time prediction Y of binary parallel compositions of tasks with arbitrary execution time distributions X1 and X2 are either computationally complex or very inaccurate. In this paper we introduce an analytical approach based on the use of lambda distributions to approximate execution time distributions. This allows us to predict the first 4 statistical moments of Y in terms of the first 4 moments of X i at negligible solution complexity. The prediction method applies to a wide range of workload distributions as found in practice, while its accuracy is better or equal compared to comparable low-cost approaches.
Compiling Performance Models from Parallel Programs
- In Proceedings of the 8th ACM International Conference on Supercomputing
, 1994
"... A technique is described to automatically compile performance models in the course of program translation. The performance models are fully symbolic in order to preserve as much diagnostic information as possible. Although compiled statically, the models account for the effects of resource contentio ..."
Abstract
-
Cited by 4 (0 self)
- Add to MetaCart
A technique is described to automatically compile performance models in the course of program translation. The performance models are fully symbolic in order to preserve as much diagnostic information as possible. Although compiled statically, the models account for the effects of resource contention, due to the introduction of a novel algorithm within the symbolic compilation scheme. It is shown that the compilation approach fundamentally outperforms traditional static estimation procedures in terms of precision at a negligible increase in cost. This claim is illustrated by a case study of an LU factorization algorithm on a multiprocessor. 1 Introduction Low-cost, compile-time performance prediction provides essential, early feedback to enable program and machine parameter optimization by both the user and the compiler. In this paper we present a technique to automatically compile a symbolic performance model which accurately predicts the execution time of a parallel program given a...
On the Analysis of PAMELA Models
, 1993
"... While last year's report [16] loosely introduced the general concepts behind the Pamela approach toward modeling and analysis of parallel systems, this report exclusively focuses on the calculus of the methodology. In particular, it defines an algorithmic approach toward serialization analysis, whi ..."
Abstract
-
Cited by 4 (3 self)
- Add to MetaCart
While last year's report [16] loosely introduced the general concepts behind the Pamela approach toward modeling and analysis of parallel systems, this report exclusively focuses on the calculus of the methodology. In particular, it defines an algorithmic approach toward serialization analysis, which enables (future) mechanization of the analysis. Thus, a technique is developed to automatically compile symbolic performance models in the course of program translation. It is shown that the resulting performance models fundamentally outperform traditional static estimation approaches at a negligible increase in cost. This claim is illustrated by two case studies, i.e., an LU factorization algorithm on a multiprocessor, and a matrix-vector update on a multicomputer. Contents 1 Introduction 2 2 Analysis 5 2.1 Mathematical Preliminaries : : : : : : : : : : : : : : : : : : : : : 5 2.2 Formalism : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 7 2.3 Homomorphic Mapping : : : :...
A simple run--time concurrency measure
- In Proceedings of the 3 rd Australian Transputer and OCCAM User Group Conference
, 1990
"... A “concurrency measure ” provides an objective means of comparing the level of parallelism achievable by a distributed computation. To date such measures have only been applicable after a computation has successfully terminated. This paper develops a notion of “observed concurrency ” that can be con ..."
Abstract
-
Cited by 4 (0 self)
- Add to MetaCart
A “concurrency measure ” provides an objective means of comparing the level of parallelism achievable by a distributed computation. To date such measures have only been applicable after a computation has successfully terminated. This paper develops a notion of “observed concurrency ” that can be continuously evaluated as a computation proceeds. It is suitable for nested parallelism and evaluation either in situ or during simulation.
Predicting Parallel System Performance with PAMELA
, 1995
"... A compile-time prediction technique is outlined that yields approximate, yet low-cost, analytical performance models of parallel systems, to be used for parameter optimization during the initial design loops. In this paper we present the technique and report on its accuracy when compared to results ..."
Abstract
-
Cited by 3 (1 self)
- Add to MetaCart
A compile-time prediction technique is outlined that yields approximate, yet low-cost, analytical performance models of parallel systems, to be used for parameter optimization during the initial design loops. In this paper we present the technique and report on its accuracy when compared to results from simulation as well as to measurement results on a distributed-memory machine. 1 Introduction In the performance prediction of parallel systems many approaches exist which represent a specific trade-off between analysis accuracy and cost. Approaches aimed for accuracy involve the use of probabilistic techniques based on queueing networks [10, 15], stochastic graphs [13, 18], timed Petri nets [1, 20], and stochastic process algebras [8], as well as simulation [16, 19]. Due to the exponential state space complexity associated with these models, however, the computational costs may easily prohibit frequent use of such techniques in a design loop. Although compile-time techniques entail a s...
Performance Estimation for Embedded Systems
, 2000
"... In this document we propose a symbolic performance modeling technique to be used as the basis of the JOSES cost estimator. The approach is inspired by the need for highly parametric cost models in the initial stages in parallel program design, where absolute prediction accuracy is of less priority t ..."
Abstract
-
Cited by 2 (2 self)
- Add to MetaCart
In this document we propose a symbolic performance modeling technique to be used as the basis of the JOSES cost estimator. The approach is inspired by the need for highly parametric cost models in the initial stages in parallel program design, where absolute prediction accuracy is of less priority than solution cost, and where symbolic feedback on the effects of user mapping decisions and machine parameters is of primary concern. As illustrated by the case study, the symbolic approach provides good feedback on the effects of partitioning choices as well as the influence of computation and communication parameters on application performance.

