Abstract:
Although there are many situations in which a model of application performance is valuable, performance modeling of parallel programs is not commonplace, largely because of the difficulty of developing accurate models of real applications executing on real multiprocessors. This paper describes a toolkit for performance tuning and prediction based on lost cycles analysis. Lost cycles analysis decomposes parallel overheads into meaningful categories that are amenable to modeling, and uses a priori knowledge of the sources and characteristics of overhead in parallel systems to guide and constrain the modeling process. The Lost Cycles Toolkit automates the process of constructing a performance model for a parallel application by integrating empirical model-building techniques from statistics with measurement and modeling techniques for parallel programs. We present several examples to show how the toolkit facilitates the construction of performance models, and to illustrate the use of the ...
Citations
|
1127
|
Numerical Recipes In C: The Art of Scientific Computing
– Flannery
- 1992
|
|
686
|
The Art of Computer System Performance Analysis
– Jain
- 1991
|
|
234
|
Empirical Model-Building and Response Surfaces
– Box, Draper
- 1987
|
|
140
|
Statistics for Experimenters - An Introduction to Design, Data Analysis, and Model Building
– Box, Hunter, et al.
- 1978
|
|
83
|
An algorithm for subgraph isomorphism
– Ullman
- 1976
|
|
62
|
Parallel Performance Prediction Using Lost Cycles Analysis
– Crovella, LeBlanc
- 1994
|
|
49
|
High-level optimization via automated statistical modeling
– Brewer
- 1995
|
|
38
|
Analytical performance prediction on multicomputers. Supercomputing 93
– Clement, Quinn
- 1993
|
|
21
|
Performance Prediction and Tuning of Parallel Programs
– Crovella
- 1999
|
|
21
|
PAWS: A performance evaluation tool for parallel computing systems
– Pease, Ghafoor, et al.
- 1991
|
|
15
|
Performance Debugging using Parallel Performance Predicates
– Crovella, Leblanc
- 1992
|
|
12
|
Isoefficiency function: A scalability metric for parallel algorithms and architectures
– Grama, Gupta, et al.
- 1993
|
|
8
|
PERFSIM: A tool for automatic performance analysis of data-parallel fortran programs
– Toledo
- 1995
|
|
6
|
Optimum Experimental Designs. Oxford science publications
– Atkinson, Donev
- 1992
|
|
6
|
The advantages of multiple parallelizations in combinatorial search
– Crowl, Crovella, et al.
- 1994
|
|
2
|
Performance Efficient Mapping of Applications to Parallel and Distributed Architectures
– Zimran, Rao, et al.
- 1990
|