Results 1  10
of
51
Analyzing Scalability of Parallel Algorithms and Architectures
 Journal of Parallel and Distributed Computing
, 1994
"... The scalability of a parallel algorithm on a parallel architecture is a measure of its capacity to effectively utilize an increasing number of processors. Scalability analysis may be used to select the best algorithmarchitecture combination for a problem under different constraints on the growth of ..."
Abstract

Cited by 90 (18 self)
 Add to MetaCart
The scalability of a parallel algorithm on a parallel architecture is a measure of its capacity to effectively utilize an increasing number of processors. Scalability analysis may be used to select the best algorithmarchitecture combination for a problem under different constraints on the growth of the problem size and the number of processors. It may be used to predict the performance of a parallel algorithm and a parallel architecture for a large number of processors from the known performance on fewer processors. For a fixed problem size, it may be used to determine the optimal number of processors to be used and the maximum possible speedup that can be obtained. The objective of this paper is to critically assess the state of the art in the theory of scalability analysis, and motivate further research on the development of new and more comprehensive analytical tools to study the scalability of parallel algorithms and architectures. We survey a number of techniques and formalisms t...
Scalable Problems and MemoryBounded Speedup
, 1992
"... In this paper three models of parallel speedup are studied. They are fixedsize speedup, fixedtime speedup and memorybounded speedup. The latter two consider the relationship between speedup and problem scalability. Two sets of speedup formulations are derived for these three models. One set consi ..."
Abstract

Cited by 53 (13 self)
 Add to MetaCart
In this paper three models of parallel speedup are studied. They are fixedsize speedup, fixedtime speedup and memorybounded speedup. The latter two consider the relationship between speedup and problem scalability. Two sets of speedup formulations are derived for these three models. One set considers uneven workload allocation and communication overhead and gives more accurate estimation. Another set considers a simplified case and provides a clear picture on the impact of the sequential portion of an application on the possible performance gain from parallel processing. The simplified fixedsize speedup is Amdahl's law. The simplified fixedtime speedup is Gustafson's scaled speedup. The simplified memorybounded speedup contains both Amdahl's law and Gustafson's scaled speedup as special cases. This study leads to a better understanding of parallel processing.
Parallel Programming using Functional Languages
, 1991
"... I am greatly indebted to Simon Peyton Jones, my supervisor, for his encouragement and technical assistance. His overwhelming enthusiasm was of great support to me. I particularly want to thank Simon and Geoff Burn for commenting on earlier drafts of this thesis. Through his excellent lecturing Cohn ..."
Abstract

Cited by 48 (3 self)
 Add to MetaCart
I am greatly indebted to Simon Peyton Jones, my supervisor, for his encouragement and technical assistance. His overwhelming enthusiasm was of great support to me. I particularly want to thank Simon and Geoff Burn for commenting on earlier drafts of this thesis. Through his excellent lecturing Cohn Runciman initiated my interest in functional programming. I am grateful to Phil Trinder for his simulator, on which mine is based, and Will Partain for his help with LaTex and graphs. I would like to thank the Science and Engineering Research Council of Great Britain for their financial support. Finally, I would like to thank Michelle, whose culinary skills supported me whilst I was writingup.The Imagination the only nation worth defending a nation without alienation a nation whose flag is invisible and whose borders are forever beyond the horizon a nation whose motto is why have one or the other when you can have one the other and both
An Approach to Scalability Study of Shared Memory Parallel Systems
, 1994
"... The overheads in a parallel system that limit its scalability need to be identified and separated in order to enable parallel algorithm design and the development of parallel machines. Such overheads may be broadly classified into two components. The first one is intrinsic to the algorithm and arise ..."
Abstract

Cited by 33 (18 self)
 Add to MetaCart
The overheads in a parallel system that limit its scalability need to be identified and separated in order to enable parallel algorithm design and the development of parallel machines. Such overheads may be broadly classified into two components. The first one is intrinsic to the algorithm and arises due to factors such as the workimbalance and the serial fraction. The second one is due to the interaction between the algorithm and the architecture and arises due to latency and contention in the network. A topdown approach to scalability study of shared memory parallel systems is proposed in this research. We define the notion of overhead functions associated with the different algorithmic and architectural characteristics to quantify the scalability of parallel systems; we isolate the algorithmic overhead and the overheads due to network latency and contention from the overall execution time of an application; we design and implement an executiondriven simulation platform that incorporates these methods for quantifying the overhead functions; and we use this simulator to study the scalability characteristics of five applications on shared memory platforms with different communication topologies.
Performance and scalability of preconditioned conjugate gradient methods on parallel computers
 Department of Computer Science, University of Minnesota
, 1995
"... ..."
A Simulationbased Scalability Study of Parallel Systems
 Journal of Parallel and Distributed Computing
, 1993
"... Scalability studies of parallel architectures have used scalar metrics to evaluate their performance. Very often, it is difficult to glean the sources of inefficiency resulting from the mismatch between the algorithmic and architectural requirements using such scalar metrics. Lowlevel performance s ..."
Abstract

Cited by 21 (15 self)
 Add to MetaCart
Scalability studies of parallel architectures have used scalar metrics to evaluate their performance. Very often, it is difficult to glean the sources of inefficiency resulting from the mismatch between the algorithmic and architectural requirements using such scalar metrics. Lowlevel performance studies of the hardware are also inadequate for predicting the scalability of the machine on real applications. We propose a topdown approach to scalability study that alleviates some of these problems. We characterize applications in terms of the frequently occurring kernels, and their interaction with the architecture in terms of overheads in the parallel system. An overhead function is associated with the algorithmic characteristics as well as their interaction with the architectural features. We present a simulation platform called SPASM (Simulator for Parallel Architectural Scalability Measurements) that quantifies these overhead functions. SPASM separates the algorithmic overhead into ...
Parallel evolutionary algorithms can achieve superlinear performance
 Information Processing Letters
, 2002
"... performance ..."
The Consequences of Fixed Time Performance Measurement
 In Proceedings of the 25th Hawaii International Conference on System Sciences: Volume III
, 1992
"... In measuring performance of parallel computers, the usual method is to choose a problem and test execution time as the processor count is varied. This model underlies definitions of “speedup, ” “efficiency,” and arguments against parallel processing such as Ware’s formulation of Amdahl’s law. Fixed ..."
Abstract

Cited by 19 (2 self)
 Add to MetaCart
In measuring performance of parallel computers, the usual method is to choose a problem and test execution time as the processor count is varied. This model underlies definitions of “speedup, ” “efficiency,” and arguments against parallel processing such as Ware’s formulation of Amdahl’s law. Fixed time models use problem size as the figure of merit. Analysis and experiments based on fixed time instead of fixed size have yielded surprising consequences: The fixed time method does not reward slower processors with higher speedup; it predicts a new limit to speedup, more optimistic than Amdahl’s; it shows efficiency independent of processor speed and ensemble size; it sometimes gives nonspurious superlinear speedup; it provides a practical means (SLALOM) of comparing computers of widely varying speeds without distortion. 1.