Results 1 
9 of
9
Analyzing Scalability of Parallel Algorithms and Architectures
 Journal of Parallel and Distributed Computing
, 1994
"... The scalability of a parallel algorithm on a parallel architecture is a measure of its capacity to effectively utilize an increasing number of processors. Scalability analysis may be used to select the best algorithmarchitecture combination for a problem under different constraints on the growth of ..."
Abstract

Cited by 90 (18 self)
 Add to MetaCart
The scalability of a parallel algorithm on a parallel architecture is a measure of its capacity to effectively utilize an increasing number of processors. Scalability analysis may be used to select the best algorithmarchitecture combination for a problem under different constraints on the growth of the problem size and the number of processors. It may be used to predict the performance of a parallel algorithm and a parallel architecture for a large number of processors from the known performance on fewer processors. For a fixed problem size, it may be used to determine the optimal number of processors to be used and the maximum possible speedup that can be obtained. The objective of this paper is to critically assess the state of the art in the theory of scalability analysis, and motivate further research on the development of new and more comprehensive analytical tools to study the scalability of parallel algorithms and architectures. We survey a number of techniques and formalisms t...
Performance and scalability of preconditioned conjugate gradient methods on parallel computers
 Department of Computer Science, University of Minnesota
, 1995
"... ..."
Performance Properties of Large Scale Parallel Systems
 Department of Computer Science, University of Minnesota
, 1993
"... There are several metrics that characterize the performance of a parallel system, such as, parallel execution time, speedup and efficiency. A number of properties of these metrics have been studied. For example, it is a well known fact that given a parallel architecture and a problem of a fixed size ..."
Abstract

Cited by 23 (7 self)
 Add to MetaCart
There are several metrics that characterize the performance of a parallel system, such as, parallel execution time, speedup and efficiency. A number of properties of these metrics have been studied. For example, it is a well known fact that given a parallel architecture and a problem of a fixed size, the speedup of a parallel algorithm does not continue to increase with increasing number of processors. It usually tends to saturate or peak at a certain limit. Thus it may not be useful to employ more than an optimal number of processors for solving a problem on a parallel computer. This optimal number of processors depends on the problem size, the parallel algorithm and the parallel architecture. In this paper we study the impact of parallel processing overheads and the degree of concurrency of a parallel algorithm on the optimal number of processors to be used when the criterion for optimality is minimizing the parallel execution time. We then study a more general criterion of optimalit...
The Consequences of Fixed Time Performance Measurement
 In Proceedings of the 25th Hawaii International Conference on System Sciences: Volume III
, 1992
"... In measuring performance of parallel computers, the usual method is to choose a problem and test execution time as the processor count is varied. This model underlies definitions of “speedup, ” “efficiency,” and arguments against parallel processing such as Ware’s formulation of Amdahl’s law. Fixed ..."
Abstract

Cited by 19 (2 self)
 Add to MetaCart
In measuring performance of parallel computers, the usual method is to choose a problem and test execution time as the processor count is varied. This model underlies definitions of “speedup, ” “efficiency,” and arguments against parallel processing such as Ware’s formulation of Amdahl’s law. Fixed time models use problem size as the figure of merit. Analysis and experiments based on fixed time instead of fixed size have yielded surprising consequences: The fixed time method does not reward slower processors with higher speedup; it predicts a new limit to speedup, more optimistic than Amdahl’s; it shows efficiency independent of processor speed and ensemble size; it sometimes gives nonspurious superlinear speedup; it provides a practical means (SLALOM) of comparing computers of widely varying speeds without distortion. 1.
Analysis and Design of Scalable Parallel Algorithms for Scientific Computing
, 1995
"... This dissertation presents a methodology for understanding the performance and scalability of algorithms on parallel computers and the scalability analysis of a variety of numerical algorithms. We demonstrate the analytical power of this technique and show how it can guide the development of better ..."
Abstract

Cited by 8 (5 self)
 Add to MetaCart
This dissertation presents a methodology for understanding the performance and scalability of algorithms on parallel computers and the scalability analysis of a variety of numerical algorithms. We demonstrate the analytical power of this technique and show how it can guide the development of better parallel algorithms. We present some new highly scalable parallel algorithms for sparse matrix computations that were widely considered to be poorly suitable for large scale parallel computers. We present some laws governing the performance and scalability properties that apply to all parallel systems. We show that our results generalize or extend a range of earlier research results concerning the performance of parallel systems. Our scalability analysis of algorithms such as fast Fourier transform (FFT), dense matrix multiplication, sparse matrixvector multiplication, and the preconditioned conjugate gradient (PCG) provides many interesting insights into their behavior on parallel computer...
Performance Evaluation for Parallel Systems: A Survey
, 1997
"... Performance is often a key factor in determining the success of a parallel software system. Performance evaluation... ..."
Abstract

Cited by 8 (0 self)
 Add to MetaCart
Performance is often a key factor in determining the success of a parallel software system. Performance evaluation...
The Performance and Scalability of Parallel Systems
, 1994
"... In this thesis wedevelopananalytical performance model for parallel computer systems. This model is builtonthree abstract performance elements; loading intensity, contention, anddelay.These elements correspondto performance measures that are theoutcome of features of both software andhardware compon ..."
Abstract

Cited by 4 (1 self)
 Add to MetaCart
In this thesis wedevelopananalytical performance model for parallel computer systems. This model is builtonthree abstract performance elements; loading intensity, contention, anddelay.These elements correspondto performance measures that are theoutcome of features of both software andhardware components of a computing system. The pro#le of these components can in turn be derived from an analysis of the performancerelated behaviour of theindividual processes that constitute a complete system.
The Unified Parallel Speedup Model And Simulator
 Southeast Regional ACM Conference, Athens GA
, 2001
"... This paper develops a unified parallel processing speedup model, that integrates parallel processing models from pipelines within the CPU to clustered and distributed multicomputers. The different software/algorithm parallelism models are analyzed at each level of parallelism, as well as the hardwa ..."
Abstract
 Add to MetaCart
This paper develops a unified parallel processing speedup model, that integrates parallel processing models from pipelines within the CPU to clustered and distributed multicomputers. The different software/algorithm parallelism models are analyzed at each level of parallelism, as well as the hardware architecture developed to capitalize on the potential speedup. By integrating the different levels of parallelism in a single unified model through the use of encapsulation, researchers and designers are able to explore the statespace of possibilities and look for optimal performance returns with minimal hardware resources. Multiple levels of parallelism represent subdivisions of a parallelism continuum where performance can be scaled by adding additional levels of parallel architecture when supported by the workload. The paper also presents and discusses a simulation model developed that implements the unified speedup model. The simulation model is a Java applet posted to the web and available for experimentation. This work extends previous work that unified Amdahl's classic parallel speedup model, with process scaling and a new workload parallel model [7], additionally integrating pipeline speedup, superscalar speedup, and a new nTier clientserver distributed multicomputer parallel speedup. Index Terms: Parallel Speedup, Levels of Parallelism, Efficiency, Unified Parallel Model, scaling, clustered computing. 1 1.