Results 1 - 10
of
144
The Wisconsin Wind Tunnel: Virtual Prototyping of Parallel Computers
- In Proceedings of the 1993 ACM Sigmetrics Conference on Measurement and Modeling of Computer Systems
, 1993
"... We have developed a new technique for evaluating cache coherent, shared-memory computers. The Wisconsin Wind Tunnel (WWT) runs a parallel sharedmemory program on a parallel computer (CM-5) and uses execution-driven, distributed, discrete-event simulation to accurately calculate program execution tim ..."
Abstract
-
Cited by 187 (26 self)
- Add to MetaCart
We have developed a new technique for evaluating cache coherent, shared-memory computers. The Wisconsin Wind Tunnel (WWT) runs a parallel sharedmemory program on a parallel computer (CM-5) and uses execution-driven, distributed, discrete-event simulation to accurately calculate program execution time. WWT is a virtual prototype that exploits similarities between the system under design (the target) and an existing evaluation platform (the host). The host directly executes all target program instructions and memory references that hit in the target cache. WWT's shared memory uses the CM-5 memory 's error-correcting code (ECC) as valid bits for a fine-grained extension of shared virtual memory. Only memory references that miss in the target cache trap to WWT, which simulates a cache-coherence protocol. WWT correctly interleaves target machine events and calculates target program execution time. WWT runs on parallel computers with greater speed and memory capacity than uniprocessors. WWT'...
NAMD2: Greater Scalability for Parallel Molecular Dynamics
- JOURNAL OF COMPUTATIONAL PHYSICS
, 1998
"... Molecular dynamics programs simulate the behavior of biomolecular systems, leading to insights and understanding of their functions. However, the computational complexity of such simulations is enormous. Parallel machines provide the potential to meet this computational challenge. To harness this ..."
Abstract
-
Cited by 136 (31 self)
- Add to MetaCart
Molecular dynamics programs simulate the behavior of biomolecular systems, leading to insights and understanding of their functions. However, the computational complexity of such simulations is enormous. Parallel machines provide the potential to meet this computational challenge. To harness this potential, it is necessary to develop a scalable program. It is also necessary that the program be easily modified by application-domain programmers. The
Scalable Load Balancing Techniques for Parallel Computers
, 1994
"... In this paper we analyze the scalability of a number of load balancing algorithms which can be applied to problems that have the following characteristics : the work done by a processor can be partitioned into independent work pieces; the work pieces are of highly variable sizes; and it is not po ..."
Abstract
-
Cited by 89 (16 self)
- Add to MetaCart
In this paper we analyze the scalability of a number of load balancing algorithms which can be applied to problems that have the following characteristics : the work done by a processor can be partitioned into independent work pieces; the work pieces are of highly variable sizes; and it is not possible (or very difficult) to estimate the size of total work at a given processor. Such problems require a load balancing scheme that distributes the work dynamically among different processors. Our goal here is to determine the most scalable load balancing schemes for different architectures such as hypercube, mesh and network of workstations. For each of these architectures, we establish lower bounds on the scalability of any possible load balancing scheme. We present the scalability analysis of a number of load balancing schemes that have not been analyzed before. This gives us valuable insights into their relative performance for different problem and architectural characteristi...
Job Characteristics of a Production Parallel Scientific Workload on the NASA Ames iPSC/860
, 1995
"... . Statistics of a parallel workload on a 128-node iPSC/860 located at NASA Ames are presented. It is shown that while the number of sequential jobs dominates the number of parallel jobs, most of the resources (measured in node-seconds) were consumed by parallel jobs. Moreover, most of the sequen ..."
Abstract
-
Cited by 89 (23 self)
- Add to MetaCart
. Statistics of a parallel workload on a 128-node iPSC/860 located at NASA Ames are presented. It is shown that while the number of sequential jobs dominates the number of parallel jobs, most of the resources (measured in node-seconds) were consumed by parallel jobs. Moreover, most of the sequential jobs were for system administration. The average runtime of jobs grew with the number of nodes used, so the total resource requirements of large parallel jobs were larger by more than the number of nodes they used. The job submission rate during peak day activity was somewhat lower than one every two minutes, and the average job size was small. At night, submission rate was low but job sizes and system utilization were high, mainly due to NQS. Submission rate and utilization over the weekend were lower than on weekdays. The overall utilization was 50%, after accounting for downtime. About 2/3 of the applications were executed repeatedly, some for a significant number of times....
Analyzing Scalability of Parallel Algorithms and Architectures
- Journal of Parallel and Distributed Computing
, 1994
"... The scalability of a parallel algorithm on a parallel architecture is a measure of its capacity to effectively utilize an increasing number of processors. Scalability analysis may be used to select the best algorithm-architecture combination for a problem under different constraints on the growth of ..."
Abstract
-
Cited by 84 (17 self)
- Add to MetaCart
The scalability of a parallel algorithm on a parallel architecture is a measure of its capacity to effectively utilize an increasing number of processors. Scalability analysis may be used to select the best algorithm-architecture combination for a problem under different constraints on the growth of the problem size and the number of processors. It may be used to predict the performance of a parallel algorithm and a parallel architecture for a large number of processors from the known performance on fewer processors. For a fixed problem size, it may be used to determine the optimal number of processors to be used and the maximum possible speedup that can be obtained. The objective of this paper is to critically assess the state of the art in the theory of scalability analysis, and motivate further research on the development of new and more comprehensive analytical tools to study the scalability of parallel algorithms and architectures. We survey a number of techniques and formalisms t...
The Workload on Parallel Supercomputers: Modeling the Characteristics of Rigid Jobs
- Journal of Parallel and Distributed Computing
, 2001
"... The analysis of workloads is important for understanding how systems are used. In addition, workload models are needed as input for the evaluation of new system designs, and for the comparison of system designs. This is especially important in costly large-scale parallel systems. Luckily, workloa ..."
Abstract
-
Cited by 80 (10 self)
- Add to MetaCart
The analysis of workloads is important for understanding how systems are used. In addition, workload models are needed as input for the evaluation of new system designs, and for the comparison of system designs. This is especially important in costly large-scale parallel systems. Luckily, workload data is available in the form of accounting logs. Using such logs from three dierent sites, we analyze and model the job-level workloads with an emphasis on those aspects that are universal to all sites. As many distributions turn out to span a large range, we typically rst apply a logarithmic transformation to the data, and then t it to a novel hyper-Gamma distribution or one of its special cases. This is a generalization of distributions proposed previously, and leads to good goodness-of-t scores. The parameters for the distribution are found using the iterative EM algorithm. The results of the analysis have been codied in a modeling program that creates a synthetic workload based on the results of the analysis. 1
Special Purpose Parallel Computing
- Lectures on Parallel Computation
, 1993
"... A vast amount of work has been done in recent years on the design, analysis, implementation and verification of special purpose parallel computing systems. This paper presents a survey of various aspects of this work. A long, but by no means complete, bibliography is given. 1. Introduction Turing ..."
Abstract
-
Cited by 77 (5 self)
- Add to MetaCart
A vast amount of work has been done in recent years on the design, analysis, implementation and verification of special purpose parallel computing systems. This paper presents a survey of various aspects of this work. A long, but by no means complete, bibliography is given. 1. Introduction Turing [365] demonstrated that, in principle, a single general purpose sequential machine could be designed which would be capable of efficiently performing any computation which could be performed by a special purpose sequential machine. The importance of this universality result for subsequent practical developments in computing cannot be overstated. It showed that, for a given computational problem, the additional efficiency advantages which could be gained by designing a special purpose sequential machine for that problem would not be great. Around 1944, von Neumann produced a proposal [66, 389] for a general purpose storedprogram sequential computer which captured the fundamental principles of...
Scalable Problems and Memory-Bounded Speedup
, 1992
"... In this paper three models of parallel speedup are studied. They are fixed-size speedup, fixed-time speedup and memory-bounded speedup. The latter two consider the relationship between speedup and problem scalability. Two sets of speedup formulations are derived for these three models. One set consi ..."
Abstract
-
Cited by 49 (12 self)
- Add to MetaCart
In this paper three models of parallel speedup are studied. They are fixed-size speedup, fixed-time speedup and memory-bounded speedup. The latter two consider the relationship between speedup and problem scalability. Two sets of speedup formulations are derived for these three models. One set considers uneven workload allocation and communication overhead and gives more accurate estimation. Another set considers a simplified case and provides a clear picture on the impact of the sequential portion of an application on the possible performance gain from parallel processing. The simplified fixed-size speedup is Amdahl's law. The simplified fixed-time speedup is Gustafson's scaled speedup. The simplified memory-bounded speedup contains both Amdahl's law and Gustafson's scaled speedup as special cases. This study leads to a better understanding of parallel processing.
Aspects of Networking in Multiplayer Computer Games
, 2001
"... Distributed, real-time multiplayer computer games (MCGs) are in the vanguard of utilizing the networking possibilities. Although related research have been done in military simulations, virtual reality systems, and computer supported cooperative working, the suggested solutions diverge from the prob ..."
Abstract
-
Cited by 43 (1 self)
- Add to MetaCart
Distributed, real-time multiplayer computer games (MCGs) are in the vanguard of utilizing the networking possibilities. Although related research have been done in military simulations, virtual reality systems, and computer supported cooperative working, the suggested solutions diverge from the problems posed by MCGs. With this in mind, this paper provides a concise overview of four aspects affecting networking in MCGs. Firstly, networking resources (bandwidth, latency, and computational power) set the technical boundaries within which the MCG must operate. Secondly, distribution concepts encompass communication architectures (peer-to-peer, client/server, server-network), and both data and control architectures (centralized, distributed, replicated). Thirdly, scalability allows the MCG to adapt to the resource changes by parametrization. Finally, security aims at fighting back against cheating and vandalism, which are common in online gaming. Keywords---Computer games, networking, online entertainment, distributed interactive simulation, virtual environments.

