Results 1 -
3 of
3
Bigsim: A parallel simulator for performance prediction of extremely large parallel machines
- In18th Intl.Paralleland Distr.Proc. Symp. (IPDPS
, 2004
"... We present a parallel simulator — BigSim — for predicting performance of machines with a very large number of processors. The simulator provides the ability to make performance predictions for machines such as Blue-Gene/L, based on actual execution of real applications. We present this capability us ..."
Abstract
-
Cited by 25 (5 self)
- Add to MetaCart
We present a parallel simulator — BigSim — for predicting performance of machines with a very large number of processors. The simulator provides the ability to make performance predictions for machines such as Blue-Gene/L, based on actual execution of real applications. We present this capability using case-studies of some application benchmarks. Such a simulator is useful to evaluate the performance of specific applications on such machines even before they are built. A sequential simulator may be too slow or infeasible. However, a parallel simulator faces problems of causality violations. We describe our scheme based on ideas from parallel discrete event simulation and utilize inherent determinacy of many parallel applications. We also explore techniques for optimizing such parallel simulations of machines with large number of processors on existing machines with fewer number of processors. 1 1
Scaling applications to massively parallel machines using projections performance analysis tool
- In Future Generation Computer Systems Special Issue on: Large-Scale System Performance Modeling and Analysis
, 2005
"... Some of the most challenging applications to parallelize scalably are the ones that present a relatively small amount of computation per iteration. Multiple interacting performance challenges must be identified and solved to attain high parallel efficiency in such cases. We present case studies invo ..."
Abstract
-
Cited by 7 (4 self)
- Add to MetaCart
Some of the most challenging applications to parallelize scalably are the ones that present a relatively small amount of computation per iteration. Multiple interacting performance challenges must be identified and solved to attain high parallel efficiency in such cases. We present case studies involving NAMD, a parallel classic molecular dynamics application for large biomolecular systems, and CPAIMD, Car-Parrinello ab initio molecular dynamics application, and efforts to scale them to large number of processors. Both applications are implemented in Charm++, and the performance analysis was carried out using Projections, the performance visualization/analysis tool associated with Charm++. We will showcase a series of optimizations facilitated by Projections. The resultant performance of NAMD led to a Gordon Bell award at SC2002 with unprecedented speedup on 3,000 processors with teraflops level peak performance. We also explore the techniques for applying the performance visualization/analysis tool on future generation extreme-scale parallel machines and discuss the scalability issues with Projections. 1
We present Penumbra Limit Maps, a technique
"... A powerwall is an array of separate screens that work together to provide a single unified display. Powerwalls are often driven by a small cluster, which requires parallel software to organize and synchronize the distributed rendering process. This paper describes MPIglut, our powerwall-friendly imp ..."
Abstract
- Add to MetaCart
A powerwall is an array of separate screens that work together to provide a single unified display. Powerwalls are often driven by a small cluster, which requires parallel software to organize and synchronize the distributed rendering process. This paper describes MPIglut, our powerwall-friendly implementation of the popular sequential GLUT OpenGL 3D programming interface. MPIglut internally communicates using MPI to provide a single coherent display even across a distributed-memory parallel machine. Uniquely, MPIglut is source-code compatible with ordinary sequential GLUT code while providing high performance.

