Results 1 -
8 of
8
Fast File Access for Fast Agents
, 2000
"... . Mobile agents are a powerful tool for coordinating general purpose distributed computing, where the main goal is high performance. In this paper we demonstrate how the inherent mobility of agents may be exploited to achieve fast file access, which is necessary for most general-purpose applicati ..."
Abstract
-
Cited by 11 (8 self)
- Add to MetaCart
. Mobile agents are a powerful tool for coordinating general purpose distributed computing, where the main goal is high performance. In this paper we demonstrate how the inherent mobility of agents may be exploited to achieve fast file access, which is necessary for most general-purpose applications. We present a file system for mobile agents based exclusively on local disks of the participating workstations. The mobility of agents allows us to make all file operations local, which significantly reduces access time. We also demonstrate how code files and special system files can be handled efficiently in a localdisk -based environment. 1
Scalable Molecular Dynamics for Large Biomolecular Systems
- In Proceedings of Supercomputing (SC) 2000
, 2000
"... We present an optimized parallelization scheme for molecular dynamics simulations of large biomolecular systems, implemented in the production-quality molecular dynamics program NAMD. With an object-based hybrid force and spatial decomposition scheme, and an aggressive measurement-based predictiv ..."
Abstract
-
Cited by 8 (1 self)
- Add to MetaCart
We present an optimized parallelization scheme for molecular dynamics simulations of large biomolecular systems, implemented in the production-quality molecular dynamics program NAMD. With an object-based hybrid force and spatial decomposition scheme, and an aggressive measurement-based predictive load balancing framework, we have attained speeds and speedups that are much higher than any reported in literature so far. The paper first summarizes the broad methodology we are pursuing, and the basic parallelization scheme we used. It then describes the optimizations that were instrumental in increasing performance, and presents performance results on benchmark simulations. 1 Introduction Understanding the structure and function of biomolecules such as proteins and DNA is crucial to our ability to understand the mechanisms of diseases, drugs, and normal life processes. With the experimental determination of structures for an increasing set of proteins it has become possible to em...
Impostors for Parallel Interactive Computer Graphics
, 2004
"... We demonstrate an interactive parallel rendering system based on the impostors technique. Impostors in-crease the latency tolerance of an interactive rendering system, which allows us to use the power of a parallel machine even at high resolutions and framerates. Impostors also decrease the required ..."
Abstract
-
Cited by 2 (2 self)
- Add to MetaCart
We demonstrate an interactive parallel rendering system based on the impostors technique. Impostors in-crease the latency tolerance of an interactive rendering system, which allows us to use the power of a parallel machine even at high resolutions and framerates. Impostors also decrease the required rendering bandwidth, which makes possible the interactive use of a variety of advanced rendering techniques. These techniques are demonstrated by the interactive high-quality rendering of very large detailed models on large distributed-memory parallel machines. iii To TRUTH, without which everybody would be lying. iv Acknowledgments This work was made possible by the efforts of hundreds of teachers and friends over the span of nearly three decades. I can only mention a few here. Thanks to my adviser, Dr. Kale, who provided me continual support and a steady stream of good ideas. May you always have enough good students to implement your grand designs. Thanks to my committee
Automatic Dynamic Load Balancing for a Crack Propagation Application
"... Abstract — Automatic, adaptive load balancing is essential for handling load imbalance that may occur during parallel finite element simulations involving mesh adaptivity, nonlinear material behavior and other localized effects. This paper demonstrates the successful application of a measurement-bas ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
Abstract — Automatic, adaptive load balancing is essential for handling load imbalance that may occur during parallel finite element simulations involving mesh adaptivity, nonlinear material behavior and other localized effects. This paper demonstrates the successful application of a measurement-based dynamic load balancing concept to the finite element analysis of elasto-plastic wave propagation and dynamic fracture events. The simulations are performed with the aid of a parallel framework for unstructured meshes called ParFUM, which is based on Charm++ and Adaptive MPI (AMPI) and involves migratable user-level threads. The performance was analyzed using Projections, a performance analysis and post factum visualization tool. The bottlenecks to scalability are identified and eliminated using a variety of strategies resulting in performance gains ranging from moderate to highly significant. I.
Maintaining Communication Links with Migrating Components
"... In systems aiming at high-speed distributed applications, load balancing plays an essential role. If a process can be easily broken down into several sub-parts, it is possible to do the load balancing transparently to the user application. This is possible if processes consist of multiple threads ..."
Abstract
- Add to MetaCart
In systems aiming at high-speed distributed applications, load balancing plays an essential role. If a process can be easily broken down into several sub-parts, it is possible to do the load balancing transparently to the user application. This is possible if processes consist of multiple threads, distributed objects, or logical nodes. Then some of these parts could be transparently migrated from overloaded process to the underloaded ones by the runtime system. During component migration, the communication links among the system components must be preserved. To allow efficient load balancing, the migration should have minimal effect on the performance of the non-migrating components of the system. This paper presents a migration mechanism for preserving communication links between system components. The presented mechanism uses minimal knowledge of the system topology and requires only the component being moved to stop computing while in transit, which makes this algorithm ap...
Jade: Compiler-Supported . . . VIRTUALIZATION-BASED PARALLEL PROGRAMMING
, 2004
"... Current parallel programming approaches, which typically use message-passing and sharedmemory threads, require the programmer to write considerable low-level work management and distribution code to partition and distribute data, perform load distribution and balancing, pack and unpack data into mes ..."
Abstract
- Add to MetaCart
Current parallel programming approaches, which typically use message-passing and sharedmemory threads, require the programmer to write considerable low-level work management and distribution code to partition and distribute data, perform load distribution and balancing, pack and unpack data into messages, and so on. One solution to this low level of programming is to use processor virtualization, wherein the programmer assumes a large number of available virtual processors and creates a large number of work objects, combined with an adaptive runtime system (ARTS) that intelligently maps work to processors and performs dynamic load balancing to optimize performance. Charm++ and AMPI are implementations of this approach. Although Charm++ and AMPI enable the use of an ARTS, the program specification is still low-level, requiring many details. Furthermore, the only mechanisms for information exchange are asynchronous method invocation and message passing, although some applications are more easily expressed in a shared memory paradigm. We explore the thesis that compiler support and optimizations, and a disciplined shared memory abstraction can substantially improve programmer productivity while retaining most of the performance benefits of processor virtualization and the ARTS. The
Model Based Load Indices (MBLI) for Scientific Simulation
"... This paper presents the data relationships necessary to discover and implement a model based load index (MBLI) for load balancing scientific applications on distributed parallel systems. An MBLI is an alternative quantity to run-time measurement-based load indices (RLIs) such as processing time. Thi ..."
Abstract
- Add to MetaCart
This paper presents the data relationships necessary to discover and implement a model based load index (MBLI) for load balancing scientific applications on distributed parallel systems. An MBLI is an alternative quantity to run-time measurement-based load indices (RLIs) such as processing time. This newly characterized index must be a quantity produced by or required of the scientific model being simulated. An MBLI correlates with a measured process performance parameter that directly represents hetero-geneous computational loads and can be used to resolve load imbalances that reduce an application’s time to completion. The method of obtaining an MBLI occurs during a pre-processing step and does not incur a run-time cost after implementation. Atomic mass, temperature tendency and surface flux are examples of MBLIs found in Molecular Dynamics (MD) models, Atmospheric General Circulation Models (AGCM) and Ocean Circulation Models (OCM) respectively. This paper presents the discovery processes for MBLIs in AGCMs, MD models and OCMs. MBLI im-plementations and performance of an AGCM and MD model are discussed while executing on Pentium4 Xeon, IBM Power5-p575 and IBM BlueGene/L systems. 2 1
c ○ 2008 Sayantan ChakravortyA FAULT TOLERANCE PROTOCOL FOR FAST RECOVERY BY
, 2008
"... Urbana, Illinoisence of faults, an application using our fault tolerance protocol takes less time to complete than a traditional checkpoint based protocol. iii To Ma and Baba iv Acknowledgments I would like to thank my advisor Prof. L. V. Kalé for his encouragement, guidance and patience without whi ..."
Abstract
- Add to MetaCart
Urbana, Illinoisence of faults, an application using our fault tolerance protocol takes less time to complete than a traditional checkpoint based protocol. iii To Ma and Baba iv Acknowledgments I would like to thank my advisor Prof. L. V. Kalé for his encouragement, guidance and patience without which this thesis would have not been possible. His willingness to have long discussions, even in the middle of the busiest day, has helped me and boosted my morale no end. I would also like to thank my dissertation committee Prof. Indranil Gupta, Prof. Keshav Pingali, Prof. Josep Torrellas and Prof. Yuanyuan Zhou for their very helpful suggestions and advice. This thesis builds on a large body of work by current and former members of the Parallel Programming Laboratory. I am very grateful to Gengbin Zheng and Orion S. Lawlor for their insights and suggestions that helped me out of many a difficult corner. Gengbin has been my tutor in the dark art of debugging parallel programs. Several other members have been of great help by being sounding boards for my fanciful ideas, proof reading papers and acting as another pair of eyes in search of bugs. I would like to thank, in no certain

