Results 1 -
7 of
7
Optimizing Communication in Time-Warp Simulators
- Society for Computer Simulation
, 1998
"... In message passing environments, the message send time is dominated by overheads that are relatively independent of the message size. Therefore, fine-grained applications (such as Time-Warp simulators) suffer high overheads because of frequent communication. In this paper, we investigate the optimiz ..."
Abstract
-
Cited by 21 (6 self)
- Add to MetaCart
In message passing environments, the message send time is dominated by overheads that are relatively independent of the message size. Therefore, fine-grained applications (such as Time-Warp simulators) suffer high overheads because of frequent communication. In this paper, we investigate the optimization of the communication subsystem of Time-Warp simulators using dynamic message aggregation. Under this scheme, Time-Warp messages with the same destination LP, occuring in close temporal proximity are dynamically aggregated and sent as a single physical message. Several aggregation strategies that attempt to minimize the communication overhead without harming the progress of the simulation (because of messages being delayed) are developed. The performance of the strategies is evaluated for a network of workstations, and an SMP, using a number of applications that have different communication behavior. 1 Introduction In distributed environments the performance of the communication subsy...
Causality Representation and Cancellation Mechanism in Time Warp Simulations
- In PADS ’01: Proceedings of the fifteenth workshop on Parallel and distributed simulation
, 2001
"... The Time Warp synchronization protocol allows causality errors and then recovers from them with the assistance of a cancellation mechanism. Cancellation can cause the rollback of several other simulation objects that may trigger a cascading rollback situation where the rollback cycles back to the or ..."
Abstract
-
Cited by 5 (0 self)
- Add to MetaCart
The Time Warp synchronization protocol allows causality errors and then recovers from them with the assistance of a cancellation mechanism. Cancellation can cause the rollback of several other simulation objects that may trigger a cascading rollback situation where the rollback cycles back to the original simulation object. These cycles of rollback can cause the simulation to enter a unstable (or thrashing) state where little real forward simulation progress is achieved. To address this problem, knowledge of causal relations between events can be used during cancellation to avoid cascading rollbacks and to initiate early recovery operations from causality errors. In this paper, we describe a logical time representation for Time Warp simulations that is used to disseminate causality information. The new timestamp representation, called Total Clocks, has two components: (i) a virtual time component, and (ii) a vector of event counters similar to Vector clocks. The virtual time component provides a one dimensional global simulation time, and the vector of event counters records event processing rates by the simulation objects. This time representation allows us to disseminate causality information during event execution that can be used to allow early recovery during cancellation. We propose a cancellation mechanism using Total Clocks that avoids cascading rollbacks in Time Warp simulations that have FIFO communication channels.
Relaxing causal constraints in pdes
- In Proceeding of the 13th International Parallel Processing Symposium (IPPS/SPDP '99
, 1999
"... One of the major overheads that prohibits the wide spread deployment of parallel discrete event simulation (PDES) is the need to synchronize the distributed processes in the simulation. Considerable investigations have been conducted to analyze and optimize the two widely used synchronization strate ..."
Abstract
-
Cited by 3 (1 self)
- Add to MetaCart
One of the major overheads that prohibits the wide spread deployment of parallel discrete event simulation (PDES) is the need to synchronize the distributed processes in the simulation. Considerable investigations have been conducted to analyze and optimize the two widely used synchronization strategies, namely the conservative and the optimistic simulation paradigms. However, little attention has been focussed on the de-nition and strictness of causality. Does causality need to be preserved in all types of simulations? Previously, we had suggested an answer to this question. We had argued that signi cant performance gains can be achieved byreconsidering this de nition to decide if the parallel simulation really needs to subscribe to the preservation of causality. In this paper, we investigate this issue even more closely. An in depth analysis using several example simulation models is presented in this paper. In addition, a comparative analysis between unsynchronized and Time Warp simulation is presented. 1
External adjustment of runtime parameters in Time Warp synchronized parallel simulators
- In 11th International Parallel Processing Symposium, (IPPS'97
, 1997
"... Several optimizations to the Time Warp synchronization protocol for parallel discrete event simulation have been proposed and studied. Many of these optimizations have included some form of dynamic adjustment (or control) of the operating parametersof the simulation (e.g., checkpoint interval, cance ..."
Abstract
-
Cited by 3 (1 self)
- Add to MetaCart
Several optimizations to the Time Warp synchronization protocol for parallel discrete event simulation have been proposed and studied. Many of these optimizations have included some form of dynamic adjustment (or control) of the operating parametersof the simulation (e.g., checkpoint interval, cancellation strategy). Traditionally dynamic parameter adjustment has been performed at the simulation object level; each simulation object collects measures of its operating behaviors (e.g., rollback frequency, rollback length, etc) and uses them to adjust its operating parameters. The performance data collection functions and parameter adjustment are overhead costs that are incurred in the expectation of higher throughput. This paper presents a method of eliminating some of these overheads through the use of an external object to adjust the control parameters. That is, instead of inserting code for adjusting simulation parametersin the simulation object, an external control object is defined to periodically analyze each simulation object's performance data and revise that object's operating parameters. An implementation of an external control object in the WARPED Time Warp simulation kernel has been completed. The simulation parameters updated by the implemented control system are: checkpoint interval, and cancellation strategy (lazy or aggressive). A comparative analysis of three test cases shows that the external control mechanism provides speedups between 5%-17 % over the best performing embedded dynamic adjustment algorithms. 1
A Performance and Scalability Analysis Framework for Parallel Discrete Event Simulators
- J. Cryptology
, 1992
"... The development of efficient parallel discrete event simulators is hampered by the large number of interrelated factors affecting performance. This problem is made more difficult by the lack of scalable representative models that can be used to analyze optimizations and isolate bottlenecks. This pap ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
The development of efficient parallel discrete event simulators is hampered by the large number of interrelated factors affecting performance. This problem is made more difficult by the lack of scalable representative models that can be used to analyze optimizations and isolate bottlenecks. This paper proposes a performance and scalabilty analysis framework (PSAF) for parallel discrete event simulators. PSAF is built on a platform-independent workload specification language (WSL). WSL is a language that represents simulation models using a set of fundamental performance-critical parameters. For each simulator under study, a WSL translator generates synthetic platform-specific simulation models that conform to the performance and scalability characteristics specified by the WSL description. Moreover, sets of portable simulation models that explore the effects of the different parameters, individually or collectively, on the execution performance can easily be constructed using the synthetic workload generator (SWG). The SWG automatically generates simulation workloads with different performance properties. In addition, PSAF supports the seamless integration of real simulation models into the workload specification. Thus, a benchmark with both real and synthetically generated models can be built allowing for realistic and thorough exploration of the performance space. The utility of PSAF in determining the boundaries of performance and scalability of simulation environments and models is demonstrated.
Reducing Communication Overhead in Asynchronous Distributed Applications
, 1998
"... Communication is expensive and inevitable in distributed environments. In distributed applications, the speedup achieved by exploiting spatial parallelism and concurrency is dependent on several factors such as communication overhead, message latency, and the number of messages communicated. There h ..."
Abstract
- Add to MetaCart
Communication is expensive and inevitable in distributed environments. In distributed applications, the speedup achieved by exploiting spatial parallelism and concurrency is dependent on several factors such as communication overhead, message latency, and the number of messages communicated. There have been different approaches to reduce the aforementioned factors and hence the cost associated with communication operations. This thesis explores runtime mechanisms that reduces the communication overhead of distributed applications. It studies the aggregation of messages i.e., collecting application messages in close temporal and spatial proximity to reduce the overall overhead involved in sending and receiving messages. The aggregation mechanism called Dynamic Message Aggregation (DyMA), performs runtime decisions and has been proposed for the asynchronous message passing applications. Three aggregation strategies have been proposed and studied. Two strategies, namely Fixed Aggregation Window (FAW) and Simple Adaptive Aggregation Window (SAAW), are windowing schemes that aggregate messages for a specified time window and sends them as a single physical message. The third strategy namely Fixed Message Count (FMC) aggregate messages for a specified number of message count and sends them as a single physical message. As a case study, the aggregation strategies have been designed and studied for the Time Warp simulation paradigm. DyMA strategies reduce the overall communication overhead and is minimally intrusive with the progress of the simulation. Message aggregation strategies incorporated at the communication layer of asynchronous distributed applications can be employed with other application optimizations or hardware specific communication optimizations. To Amma, Appa...
An Extensible and Hierarchically Distributed Run-time Control System for Optimistic Discrete-Event Simulators
, 1998
"... Many Time Warp simulation tools are used by a wide variety of application developers, each with different demands and patterns of use. It is unlikely, under these circumstances, for off-the-shelf simulation software to be "optimal" for any application in any processing environment. The main form of ..."
Abstract
- Add to MetaCart
Many Time Warp simulation tools are used by a wide variety of application developers, each with different demands and patterns of use. It is unlikely, under these circumstances, for off-the-shelf simulation software to be "optimal" for any application in any processing environment. The main form of adaptation that is presently available is hand-crafted and problem specific; where the needs and patterns of use of the application are defined and the Time Warp simulation kernel software is fitted to optimize the performance of this typical application. The problem with this is, by their nature, Time Warp simulations are subject to constant change and adaptation. This situation is exacerbated by changes in network topology and hardware platforms. For most simulations, successfully adapting to the imbalances in the system is often a question of dynamically adjusting the right set of parameters in the executing simulation. Unfortunately, due to the dynamic nature of Time Warp simulation systems, identification of this critical set of parameters is not trivial. Also, modifying these parameters in the simulation system affects both the executing simulation and the execution environment. Hence, in addition to studying methods to adjust the set of critical parameters, the effect of these adjustments on the execution and the system resources must also be investigated.

