Results 1 - 10
of
53
Efficient optimistic parallel simulations using reverse computation
- ACM Transactions on Modeling and Computer Simulation
, 1999
"... ..."
Supporting Parallel Applications on Clusters of Workstations: The Intelligent Network Interface Approach
- Cluster Computing, Special Issue on High Performance Distributed Computing
, 1997
"... This paper presents a novel networking architecture designed for communication intensive parallel applications running on clusters of workstations (COWs) connected by high speed networks. This architecture permits (1) the transfer of selected communication-related functionality from the host machine ..."
Abstract
-
Cited by 35 (17 self)
- Add to MetaCart
This paper presents a novel networking architecture designed for communication intensive parallel applications running on clusters of workstations (COWs) connected by high speed networks. This architecture permits (1) the transfer of selected communication-related functionality from the host machine to the network interface coprocessor, and (2) the exposure of this functionality directly to applications as instructions of aVirtual Communication Machine (VCM) implemented by the coprocessor. The user-level code interacts directly with the network coprocessor as the host kernel only 'connects' the application to the VCM and does not participate in the data transfers. The distinctive feature of our design is its flexibility: the integration of the network withthe applicationcan be varied to maximize performance. The resulting communication architecture is characterized by a very low overhead on the host processor, by latency and bandwidth close to the hardware limits, and by an applicatio...
Scheduling Critical Channels in Conservative Parallel Discrete Event Simulation
- In Proceedings of the 13th Workshop on Parallel and Distributed Simulation
, 1999
"... This paper introduces the Critical Channel Traversing (CCT) algorithm, a new scheduling algorithm for both sequential and parallel discrete event simulation. CCT is a general conservative algorithm that is aimed at the simulation of low-granularity network models on shared-memory multi-processor co ..."
Abstract
-
Cited by 30 (5 self)
- Add to MetaCart
This paper introduces the Critical Channel Traversing (CCT) algorithm, a new scheduling algorithm for both sequential and parallel discrete event simulation. CCT is a general conservative algorithm that is aimed at the simulation of low-granularity network models on shared-memory multi-processor computers. An implementation of the CCT algorithm within a kernel called TasKit has demonstrated excellent performance for large ATM network simulations when compared to previous sequential, optimistic and conservative kernels. TasKit has achieved two to three times speedup on a single processor with respect to a splay tree central-event-list based sequential kernel. On a 16 processor (R8000) Silicon Graphics PowerChallenge, TasKit has achieved an event-rate of 1.2 million events per second and a speedup of 26 relative to the sequential kernel for a large ATM network model. Performance is achieved through a multi-level scheduling scheme that supports the scheduling of large grains of computati...
Cloning parallel simulations
- ACM Transactions on Modeling and Computer Simulation
, 2001
"... NOTE: This is a preliminary release of an article accepted by the ACM Transactions on Modeling and Computer Simulation. The definitive version is currently in production at ACM and, when released, will supersede this version. c○1998 by the Association for Computing Machinery, Inc. Permission to make ..."
Abstract
-
Cited by 27 (2 self)
- Add to MetaCart
NOTE: This is a preliminary release of an article accepted by the ACM Transactions on Modeling and Computer Simulation. The definitive version is currently in production at ACM and, when released, will supersede this version. c○1998 by the Association for Computing Machinery, Inc. Permission to make digital/hard copy of all or part of this material without fee for personal or classroom use provided that the copies are not made or distributed for profit or commercial advantage, the ACM copyright/server notice, the title of the publication, and its date appear, and notice is given that copying is by permission of the ACM, Inc. To copy otherwise, to republish, to post on servers, or to redistribute to lists requires prior specific permission and/or a fee. Permissions may be requested from
ROSS: A High-Performance, Low Memory, Modular Time Warp System
- Journal of Parallel and Distributed Computing
, 2000
"... In this paper, we introduce a new Time Warp system called ROSS: Rensselaer's Optimistic Simulation System. ROSS is an extremely modular kernel that is capable of achieving event rates as high as 1,250,000 events per second when simulating a wireless telephone network model (PCS) on a quad processor ..."
Abstract
-
Cited by 27 (5 self)
- Add to MetaCart
In this paper, we introduce a new Time Warp system called ROSS: Rensselaer's Optimistic Simulation System. ROSS is an extremely modular kernel that is capable of achieving event rates as high as 1,250,000 events per second when simulating a wireless telephone network model (PCS) on a quad processor PC server. In a head-to-head comparison, we observe that ROSS out performs the Georgia Tech Time Warp (GTW) system by up to 180% on a quad processor PC server and up to 200% on the SGI Origin 2000 . ROSS only requires a small constant amount of memory buffers greater than the amount needed by the sequential simulation for a constant number of processors. ROSS demonstrates for the first time that stable, highly-efficient execution using little memory above what the sequential model would require is possible for low-event granularity simulation models. The driving force behind these high-performance and low memory utilization results is the coupling of an efficient pointer-based implementation framework, Fujimoto 's fast GVT algorithm for shared memory multiprocessors, reverse computation and the introduction of Kernel Processes (KPs). KPs lower fossil collection overheads by aggregating processed event lists. This aspect allows fossil collection to be done with greater frequency, thus lowering the overall memory necessary to sustain stable, efficient parallel execution. These characteristics make ROSS an ideal system for use in large-scale networking simulation models. The principle conclusion drawn from this study is that the performance of an optimistic simulator is largely determined by its memory usage. 1
Computing Global Virtual Time in Shared-Memory Multiprocessors
- ACM TRANS. MODEL. COMPUT. SIMUL
, 1997
"... ..."
Bigsim: A parallel simulator for performance prediction of extremely large parallel machines
- In18th Intl.Paralleland Distr.Proc. Symp. (IPDPS
, 2004
"... We present a parallel simulator — BigSim — for predicting performance of machines with a very large number of processors. The simulator provides the ability to make performance predictions for machines such as Blue-Gene/L, based on actual execution of real applications. We present this capability us ..."
Abstract
-
Cited by 25 (5 self)
- Add to MetaCart
We present a parallel simulator — BigSim — for predicting performance of machines with a very large number of processors. The simulator provides the ability to make performance predictions for machines such as Blue-Gene/L, based on actual execution of real applications. We present this capability using case-studies of some application benchmarks. Such a simulator is useful to evaluate the performance of specific applications on such machines even before they are built. A sequential simulator may be too slow or infeasible. However, a parallel simulator faces problems of causality violations. We describe our scheme based on ideas from parallel discrete event simulation and utilize inherent determinacy of many parallel applications. We also explore techniques for optimizing such parallel simulations of machines with large number of processors on existing machines with fewer number of processors. 1 1
Scalable Parallel Simulations of Wireless Networks with WiPPET: Modeling of Radio Propagation, Mobility and Protocols
, 1999
"... this paper we have summarized the parallel simulation of wireless networks that is ongoing at WINLAB. Future work on simulation of third generation (3G) wireless systems includes simulating interference cancelation in wideband code-division-multiple-access (W-CDMA). This is being accomplished using ..."
Abstract
-
Cited by 21 (3 self)
- Add to MetaCart
this paper we have summarized the parallel simulation of wireless networks that is ongoing at WINLAB. Future work on simulation of third generation (3G) wireless systems includes simulating interference cancelation in wideband code-division-multiple-access (W-CDMA). This is being accomplished using the next generation modeling framework SSF (Scalable Simulation Framework) [33] that was designed on the basis of experiences with TeD. The specific simulator that is currently under development, WiPPET signal , is a generic tool capable of multicell simulations at the level of bits, chips and waveforms
µsik - A Micro-Kernel for Parallel/Distributed Simulation Systems
- Workshop on Principles of Advanced and Distributed Simulation
, 2005
"... We present a novel micro-kernel approach to parallel/distributed simulation. Using the micro-kernel approach, we develop a unified architecture for incorporating multiple types of simulation processes. The processes hold potential to employ a variety of synchronization mechanisms, and could alter th ..."
Abstract
-
Cited by 20 (9 self)
- Add to MetaCart
We present a novel micro-kernel approach to parallel/distributed simulation. Using the micro-kernel approach, we develop a unified architecture for incorporating multiple types of simulation processes. The processes hold potential to employ a variety of synchronization mechanisms, and could alter their choice of mechanism dynamically. Supported mechanisms include traditional lookahead-based conservative and state saving-based optimistic execution approaches, as well as newer mechanisms such as reverse computation-based optimistic execution and aggregation-based event processing, all within a single parsimonious application programming interface (API). We also present the internal implementation and a preliminary performance evaluation of this interface in µsik, which is an efficient parallel/distributed realization of our micro-kernel architecture in C ++. 1.
The Dark Side of Risk (What your mother never told you about Time Warp)
, 1996
"... This paper is a reminder of the danger of allowing "risk" when synchronizing a parallel discrete-event simulation: a simulation code that runs correctly on a serial machine may, when run in parallel, fail catastrophically. This can happen when Time Warp presents an "inconsistent " message to an LP, ..."
Abstract
-
Cited by 16 (3 self)
- Add to MetaCart
This paper is a reminder of the danger of allowing "risk" when synchronizing a parallel discrete-event simulation: a simulation code that runs correctly on a serial machine may, when run in parallel, fail catastrophically. This can happen when Time Warp presents an "inconsistent " message to an LP, a message that makes absolutely no sense given the LP's state. Failure may result if the simulation modeler did not anticipate the possibility of this inconsistency. While the problem is not new, there has been little discussion of how to deal with it; furthermore the problem may not be evident to new users or potential users of parallel simulation. This paper shows how the problem may occur, and the damage it may cause. We show how one may eliminate inconsistencies due to lagging rollbacks and stale state, but then show that so long as risk is allowed it is still possible for an LP to be placed in a state that is inconsistent with model semantics, again making it vulnerable to failure. We f...

