## Adaptive Runtime Support for Direct Simulation Monte Carlo Methods on Distributed Memory Architectures

Citations: | 30 - 15 self |

### BibTeX

@MISC{Moon_adaptiveruntime,

author = {Bongki Moon and et al.},

title = {Adaptive Runtime Support for Direct Simulation Monte Carlo Methods on Distributed Memory Architectures},

year = {}

}

### Years of Citing Articles

### OpenURL

### Abstract

In highly adaptive irregular problems such as many Particle-In-Cell (PIC) codes and Direct Simulation Monte Carlo (DSMC) codes, data access patterns may vary from time step to time step. This uctuation may hinder e cient utilization of distributed memory parallel computers because of the resulting overhead for data redistribution and dynamic load balancing. To efficiently parallelize such adaptive irregular problems on distributed memory parallel computers, several issues such as effective methods for domain partitioning and fast data transportation must be addressed. This paper presents efficient runtime support methods for such problems. A simple one-dimensional domain partitioning method is implemented and compared with unstructured mesh partitioners such as recursive coordinate bisection and recursive inertial bisection. A remapping decision policy has been investigated for dynamic load balancing on 3-dimensional DSMC codes. Performance results are presented.

### Citations

515 |
Partitioning sparse matrices with eigenvectors of graphs
- Pothen, Simon, et al.
- 1990
(Show Context)
Citation Context ...parallel computers. Recursive bisection algorithms produce partitions of reasonable quality for static irregular problems, with relatively low overhead when compared with Recursive spectral bisection =-=[11]-=- and Simulated Annealing [17]. von Hanxleden [16] and Williams [17] discuss the qualities of partitions produced by the recursive bisection algorithms, and compare their performance with other partiti... |

347 | Molecular Gas Dynamics and the Direct Simulation of Gas Flows - Bird - 1994 |

241 |
A Partitioning Strategy for Nonuniform Problems on Multiprocessors
- Berger, Bokhari
- 1987
(Show Context)
Citation Context .... 3.1 Recursive bisection There have been several theoretical and experimental discussions of partitioning strategies based on spatial information for many years. Recursive coordinate bisection (RCB) =-=[1]-=- is a well-known algorithm which bisections a problem domain into two pieces of equal work load recursively until the number of subdomains is equal to the number of processors. Recursive inertial bise... |

95 |
Partitioning problems in parallel, pipelined, and distributed computing
- Bokhari
- 1988
(Show Context)
Citation Context ...such a way that work units i and i+1 are assigned to the same or to adjacent processors. Relatively simple algorithms forsnding the optimal partition of a chain-structured problem have been suggested =-=[2, 8, 3]-=-. While these algorithms are developed to optimize computation and communication costs at the same time, we have developed and used a new chain partitioning algorithm which considers computation cost ... |

95 | The design and implementation of a parallel unstructured Euler solver using software primitives
- Das, Mavriplis, et al.
- 1994
(Show Context)
Citation Context ...main unchanged. In past work a PARTI runtime support library has been developed for a class of irregular but relatively static problems in which data access patterns do not change during computation. =-=[14, 4, 5]-=- To parallelize such problems, the PARTI runtime primitives coordinate interprocessor data movement, manage the storage of, and access to, copies of o-processor data, and partition work and data stru... |

62 |
Dynamic remapping of parallel computations with varying resource demands
- Nicol, Saltz
- 1988
(Show Context)
Citation Context ...s potentially impractical pre-runtime analysis to determine an optimal periodicity. Stop-At-Rise (SAR) remapping decision policy has been implemented and experimented, which was introduced previously =-=[9]-=-. The SAR heuristic trades the cost of problem remapping against time wasted due to load imbalance. It assumes that processors synchronize globally at every time step, that the cost of remapping and i... |

57 | Solving finite element equations on concurrent computers - Nour-Omid, Raefsky, et al. - 1987 |

17 | Improved algorithms for mapping pipelined and parallel computations
- Nicol, O'Hallaron
- 1991
(Show Context)
Citation Context ...such a way that work units i and i+1 are assigned to the same or to adjacent processors. Relatively simple algorithms forsnding the optimal partition of a chain-structured problem have been suggested =-=[2, 8, 3]-=-. While these algorithms are developed to optimize computation and communication costs at the same time, we have developed and used a new chain partitioning algorithm which considers computation cost ... |

15 | Parallelizing molecular dynamics codes using the Parti software - Das, Saltz - 1993 |

9 | Spacecraft contamination investigation by direct simulation Monte Carlo - contamination on UARS/HALOE - Rault, Woronowicz |

8 | Solving nite element equations on concurrent computers - Nour-Amid, Raefsky, et al. - 1986 |

4 | A.,\The G2/A3 Program System Users Manual, Version 1.6 - Bird - 1991 |

4 |
Ecient algorithms for mapping and partitioning a class of parallel computations
- Choi, Narahari
- 1993
(Show Context)
Citation Context ...such a way that work units i and i+1 are assigned to the same or to adjacent processors. Relatively simple algorithms forsnding the optimal partition of a chain-structured problem have been suggested =-=[2, 8, 3]-=-. While these algorithms are developed to optimize computation and communication costs at the same time, we have developed and used a new chain partitioning algorithm which considers computation cost ... |

2 |
A comparison of particle simulation implementations on two dierent parallel architectures
- McDonald, Dagum
- 1991
(Show Context)
Citation Context ...od: the movement and collision processes are completely uncoupled over a time step [13]. McDonald and Dagum have compared implementations of direct particle simulation on SIMD and MIMD architectures. =-=[7]-=- 2.2 Computational characteristics Changes in position coordinates may cause the particles to move from current cells to new cells according to their new position coordinates. This implies that the co... |

2 |
Solving element equations on concurrent computers
- Nour-Omid, Raefsky, et al.
- 1987
(Show Context)
Citation Context ...nown algorithm which bisections a problem domain into two pieces of equal work load recursively until the number of subdomains is equal to the number of processors. Recursive inertial bisection (RIB) =-=[10]-=- is similar to RCB in that it bisects a problem domain recursively based on spatial information, but RIB uses minimum moment of inertia when it selects bisectioning directions, whereas RCB selects bis... |

1 |
Dynamic load balancing in a concurrent plasma PIC code on the JPL/Caltech Mark III hypercube
- Liewer, Leaver, et al.
- 1990
(Show Context)
Citation Context ...ied and manipulated to help make remapping decisions dynamically at runtime. A related work that achieves dynamic load balance by monitoring load imbalance at user specied interval has been reported =-=[6]-=-. 4.1 Periodic remapping DSMC codes can be characterized by statistical calculations involving particles associated with each cell, particles moved to new cells as a result of calculations, and cells ... |

1 |
Parallelizingmolecular dynamics codes using the Parti software primitives
- Das, Saltz
- 1993
(Show Context)
Citation Context ...main unchanged. In past work a PARTI runtime support library has been developed for a class of irregular but relatively static problems in which data access patterns do not change during computation. =-=[14, 4, 5]-=- To parallelize such problems, the PARTI runtime primitives coordinate interprocessor data movement, manage the storage of, and access to, copies of o-processor data, and partition work and data stru... |