Results 11 - 20
of
76
Using Schooner to Support Distribution and Heterogeneity in the Numerical Propulsion System Simulation Project
- IN THE NUMERICAL PROPULSION SYSTEM SIMULATION PROJECT. CONCURRENCY: PRACTICE AND EXPERIENCE
, 1994
"... The Numerical Propulsion System Simulation (NPSS) project has been initiated by NASA to explore the use of computer simulation in the development of new aircraft propulsion technology. With this approach, each engine component is modeled by a separate computational code, with a simulation executive ..."
Abstract
-
Cited by 11 (5 self)
- Add to MetaCart
The Numerical Propulsion System Simulation (NPSS) project has been initiated by NASA to explore the use of computer simulation in the development of new aircraft propulsion technology. With this approach, each engine component is modeled by a separate computational code, with a simulation executive connecting the codes and modeling component interactions. Since each code potentially executes on a different machine in a network, a simulation run is a heterogeneous distributed program in which diverse software and hardware elements are incorporated into a single computation. In this paper, a prototype simulation executive that supports this type of programming is described. The two components of this executive are the AVS scientific visualization system and the Schooner heterogeneous remote procedure call (RPC) facility. In addition, the match between Schooner's capabilities and the needs of NPSS is evaluated based on our experience with a collection of test codes. The basic conclusion i...
Implementing MPI’s One-Sided Communications for WMPI
- In EuroPVM/MPI
, 1999
"... Abstract. One-sided Communications is one of the extensions to MPI set out in the MPI-2 standard. We present here a thread-based implementation of One-sided Communications written for WMPI, an existing Windows implementation of MPI written at the Universidade de Coimbra. This is a major step towards ..."
Abstract
-
Cited by 9 (0 self)
- Add to MetaCart
Abstract. One-sided Communications is one of the extensions to MPI set out in the MPI-2 standard. We present here a thread-based implementation of One-sided Communications written for WMPI, an existing Windows implementation of MPI written at the Universidade de Coimbra. This is a major step towards WMPI incorporating the MPI-2 standard, with the further bene t of contributing to the thread safety of WMPI. We discuss the main design decisions associated with the implementation and consider further research work required in this area to improve both the existing implementation and to assess other implementations of One-sided Communications. 1
Communicating Across Parallel Message-Passing Environments
- Journal of Systems Architecture
, 1998
"... this paper, wehave presented PLUS, a ..."
The influence of coordination on program structure
- in: Proceedings of the 30th Hawaii International Conference on System Sciences (IEEE
, 1997
"... In this paper, we examine the inherent properties of some of the most popular coordination models of today and show how they contribute to the difficulty of systematic construction of coordination protocols for the cooperation of concurrent processes, as explicit, tangible pieces of software. They i ..."
Abstract
-
Cited by 7 (4 self)
- Add to MetaCart
In this paper, we examine the inherent properties of some of the most popular coordination models of today and show how they contribute to the difficulty of systematic construction of coordination protocols for the cooperation of concurrent processes, as explicit, tangible pieces of software. They include the object oriented models of communication and the generative tuple space paradigm of models such as Linda. We then describe a new generic model: Idealized Worker Idealized Manager (IWIM) and discuss its advantages for coordination of concurrent activities, especially in controloriented applications. We demonstrate its “completeness” by showing that it can trivially emulate other well-known communication and coordination models. Separation of computation and coordination code into different program modules is one of the important properties of this model, which is fully exploited in the pure coordination language MANIFOLD. 1.
Integrating Task and Data Parallelism with the Collective Communication Archetype
, 1994
"... A parallel program archetype aids in the development of reliable, efficient parallel applications with common computation/communication structures by providing stepwise refinement methods and code libraries specific to the structure. The methods and libraries help in transforming a sequential progra ..."
Abstract
-
Cited by 7 (3 self)
- Add to MetaCart
A parallel program archetype aids in the development of reliable, efficient parallel applications with common computation/communication structures by providing stepwise refinement methods and code libraries specific to the structure. The methods and libraries help in transforming a sequential program into a parallel program via a sequence of refinement steps that help maintain correctness while refining the program to obtain the appropriate level of granularity for a target machine. The specific archetype discussed here deals with the integration of task and data parallelism by using collective (or group) communication. This archetype has been used to develop several applications. 1 Introduction Archetypes. Many parallel applications share common features in design, testing, debugging, performance tuning, and program structuring. A parallel program archetype is an abstraction that embodies common features shared by parallel applications within a domain. An archetype aids the develop...
Composites: Trees for Data Parallel Programming
- In Proceedings of the 1994 International Conference on Computer Languages
, 1994
"... Data parallel programming languages offer ease of programming and debugging and scalability of parallel programs to increasing numbers of processors. Unfortunately, the usefulness of these languages for non-scientific programmers and loosely coupled parallel machines is currently limited. In this pa ..."
Abstract
-
Cited by 7 (3 self)
- Add to MetaCart
Data parallel programming languages offer ease of programming and debugging and scalability of parallel programs to increasing numbers of processors. Unfortunately, the usefulness of these languages for non-scientific programmers and loosely coupled parallel machines is currently limited. In this paper, we present the composite tree model which seeks to provide greater flexibility via parallel data types, support for more general, hierachical parallelism, parallel control flow, and efficient execution on loosely coupled, coarse grained parallel machines such as workstation networks. The composite tree model is a new model of parallel programming based on merging data parallelism with object oriented programming languages, and can be implemented as a small set of extensions to any pure, static typed, object oriented programming language. 1 Introduction Data parallel programming achieves parallelism through the simultaneous execution of the same operation across a set of data [19]. In a...
Volume Integral Equations in Nonlinear 3D Magnetostatics
, 1994
"... this paper a discussion of volume integral formulations in three-dimensional nonlinear magnetostatics is presented. Integral formulations are examined in connection with Whitney's elements in order to find new approaches. A numerical algorithm based on an h-formulation is introduced. Results of dema ..."
Abstract
-
Cited by 6 (6 self)
- Add to MetaCart
this paper a discussion of volume integral formulations in three-dimensional nonlinear magnetostatics is presented. Integral formulations are examined in connection with Whitney's elements in order to find new approaches. A numerical algorithm based on an h-formulation is introduced. Results of demanding application problems are shown demonstrating the characteristics of this kind of volume integral approach. In addition, a discussion of the parallelized version of the numerical code based on the h-type approach is presented, appended with numerical results illustrating the advantages of combining integral formulations with concurrent computing. INTRODUCTION A numerical approximation for static or low-frequency electromagnetic problems can be computed with either partial differential equations (PDEs) or integral equations. Partial differential equations are often favorable because they offer cost-effective solutions for three-dimensional problems. Integral equations have their own advantages; for example, air regions can be excluded, and the exterior boundary condition (i.e., that the magnetic field vanishes in infinity) is satisfied automatically (see, e.g., [1], [2]). Moreover, in linear problems, boundary integral equations give often a reasonably accurate solution with a relative small number of unknowns. During the past twenty years PDEs have dominated research in the field of numerical computation of low-frequency electromagnetic fields, and they have almost overwhelmed the developments in integral formulations. Probably one of the main reasons that integral equations have not attracted researchers as much as PDEs is the dense integral equation matrix. It is well known that iterative solvers such as ICCG (incomplete Choleski factorization, conjugate gradient) are ...
WMPI Message Passing Interface for Win32 Clusters. Instituto de Engenharia de Coimbra, Portugal and Departamento de Engenharia Informa'atica, Universidade de
- in: Proceedings of the Fifth Euro PVM/MPI, Lecture Notes in Computer Science
, 1998
"... Abstract. This paper describes WMPI 1, the first full implementation of the Message Passing Interface standard (MPI) for clusters of Microsoft's Windows platforms (Win32). Its internal architecture and user interface are presented, along with some performance test results (for release v1.1), that ev ..."
Abstract
-
Cited by 6 (0 self)
- Add to MetaCart
Abstract. This paper describes WMPI 1, the first full implementation of the Message Passing Interface standard (MPI) for clusters of Microsoft's Windows platforms (Win32). Its internal architecture and user interface are presented, along with some performance test results (for release v1.1), that evaluate how much of the total underlying system capacity for communication is delivered to the MPI based parallel applications. WMPI is based on MPICH, a portable implementation of the MPI standard for UNIX ® machines from the Argonne National Laboratory and, even when performance requisites cannot be satisfied, it is a useful tool for application developing, teaching and training. WMPI processes are also compatible with MPICH processes running on Unix workstations. 1.
Evaluating of MPI Implementations on Grid-connected Clusters using an Emulated
- WAN ���������� ������������������������������������ ���������� ������������ �� ������������������������ ������������������������������ IEEE/ACM International Symposium on Cluster Computing and the Grid (CCGrid03
, 2003
"... The MPICH-SCore high performance communication library for cluster computing is integrated into the MPICH-G2 library in order to adapt PC clusters to a Grid environment. The integrated library is called MPICH-G2/SCore. In addition, for the purpose of comparison with other approaches, MPICH-SCore its ..."
Abstract
-
Cited by 6 (2 self)
- Add to MetaCart
The MPICH-SCore high performance communication library for cluster computing is integrated into the MPICH-G2 library in order to adapt PC clusters to a Grid environment. The integrated library is called MPICH-G2/SCore. In addition, for the purpose of comparison with other approaches, MPICH-SCore itself is extended to encapsulate its network packet into a UDP packet so that packets are delivered via L3 switches. This extension is called UDPencapsulated MPICH-SCore. In this paper, three implementations of the MPI library, UDP-encapsulated MPICH-SCore, MPICH-G2/SCore, and MPICH-P4, are evaluated using an emulated WAN environment where two clusters, each consisting of sixteen hosts, are connected by a router PC. The router PC controls the latency of message delivery between clusters, and the added latency is varied from 1 millisecond to 4 milliseconds in round-trip time. Experiments are performed using the NAS Parallel Benchmarks, which show UDP-encapsulated MPICH-SCore most often performs better than other implementations. However, the differences are not critical for the benchmarks. The preliminary results show that the performance of the LU benchmark scales up linearly with under 4 millisecond round-trip latency. The CG and MG benchmarks show the scalability of 1.13 and 1.24 times with 4 millisecond round-trip latency, respectively. 1.

