Results 11 - 20
of
55
A Users' Guide To Pstswm
, 1995
"... this report, we describe how to obtain, compile, and use the code. We also discuss what is involved in porting the code to a new parallel platform. - v - 1. Introduction PSTSWM Version 4.0 is a message-passing benchmark code and parallel algorithm testbed that solves the nonlinear shallow water equ ..."
Abstract
-
Cited by 11 (7 self)
- Add to MetaCart
this report, we describe how to obtain, compile, and use the code. We also discuss what is involved in porting the code to a new parallel platform. - v - 1. Introduction PSTSWM Version 4.0 is a message-passing benchmark code and parallel algorithm testbed that solves the nonlinear shallow water equations on a rotating sphere using the spectral transform method. PSTSWM was developed to evaluate parallel algorithms for the spectral transform method as it is used in global atmospheric circulation models [6]. Multiple parallel algorithms are embedded in the code and can be selected at run-time, as can the problem size, number of processors, and data decomposition. Six different problem test cases are also supported, each with associated solution and error analysis options. The extensive selection of run-time options are included to make a fair parallel algorithm comparison tractable. On each platform, each major algorithm is first tuned to achieve optimum performance before comparing between the algorithms. Developing, validating, maintaining, and executing separate versions of the code for each variant of each parallel algorithm would have been impossible. The algorithm comparison is also sensitive to problem specifics, motivating the run-time selection of the problem size and problem test case, and to the parallel platform. To avoid maintaining significantly different versions of the code for outwardly similar parallel architectures, PSTSWM has been structured to be easily ported. PSTSWM is written in Fortran 77 with VMS extensions and a small number of C preprocessor directives. Message passing is implemented using MPI [2], PICL [8], PVM [7], or native message passing libraries, with the choice being made at compile time. Additionally, all message passing is encapsulat...
Automating Parallelization of Regular Computations for Distributed-Memory Multicomputers in the PARADIGM Compiler
- Machines in the PARADIGM Compiler," in The Proceedings of the International Conference on Parallel Processing
, 1993
"... Distributed-memory multicomputers such as the Intel iPSC/860, the NCUBE/2, the Intel Paragon and the Connection Machine CM-5 offer significant advantages over shared-memory multiprocessors in terms of cost and scalability. Unfortunately, to extract all the computational power from these machines, us ..."
Abstract
-
Cited by 9 (5 self)
- Add to MetaCart
Distributed-memory multicomputers such as the Intel iPSC/860, the NCUBE/2, the Intel Paragon and the Connection Machine CM-5 offer significant advantages over shared-memory multiprocessors in terms of cost and scalability. Unfortunately, to extract all the computational power from these machines, users have to parallelize their existing serial programs, which can be an extremely laborious process. One major reason for this difficulty is the absence of a single global shared address space. As a result, the programmer has to distribute code and data on processors and manage communication among tasks explicitly. Clearly there is a need for efficient parallelizing compiler support on these machines. The PARADIGM project at the University of Illinois addresses these problems by developing a fully automated means to translate serial programs for efficient execution on distributed-memory multicomputers. In this paper we discuss parallelization of regular computations using symbolic sets as a ...
Performance Evaluation for Parallel Systems: A Survey
, 1997
"... Performance is often a key factor in determining the success of a parallel software system. Performance evaluation... ..."
Abstract
-
Cited by 7 (0 self)
- Add to MetaCart
Performance is often a key factor in determining the success of a parallel software system. Performance evaluation...
Compiler and Run-Time Support for Irregular Computations
, 1995
"... There are many important applications in computational fluid dynamics, circuit simulation and structural analysis that can be more accurately modeled using iterations on unstructured grids. In these problems, regular compiler analysis for Massively Parallel Processors (MPP) with distributed address ..."
Abstract
-
Cited by 7 (1 self)
- Add to MetaCart
There are many important applications in computational fluid dynamics, circuit simulation and structural analysis that can be more accurately modeled using iterations on unstructured grids. In these problems, regular compiler analysis for Massively Parallel Processors (MPP) with distributed address space fails because communication can only be determined at run-time. However, in many of these applications the communication pattern repeats for every iteration. Therefore, equivalent optimizations to the regular case can be achieved with a combination of run-time support (RTS) and compiler analysis.
Performance Analysis of Data Parallel Programs
- PROC. WORKSHOP ON PARALLEL COMPUTING SYSTEMS, LANL
, 1994
"... Effective strategies for performance analysis and tuning will be essential for the success of data parallel languages such as High-Performance Fortran (HPF) and Fortran D. Since compilers for these languages insert all communication, they have considerable knowledge about a program's dynamic structu ..."
Abstract
-
Cited by 7 (1 self)
- Add to MetaCart
Effective strategies for performance analysis and tuning will be essential for the success of data parallel languages such as High-Performance Fortran (HPF) and Fortran D. Since compilers for these languages insert all communication, they have considerable knowledge about a program's dynamic structure and the relationship between its parallelism and communication. This paper explores how this compiler knowledge can be exploited to support performance evaluation and tuning. First, the compiler itself can use parameterized models to tune the performance of individual program phases; this approach can be effective provided that the compiler can test and handle violations of the model assumptions. Second, by exploiting compiler knowledge and introducing code transformations to improve monitorability, we can collect dynamic performance information that is far more compact than full communication traces, but well suited to the needs of tuning specific communication patterns. Third, we discus...
Performance Visualisation in a Portable Parallel Programming Environment
- Performance Measurement and Visualization of Parallel Systems
, 1992
"... In order to obtain the highest possible performance from programs running on massively parallel machines it is essential to identify precisely where and when computational resources are consumed during their execution. A number of performance visualisation tools have evolved to meet this need for pa ..."
Abstract
-
Cited by 6 (0 self)
- Add to MetaCart
In order to obtain the highest possible performance from programs running on massively parallel machines it is essential to identify precisely where and when computational resources are consumed during their execution. A number of performance visualisation tools have evolved to meet this need for particular systems but they are often not portable to other machines. We regard portability as crucial to the widespread acceptance and use of such tools, and have investigated several approaches to achieving it. Each approach has been based on the public domain ParaGraph tool, which enables trace data collected during a program's execution to be viewed from various different visual perspectives. One approach is for programs to use the portable instrumented communication library PICL, which directly generates trace data in the appropriate format. Alternatively, trace files produced by applications using other libraries can be converted into ParaGraph format using trace filter programs. In this...
Parallel Adaptive Mesh Generation and Decomposition
- WHR
, 1996
"... An important class of methodologies for the parallel processing of computational models defined on some discrete geometric data structures (i.e., meshes, grids) is the so called geometry decomposition or splitting approach. Compared to the sequential processing of such models, the geometry splitting ..."
Abstract
-
Cited by 5 (2 self)
- Add to MetaCart
An important class of methodologies for the parallel processing of computational models defined on some discrete geometric data structures (i.e., meshes, grids) is the so called geometry decomposition or splitting approach. Compared to the sequential processing of such models, the geometry splitting parallel methodology requires an additional computational phase. It consists of the decomposition of the associated geometric data structure into a number of balanced subdomains that satisfy a number of conditions that ensure the load balancing and minimum communication requirement of the underlying computations on a parallel hardware platform. It is well known that the implementation of the mesh decomposition phase requires the solution of a computationally intensive problem. For this reason several fast heuristics have been proposed. In this paper we explore a decomposition approach which is part of a parallel adaptive finite element mesh procedure. The proposed integrated approach consists of five steps. It starts with a coarse background mesh that is optimally decomposed by applying well known heuristics. Then, the initial mesh is refined in each subdomain after linking the new boundaries introduced by its decomposition. Finally, the decomposition of the new refined mesh is improved so that it satisfies the objectives and conditions of the mesh decomposition problem. Extensive experimentation indicates the effectiveness and efficiency of the proposed parallel mesh and decomposition approach.- 1-1.

