Results 1 -
3 of
3
STEP: A Distributed OpenMP for Coarse-Grain Parallelism Tool
- In Proceedings of the 4th International Conference on OpenMP
, 1965
"... Abstract. To benefit from distributed architectures, many applications need a coarse grain paral-lelisation of their programs. In order to help a non-expert parallel programmer to take advantage of this possibility, we have carried out a tool called STEP (Système de Transformation pour l’Exécution ..."
Abstract
-
Cited by 5 (1 self)
- Add to MetaCart
Abstract. To benefit from distributed architectures, many applications need a coarse grain paral-lelisation of their programs. In order to help a non-expert parallel programmer to take advantage of this possibility, we have carried out a tool called STEP (Système de Transformation pour l’Exécution Parallèle). From a code decorated with OpenMP directives, this source-to-source transformation tool produces another code based on the message-passing programming model automatically. Thus, the programs of the legacy application can easily and reliably evolve without the burden of re-structuring the code so as to insert calls to message passing API primitives. This tool deals with difficulties inherent in coarse grain parallelisation such as inter-procedural analyses and irregular code. 1
OpenMP for Clusters via Translation to Global Arrays ∗
, 2004
"... This paper discusses a novel approach to implementing OpenMP on clusters. Traditional approaches to do so rely on Software Distributed Shared Memory systems to handle shared data. We discuss these and then introduce an alternative approach that translates OpenMP to Global Arrays (GA), explaining the ..."
Abstract
- Add to MetaCart
(Show Context)
This paper discusses a novel approach to implementing OpenMP on clusters. Traditional approaches to do so rely on Software Distributed Shared Memory systems to handle shared data. We discuss these and then introduce an alternative approach that translates OpenMP to Global Arrays (GA), explaining the basic strategy. GA requires a data distribution. We do not expect the user to supply this; rather, we show how we perform data distribution and work distribution according to the user-supplied OpenMP static loop schedules. An inspector-executor strategy is employed for irregular applications in order to gather information on accesses to potentially non-local data, group non-local data transfers and overlap communications with local computations. Furthermore, a new directive INVARIANT is proposed to provide information about the dynamic scope of data access patterns. This directive can help us generate efficient codes for irregular applications using the inspector-executor approach. We also illustrate how to deal with some hard cases containing reshaping and strided accesses during the translation. Our experiments show promising results for the corresponding regular and irregular GA codes.
Prototyping the automatic generation of MPI code from OpenMP programs in GCC
"... Abstract. Multiprocessor architectures comprising various memory organizations and communi-cation schemes are now widely deployed. Hence, powerful programming tools are expressly needed in order to fully exploit the computation capabilities of such architectures. Classically, a parallel program must ..."
Abstract
- Add to MetaCart
Abstract. Multiprocessor architectures comprising various memory organizations and communi-cation schemes are now widely deployed. Hence, powerful programming tools are expressly needed in order to fully exploit the computation capabilities of such architectures. Classically, a parallel program must strictly conform to a unique parallel paradigm (either shared memory or message passing). However it is clear that this approach is far from being suited to current architectures. The shared-memory paradigm is restricted to certain hardware architectures while the message-passing paradigm prevents the application from evolution. In this paper we investigated the use of GCC as a compiler to transform an OpenMP program into a MPI program. OpenMP directives are used to express parallelism in the application and to provide data dependence information to the compiler. We deeply analyze GOMP, the OpenMP implementation in GCC, providing code trans-formations in the GIMPLE representation and compare them to those necessary for transforming a shared-memory program (annotated with OpenMP pragmas) to a message-passing MPI-based program. In order to test some ideas, we have developed a limited prototype. The result is that, in addition to GOMP providing parser and code transformation, additional data dependency analyses are necessary to generate communications. 1