Results 1 -
9 of
9
Compiler Transformations for High-Performance Computing
- ACM Computing Surveys
, 1994
"... In the last three decades a large number of compiler transformations for optimizing programs have been implemented. Most optimization for uniprocessors reduce the number of instructions executed by the program using transformations based on the analysis of scalar quantities and data-flow techniques. ..."
Abstract
-
Cited by 332 (4 self)
- Add to MetaCart
In the last three decades a large number of compiler transformations for optimizing programs have been implemented. Most optimization for uniprocessors reduce the number of instructions executed by the program using transformations based on the analysis of scalar quantities and data-flow techniques. In contrast, optimization for
Fortran D Language Specification
, 1990
"... This paper presents Fortran D, a version of Fortran enhanced with data decomposition specifications. It is designed to support two fundamental stages of writing a data-parallel program: problem mapping using sophisticated array alignments, and machine mapping through a rich set of data distribution ..."
Abstract
-
Cited by 278 (47 self)
- Add to MetaCart
This paper presents Fortran D, a version of Fortran enhanced with data decomposition specifications. It is designed to support two fundamental stages of writing a data-parallel program: problem mapping using sophisticated array alignments, and machine mapping through a rich set of data distribution functions. We believe that Fortran D provides a simple machine-independent programming model for most numerical computations. We intend to evaluate its usefulness for both programmers and advanced compilers on a variety of parallel architectures.
Compiler Support for Machine-Independent Parallel Programming in Fortran D
, 1991
"... Because of the complexity and variety of parallel architectures, an efficient machine-independent parallel programming model is needed to make parallel computing truly usable for scientific programmers. We believe that Fortran D, a version of Fortran enhanced with data decomposition specifications, ..."
Abstract
-
Cited by 76 (16 self)
- Add to MetaCart
Because of the complexity and variety of parallel architectures, an efficient machine-independent parallel programming model is needed to make parallel computing truly usable for scientific programmers. We believe that Fortran D, a version of Fortran enhanced with data decomposition specifications, can provide such a programming model. This paper presents the design of a prototype Fortran D compiler for the iPSC/860, a MIMD distributed-memory machine. Issues addressed include data decomposition analysis, guard introduction, communications generation and optimization, program transformations, and storage assignment. A test suite of scientific programs will be used to evaluate the effectiveness of both the compiler technology and programming model for the Fortran D compiler.
Automatic Data Layout for Distributed Memory Machines
, 1995
"... The goal of languages like Fortran D or High Performance Fortran (HPF) is to provide a simple yet efficient machine-independent parallel programming model. Besides the algorithm selection, the data layout choice is the key intellectual challenge in writing an efficient program in such languages. The ..."
Abstract
-
Cited by 35 (5 self)
- Add to MetaCart
The goal of languages like Fortran D or High Performance Fortran (HPF) is to provide a simple yet efficient machine-independent parallel programming model. Besides the algorithm selection, the data layout choice is the key intellectual challenge in writing an efficient program in such languages. The performance of a data layout depends on the target compilation system, the target machine, the problem size, and the number of available processors. This makes the choice of a good layout extremely difficult for most users of such languages. This thesis discusses the design and implementation of a data layout selection tool that generates Fortran D or HPF style data layout specifications automatically. Because the tool is not embedded in the target compiler and will be run only a few times during the tuning phase of an application, it can use techniques that may be considered too computationally expensive for inclusion in today's compilers. The proposed framework for automatic data layout s...
Context Optimization for SIMD Execution
- IN PROCEEDINGS OF THE 1994 SCALABLE HIGH PERFORMANCE COMPUTING CONFERENCE
, 1994
"... One issue that SIMD compilers must address is generating code to change the machine context; i.e., disabling processors not involved in the current computation. We present two compiler optimizations that reduce the cost of context changes. The first optimization, context partitioning, reorders the ..."
Abstract
-
Cited by 4 (4 self)
- Add to MetaCart
One issue that SIMD compilers must address is generating code to change the machine context; i.e., disabling processors not involved in the current computation. We present two compiler optimizations that reduce the cost of context changes. The first optimization, context partitioning, reorders the Fortran 90 code so that as subgrid loops are generated, as many statements as possible that require the same context are placed in the same loop nest. The second optimization, context splitting, splits the iteration space of the subgrid loops into sets that have invariant contexts. This allows us to hoist the code that sets the machine context out of the subgrid loops.
Optimizing Fortran90D/HPF for Distributed-Memory Computers
, 1997
"... High Performance Fortran (HPF), as well as its predecessor FortranD, has attracted considerable attention as a promising language for writing portable parallel programs for a wide variety of distributed-memory architectures. Programmers express data parallelism using Fortran90 array operations and u ..."
Abstract
-
Cited by 4 (4 self)
- Add to MetaCart
High Performance Fortran (HPF), as well as its predecessor FortranD, has attracted considerable attention as a promising language for writing portable parallel programs for a wide variety of distributed-memory architectures. Programmers express data parallelism using Fortran90 array operations and use data layout directives to direct the partitioning of the data and computation among the processors of a parallel machine. For HPF to gain acceptance as a vehicle for parallel scientific programming, it must achieve high performance on problems for which it is well suited. To achieve high performance with an HPF program on a distributed-memory parallel machine, an HPF compiler must do a superb job of translating Fortran90 data-parallel array constructs into an efficient sequence of operations that minimize the overhead associated with data movement and also maximize data locality. This dissertation presents and analyzes a set of advanced optimizations designed to improve the execution perf...
Optimizing Fortran 90D Programs for SIMD Execution
, 1993
"... SIMD architectures offer an alternative to MIMD architectures for obtaining high performance computation through parallelism. These architectures can offer impressive price/performance ratios for certain classes of problems. However, the effectiveness of such machines is greatly affected by the capa ..."
Abstract
- Add to MetaCart
SIMD architectures offer an alternative to MIMD architectures for obtaining high performance computation through parallelism. These architectures can offer impressive price/performance ratios for certain classes of problems. However, the effectiveness of such machines is greatly affected by the capabilities of the compilers which produce code for it. Current compilers have many weaknesses that introduce inefficiencies in the code that they produce. It is our thesis that advanced compiler techniques can produce more efficient SIMD code and exploit the massively parallel hardware closer to its full potential. To validate our thesis, we are designing and implementing compiler transformations that optimize computation and communication given the constraint of a single instruction stream. 1 Introduction Parallel computing has been becoming more and more popular as a method of obtaining high performance. This trend will continue as parallel computers become less expensive and more readily ...
Compiler Transformations for High-Performance Computing
- ACM Computing Surveys
, 1993
"... this paper are based on loop manipulation. Many of them have been studied or widely applied only in the context of the innermost loop ..."
Abstract
- Add to MetaCart
this paper are based on loop manipulation. Many of them have been studied or widely applied only in the context of the innermost loop
Automatic Localization for Distributed-Memory Multiprocessors Using a Shared-Memory Compilation Framework
, 1994
"... In this paper, we outline an approach for compiling for distributed-memory multiprocessors that is inherited from compiler technologies for shared-memory multiprocessors. We believe that this approach to compiling for distributed-memory machines is promising because it is a logical extension of the ..."
Abstract
- Add to MetaCart
In this paper, we outline an approach for compiling for distributed-memory multiprocessors that is inherited from compiler technologies for shared-memory multiprocessors. We believe that this approach to compiling for distributed-memory machines is promising because it is a logical extension of the sharedmemory parallel programming model, a model that is easier for programmers to work with, and that has been studied in great detail as a target for parallelizing and optimizing compilers. In particular, the paper focuses on the localization step, and presents optimal localization algorithms for a single global DOALL, and for special-case structures of multiple global DOALLs.

