Results 1 - 10
of
22
Automatic Data Layout for High-Performance Fortran
- IN PROCEEDINGS OF SUPERCOMPUTING '95
, 1994
"... High Performance Fortran (HPF) is rapidly gaining acceptance as a language for parallel programming. The goal of HPF is to provide a simple yet ecient machine independent parallel programming model. Besides the algorithm selection, the data layout choice is the key intellectual step in writing an ec ..."
Abstract
-
Cited by 66 (3 self)
- Add to MetaCart
High Performance Fortran (HPF) is rapidly gaining acceptance as a language for parallel programming. The goal of HPF is to provide a simple yet ecient machine independent parallel programming model. Besides the algorithm selection, the data layout choice is the key intellectual step in writing an ecient HPF program. The developers of HPF did not believe that data layouts can be determined automatically in all cases. Therefore HPF requires the user to specify the data layout. It is the task of the HPF compiler to generate ecient code for the user supplied data layout. The choice
The ADIFOR 2.0 System for the Automatic Differentiation of Fortran 77 Programs
- RICE UNIVERSITY
, 1994
"... Automatic Differentiation is a technique for augmenting computer programs with statements for the computation of derivatives based on the chain rule of differential calculus. The ADIFOR 2.0 system provides automatic differentiation of Fortran 77 programs for first-order derivatives. The ADIFOR 2.0 s ..."
Abstract
-
Cited by 50 (16 self)
- Add to MetaCart
Automatic Differentiation is a technique for augmenting computer programs with statements for the computation of derivatives based on the chain rule of differential calculus. The ADIFOR 2.0 system provides automatic differentiation of Fortran 77 programs for first-order derivatives. The ADIFOR 2.0 system consists of three main components: The ADIFOR 2.0 preprocessor, the ADIntrinsics Fortran 77 exception-handling system, and the SparsLinC library. The combination of these tools provides the ability to deal with arbitrary Fortran 77 syntax, to handle codes containing single- and double-precision real- or complex-valued data, to fully support and easily customize the translation of Fortran 77 intrinsics, and to transparently exploit sparsity in derivative computations. ADIFOR 2.0 has been successfully applied to a 60,000-line code, which we believe to be a new record in automatic differentiation.
Interprocedural Compilation of Fortran D
, 1996
"... Fortran D is a version of Fortran extended with data decomposition specifications. It is designed to provide a machine-independent programming model for data-parallel applications and has heavily influenced the design of High Performance Fortran (HPF). In previous work we described Fortran D compila ..."
Abstract
-
Cited by 20 (2 self)
- Add to MetaCart
Fortran D is a version of Fortran extended with data decomposition specifications. It is designed to provide a machine-independent programming model for data-parallel applications and has heavily influenced the design of High Performance Fortran (HPF). In previous work we described Fortran D compilation algorithms for individual procedures. This paper presents an interprocedural approach to analyze data & computation partitions, optimize communication, support dynamic data decomposition, and perform other tasks required to compile Fortran D programs. Our algorithms are designed to make interprocedural compilation efficient. First, we collect summary information after edits to solve important data-flow problems in a separate interprocedural propagation phase. Second, for non-recursive programs we compile procedures in reverse topological order to propagate additional interprocedural information during code generation. We thus limit compilation to a single pass over each procedure body. ...
Automatic data layout for distributed-memory machines
- ACM Transactions on Programming Languages and Systems
, 1998
"... The goal of languages like Fortran D or High Performance Fortran (HPF) is to provide a simple yet efficient machine-independent parallel programming model. After the algorithm selection, the data layout choice is the key intellectual challenge in writing an efficient program in such languages. The p ..."
Abstract
-
Cited by 18 (0 self)
- Add to MetaCart
The goal of languages like Fortran D or High Performance Fortran (HPF) is to provide a simple yet efficient machine-independent parallel programming model. After the algorithm selection, the data layout choice is the key intellectual challenge in writing an efficient program in such languages. The performance of a data layout depends on the target compilation system, the target machine, the problem size, and the number of available processors. This makes the choice of a good layout extremely difficult for most users of such languages. If languages such as HPF are to find general acceptance, the need for data layout selection support has to be addressed. We believe that the appropriate way to provide the needed support is through a tool that generates data layout specifications automatically. This article discusses the design and implementation of a data layout selection tool that generates HPF-style data layout specifications automatically. Because layout is done in a tool that is not embedded in the target compiler and hence will be run only a few times during the tuning phase of an application, it can use techniques such as integer programming that may be considered too computationally expensive for inclusion in production compilers. The proposed framework for automatic data layout selection builds and examines search spaces of candidate data layouts. A candidate layout is an efficient layout for
The Importance of Synchronization Structure in Parallel Program Optimization
- in Proc. 11th ACM Int'l Conf. on Supercomputing
, 1997
"... In automatic, retargetable compilation low-cost, analytic cost estimation techniques are crucial in order to efficiently steer the optimization process. Programming models aimed at optimum expressiveness of parallelism, however, are not amenable to static cost estimation. We present a new coordinati ..."
Abstract
-
Cited by 10 (6 self)
- Add to MetaCart
In automatic, retargetable compilation low-cost, analytic cost estimation techniques are crucial in order to efficiently steer the optimization process. Programming models aimed at optimum expressiveness of parallelism, however, are not amenable to static cost estimation. We present a new coordination model, called SPC, that imposes specific restrictions in the synchronization structures that can be programmed. Imposing these restrictions enables the efficient computation of reliable cost estimations paving the way for automatic optimization. Regarding SPC's limited expressiveness we present a conjecture stating that the loss of parallelism when programming in SPC is typically limited to a constant factor of 2 compared to the unrestricted case. This limited loss is outweighed by the unlocked potential of automatic performance optimization as well as the portability that is achieved. We demonstrate how SPC enables automatic program optimizations through a compilation case study involvin...
Visualization of Distributed Data Structures for HPF-like Languages
"... This paper motivates the usage of graphics and visualization for efficient utilization of HPF's data distribution facilities. It proposes a graphical tooltkit consisting of exploratory tools and estimation tools which allow the programmer to navigate through complex distributions and to obtain graph ..."
Abstract
-
Cited by 8 (4 self)
- Add to MetaCart
This paper motivates the usage of graphics and visualization for efficient utilization of HPF's data distribution facilities. It proposes a graphical tooltkit consisting of exploratory tools and estimation tools which allow the programmer to navigate through complex distributions and to obtain graphical ratings with respect to load distribution and communication. The toolkit has been implemented in a mapping design and visualization tool which is coupled with a compilation system for the HPF predecessor Vienna Fortran. Since this language covers a superset of HPF's facilities, the tool may also be used for visualization of HPF data structures.
Tools and Techniques for Automatic Data Layout: A Case Study
- PARALLEL COMPUTING
, 1998
"... Parallel architectures with physically distributed memory providing computing cycles and large amounts of memory are becoming more and more common. To make such architectures truly usable, programming models and support tools are needed to ease the programming effort for these parallel systems. A ..."
Abstract
-
Cited by 7 (1 self)
- Add to MetaCart
Parallel architectures with physically distributed memory providing computing cycles and large amounts of memory are becoming more and more common. To make such architectures truly usable, programming models and support tools are needed to ease the programming effort for these parallel systems. Automatic data distribution tools and techniques play an important role in achieving that goal. This paper discusses state-of-the-art approaches to fully automatic data and computation partitioning. A kernel application is used as a case study to illustrate the main differences of four representative approaches. The paper concludes with a discussion of promising future research directions for automatic data layout.
Notes on SPC: A Parallel Programming Model
, 1997
"... In automatic, retargetable compilation low-cost, analytic cost estimation techniques are crucial in order to efficiently steer the optimization process. Programming models aimed at optimum expressiveness of parallelism, however, are not amenable to static cost estimation. We present a new coordinati ..."
Abstract
-
Cited by 4 (4 self)
- Add to MetaCart
In automatic, retargetable compilation low-cost, analytic cost estimation techniques are crucial in order to efficiently steer the optimization process. Programming models aimed at optimum expressiveness of parallelism, however, are not amenable to static cost estimation. We present a new coordination model, called SPC, that imposes specific restrictions in the synchronization structures that can be programmed. Imposing these restrictions enables the efficient computation of reliable cost estimations that can be used in the automatic optimization process. We present a conjecture stating that the loss of parallelism due to the synchronization constraints imposed by SPC is limited to a constant factor of 2 compared to the unrestricted case. We demonstrate the expressiveness of SPC through various programming examples, show how SPC is translated, and show how SPC enables automatic program optimizations through a case study compiling a line relaxation kernel on a distributed-memory machine...
Fortran RED - A Retargetable Environment for Automatic Data Layout
- In Eleventh Workshop on Languages and Compilers for Parallel Computing
, 1998
"... . The proliferation of parallel platforms over the last ten years has been dramatic. Parallel platforms come in different flavors, including desk--top multiprocessor PCs and workstations with a few processors, networks of PCs and workstations, and supercomputers with hundreds of processors or more. ..."
Abstract
-
Cited by 3 (2 self)
- Add to MetaCart
. The proliferation of parallel platforms over the last ten years has been dramatic. Parallel platforms come in different flavors, including desk--top multiprocessor PCs and workstations with a few processors, networks of PCs and workstations, and supercomputers with hundreds of processors or more. This diverse collection of parallel platforms provide not only computing cycles, but other important resources for scientific computing as well, such as large amounts of main memory and fast I/O capabilities. As a result of the proliferation of parallel platforms, the "typical profile" of a potential user of such systems has changed considerably. The specialist user who has a good understanding of the complexities of the target parallel system has been replaced by a user who is largely unfamiliar with the underlying system characteristics. While the specialist's main concern is peak performance, the non--specialist user may be willing to trade off performance for ease of programming. Recent ...

