Results 1 - 10
of
16
Fortran D Language Specification
, 1990
"... This paper presents Fortran D, a version of Fortran enhanced with data decomposition specifications. It is designed to support two fundamental stages of writing a data-parallel program: problem mapping using sophisticated array alignments, and machine mapping through a rich set of data distribution ..."
Abstract
-
Cited by 278 (47 self)
- Add to MetaCart
This paper presents Fortran D, a version of Fortran enhanced with data decomposition specifications. It is designed to support two fundamental stages of writing a data-parallel program: problem mapping using sophisticated array alignments, and machine mapping through a rich set of data distribution functions. We believe that Fortran D provides a simple machine-independent programming model for most numerical computations. We intend to evaluate its usefulness for both programmers and advanced compilers on a variety of parallel architectures.
Automatic Data Partitioning on Distributed Memory Multiprocessors
, 1991
"... An important problem facing numerous research projects on parallelizing compilers for distributed memory machines is that of automatically determining a suitable data partitioning scheme for a program. Most of the current projects leave this tedious problem almost entirely to the user. In this paper ..."
Abstract
-
Cited by 102 (6 self)
- Add to MetaCart
An important problem facing numerous research projects on parallelizing compilers for distributed memory machines is that of automatically determining a suitable data partitioning scheme for a program. Most of the current projects leave this tedious problem almost entirely to the user. In this paper, we present a novel approach to the problem of automatic data partitioning. We introduce the notion of constraints on data distribution, and show how, based on performance considerations, a compiler identifies constraints to be imposed on the distribution of various data structures. These constraints are then combined by the compiler to obtain a complete and consistent picture of the data distribution scheme, one that offers good performance in terms of the overall execution time.
Compiler Support for Machine-Independent Parallel Programming in Fortran D
, 1991
"... Because of the complexity and variety of parallel architectures, an efficient machine-independent parallel programming model is needed to make parallel computing truly usable for scientific programmers. We believe that Fortran D, a version of Fortran enhanced with data decomposition specifications, ..."
Abstract
-
Cited by 76 (16 self)
- Add to MetaCart
Because of the complexity and variety of parallel architectures, an efficient machine-independent parallel programming model is needed to make parallel computing truly usable for scientific programmers. We believe that Fortran D, a version of Fortran enhanced with data decomposition specifications, can provide such a programming model. This paper presents the design of a prototype Fortran D compiler for the iPSC/860, a MIMD distributed-memory machine. Issues addressed include data decomposition analysis, guard introduction, communications generation and optimization, program transformations, and storage assignment. A test suite of scientific programs will be used to evaluate the effectiveness of both the compiler technology and programming model for the Fortran D compiler.
Automatic Data Layout for Distributed Memory Machines
, 1995
"... The goal of languages like Fortran D or High Performance Fortran (HPF) is to provide a simple yet efficient machine-independent parallel programming model. Besides the algorithm selection, the data layout choice is the key intellectual challenge in writing an efficient program in such languages. The ..."
Abstract
-
Cited by 35 (5 self)
- Add to MetaCart
The goal of languages like Fortran D or High Performance Fortran (HPF) is to provide a simple yet efficient machine-independent parallel programming model. Besides the algorithm selection, the data layout choice is the key intellectual challenge in writing an efficient program in such languages. The performance of a data layout depends on the target compilation system, the target machine, the problem size, and the number of available processors. This makes the choice of a good layout extremely difficult for most users of such languages. This thesis discusses the design and implementation of a data layout selection tool that generates Fortran D or HPF style data layout specifications automatically. Because the tool is not embedded in the target compiler and will be run only a few times during the tuning phase of an application, it can use techniques that may be considered too computationally expensive for inclusion in today's compilers. The proposed framework for automatic data layout s...
Evaluation of Compiler Optimizations for Fortran D on MIMD Distributed-Memory Machines
- IN PROCEEDINGS OF THE 1992 ACM INTERNATIONAL CONFERENCE ON SUPERCOMPUTING
, 1992
"... The Fortran D compiler uses data decomposition specifications to automatically translate Fortran programs for execution on MIMD distributed-memory machines. This paper introduces and classifies a number of advanced optimizations needed to achieve acceptable performance; they are analyzed and empiric ..."
Abstract
-
Cited by 33 (11 self)
- Add to MetaCart
The Fortran D compiler uses data decomposition specifications to automatically translate Fortran programs for execution on MIMD distributed-memory machines. This paper introduces and classifies a number of advanced optimizations needed to achieve acceptable performance; they are analyzed and empirically evaluated for stencil computations. Profitability formulas are derived for each optimization. Results show that exploiting parallelism for pipelined computations, reductions, and scans is vital. Message vectorization, collective communication, and efficient coarsegrain pipelining also significantly affect performance.
PARADIGM: A Compiler for Automatic Data Distribution on Multicomputers
- In International Conference on Supercomputing
, 1993
"... One of the most challenging steps in developing a parallel program for a distributed memory machine is determining how data should be distributed across processors. Most of the compilers being developed to make it easier to program such machines still provide no assistance to the programmer in this ..."
Abstract
-
Cited by 33 (2 self)
- Add to MetaCart
One of the most challenging steps in developing a parallel program for a distributed memory machine is determining how data should be distributed across processors. Most of the compilers being developed to make it easier to program such machines still provide no assistance to the programmer in this difficult and machinedependent task. We have developed Paradigm, a compiler that makes data partitioning decisions for Fortran 77 procedures. A significant feature of the design of Paradigm is the decomposition of the data partitioning problem into a number of sub-problems, each dealing with a different distribution parameter for all the arrays. This paper presents the algorithms that, in conjunction with the computational and the communication cost estimators developed by us, determine those distribution parameters. We also present results obtained on Fortran procedures taken from the Linpack and Eispack libraries, and the Perfect Benchmarks. We believe these are the first results demonstr...
SUPERB Support for Irregular Scientific Computations
, 1992
"... Runtime support for parallelization of scientific programs is needed when some information important for decisions in this process cannot be accurately derived at compile time. This paper describes a project which integrates runtime parallelization with advanced compile-time parallelization techniqu ..."
Abstract
-
Cited by 25 (6 self)
- Add to MetaCart
Runtime support for parallelization of scientific programs is needed when some information important for decisions in this process cannot be accurately derived at compile time. This paper describes a project which integrates runtime parallelization with advanced compile-time parallelization techniques of SUPERB. Besides the description of implementation techniques, language constructs are proposed, providing means for the specification of irregular computations. SUPERB is an interactive SIMD/MIMD parallelizing system for the SUPRENUM, iPSC/860 and GENESIS-P machines. The implementation of the runtime parallelization is based on the Parti procedures developed at ICASE NASA. 1 Introduction SUPERB (SUprenum ParallelizER Bonn) ([3, 17]) is a semi-automatic parallelization tool for distributed memory multiprocessors, e.g. SUPRENUM, iPSC/860 and GENESIS-P. It is a source-to-source transformation system which translates Fortran 77 programs into parallel programs written in the Fortran diale...
A Framework for Exploiting Data Availability to Optimize Communication
- IN PROCEEDINGS OF THE SIXTH WORKSHOP ON LANGUAGES AND COMPILERS FOR PARALLEL COMPUTING
, 1994
"... This paper presents a global analysis framework for determining the availability of data on a virtual processor grid. The data availability information obtained is useful for optimizing communication when generating SPMD programs for distributed address-space multiprocessors. We introduce a new kin ..."
Abstract
-
Cited by 15 (3 self)
- Add to MetaCart
This paper presents a global analysis framework for determining the availability of data on a virtual processor grid. The data availability information obtained is useful for optimizing communication when generating SPMD programs for distributed address-space multiprocessors. We introduce a new kind of array section descriptor, called an Available Section Descriptor, which represents the mapping of an array section onto the processor grid. We present an array data-flow analysis procedure, based on interval analysis, for determining data availability at each statement. Several communication optimizations, including redundant communication elimination, are also described. An advantage of our approach is that it is independent of actual data partitioning and representation of explicit communication.
AP1000+: Architectural Support of PUT/GET Interface for Parallelizing Compiler
, 1994
"... The scalability of distributed-memory parallel computers makes them attractive candidates for solving largescale problems. New languages, such as HPF, FortranD, and VPP Fortran, have been developed to enable existing software to be easily ported to such machines. Many distributed-memory parallel com ..."
Abstract
-
Cited by 12 (0 self)
- Add to MetaCart
The scalability of distributed-memory parallel computers makes them attractive candidates for solving largescale problems. New languages, such as HPF, FortranD, and VPP Fortran, have been developed to enable existing software to be easily ported to such machines. Many distributed-memory parallel computers have been built, but none of them support the mechanisms required by such languages. We studied the mechanisms required by parallelizing compilers and proposed a new architecture to support them. Based on this proposed architecture, we developed a new distributed-memory parallel computer, the AP1000+, which is an enhanced version of the AP1000. Using scientific applications in VPP Fortran and C, such as NAS parallel benchmarks, we simulated the performance of the AP1000+. 1 Introduction The scalability of distributed-memory parallel computers makes them attractive candidates for solving large-scale problems. Since the memory model of the distributedmemory architecture is radically d...
High Performance Fortran: History, Status and Future
, 1997
"... High Performance Fortran (HPF) is a data-parallel language that was designed to provide the user with a high-level interface for programming scientific applications, while delegating to the compiler the task of generating an explicitly parallel message-passing program. The main objective of this pap ..."
Abstract
-
Cited by 10 (5 self)
- Add to MetaCart
High Performance Fortran (HPF) is a data-parallel language that was designed to provide the user with a high-level interface for programming scientific applications, while delegating to the compiler the task of generating an explicitly parallel message-passing program. The main objective of this paper is to study the expressivity of the language and related performance issues. After giving an outline of developments that led to HPF and shortly explaining its major features, we discuss in detail a variety of approaches for solving multiblock problems and applications dealing with unstructured meshes. We argue that the efficient solution of these problems does not only need the full range of the HPF Approved Extensions, but also requires additional features such as the explicit control of communication schedules and support for value-based alignment. The final part of the paper points out some classes of problems that are difficult to deal with efficiently within the HPF paradigm.

