Results 1 -
3 of
3
Loop Fusion in High Performance Fortran
- IN PROCEEDINGS OF THE 1998 ACM INTERNATIONAL CONFERENCE ON SUPERCOMPUTING
, 1998
"... In this paper we investigate a unique problem associated with fusing loops within a High Performance Fortran (HPF) program. In particular, we discuss the issue of performing loop fusion in an HPF compiler when compiling Fortran90 array assignment statements for execution on a distributedmemory machi ..."
Abstract
-
Cited by 8 (2 self)
- Add to MetaCart
In this paper we investigate a unique problem associated with fusing loops within a High Performance Fortran (HPF) program. In particular, we discuss the issue of performing loop fusion in an HPF compiler when compiling Fortran90 array assignment statements for execution on a distributedmemory machine. During compilation of an HPF program, Fortran90 array assignment statements must be scalarized into loop nests. We show how a certain class of these loop nests, when fused, can cause problems for the compiler's distributed-memory code generator. We then present an algorithm which not only prevents the fusion of these loops, but also increases the amount of useful fusion that can be performed.
A Design Methodology For Data-Parallel Applications
- in AIP Design Meeting (12/95
, 1995
"... A methodology for the design and development of data parallel applications and components is presented. Dataparallelism is a well understood form of parallel computation, yet developing simple applications can involve substantial efforts to express the problem in low-level notations. We describe a p ..."
Abstract
-
Cited by 4 (1 self)
- Add to MetaCart
A methodology for the design and development of data parallel applications and components is presented. Dataparallelism is a well understood form of parallel computation, yet developing simple applications can involve substantial efforts to express the problem in low-level notations. We describe a process of software development for data-parallel applications starting from high-level specifications, generating repeated refinements of designs to match different architectural models and performance constraints, enabling a development activity with cost-benefit analysis. Primary issues are algorithm choice, correctness and efficiency, followed by data decomposition, load balancing and messagepassing coordination. Development of a data-parallel multitarget tracking application is used as a case study, showing the progression from high to low-level refinements. We conclude by describing tool support for the process. 1.
Context Optimization for SIMD Execution
- IN PROCEEDINGS OF THE 1994 SCALABLE HIGH PERFORMANCE COMPUTING CONFERENCE
, 1994
"... One issue that SIMD compilers must address is generating code to change the machine context; i.e., disabling processors not involved in the current computation. We present two compiler optimizations that reduce the cost of context changes. The first optimization, context partitioning, reorders the ..."
Abstract
-
Cited by 4 (4 self)
- Add to MetaCart
One issue that SIMD compilers must address is generating code to change the machine context; i.e., disabling processors not involved in the current computation. We present two compiler optimizations that reduce the cost of context changes. The first optimization, context partitioning, reorders the Fortran 90 code so that as subgrid loops are generated, as many statements as possible that require the same context are placed in the same loop nest. The second optimization, context splitting, splits the iteration space of the subgrid loops into sets that have invariant contexts. This allows us to hoist the code that sets the machine context out of the subgrid loops.

