• Documents
  • Authors
  • Tables
  • Other Seers ▼
    RefSeer AckSeer CollabSeer SeerSeer
  • Log in
  • Sign up
  • MetaCart

CiteSeerX logo

Advanced Search Include Citations
Advanced Search Include Citations | Disambiguate

A framework for unifying reordering transformations (1993)

by Wayne Kelly, William Pugh
Add To MetaCart

Tools

Sorted by:
Results 1 - 10 of 48
Next 10 →

Counting Solutions to Presburger Formulas: How and Why

by William Pugh , 1994
"... We describe methods that are able to count the number of integer solutions to selected free variables of a Presburger formula, or sum a polynomial over all integer solutions of selected free variables of a Presburger formula. This answer is given symbolically, in terms of symbolic constants (the rem ..."
Abstract - Cited by 78 (2 self) - Add to MetaCart
We describe methods that are able to count the number of integer solutions to selected free variables of a Presburger formula, or sum a polynomial over all integer solutions of selected free variables of a Presburger formula. This answer is given symbolically, in terms of symbolic constants (the remaining free variables in the Presburger formula). For example...

A Linear Algebra Framework for Static HPF Code Distribution

by Corinne Ancourt, Fabien Coelho, François Irigoin, Ronan Keryell , 1995
"... High Performance Fortran (hpf) was developed to support data parallel programming for simd and mimd machines with distributed memory. The programmer is provided a familiar uniform logical address space and specifies the data distribution by directives. The compiler then exploits these directives to ..."
Abstract - Cited by 72 (7 self) - Add to MetaCart
High Performance Fortran (hpf) was developed to support data parallel programming for simd and mimd machines with distributed memory. The programmer is provided a familiar uniform logical address space and specifies the data distribution by directives. The compiler then exploits these directives to allocate arrays in the local memories, to assign computations to elementary processors and to migrate data between processors when required. We show here that linear algebra is a powerful framework to encode Hpf directives and to synthesize distributed code with space-efficient array allocation, tight loop bounds and vectorized communications for INDEPENDENT loops. The generated code includes traditional optimizations such as guard elimination, message vectorization and aggregation, overlap analysis... The systematic use of an affine framework makes it possible to prove the compilation scheme correct. An early version of this paper was presented at the Fourth International Workshop on Comp...

Code Generation for Multiple Mappings

by Wayne Kelly, William Pugh, Evan Rosser - IN FRONTIERS '95: THE 5TH SYMPOSIUM ON THE FRONTIERS OF MASSIVELY PARALLEL COMPUTATION , 1994
"... There has been a great amount of recent work toward unifying iteration reordering transformations. Many of these approaches represent transformations as affine mappings from the original iteration space to a new iteration space. These approaches show a great deal of promise, but they all rely on the ..."
Abstract - Cited by 70 (2 self) - Add to MetaCart
There has been a great amount of recent work toward unifying iteration reordering transformations. Many of these approaches represent transformations as affine mappings from the original iteration space to a new iteration space. These approaches show a great deal of promise, but they all rely on the ability to generate code that iterates over the points in these new iteration spaces in the appropriate order. This problem has been fairly well-studied in the case where all statements use the same mapping. We have developed an algorithm for the less well-studied case where each statement uses a potentially different mapping. Unlike many other approaches, our algorithm can also generate code from mappings corresponding to loop blocking. We address the important trade-off between reducing control overhead and duplicating code.

Non-Singular Data Transformations: Definition, Validity and Applications

by M. F. P. O'boyle, P.M.W. Knijnenburg - In Proc. 6th Workshop on Compilers for Parallel Computers , 1997
"... This paper describes a unifying framework for non-singular data transformations. It shows that a wide class of existing transformations may be expressed in this framework, allowing compound transformations to be performed in one step. Validity conditions for such transformations are developed as is ..."
Abstract - Cited by 42 (5 self) - Add to MetaCart
This paper describes a unifying framework for non-singular data transformations. It shows that a wide class of existing transformations may be expressed in this framework, allowing compound transformations to be performed in one step. Validity conditions for such transformations are developed as is the form of the transformed program and data. Constructive algorithms to generate data transformations for different applications are described and applied to example programs. It is shown that they can have a significant impact on program performance and may be used in situations where traditional loop transformations are inappropriate. 1 Introduction Recent years have seen a great improvement in loop transformation theory. By using an affine representation of loops, several loop transformations have been incorporated into one single framework [18]. In [2], Banerjee shows that loop interchange, reversal and skewing can be described as unimodular transformations of the iteration space. In ...

Optimization within a Unified Transformation Framework

by Wayne Anthony Kelly , 1996
"... ..."
Abstract - Cited by 29 (0 self) - Add to MetaCart
Abstract not found

Finding Legal Reordering Transformations using Mappings

by Wayne Kelly, William Pugh - In Seventh International Workshop on Languages and Compilers for Parallel Computing
"... Traditionally, optimizing compilers attempt to improve the performance of programs by applying source to source transformations, such as loop interchange, loop skewing and loop distribution. Each of these transformations has its own special legality checks and transformation rules which make it ha ..."
Abstract - Cited by 26 (3 self) - Add to MetaCart
Traditionally, optimizing compilers attempt to improve the performance of programs by applying source to source transformations, such as loop interchange, loop skewing and loop distribution. Each of these transformations has its own special legality checks and transformation rules which make it hard to analyze or predict the effects of compositions of these transformations. To overcome these problems we have developed a framework for unifying iteration reordering transformations. The framework is based on the idea that all reordering transformation can be represented as a mapping from the original iteration space to a new iteration space. The framework is designed to provide a uniform way to represent and reason about transformations. An optimizing compiler would use our framework by finding a mapping that both corresponds to a legal transformation and produces efficient code. We present the mapping selection problem as a search problem by decomposing it into a sequence of smal...

Going beyond Integer Programming with the Omega Test to Eliminate False Data Dependences

by William Pugh, David Wonnacott - IEEE Transactions on Parallel and Distributed Systems , 1992
"... Array data dependence analysis methods currently in use generate false dependences that can prevent useful program transformations. These false dependences arise because the questions asked are conservative approximations to the questions we really should be asking. Unfortunately, the questions we r ..."
Abstract - Cited by 26 (11 self) - Add to MetaCart
Array data dependence analysis methods currently in use generate false dependences that can prevent useful program transformations. These false dependences arise because the questions asked are conservative approximations to the questions we really should be asking. Unfortunately, the questions we really should be asking go beyond integer programming and require decision procedures for a subclass of Presburger formulas. In this paper, we describe how to extend the Omega test so that it can answer these queries and allow us to eliminate these false data dependences. We have implemented the techniques described here and believe they are suitable for use in production compilers.

Iterative optimization in the polyhedral model: Part I, one-dimensional time

by Louis-noël Pouchet, Cédric Bastoul, Albert Cohen, Nicolas Vasilache - In IEEE/ACM Intl. Conf. on Code Generation and Optimization (CGO’07 , 2007
"... Emerging microprocessors offer unprecedented parallel computing capabilities and deeper memory hierarchies, increasing the importance of loop transformations in optimizing compilers. Because compiler heuristics rely on simplistic performance models, and because they are bound to a limited set of tra ..."
Abstract - Cited by 25 (6 self) - Add to MetaCart
Emerging microprocessors offer unprecedented parallel computing capabilities and deeper memory hierarchies, increasing the importance of loop transformations in optimizing compilers. Because compiler heuristics rely on simplistic performance models, and because they are bound to a limited set of transformations sequences, they only uncover a fraction of the peak performance on typical benchmarks. Iterative optimization is a maturing framework to address these limitations, but so far, it was not successfully applied complex loop transformation sequences because of the combinatorics of the optimization search space. We focus on the class of loop transformation which can be expressed as one-dimensional affine schedules. We define a systematic exploration method to enumerate the space of all legal, distinct transformations in this class. This method is based on an upstream characterization, as opposed to state-of-the-art downstream filtering approaches. Our results demonstrate orders of magnitude improvements in the size of the search space and in the convergence speed of a dedicated iterative optimization heuristic. 1.

Optimal Fine and Medium Grain Parallelism Detection in Polyhedral Reduced Dependence Graphs

by Alain Darte, Frederic Vivien , 1996
"... This papcr presents an optimal algorithm lor detecting line or medium grain parallelism in nested loops whose dependences are described by an approximation of distance vectors by polyhedra. In particular, this algorithm is optimal for the classical approximation by direction sectors. This result gcn ..."
Abstract - Cited by 24 (4 self) - Add to MetaCart
This papcr presents an optimal algorithm lor detecting line or medium grain parallelism in nested loops whose dependences are described by an approximation of distance vectors by polyhedra. In particular, this algorithm is optimal for the classical approximation by direction sectors. This result gcncruli/es. to the case of several statements. Wolf and Lam's algorithm which is optimal for a single statement. Our algorithm relies on a dependence uniformi/ation process and on paralleli/ation techniques related to system of uniform recurrence equations. It can also be viewed as a combination of both Allen and Kennedy's algorithm and Wolf and Lam's algorithm.

Communication-Free Parallelization via Affine Transformations

by Amy Lim, Monica S. Lam - In 24 th ACM Symp. on Principles of Programming Languages , 1994
"... . The paper describes a parallelization algorithm for programs consisting of arbitrary nestings of loops and sequences of loops. The code produced by our algorithm yields all the degrees of communication-free parallelism that can be obtained via loop fission, fusion, interchange, reversal, skewing, ..."
Abstract - Cited by 22 (2 self) - Add to MetaCart
. The paper describes a parallelization algorithm for programs consisting of arbitrary nestings of loops and sequences of loops. The code produced by our algorithm yields all the degrees of communication-free parallelism that can be obtained via loop fission, fusion, interchange, reversal, skewing, scaling, reindexing and statement reordering. The algorithm first assigns the iterations of instructions in the program to processors via affine processor mappings, then generates the correct code by ensuring that the code executed by each processor is a subsequence of the original sequential execution sequence. 1 Introduction Previous research in vectorizing and parallelizing compilers has shown that parallelization can be improved by a host of high-level loop transformations. These loop transformations include loop fission (or loop distribution), loop fusion, loop interchange, loop reversal, loop skewing, loop scaling, loop reindexing (also known as loop alignment or index set shifting), ...
The National Science Foundation
  • About CiteSeerX
  • Submit Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2010 The Pennsylvania State University