Results 1  10
of
20
The program dependence graph and its use in optimization
 ACM Transactions on Programming Languages and Systems
, 1987
"... In this paper we present an intermediate program representation, called the program dependence graph (PDG), that makes explicit both the data and control dependence5 for each operation in a program. Data dependences have been used to represent only the relevant data flow relationships of a program. ..."
Abstract

Cited by 826 (3 self)
 Add to MetaCart
In this paper we present an intermediate program representation, called the program dependence graph (PDG), that makes explicit both the data and control dependence5 for each operation in a program. Data dependences have been used to represent only the relevant data flow relationships of a program. Control dependence5 are introduced to analogously represent only the essential control flow relationships of a program. Control dependences are derived from the usual control flow graph. Many traditional optimizations operate more efficiently on the PDG. Since dependences in the PDG connect computationally related parts of the program, a single walk of these dependences is sufficient to perform many optimizations. The PDG allows transformations such as vectorization, that previously required special treatment of control dependence, to be performed in a manner that is uniform for both control and data dependences. Program transformations that require interaction of the two dependence types can also be easily handled with our representation. As an example, an incremental approach to modifying data dependences resulting from branch deletion or loop unrolling is introduced. The PDG supports incremental optimization, permitting transformations to be triggered by one another and applied only to affected dependences.
Automatic Translation of FORTRAN Programs to Vector Form
 ACM Transactions on Programming Languages and Systems
, 1987
"... This paper discusses the theoretical concepts underlying a project at Rice University to develop an automatic translator, called PFC (for Parallel FORTRAN Converter), from FORTRAN to FORTRAN 8x. The Rice project, based initially upon the research of Kuck and others at the University of Illinois [6, ..."
Abstract

Cited by 293 (32 self)
 Add to MetaCart
This paper discusses the theoretical concepts underlying a project at Rice University to develop an automatic translator, called PFC (for Parallel FORTRAN Converter), from FORTRAN to FORTRAN 8x. The Rice project, based initially upon the research of Kuck and others at the University of Illinois [6, 1721, 24, 32, 36], is a continuation of work begun while on leave at IBM Research in Yorktown Heights, N.Y. Our first implementation was based on the Illinois PARAFRASE compiler [20, 36], but the current version is a completely new program (although it performs many of the same transformations as PARAFRASE). Other projects that have influenced our work are the Texas Instruments ASC compiler [9, 33], the Cray1 FORTRAN compiler [15], and the Massachusetts Computer Associates Vectorizer [22, 25]. The paper is organized into seven sections. Section 2 introduces FORTRAN 8x and gives examples of its use. Section 3 presents an overview of the translation process along with an extended translation example. Section 4 develops the concept of interstatement dependence and shows how it can be applied to the problem of vectorization. Loop carried dependence and loop independent dependence are introduced in this section to extend dependence to multiple statements and multiple loops. Section 5 develops dependencebased algorithms for code generation and transformations for enhancing the parallelism of a statement. Section 6 describes a method for extending the power of data dependence to control statements by the process of IF conversion. Finally, Section 7 details the current state of PFC and our plans for its continued development
Automatic Program Parallelization
, 1993
"... This paper presents an overview of automatic program parallelization techniques. It covers dependence analysis techniques, followed by a discussion of program transformations, including straightline code parallelization, do loop transformations, and parallelization of recursive routines. The last s ..."
Abstract

Cited by 105 (8 self)
 Add to MetaCart
This paper presents an overview of automatic program parallelization techniques. It covers dependence analysis techniques, followed by a discussion of program transformations, including straightline code parallelization, do loop transformations, and parallelization of recursive routines. The last section of the paper surveys several experimental studies on the effectiveness of parallelizing compilers.
Beyond Induction Variables
, 1992
"... Induction variable detection is usually closely tied to the strength reduction optimization. This paper studies induction variable analysis from a different perspective, that of finding induction variables for data dependence analysis. While classical induction variable analysis techniques have been ..."
Abstract

Cited by 90 (6 self)
 Add to MetaCart
Induction variable detection is usually closely tied to the strength reduction optimization. This paper studies induction variable analysis from a different perspective, that of finding induction variables for data dependence analysis. While classical induction variable analysis techniques have been used successfully up to now, we have found a simple algorithm based on the the Static Single Assignment form of a program that finds all linear induction variables in a loop. Moreover, this algorithm is easily extended to find induction variables in multiple nested loops, to find nonlinear induction variables, and to classify other integer scalar assignments in loops, such as monotonic, periodic and wraparound variables. Some of these other variables are now classified using ad hoc pattern recognition, while others are not analyzed by current compilers. Giving a unified approach improves the speed of compilers and allows a more general classification scheme. We also show how to use these va...
Structured dataflow analysis for arrays and its use in an optimizing compiler
 SoftwarePractice and Experience
, 1990
"... We extend the wellknown interval analysis method so that it can be used to gather global flow information for individual array elements. Data dependences between all array accesses in different basic blocks, different iterations of the same loop, and across different loops are computed and represen ..."
Abstract

Cited by 53 (0 self)
 Add to MetaCart
We extend the wellknown interval analysis method so that it can be used to gather global flow information for individual array elements. Data dependences between all array accesses in different basic blocks, different iterations of the same loop, and across different loops are computed and represented as labelled arcs in a program flow graph. This approach results in a uniform treatment of scalars and arrays in the compiler and builds a systematic basis from which the compiler can perform numerous global optimizations. This global dataflow analysis is performed as a separate phase in the compiler. This phase only gathers the global relationships between different accesses to a variable, yet the use of this information is left to the code generator. This organization substantially simplifies the engineering of an optimizing compiler and separates the back end of the compiler (e.g. code generator and register allocator) from the flow analysis part. The global dataflow analysis algorithm described in this paper has been implemented and used in an optimizing compiler for a processor with deep pipelines. This paper describes the algorithm and its compact implementation and evaluates it, both with respect to the accuracy of the information and to the compiletime cost of obtaining and using it. KEY WORDS Compilation Global dataflow analysis Interval analysis Optimization Pipelining
An Efficient Data Dependence Analysis for Parallelizing Compilers
, 1990
"... this paper, we extend the existing numerical methods to overcome these difficulties. A geometrical analysis reveals that we can take advantage of the regular shape of the convex sets derived from multidimensional arrays in a data dependence test. The general methods proposed before assume very gene ..."
Abstract

Cited by 51 (2 self)
 Add to MetaCart
this paper, we extend the existing numerical methods to overcome these difficulties. A geometrical analysis reveals that we can take advantage of the regular shape of the convex sets derived from multidimensional arrays in a data dependence test. The general methods proposed before assume very general convex sets; this assumption causes their inefficiency. We have implemented a new algorithm called the ltest and performed some measurements. Results were quite encouraging (see Section 4). As in earlier numerical methods, the proposed scheme uses Diophantine equations and bounds of real functions. The major difference lies in the way multiple dimensions are treated. In earlier numerical methods, data areas accessed by two array references are examined dimension by dimension. If the examination of any dimension shows that the two areas representing the subscript expressions are disjoint, there is no data dependence between the two references. However, if each pair of areas appears to overlap in each individual dimension, it is unclear whether there is an overlapped area  3  when all dimensions are considered simultaneously. In this case, a data dependence has to be assumed. Our algorithm treats all dimensions simultaneously. Based on the subscripts, it selects a few suitable "viewing angles" so that it gets an exact view of the data areas. Selection of the viewing angles is rather straightforward and only a few angles are needed in most cases. We present the rest of our paper as follows. In Section 2, we give some examples to illustrate the difficulties in data dependence analysis on multidimensional array references. Some measurement results on a large set of real programs are presented to show the actual frequency of such difficult cases. In Section 3, we describe...
A Tile Selection Algorithm for Data Locality and Cache Interference
 In 1999 ACM International Conference on Supercomputing
, 1999
"... Loop tiling is a wellknown compiler transformation that increases data locality, exposes parallelism and reduces synchronization costs. Tiling increases the amount of data reuse that can be exploited by reordering the loop iterations so that accesses to the same data are closer together in time. Ho ..."
Abstract

Cited by 25 (2 self)
 Add to MetaCart
Loop tiling is a wellknown compiler transformation that increases data locality, exposes parallelism and reduces synchronization costs. Tiling increases the amount of data reuse that can be exploited by reordering the loop iterations so that accesses to the same data are closer together in time. However, tiled loops often suer from cache interference in the directmapped or lowassociativity caches typically found in stateoftheart microprocessors. A solution to this problem is to choose a tile size that does not exhibit self interference. In this paper, we propose a new tile selection algorithm for eliminating self interference and simultaneously minimizing capacity and crossinterference misses. We have automated the algorithm in the SUIF compiler and used it to generate tiles for a range of problem sizes for three scienti c computations. Our experimental results show that the algorithm consistently nds tiles that yield lower miss rates than existing tile selection algorithms. ...
Violated dependence analysis
 In ACM ICS
, 2006
"... The polyhedral model is a powerful framework to reason about high level loop transformations. Yet the lack of scalable algorithms and tools has deterred actors from both academia and industry to put this model to practical use. Indeed, for fundamental complexity reasons, its applicability has long b ..."
Abstract

Cited by 21 (4 self)
 Add to MetaCart
The polyhedral model is a powerful framework to reason about high level loop transformations. Yet the lack of scalable algorithms and tools has deterred actors from both academia and industry to put this model to practical use. Indeed, for fundamental complexity reasons, its applicability has long been limited to simple kernels. Recent developments broke some generally accepted ideas about these limitations. In particular, new algorithms made it possible to compute the target code for full SPEC benchmarks while this code generation step was expected not to be scalable. Instancewise array dependence analysis computes a finite, intensional representation of the (statically unbounded) set of all dynamic dependences. This problem has always been considered nonscalable and/or an overkill with respect to less expressive and faster dependence tests. On the contrary, this article presents experimental evidence of its applicability to full SPEC CPU2000 benchmarks. To make this possible, we revisit the characterization of data dependences, considering relations between time dimensions of the transformed space. Beyond algorithmic benefits, this naturally leads to a novel way of reasoning about violated dependences across arbitrary transformation sequences. Reasoning about violated dependences relieves the compiler designer from the cumbersome task of implementing specific legality checks for each single transformation. It also allows, in the case of invalid transformations, to precisely determine the violated dependences that need to be corrected. Identifying these violations can in turn enable automatic correction schemes to fix an illegal transformation sequence with minimal changes.
The Polyhedral Model Is More Widely Applicable Than You Think
"... Abstract. The polyhedral model is a powerful framework for automatic optimization and parallelization. It is based on an algebraic representation of programs, allowing to construct and search for complex sequences of optimizations. This model is now mature and reaches production compilers. The main ..."
Abstract

Cited by 14 (9 self)
 Add to MetaCart
Abstract. The polyhedral model is a powerful framework for automatic optimization and parallelization. It is based on an algebraic representation of programs, allowing to construct and search for complex sequences of optimizations. This model is now mature and reaches production compilers. The main limitation of the polyhedral model is known to be its restriction to statically predictable, loopbased program parts. This paper removes this limitation, allowing to operate on general datadependent controlflow. We embed control and exit predicates as firstclass citizens of the algebraic representation, from program analysis to code generation. Complementing previous (partial) attempts in this direction, our work concentrates on extending the code generation step and does not compromise the expressiveness of the model. We present experimental evidence that our extension is relevant for program optimization and parallelization, showing performance improvements on benchmarks that were thought to be out of reach of the polyhedral model. 1