Results 1 - 10
of
29
The program dependence graph and its use in optimization
- ACM Transactions on Programming Languages and Systems
, 1987
"... In this paper we present an intermediate program representation, called the program dependence graph (PDG), that makes explicit both the data and control dependence5 for each operation in a program. Data dependences have been used to represent only the relevant data flow relationships of a program. ..."
Abstract
-
Cited by 749 (3 self)
- Add to MetaCart
In this paper we present an intermediate program representation, called the program dependence graph (PDG), that makes explicit both the data and control dependence5 for each operation in a program. Data dependences have been used to represent only the relevant data flow relationships of a program. Control dependence5 are introduced to analogously represent only the essential control flow relationships of a program. Control dependences are derived from the usual control flow graph. Many traditional optimizations operate more efficiently on the PDG. Since dependences in the PDG connect computationally related parts of the program, a single walk of these dependences is sufficient to perform many optimizations. The PDG allows transformations such as vectorization, that previ-ously required special treatment of control dependence, to be performed in a manner that is uniform for both control and data dependences. Program transformations that require interaction of the two dependence types can also be easily handled with our representation. As an example, an incremental approach to modifying data dependences resulting from branch deletion or loop unrolling is intro-duced. The PDG supports incremental optimization, permitting transformations to be triggered by one another and applied only to affected dependences.
Automatic Translation of FORTRAN Programs to Vector Form
- ACM Transactions on Programming Languages and Systems
, 1987
"... This paper discusses the theoretical concepts underlying a project at Rice University to develop an automatic translator, called PFC (for Parallel FORTRAN Converter), from FORTRAN to FORTRAN 8x. The Rice project, based initially upon the research of Kuck and others at the University of Illinois [6, ..."
Abstract
-
Cited by 284 (32 self)
- Add to MetaCart
This paper discusses the theoretical concepts underlying a project at Rice University to develop an automatic translator, called PFC (for Parallel FORTRAN Converter), from FORTRAN to FORTRAN 8x. The Rice project, based initially upon the research of Kuck and others at the University of Illinois [6, 17-21, 24, 32, 36], is a continuation of work begun while on leave at IBM Research in Yorktown Heights, N.Y. Our first implementation was based on the Illinois PARAFRASE compiler [20, 36], but the current version is a completely new program (although it performs many of the same transformations as PARAFRASE). Other projects that have influenced our work are the Texas Instruments ASC compiler [9, 33], the Cray-1 FORTRAN compiler [15], and the Massachusetts Computer Associates Vectorizer [22, 25]. The paper is organized into seven sections. Section 2 introduces FORTRAN 8x and gives examples of its use. Section 3 presents an overview of the translation process along with an extended translation example. Section 4 develops the concept of interstatement dependence and shows how it can be applied to the problem of vectorization. Loop carried dependence and loop independent dependence are introduced in this section to extend dependence to multiple statements and multiple loops. Section 5 develops dependence-based algorithms for code generation and transformations for enhancing the parallelism of a statement. Section 6 describes a method for extending the power of data dependence to control statements by the process of IF conversion. Finally, Section 7 details the current state of PFC and our plans for its continued development
Automatic Program Parallelization
, 1993
"... This paper presents an overview of automatic program parallelization techniques. It covers dependence analysis techniques, followed by a discussion of program transformations, including straight-line code parallelization, do loop transformations, and parallelization of recursive routines. The last s ..."
Abstract
-
Cited by 97 (8 self)
- Add to MetaCart
This paper presents an overview of automatic program parallelization techniques. It covers dependence analysis techniques, followed by a discussion of program transformations, including straight-line code parallelization, do loop transformations, and parallelization of recursive routines. The last section of the paper surveys several experimental studies on the effectiveness of parallelizing compilers.
Symbolic Analysis for Parallelizing Compilers
, 1994
"... Symbolic Domain The objects in our abstract symbolic domain are canonical symbolic expressions. A canonical symbolic expression is a lexicographically ordered sequence of symbolic terms. Each symbolic term is in turn a pair of an integer coefficient and a sequence of pairs of pointers to program va ..."
Abstract
-
Cited by 95 (4 self)
- Add to MetaCart
Symbolic Domain The objects in our abstract symbolic domain are canonical symbolic expressions. A canonical symbolic expression is a lexicographically ordered sequence of symbolic terms. Each symbolic term is in turn a pair of an integer coefficient and a sequence of pairs of pointers to program variables in the program symbol table and their exponents. The latter sequence is also lexicographically ordered. For example, the abstract value of the symbolic expression 2ij+3jk in an environment that i is bound to (1; (( " i ; 1))), j is bound to (1; (( " j ; 1))), and k is bound to (1; (( " k ; 1))) is ((2; (( " i ; 1); ( " j ; 1))); (3; (( " j ; 1); ( " k ; 1)))). In our framework, environment is the abstract analogous of state concept; an environment is a function from program variables to abstract symbolic values. Each environment e associates a canonical symbolic value e x for each variable x 2 V ; it is said that x is bound to e x. An environment might be represented by...
Compiler-directed Data Prefetching in Multiprocessors with Memory Hierarchies
- In International Conference on Supercomputing
, 1990
"... Memory hierarchies are used by multiprocessor systems to reduce large memory access times. It is necessary to automatically manage such a hierarchy, to obtain effective memory utilization. In this paper, we discuss the various issues involved in obtaining an optimal memory management strategy for a ..."
Abstract
-
Cited by 87 (7 self)
- Add to MetaCart
Memory hierarchies are used by multiprocessor systems to reduce large memory access times. It is necessary to automatically manage such a hierarchy, to obtain effective memory utilization. In this paper, we discuss the various issues involved in obtaining an optimal memory management strategy for a memory hierarchy. We present an algorithm for finding the earliest point in a program that a block of data can be prefetched. This determination is based on the control and data dependences in the program. Such a method is an integral part of more general memory management algorithms. We demonstrate our method's potential by using static analysis to estimate the performance improvement afforded by our prefetching strategy and to analyze the reference patterns in a set of Fortran benchmarks. We also study the effectiveness of prefetching in a realistic shared-memory system using an RTL-level simulator and real codes. This differs from previous studies by considering prefetching benefits in th...
An Efficient Data Dependence Analysis for Parallelizing Compilers
, 1990
"... this paper, we extend the existing numerical methods to overcome these difficulties. A geometrical analysis reveals that we can take advantage of the regular shape of the convex sets derived from multi-dimensional arrays in a data dependence test. The general methods proposed before assume very gene ..."
Abstract
-
Cited by 50 (3 self)
- Add to MetaCart
this paper, we extend the existing numerical methods to overcome these difficulties. A geometrical analysis reveals that we can take advantage of the regular shape of the convex sets derived from multi-dimensional arrays in a data dependence test. The general methods proposed before assume very general convex sets; this assumption causes their inefficiency. We have implemented a new algorithm called the l-test and performed some measurements. Results were quite encouraging (see Section 4). As in earlier numerical methods, the proposed scheme uses Diophantine equations and bounds of real functions. The major difference lies in the way multiple dimensions are treated. In earlier numerical methods, data areas accessed by two array references are examined dimension by dimension. If the examination of any dimension shows that the two areas representing the subscript expressions are disjoint, there is no data dependence between the two references. However, if each pair of areas appears to overlap in each individual dimension, it is unclear whether there is an overlapped area - 3 - when all dimensions are considered simultaneously. In this case, a data dependence has to be assumed. Our algorithm treats all dimensions simultaneously. Based on the subscripts, it selects a few suitable "viewing angles" so that it gets an exact view of the data areas. Selection of the viewing angles is rather straightforward and only a few angles are needed in most cases. We present the rest of our paper as follows. In Section 2, we give some examples to illustrate the difficulties in data dependence analysis on multi-dimensional array references. Some measurement results on a large set of real programs are presented to show the actual frequency of such difficult cases. In Section 3, we describe...
A Control-Flow Normalization Algorithm and Its Complexity
- IEEE Transactions on Software Engineering
, 1992
"... We present a simple method for normalizing the control-flow of programs to facilitate program transformations, program analysis, and automatic parallelization. While previous methods result in programs whose control flowgraphs are reducible, programs normalized by this technique satisfy a stronger c ..."
Abstract
-
Cited by 38 (0 self)
- Add to MetaCart
We present a simple method for normalizing the control-flow of programs to facilitate program transformations, program analysis, and automatic parallelization. While previous methods result in programs whose control flowgraphs are reducible, programs normalized by this technique satisfy a stronger condition than reducibility and are therefore simpler in their syntax and structure than with previous methods. In particular, all control-flow cycles are normalized into single-entry, single-exit while loops, and all goto's are eliminated. Furthermore, the method avoids problems of code replication that are characteristic of node-splitting techniques. This restructuring obviates the control dependence graph, since afterwards control dependence relations are manifest in the syntax tree of the program. In this paper we present transformations that effect this normalization, and study the complexity of the method. Index Terms: Continuations, control-flow, elimination algorithms, normalization,...
A Timestamp-based Cache Coherence Scheme
, 1989
"... this paper, we propose a software-assisted cache coherence scheme which overcomes some of the inefficiencies of previous approaches by using a combination of a compile-time marking of references and a hardware-based local incoherence detection scheme. In section 2, we give the notation used througho ..."
Abstract
-
Cited by 29 (1 self)
- Add to MetaCart
this paper, we propose a software-assisted cache coherence scheme which overcomes some of the inefficiencies of previous approaches by using a combination of a compile-time marking of references and a hardware-based local incoherence detection scheme. In section 2, we give the notation used throughout the paper. Section 3 reviews previous software-assisted methods to enforcing cache coherence. In section 4, a complete description of our approach is given along with a correctness proof. Section 5 gives a qualitative comparison of our scheme and the directory-based approaches. Section 6 provides some concluding remarks. Definitions In conventional programs, there are four kinds of data dependences : flow-dependence, antidependence, output-dependence and input-dependence [14]. Let r and r
Adaptive And Integrated Data Cache Prefetching For Shared-Memory Multiprocessors
, 1995
"... ... yield a better overall scheme. We give a detailed description of the compiler analysis necessary for integrated prefetching. The performance of integrated prefetching is compared to software and hardware prefetching, and we show the effect of adapting the scheduling of prefetches at compile ti ..."
Abstract
-
Cited by 18 (0 self)
- Add to MetaCart
... yield a better overall scheme. We give a detailed description of the compiler analysis necessary for integrated prefetching. The performance of integrated prefetching is compared to software and hardware prefetching, and we show the effect of adapting the scheduling of prefetches at compile time. Finally, we discuss approaches that combine integrated prefetching with the adaptive hardware prefetching technique.
High-Level Semantic Optimization of Numerical Codes
- In Proceedings of the International Conference on Supercomputing 1999
, 1999
"... This paper presents a mathematical framework to exploit the semantic properties of matrix operations in loop-based numerical codes. The heart of this framework is an algebraic language called the Abstract Matrix Form which a compiler can use to reason about matrix computations in terms of loop nests ..."
Abstract
-
Cited by 15 (1 self)
- Add to MetaCart
This paper presents a mathematical framework to exploit the semantic properties of matrix operations in loop-based numerical codes. The heart of this framework is an algebraic language called the Abstract Matrix Form which a compiler can use to reason about matrix computations in terms of loop nests, high-level matrix operations, and intermediate forms. We demonstrate how this framework may be used to detect and exploit matrix products in loop-based languages such as FORTRAN and MATLAB, and discuss the resulting performance benefits. 1 Introduction Algebraic properties of scalar integer and floating point operations are used by most compilers to optimize programs. These properties enable compilers to reduce of the strength of expressions, enhance the power of common subexpression elimination, and verify the legality of certain loop transformations [2]. Although matrices are also endowed with a rich algebra, it is less common for compilers to exploit matrix algebra to optimize program...

