Results 1  10
of
43
Practical Dependence Testing
, 1991
"... Precise and efficient dependence tests are essential to the effectiveness of a parallelizing compiler. This paper proposes a dependence testing scheme based on classifying pairs of subscripted variable references. Exact yet fast dependence tests are presented for certain classes of array references, ..."
Abstract

Cited by 142 (16 self)
 Add to MetaCart
Precise and efficient dependence tests are essential to the effectiveness of a parallelizing compiler. This paper proposes a dependence testing scheme based on classifying pairs of subscripted variable references. Exact yet fast dependence tests are presented for certain classes of array references, as well as empirical results showing that these references dominate scientific Fortran codes. These dependence tests are being implemented at Rice University in both PFC, a parallelizing compiler, and ParaScope, a parallel programming environment.
Efficient and exact data dependence analysis
 PROCEEDINGS OF THE ACM SIGPLAN '91 CONFERENCE ON PROGRAMMING LANGUAGE DESIGN AND IMPLEMENTATION
, 1991
"... Data dependence testing is the basic step in detecting loop level parallelism in numerical programs. The problem is equivalent to integer linear programming and thus in general cannot be solved efficiently. Current methods in use employ inexact methods that sacrifice potential parallelism in order t ..."
Abstract

Cited by 121 (8 self)
 Add to MetaCart
Data dependence testing is the basic step in detecting loop level parallelism in numerical programs. The problem is equivalent to integer linear programming and thus in general cannot be solved efficiently. Current methods in use employ inexact methods that sacrifice potential parallelism in order to improve compiler efficiency. This paper shows that in practice, data dependence can be computed exactly and efficiently. There are three major ideas that lead to this result. First, we have developed and assembled a small set of efficient algorithms, each one exact for special case inputs. Combined with a moderately expensive backup test, they are exact for all the cases we have seen in practice. Second, we introduce a memorization technique to save results of previous tests, thus avoiding calling the data dependence routines multiple times on the same input. Third, we show that this approach can both be extended to compute distance and direction vectors and to use unknown symbolic terms without any loss of accuracy or efficiency, We have implemented our algorithm in the SUIF system, a general purpose compiler system developed at Stanford. We ran the algorithm on the PERFECT Club Benchmarks and our data dependence analyzer gave an exact solution in all cases efficiently.
The Multiscalar Architecture
, 1993
"... The centerpiece of this thesis is a new processing paradigm for exploiting instruction level parallelism. This paradigm, called the multiscalar paradigm, splits the program into many smaller tasks, and exploits finegrain parallelism by executing multiple, possibly (control and/or data) dependent t ..."
Abstract

Cited by 119 (8 self)
 Add to MetaCart
(Show Context)
The centerpiece of this thesis is a new processing paradigm for exploiting instruction level parallelism. This paradigm, called the multiscalar paradigm, splits the program into many smaller tasks, and exploits finegrain parallelism by executing multiple, possibly (control and/or data) dependent tasks in parallel using multiple processing elements. Splitting the instruction stream at statically determined boundaries allows the compiler to pass substantial information about the tasks to the hardware. The processing paradigm can be viewed as extensions of the superscalar and multiprocessing paradigms, and shares a number of properties of the sequential processing model and the dataflow processing model. The multiscalar paradigm is easily realizable, and we describe an implementation of the multiscalar paradigm, called the multiscalar processor. The central idea here is to connect multiple sequential processors, in a decoupled and decentralized manner, to achieve overall multiple issue. The multiscalar processor supports speculative execution, allows arbitrary dynamic code motion (facilitated by an efficient hardware memory disambiguation mechanism), exploits communication localities, and does all of these with hardware that is fairly straightforward to build. Other desirable aspects of the
Symbolic Bounds Analysis of Pointers, Array Indices, and Accessed Memory Regions
 PLDI 2000
, 2000
"... This paper presents a novel framework for the symbolic bounds analysis of pointers, array indices, and accessed memory regions. Our framework formulates each analysis problem as a system of inequality constraints between symbolic bound polynomials. It then reduces the constraint system to a linear p ..."
Abstract

Cited by 117 (14 self)
 Add to MetaCart
This paper presents a novel framework for the symbolic bounds analysis of pointers, array indices, and accessed memory regions. Our framework formulates each analysis problem as a system of inequality constraints between symbolic bound polynomials. It then reduces the constraint system to a linear program. The solution to the linear program provides symbolic lower and upper bounds for the values of pointer and array index variables and for the regions of memory that each statement and procedure accesses. This approach eliminates fundamental problems associated with applying standard xedpoint approaches to symbolic analysis problems. Experimental results from our implemented compiler show that the analysis can solve several important problems, including static race detection, automatic parallelization, static detection of array bounds violations, elimination of array bounds checks, and reduction of the number of bits used to store computed values.
Automatic Program Parallelization
, 1993
"... This paper presents an overview of automatic program parallelization techniques. It covers dependence analysis techniques, followed by a discussion of program transformations, including straightline code parallelization, do loop transformations, and parallelization of recursive routines. The last s ..."
Abstract

Cited by 112 (8 self)
 Add to MetaCart
This paper presents an overview of automatic program parallelization techniques. It covers dependence analysis techniques, followed by a discussion of program transformations, including straightline code parallelization, do loop transformations, and parallelization of recursive routines. The last section of the paper surveys several experimental studies on the effectiveness of parallelizing compilers.
A Framework for Unifying Reordering Transformations
, 1993
"... We present a framework for unifying iteration reordering transformations such as loop interchange, loop distribution, skewing, tiling, index set splitting and statement reordering. The framework is based on the idea that a transformation can be represented as a schedule that maps the original iterat ..."
Abstract

Cited by 72 (10 self)
 Add to MetaCart
(Show Context)
We present a framework for unifying iteration reordering transformations such as loop interchange, loop distribution, skewing, tiling, index set splitting and statement reordering. The framework is based on the idea that a transformation can be represented as a schedule that maps the original iteration space to a new iteration space. The framework is designed to provide a uniform way to represent and reason about transformations. As part of the framework, we provide algorithms to assist in the building and use of schedules. In particular, we provide algorithms to test the legality of schedules, to align schedules and to generate optimized code for schedules. This work is supported by an NSF PYI grant CCR9157384 and by a Packard Fellowship. 1 Introduction Optimizing compilers reorder iterations of statements to improve instruction scheduling, register use, and cache utilization, and to expose parallelism. Many different reordering transformations have been developed and studied, su...
An Efficient Data Dependence Analysis for Parallelizing Compilers
, 1990
"... this paper, we extend the existing numerical methods to overcome these difficulties. A geometrical analysis reveals that we can take advantage of the regular shape of the convex sets derived from multidimensional arrays in a data dependence test. The general methods proposed before assume very gene ..."
Abstract

Cited by 53 (2 self)
 Add to MetaCart
this paper, we extend the existing numerical methods to overcome these difficulties. A geometrical analysis reveals that we can take advantage of the regular shape of the convex sets derived from multidimensional arrays in a data dependence test. The general methods proposed before assume very general convex sets; this assumption causes their inefficiency. We have implemented a new algorithm called the ltest and performed some measurements. Results were quite encouraging (see Section 4). As in earlier numerical methods, the proposed scheme uses Diophantine equations and bounds of real functions. The major difference lies in the way multiple dimensions are treated. In earlier numerical methods, data areas accessed by two array references are examined dimension by dimension. If the examination of any dimension shows that the two areas representing the subscript expressions are disjoint, there is no data dependence between the two references. However, if each pair of areas appears to overlap in each individual dimension, it is unclear whether there is an overlapped area  3  when all dimensions are considered simultaneously. In this case, a data dependence has to be assumed. Our algorithm treats all dimensions simultaneously. Based on the subscripts, it selects a few suitable "viewing angles" so that it gets an exact view of the data areas. Selection of the viewing angles is rather straightforward and only a few angles are needed in most cases. We present the rest of our paper as follows. In Section 2, we give some examples to illustrate the difficulties in data dependence analysis on multidimensional array references. Some measurement results on a large set of real programs are presented to show the actual frequency of such difficult cases. In Section 3, we describe...
Finding Legal Reordering Transformations using Mappings
 In Seventh International Workshop on Languages and Compilers for Parallel Computing
"... Traditionally, optimizing compilers attempt to improve the performance of programs by applying source to source transformations, such as loop interchange, loop skewing and loop distribution. Each of these transformations has its own special legality checks and transformation rules which make it ha ..."
Abstract

Cited by 31 (3 self)
 Add to MetaCart
(Show Context)
Traditionally, optimizing compilers attempt to improve the performance of programs by applying source to source transformations, such as loop interchange, loop skewing and loop distribution. Each of these transformations has its own special legality checks and transformation rules which make it hard to analyze or predict the effects of compositions of these transformations. To overcome these problems we have developed a framework for unifying iteration reordering transformations. The framework is based on the idea that all reordering transformation can be represented as a mapping from the original iteration space to a new iteration space. The framework is designed to provide a uniform way to represent and reason about transformations. An optimizing compiler would use our framework by finding a mapping that both corresponds to a legal transformation and produces efficient code. We present the mapping selection problem as a search problem by decomposing it into a sequence of smal...
Accurate Analysis of Array References
, 1992
"... ii I certify that I have read this thesis and that in my opinion it is fully adequate, in scope and in quality, as a dissertation for the degree of Doctor of Philosophy. John L. Hennessy(Principal Adviser) I certify that I have read this thesis and that in my opinion it is fully adequate, in scope a ..."
Abstract

Cited by 30 (0 self)
 Add to MetaCart
(Show Context)
ii I certify that I have read this thesis and that in my opinion it is fully adequate, in scope and in quality, as a dissertation for the degree of Doctor of Philosophy. John L. Hennessy(Principal Adviser) I certify that I have read this thesis and that in my opinion it is fully adequate, in scope and in quality, as a dissertation for the degree of Doctor of Philosophy.