Results 1  10
of
13
Beyond Induction Variables
, 1992
"... Induction variable detection is usually closely tied to the strength reduction optimization. This paper studies induction variable analysis from a different perspective, that of finding induction variables for data dependence analysis. While classical induction variable analysis techniques have been ..."
Abstract

Cited by 90 (6 self)
 Add to MetaCart
Induction variable detection is usually closely tied to the strength reduction optimization. This paper studies induction variable analysis from a different perspective, that of finding induction variables for data dependence analysis. While classical induction variable analysis techniques have been used successfully up to now, we have found a simple algorithm based on the the Static Single Assignment form of a program that finds all linear induction variables in a loop. Moreover, this algorithm is easily extended to find induction variables in multiple nested loops, to find nonlinear induction variables, and to classify other integer scalar assignments in loops, such as monotonic, periodic and wraparound variables. Some of these other variables are now classified using ad hoc pattern recognition, while others are not analyzed by current compilers. Giving a unified approach improves the speed of compilers and allows a more general classification scheme. We also show how to use these va...
Interprocedural Constant Propagation: A Study of Jump Function Implementations
 IN PROCEEDINGS OF THE ACM SIGPLAN 93 CONFERENCE ON PROGRAMMING LANGUAGE DESIGN AND IMPLEMENTATION
, 1993
"... An implementation of interprocedural constant propagation must model the transmission of values through each procedure. In the framework proposed by Callahan, Cooper, Kennedy, and Torczon in 1986, this intraprocedural propagation is modeled with a jump function. While Callahan et al. propose several ..."
Abstract

Cited by 43 (5 self)
 Add to MetaCart
An implementation of interprocedural constant propagation must model the transmission of values through each procedure. In the framework proposed by Callahan, Cooper, Kennedy, and Torczon in 1986, this intraprocedural propagation is modeled with a jump function. While Callahan et al. propose several kinds of jump functions, they give no data to help choose between them. This paper reports on a comparative study of jump function implementations. It shows that different jump functions produce different numbers of useful constants; it suggests a particular function, called the passthrough parameter jump function, as the most costeffective in practice.
Polaris: A NewGeneration Parallelizing Compiler for MPPs
 In CSRD Rept. No. 1306. Univ. of Illinois at UrbanaChampaign
, 1993
"... ion for Inner Loops When the algorithm finds a loop nested inside a loop body, it will recursively call itself on the inner loop. To hide the control flow of an inner loop, we introduce some abstraction and extend the previous definition from a basic block to a complete loop. We start by defining t ..."
Abstract

Cited by 43 (7 self)
 Add to MetaCart
ion for Inner Loops When the algorithm finds a loop nested inside a loop body, it will recursively call itself on the inner loop. To hide the control flow of an inner loop, we introduce some abstraction and extend the previous definition from a basic block to a complete loop. We start by defining the information for one iteration of the loop. Definition 5 Let L be a loop and V AR be the variables in the program. We define the following set as summary set for body(L). 1. DEF b (L) := fv 2 V AR : v has a MRD reaching all exits node of body(L) g 2. USE b (L) := fv 2 V AR : v has an outward exposed use in body(L) g 3. KILL b (L) := DEF b (L) 4. PRI b (L) := fv 2 V AR : every use of v has a reaching MRD in body(L) g 2 The summary set is an abstraction of the effect of a loop iteration on the data flow values. Using the summary set, we can ignore the structure of the inner loops in the analysis of the outer loop. The tradeoff is that we have to make a conservative approximation and ma...
The Implementation and Evaluation of Fusion and Contraction in Array Languages
, 1998
"... Array languages such as Fortran 90, HPF and ZPL have many benefits in simplifying arraybased computations and expressing data parallelism. However, they can suffer large performance penalties because they introduce intermediate arraysboth at the source level and during the compilation process ..."
Abstract

Cited by 38 (9 self)
 Add to MetaCart
Array languages such as Fortran 90, HPF and ZPL have many benefits in simplifying arraybased computations and expressing data parallelism. However, they can suffer large performance penalties because they introduce intermediate arraysboth at the source level and during the compilation processwhich increase memory usage and pollute the cache. Most compilers address this problem by simply scalarizing the array language and relying on a scalar language compiler to perform loop fusion and array contraction. We instead show that there are advantages to performing a form of loop fusion and array contraction at the array level. This paper describes this approach and explains its advantages. Experimental results show that our scheme typically yields runtime improvements of greater than 20% and sometimes up to 400%. In addition, it yields superior memory use when compared against commercial compilers and exhibits comparable memory use when compared with scalar languages. We also explore ...
Efficient and precise array access analysis
 ACM Trans. Program. Lang. Syst
, 2000
"... A number of existing compiler techniques hinge on the analysis of array accesses in a program. The most important task in array access analysis is to collect the information about array accesses of interest and summarize it in some standard form. Traditional forms used in array access analysis are s ..."
Abstract

Cited by 32 (1 self)
 Add to MetaCart
A number of existing compiler techniques hinge on the analysis of array accesses in a program. The most important task in array access analysis is to collect the information about array accesses of interest and summarize it in some standard form. Traditional forms used in array access analysis are sensitive to the complexity of array subscripts; that is, they are usually quite accurate and efficient for simple array subscripting expressions, but lose accuracy or require potentially expensive algorithms for complex subscripts. Our study has revealed that in many programs, particularly numerical applications, many access patterns are simple in nature even when the subscripting expressions are complex. Based on this analysis, we have developed a new, general array region representational form, called the linear memory access descriptor (LMAD). The key idea of the LMAD is to relate all memory accesses to the linear machine memory rather than to the shape of the logical data structures of a programming language. This form helps us expose the simplicity of the actual patterns of array accesses in memory, which may be hidden by complex array subscript expressions. Our recent experimental studies show that our new representation simplifies array
Evaluation Of Programs And Parallelizing Compilers Using Dynamic Analysis Techniques
, 1993
"... results for an unlimited number of processors. Upper and lower bounds of the inherent parallelism, for the case of limited processors, can be derived from the processor activity histogram, which records the number of concurrent operations during each time period. Stress analysis is a derivative of ..."
Abstract

Cited by 15 (1 self)
 Add to MetaCart
results for an unlimited number of processors. Upper and lower bounds of the inherent parallelism, for the case of limited processors, can be derived from the processor activity histogram, which records the number of concurrent operations during each time period. Stress analysis is a derivative of critical path analysis that determines the locations in a program that have the largest contribution to the critical path. Inductions are a computation that introduce an internal stress. A specific method is presented which measures the effects of removing the serializing effects of inductions on the inherent parallelism. Dependence analysis is crucial to the effective operation of parallelizing compilers. Static and dynamic evaluation of the effectiveness of compiletime data dependence analysis is presented, the evaluation compares the existing techniques against each other, and against the theoretical optimal results. Special attention is paid to the dependences which serialize interproce
Experimental Evaluation of Some Data Dependence Tests (Extended Abstract)
, 1991
"... ) Paul M. Petersen and David A. Padua Center for Supercomputing Research and Development University of Illinois at UrbanaChampaign Urbana, Illinois, 61801 1 Introduction Data dependence analysis is the most important step in the automatic detection and exploitation of implicit parallelism, which i ..."
Abstract

Cited by 7 (3 self)
 Add to MetaCart
) Paul M. Petersen and David A. Padua Center for Supercomputing Research and Development University of Illinois at UrbanaChampaign Urbana, Illinois, 61801 1 Introduction Data dependence analysis is the most important step in the automatic detection and exploitation of implicit parallelism, which is today recognized as an important compiler optimization technique. Its importance will clearly grow as parallel computers become even more pervasive. As discussed in more detail below, the main problem of data dependence analysis is to detect whether or not a system of equations has an integer solution inside a given region of ZZ n . One of the first techniques used answered the question accurately [Tow76]. However, this method was too expensive to use in any practical compiler. For this reason, faster but approximated techniques have been developed which will sometimes wrongly assume the existence of a solution to the system of equations. Of course, using such incorrect assumptions neve...
A partitioning programming environment for a novel parallel architecture
 IN INTERNATIONAL PARALLEL PROCESSING SYMPOSIUM
, 1996
"... The paper presents a partitioning and parallelizing programming environment for a novel parallel architecture. This universal embedded accelerator is based on a reconfigurable datapath hardware. The partitioning and parallelizing programming environment accepts Cprograms and carries out both, a prof ..."
Abstract

Cited by 5 (3 self)
 Add to MetaCart
The paper presents a partitioning and parallelizing programming environment for a novel parallel architecture. This universal embedded accelerator is based on a reconfigurable datapath hardware. The partitioning and parallelizing programming environment accepts Cprograms and carries out both, a profilingdriven host/ accelerator partitioning for performance optimization in a first step, and in a second step a resourcedriven sequential/ structural partitioning of the accelerator source code to optimize the utilization of its reconfigurable resources.
DemandDriven Interprocedural Constant Propagation: Implementation and Evaluation
, 1994
"... We have developed a hybrid algorithm for interprocedural constant propagation combining two prior methods with a new demanddriven approach. We modified a prior intraprocedural constant propagator to use incrementally in a demanddriven interprocedural framework. We compare our algorithm to three pr ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
We have developed a hybrid algorithm for interprocedural constant propagation combining two prior methods with a new demanddriven approach. We modified a prior intraprocedural constant propagator to use incrementally in a demanddriven interprocedural framework. We compare our algorithm to three prior interprocedural methods. Burke and Cytron solve the interprocedural constant propagation problem with an algorithm that uses a pessimistic incremental intraprocedural constant propagator to iterate forward and backward over the call graph until no new information is discovered [BC86]. Wegman and Zadeck solve the intraprocedural constant propagation problem with an optimistic algorithm [WZ91]. Their algorithm solves the sparse conditional constant problem. The interprocedural version of their algorithm links the Static Single Assignment graphs of all procedures together and runs their intraprocedural algorithm over the single SSA graph. Grove and Torczon performed experiments that show Ju...