Results 1  10
of
47
Optimal Code Motion: Theory and Practice
, 1993
"... An implementation oriented algorithm for lazy code motion is presented that minimizes the number of computations in programs while suppressing any unnecessary code motion in order to avoid superfluous register pressure. In particular, this variant of the original algorithm for lazy code motion works ..."
Abstract

Cited by 112 (18 self)
 Add to MetaCart
An implementation oriented algorithm for lazy code motion is presented that minimizes the number of computations in programs while suppressing any unnecessary code motion in order to avoid superfluous register pressure. In particular, this variant of the original algorithm for lazy code motion works on flowgraphs whose nodes are basic blocks rather than single statements, as this format is standard in optimizing compilers. The theoretical foundations of the modified algorithm are given in the first part, where trefined flowgraphs are introduced for simplifying the treatment of flowgraphs whose nodes are basic blocks. The second part presents the `basic block' algorithm in standard notation, and gives directions for its implementation in standard compiler environments. Keywords Elimination of partial redundancies, code motion, data flow analysis (bitvector, unidirectional, bidirectional), nondeterministic flowgraphs, trefined flow graphs, critical edges, lifetimes of registers, com...
Demanddriven Computation of Interprocedural Data Flow
, 1995
"... This paper presents a general framework for deriving demanddriven algorithms for interprocedural data flow analysis of imperative programs. The goal of demanddriven analysis is to reduce the time and/or space overhead of conventional exhaustive analysis by avoiding the collection of information tha ..."
Abstract

Cited by 77 (9 self)
 Add to MetaCart
This paper presents a general framework for deriving demanddriven algorithms for interprocedural data flow analysis of imperative programs. The goal of demanddriven analysis is to reduce the time and/or space overhead of conventional exhaustive analysis by avoiding the collection of information that is not needed. In our framework, a demand for data flow information is modeled as a set of data flow queries. The derived demanddriven algorithms find responses to these queries through a partial reversal of the respective data flow analysis. Depending on whether minimizing time or space is of primary concern, result caching may be incorporated in the derived algorithm. Our framework is applicable to interprocedural data flow problems with a finite domain set. If the problem's flow functions are distributive, the derived demand algorithms provide as precise information as the corresponding exhaustive analysis. For problems with monotone but nondistributive flow functions the provided dat...
A New Algorithm for Partial Redundancy Elimination based on SSA Form
, 1997
"... A new algorithm, SSAPRE, for performing partial redundancy elimination based entirely on SSA form is presented. It achieves optimal code motion similar to lazy code motion [KRS94a, DS93], but is formulated independently and does not involve iterative data flow analysis and bit vectors in its solutio ..."
Abstract

Cited by 67 (3 self)
 Add to MetaCart
A new algorithm, SSAPRE, for performing partial redundancy elimination based entirely on SSA form is presented. It achieves optimal code motion similar to lazy code motion [KRS94a, DS93], but is formulated independently and does not involve iterative data flow analysis and bit vectors in its solution. It not only exhibits the characteristics common to other sparse approaches, but also inherits the advantages shared by other SSAbased optimization techniques. SSAPRE also maintains its output in the same SSA form as its input. In describing the algorithm, we state theorems with proofs giving our claims about SSAPRE. We also give additional description about our practical implementation of SSAPRE, and analyze and compare its performance with a bitvectorbased implementation of PRE. We conclude with some discussion of the implications of this work. 1 Introduction The Static Single Assignment Form (SSA) has become a popular program representation in optimizing compilers, because it provid...
Complete Removal of Redundant Expressions
, 1998
"... Partial redundancy elimination (PRE), the most important component of global optimizers, generalizes the removal of common subexpressions and loopinvariant computations. Because existing PRE implementations are based on code motion, they fail to completely remove the redundancies. In fact, we obser ..."
Abstract

Cited by 66 (13 self)
 Add to MetaCart
Partial redundancy elimination (PRE), the most important component of global optimizers, generalizes the removal of common subexpressions and loopinvariant computations. Because existing PRE implementations are based on code motion, they fail to completely remove the redundancies. In fact, we observed that 73% of loopinvariant statements cannot be eliminated from loops by code motion alone. In dynamic terms, traditional PRE eliminates only half of redundancies that are strictly partial. To achieve a complete PRE, control flow restructuring must be applied. However, the resulting code duplication may cause code size explosion. This paper focuses on achieving a complete PRE while incurring an acceptable code growth. First, we present an algorithm for complete removal of partial redundancies, based on the integration of code motion and control flow restructuring. In contrast to existing complete techniques, we resort to restructuring merely to remove obstacles to code motion, rather th...
DependenceBased Program Analysis
 In Proceedings of the SIGPLAN '93 Conference on Programming Language Design and Implementation
, 1993
"... Program analysis and optimization can be speeded up through the use of the dependence flow graph (DFG), a representation of program dependences which generalizes defuse chains and static single assignment (SSA) form. In this paper, we give a simple graphtheoretic description of the DFG and show ho ..."
Abstract

Cited by 60 (6 self)
 Add to MetaCart
Program analysis and optimization can be speeded up through the use of the dependence flow graph (DFG), a representation of program dependences which generalizes defuse chains and static single assignment (SSA) form. In this paper, we give a simple graphtheoretic description of the DFG and show how the DFG for a program can be constructed in O(EV ) time. We then show how forward and backward dataflow analyses can be performed efficiently on the DFG, using constant propagation and elimination of partial redundancies as examples. These analyses can be framed as solutions of dataflow equations in the DFG. Our construction algorithm is of independent interest because it can be used to construct a program's control dependence graph in O(E) time and its SSA representation in O(EV ) time, which are improvements over existing algorithms. 1 Introduction Anumber of recent papers have focused attention on the problem of speeding up program optimization [FOW87, BMO90, CCF91, PBJ + 91, CFR +...
The Program Structure Tree: Computing Control Regions in Linear Time
, 1994
"... In this paper, we describe the program structure tree (PST), a hierarchical representation of program structure based on single entry single exit (SESE) regions of the control flow graph. We give a lineartime algorithm for finding SESE regions and for building the PST of arbitrary control flow grap ..."
Abstract

Cited by 57 (2 self)
 Add to MetaCart
In this paper, we describe the program structure tree (PST), a hierarchical representation of program structure based on single entry single exit (SESE) regions of the control flow graph. We give a lineartime algorithm for finding SESE regions and for building the PST of arbitrary control flow graphs (including irreducible ones). Next, we establish a connection between SESE regions and control dependence equivalence classes, and show how to use the algorithm to find control regions in linear time. Finally, we discuss some applications of the PST. Many controlflow algorithms, such as construction of Static Single Assignment form, can be speeded up by applying the algorithms in a divideandconquer style to each SESE region on its own. The PST is also used to speed up data flow analysis by exploiting `sparsity'. Experimental results from the Perfect Club and SPEC89 benchmarks confirm that the PST approach finds and exploits program structure.
Generation of efficient interprocedural analyzers with PAG
 In Proceedings of the Second INternational Symposium on Static Analysis
, 1995
"... . To produce high quality code, modern compilers use global optimization algorithms based on abstract interpretation. These algorithms are rather complex; their implementation is therefore a nontrivial task and errorprone. However, since they are based on a common theory, they have large similar ..."
Abstract

Cited by 48 (7 self)
 Add to MetaCart
. To produce high quality code, modern compilers use global optimization algorithms based on abstract interpretation. These algorithms are rather complex; their implementation is therefore a nontrivial task and errorprone. However, since they are based on a common theory, they have large similar parts. We conclude that analyzer writing better should be replaced with analyzer generation. We present the tool PAG that has a high level functional input language to specify data flow analyses. It offers the specification of even recursive data structures and is therefore not limited to bit vector problems. PAG generates efficient analyzers which can be easily integrated in existing compilers. The analyzers are interprocedural, they can handle recursive procedures with local variables and higher order functions. PAG has successfully been tested by generating several analyzers (e.g. alias analysis, constant propagation) for an industrial quality ANSIC and Fortran90 compiler. Keywords: d...
Parallelism for Free: Efficient and Optimal Bitvector Analyses for Parallel Programs
, 1994
"... In this paper we show how to construct optimal bitvector analysis algorithms for parallel programs with shared memory that are as efficient as their purely sequential counterparts, and which can easily be implemented. Whereas the complexity result is rather obvious, our optimality result is a conseq ..."
Abstract

Cited by 46 (3 self)
 Add to MetaCart
In this paper we show how to construct optimal bitvector analysis algorithms for parallel programs with shared memory that are as efficient as their purely sequential counterparts, and which can easily be implemented. Whereas the complexity result is rather obvious, our optimality result is a consequence of a new Kam/Ullmanstyle Coincidence Theorem. Thus, the important merits of sequential bitvector analyses survive the introduction of parallel statements. Keywords Parallelism, interleaving semantics, synchronization, program optimization, data flow analysis, bitvector problems, definitionuse chains, partial redundancy elimination, partial dead code elimination. Contents 1 Motivation 1 2 Sequential Programs 2 2.1 Representation : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 2 2.2 Data Flow Analysis : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 2 2.2.1 The MOPSolution of a DFA : : : : : : : : : : : : : : : : : : : : : : 2 2.2.2 The MFPSolution o...
Composing Dataflow Analyses and Transformations
, 2001
"... Dataflow analyses can have mutually beneficial interactions. Previous e#orts to exploit these interactions have either (1) iteratively performed each individual analysis until no further improvements are discovered or (2) developed "superanalyses " that manually combine conceptually separate analyse ..."
Abstract

Cited by 41 (6 self)
 Add to MetaCart
Dataflow analyses can have mutually beneficial interactions. Previous e#orts to exploit these interactions have either (1) iteratively performed each individual analysis until no further improvements are discovered or (2) developed "superanalyses " that manually combine conceptually separate analyses. We have devised a new approach that allows analyses to be defined independently while still enabling them to be combined automatically and profitably. Our approach avoids the loss of precision associated with iterating individual analyses and the implementation di#culties of manually writing a superanalysis. The key to our approach is a novel method of implicit communication between the individual components of a superanalysis based on graph transformations. In this paper, we precisely define our approach; we demonstrate that it is sound and it terminates; finally we give experimental results showing that in practice (1) our framework produces results at least as precise as iterating the individual analyses while compiling at least 5 times faster, and (2) our framework achieves the same precision as a manually written superanalysis while incurring a compiletime overhead of less than 20%.