Results 1 
7 of
7
Efficiently computing static single assignment form and the control dependence graph
 ACM TRANSACTIONS ON PROGRAMMING LANGUAGES AND SYSTEMS
, 1991
"... In optimizing compilers, data structure choices directly influence the power and efficiency of practical program optimization. A poor choice of data structure can inhibit optimization or slow compilation to the point that advanced optimization features become undesirable. Recently, static single ass ..."
Abstract

Cited by 839 (7 self)
 Add to MetaCart
In optimizing compilers, data structure choices directly influence the power and efficiency of practical program optimization. A poor choice of data structure can inhibit optimization or slow compilation to the point that advanced optimization features become undesirable. Recently, static single assignment form and the control dependence graph have been proposed to represent data flow and control flow propertiee of programs. Each of these previously unrelated techniques lends efficiency and power to a useful class of program optimization. Although both of these structures are attractive, the difficulty of their construction and their potential size have discouraged their use. We present new algorithms that efficiently compute these data structures for arbitrary control flow graphs. The algorithms use dominance frontiers, a new concept that may have other applications. We also give analytical and experimental evidence that all of these data structures are usually linear in the size of the original program. This paper thus presents strong evidence that these structures can be of practical use in optimization.
Interprocedural Symbolic Analysis
, 1994
"... Compiling for efficient execution on advanced computer architectures requires extensive program analysis and transformation. Most compilers limit their analysis to simple phenomena within single procedures, limiting effective optimization of modular codes and making the programmer's job harder. We p ..."
Abstract

Cited by 49 (1 self)
 Add to MetaCart
Compiling for efficient execution on advanced computer architectures requires extensive program analysis and transformation. Most compilers limit their analysis to simple phenomena within single procedures, limiting effective optimization of modular codes and making the programmer's job harder. We present methods for analyzing array side effects and for comparing nonconstant values computed in the same and different procedures. Regular sections, described by rectangular bounds and stride, prove as effective in describing array side effects in Linpack as more complicated summary techniques. On a set of six programs, regular section analysis of array side effects gives 0 to 39 percent reductions in array dependences at call sites, with 10 to 25 percent increases in analysis time. Symbolic analysis is essential to data dependence testing, array section analysis, and other highlevel program manipulations. We give methods for building symb...
On Loops, Dominators, and Dominance Frontiers
, 1999
"... This paper explores the concept of loops and loop nesting forests of controlflow graphs, using the problem of constructing the dominator tree of a graph and the problem of computing the iterated dominance frontier of a set of vertices in a graph as guiding applications. The contributions of this pa ..."
Abstract

Cited by 21 (0 self)
 Add to MetaCart
This paper explores the concept of loops and loop nesting forests of controlflow graphs, using the problem of constructing the dominator tree of a graph and the problem of computing the iterated dominance frontier of a set of vertices in a graph as guiding applications. The contributions of this paper include: (1) An axiomatic characterization, as well as a constructive characterization, of a family of loop nesting forests that includes various specific loop nesting forests that have been previously defined. (2) The definition of a new loop nesting forest, as well as an e#cient, almost linear time, algorithm for constructing this forest. (3) An illustration of how loop nesting forests can be used to transform arbitrary (potentially irreducible) problem instances into equivalent acylic graph problem instances in the case of the two problems of (a) constructing the dominator tree of a graph, and (b) computing the iterated dominance frontier of a set of vertices in a graph, leading to new, almost linear time, algorithms for these problems
Probabilistic Arithmetic
, 1989
"... This thesis develops the idea of probabilistic arithmetic. The aim is to replace arithmetic operations on numbers with arithmetic operations on random variables. Specifically, we are interested in numerical methods of calculating convolutions of probability distributions. The longterm goal is to ..."
Abstract

Cited by 13 (0 self)
 Add to MetaCart
This thesis develops the idea of probabilistic arithmetic. The aim is to replace arithmetic operations on numbers with arithmetic operations on random variables. Specifically, we are interested in numerical methods of calculating convolutions of probability distributions. The longterm goal is to be able to handle random problems (such as the determination of the distribution of the roots of random algebraic equations) using algorithms which have been developed for the deterministic case. To this end, in this thesis we survey a number of previously proposed methods for calculating convolutions and representing probability distributions and examine their defects. We develop some new results for some of these methods (the Laguerre transform and the histogram method), but ultimately find them unsuitable. We find that the details on how the ordinary convolution equations are calculated are
Automatic Generation Of DataFlow Analyzers: A Tool For Building Optimizers
, 1993
"... Modern compilers generate good code by performing global optimizations. Unlike other functions of the compiler such as parsing and code generation which examine only one statement or one basic block at a time, optimizers examine large parts of a program and coordinate changes in widely separated par ..."
Abstract

Cited by 8 (0 self)
 Add to MetaCart
Modern compilers generate good code by performing global optimizations. Unlike other functions of the compiler such as parsing and code generation which examine only one statement or one basic block at a time, optimizers examine large parts of a program and coordinate changes in widely separated parts of a program. Thus optimizers use more complex data structures and consume more time. To generate the best code, optimizers perform not one global transformation, but many in concert. These transformations can interact in unforeseen ways. This dissertation concerns the building of optimizers that are modular and extensible. It espouses an optimizer architecture, first proposed by Kildall, in which each phase is based on a dataflow analysis (DFA) of the program and on an optimization function that transforms the program. To support the architecture, a set of abstractionsflow values, flow functions, path simplification rules, action routinesis provided. A tool called Sharlit turns a DFA specification consisting of these abstractions into a solver for a DFA problem. At the heart of Sharlit is an algorithm called path simplification, an extension of Tarjan's fast path algorithm. Path simplification unifies several powerful DFA solution techniques. By using path simplification rules, compiler writers can construct a wide range of dataflow analyzers, from simple iterative ones, to solvers that use local analysis, interval analysis, or sparse dataflow evaluation. Sharlit frees compiler writers from the details of how these various solution techniques. The compiler writer can view the program representation as a simple flow graph in which each instruction is a node. Data structures to represent basic blocks and other regions are automatically generated. Sharlit promotes ...
Array Restructuring for Cache Locality
, 1996
"... Caches are used in almost every modern processor design to reduce the long memory access latency, which is increasingly a bottleneck to program performance. For caches to be effective, programs must exhibit good data locality. Thus, an optimizing compiler may have to restructure programs to enhance ..."
Abstract

Cited by 8 (0 self)
 Add to MetaCart
Caches are used in almost every modern processor design to reduce the long memory access latency, which is increasingly a bottleneck to program performance. For caches to be effective, programs must exhibit good data locality. Thus, an optimizing compiler may have to restructure programs to enhance their locality. We focus on the class of restructuring techniques that target array accesses in loops. There are two approaches to enhancing the locality of such accesses: loop restructuring and array restructuring. Under loop restructuring, a compiler adopts a canonical array layout but transforms the order in which loop iterations are performed and thereby reorders the execution of array accesses. Under array restructuring, in contrast, a compiler lays out array elements in an order that matches the access pattern, whi...
Convergence and Scalarization for DataParallel Architectures
"... Modern throughput processors such as GPUs achieve high performance and efficiency by exploiting data parallelism in application kernels expressed as threaded code. One drawback of this approach compared to conventional vector architectures is redundant execution of instructions that are common acros ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
Modern throughput processors such as GPUs achieve high performance and efficiency by exploiting data parallelism in application kernels expressed as threaded code. One drawback of this approach compared to conventional vector architectures is redundant execution of instructions that are common across multiple threads, resulting in energy inefficiency due to excess instruction dispatch, register file accesses, and memory operations. This paper proposes to alleviate these overheads while retaining the threaded programming model by automatically detecting the scalar operations and factoring them out of the parallel code. We have developed a scalarizing compiler that employs convergence and variance analyses to statically identify values and instructions that are invariant across multiple threads. Our compiler algorithms are effective at identifying convergent execution even in programs with arbitrary control flow, identifying twothirds of the opportunity captured by a dynamic oracle. The compiletime analysis leads to a reduction in instructions dispatched by 29%, register file reads and writes by 31%, memory address counts by 47%, and data access counts by 38%.