Program Analysis and Specialization for the C Programming Language
, 1994
"... Software engineers are faced with a dilemma. They want to write general and wellstructured programs that are flexible and easy to maintain. On the other hand, generality has a price: efficiency. A specialized program solving a particular problem is often significantly faster than a general program. ..."
Abstract

Cited by 527 (0 self)
Software engineers are faced with a dilemma. They want to write general and wellstructured programs that are flexible and easy to maintain. On the other hand, generality has a price: efficiency. A specialized program solving a particular problem is often significantly faster than a general program. However, the development of specialized software is timeconsuming, and is likely to exceed the production of today’s programmers. New techniques are required to solve this socalled software crisis. Partial evaluation is a program specialization technique that reconciles the benefits of generality with efficiency. This thesis presents an automatic partial evaluator for the Ansi C programming language. The content of this thesis is analysis and transformation of C programs. We develop several analyses that support the transformation of a program into its generating extension. A generating extension is a program that produces specialized programs when executed on parts of the input. The thesis contains the following main results.
Debugging concurrent programs
 ACM Computing Surveys
, 1989
"... The main problems associated with debugging concurrent programs are increased complexity, the “probe effect, ” nonrepeatability, and the lack of a synchronized global clock. The probe effect refers to the fact that any attempt to observe the behavior of a distributed system may change the behavior o ..."
Abstract

Cited by 169 (1 self)
The main problems associated with debugging concurrent programs are increased complexity, the “probe effect, ” nonrepeatability, and the lack of a synchronized global clock. The probe effect refers to the fact that any attempt to observe the behavior of a distributed system may change the behavior of that system. For some parallel programs,
Lazy Code Motion
, 1992
"... We present a bitvector algorithm for the optimal and economical placement of computations within flow graphs, which is as efficient as standard unidirectional analyses. The point of our algorithm is the decomposition of the bidirectional structure of the known placement algorithms into a sequenc ..."
Abstract

Cited by 157 (20 self)
We present a bitvector algorithm for the optimal and economical placement of computations within flow graphs, which is as efficient as standard unidirectional analyses. The point of our algorithm is the decomposition of the bidirectional structure of the known placement algorithms into a sequence of a backward and a forward analysis, which directly implies the efficiency result. Moreover, the new compositional structure opens the algorithm for modification: two further unidirectional analysis components exclude any unnecessary code motion. This laziness of our algorithm minimizes the register pressure, which has drastic effects on the runtime behaviour of the optimized programs in practice, where an economical use of registers is essential.
A fast algorithm for finding dominators in a flowgraph
 ACM Transactions on Programming Languages and Systems
, 1979
"... A fast algoritbm for finding dominators in a flowgraph is presented. The algorithm uses depthfirst search and an efficient method of computing functions defined on paths in trees. A simple implementation of the algorithm runs in O(m log n) time, where m is the number of edges and n is the number o ..."
Abstract

Cited by 144 (3 self)
A fast algoritbm for finding dominators in a flowgraph is presented. The algorithm uses depthfirst search and an efficient method of computing functions defined on paths in trees. A simple implementation of the algorithm runs in O(m log n) time, where m is the number of edges and n is the number of vertices in the problem graph. A more sophisticated implementation runs in O(ma(m, n)) time, where a(m, n) is a functional inverse of Ackermann's function. Both versions of the algorithm were implemented in Algol W, a Stanford University version of Algol, and tested on an IBM 370/168. The programs were compared with an implementation by Purdom and Moore of a straightforward O(mn)time algorithm, and with ~a bit vector algorithm described by Aho and Ullman. The fast algorithm beat the straightforward algorithm and the bit vector algorithm on all but the smallest graphs tested.
A data flow oriented program testing strategy
 IEEE Trans. Software Eng
, 1983
"... AbstractSome properties guide program testing. of a program data flow can be used to The presented approach aims to exercise usedefinition chains that appear in the program. Two such data oriented testing strategies are proposed; the first involves checking liveness of every definition of a variabl ..."
Abstract

Cited by 124 (1 self)
AbstractSome properties guide program testing. of a program data flow can be used to The presented approach aims to exercise usedefinition chains that appear in the program. Two such data oriented testing strategies are proposed; the first involves checking liveness of every definition of a variable at the point(s) of its possible use; the second deals with liveness of vectors of variables treated as arguments to an instruction or program block. Reliability of these strategies is discussed with respect to a program containing an error. Index TermsControl flow, data context, data environment, data flow, data oriented testing, program testing, liveness,.variable definition. I.
Optimal Code Motion: Theory and Practice
, 1993
"... An implementation oriented algorithm for lazy code motion is presented that minimizes the number of computations in programs while suppressing any unnecessary code motion in order to avoid superfluous register pressure. In particular, this variant of the original algorithm for lazy code motion works ..."
Abstract

Cited by 112 (18 self)
An implementation oriented algorithm for lazy code motion is presented that minimizes the number of computations in programs while suppressing any unnecessary code motion in order to avoid superfluous register pressure. In particular, this variant of the original algorithm for lazy code motion works on flowgraphs whose nodes are basic blocks rather than single statements, as this format is standard in optimizing compilers. The theoretical foundations of the modified algorithm are given in the first part, where trefined flowgraphs are introduced for simplifying the treatment of flowgraphs whose nodes are basic blocks. The second part presents the `basic block' algorithm in standard notation, and gives directions for its implementation in standard compiler environments. Keywords Elimination of partial redundancies, code motion, data flow analysis (bitvector, unidirectional, bidirectional), nondeterministic flowgraphs, trefined flow graphs, critical edges, lifetimes of registers, com...
Sharlit  A Tool for Building Optimizers
, 1992
"... This paper presents Sharlit, a tool to support the construction of modular and extensible global optimizers. We will show how Sharlit helps in constructing dataflow analyzers and the transformations that use dataflow analysis information: both are major components of any optimizer. ..."
Abstract

Cited by 70 (6 self)
This paper presents Sharlit, a tool to support the construction of modular and extensible global optimizers. We will show how Sharlit helps in constructing dataflow analyzers and the transformations that use dataflow analysis information: both are major components of any optimizer.
Analysis of Cacherelated Preemption Delay in Fixedpriority Preemptive Scheduling
, 1996
"... We propose a technique for analyzing cacherelated preemption delays of tasks that cause unpredictable variation in task execution time in the context of fixedpriority preemptive scheduling. The proposed technique consists of two steps. The first step performs a pertask analysis to estimate cache ..."
Abstract

Cited by 64 (4 self)
We propose a technique for analyzing cacherelated preemption delays of tasks that cause unpredictable variation in task execution time in the context of fixedpriority preemptive scheduling. The proposed technique consists of two steps. The first step performs a pertask analysis to estimate cacherelated preemption cost for each execution point in a given task. The second step computes the worst case response time of each task that includes the cacherelated preemption delay using a response time equation and a linear programming technique. This step takes as its input the preemption cost information of tasks obtained in the first step. This paper also compares the proposed approach with previous approaches. The results show that the proposed approach gives a prediction of the worst case cacherelated preemption delay that is up to 60% tighter than the best of predictions obtained from the previous approaches. Index Terms realtime system, fixedpriority scheduling, cache memory,...
Elimination algorithms for data flow analysis
 ACM Computing Surveys
, 1986
"... A unified model of a family of data flow algorithms, called elimination methods, is presented. The algorithms, which gather information about the definition and use of data in a program or a set of programs, are characterized by the manner in which they solve the systems of equations that describe d ..."
Abstract

Cited by 53 (8 self)
A unified model of a family of data flow algorithms, called elimination methods, is presented. The algorithms, which gather information about the definition and use of data in a program or a set of programs, are characterized by the manner in which they solve the systems of equations that describe data flow problems of interest. The unified model
Inline function expansion for compiling c programs
 ACM SIGPLAN Notices
, 1989
"... Inline function expansion replaces a function call with the function body. With automatic inline function expansion, programs can be constructed with many small functions to handle complexity and then rely on the compilation to eliminate most of the function calls. Therefore, inline expansion serves ..."
Abstract

Cited by 49 (2 self)
Inline function expansion replaces a function call with the function body. With automatic inline function expansion, programs can be constructed with many small functions to handle complexity and then rely on the compilation to eliminate most of the function calls. Therefore, inline expansion serves a tool for satisfying two conflicting goals: minizing the complexity of the program development and minimizing the function call overhead of program execution. A simple inline expansion procedure is presented which uses profile information to address three critical issues: code expansion, stack expansion, and unavailable function bodies. Experiments show that a large percentage of function calls/returns (about 59%) can be eliminated with a modest code expansion cost (about 17%) for twelve UNIX * programs. * UNIX is a trademark of the AT&T Bell Laboratories. 1.