Results 1  10
of
87
Efficient planarity testing
 J. ASSOC. COMPUT. MACH
, 1974
"... This paper describes an efficient algorithm to determine whether an arbitrary graph G can be embedded in the plane. The algorithm may be viewed as an iterative version of a method originally proposed by Auslander and Parter and correctly formulated by Goldstein. The algorithm uses depthfirst sear ..."
Abstract

Cited by 259 (5 self)
 Add to MetaCart
This paper describes an efficient algorithm to determine whether an arbitrary graph G can be embedded in the plane. The algorithm may be viewed as an iterative version of a method originally proposed by Auslander and Parter and correctly formulated by Goldstein. The algorithm uses depthfirst search and has O(V) time and space bounds, where V is the number of vertices in G. An ALGOS implementation of the algorithm successfully tested graphs with as many as 900 vertices in less than 12 seconds.
CrossArchitecture Performance Predictions for Scientific Applications Using Parameterized Models
, 2004
"... This paper describes a toolkit for semiautomatically measuring and modeling static and dynamic characteristics of applications in an architectureneutral fashion. For predictable applications, models of dynamic characteristics have a convex and diferentiable profile. Our toolkit operates on applica ..."
Abstract

Cited by 103 (5 self)
 Add to MetaCart
This paper describes a toolkit for semiautomatically measuring and modeling static and dynamic characteristics of applications in an architectureneutral fashion. For predictable applications, models of dynamic characteristics have a convex and diferentiable profile. Our toolkit operates on application binaries and succeeds in modeling key application characteristics that determine program performance. We use these characterizations to explore the interactions between an application and a target architecture. We apply our toolkit to SPARC binaries to develop architectureneutral models of computation and memory access patterns of the ASCI Sweep3D and the NAS SP, BT and LU benchmarks. From our models, we predict the L1, L2 and TLB cache miss counts as well as the overall execution time of these applications on an Origin 2000 system. We evaluate our predictions by comparing them against measurements collected using hardware performance counters.
Register allocation via hierarchical graph coloring
 In Proceedings of the SIGPLAN '91 Conference on Programming Language Design and Implementation
, 1991
"... We present a graph coloring register allocator de signed to minimize the number of dynamic memory references. We cover the program with sets of blocks called tiles and group these tiles into a tree reflecting the program's hierarchical control structure. Registers are allocated for each tile u ..."
Abstract

Cited by 99 (0 self)
 Add to MetaCart
We present a graph coloring register allocator de signed to minimize the number of dynamic memory references. We cover the program with sets of blocks called tiles and group these tiles into a tree reflecting the program's hierarchical control structure. Registers are allocated for each tile using standard graph coloring techniques and the local allocation and conflict information is passed around the tree in a two phase algorithm. This results in an allocation of reg isters that is sensitive to local usage patterns while retaining a global perspective. Spill code is placed in less frequently executed portions of the program and the choice of variables to spill is based on usage pat terns between the spills and the reloads rather than usage patterns over the entire program. 1
Combining Analyses, Combining Optimizations
, 1995
"... This thesis presents a framework for describing optimizations. It shows how to combine two such frameworks and how to reason about the properties of the resulting framework. The structure of the framework provides insight into when a combination yields better results. Also presented is a simple iter ..."
Abstract

Cited by 83 (4 self)
 Add to MetaCart
This thesis presents a framework for describing optimizations. It shows how to combine two such frameworks and how to reason about the properties of the resulting framework. The structure of the framework provides insight into when a combination yields better results. Also presented is a simple iterative algorithm for solving these frameworks. A framework is shown that combines Constant Propagation, Unreachable Code Elimination, Global Congruence Finding and Global Value Numbering. For these optimizations, the iterative algorithm runs in O(n^2) time.
This thesis then presents an O(n log n) algorithm for combining the same optimizations. This technique also finds many of the common subexpressions found by Partial Redundancy Elimination. However, it requires a global code motion pass to make the optimized code correct, also presented. The global code motion algorithm removes some Partially Dead Code as a sideeffect. An implementation demonstrates that the algorithm has shorter compile times than repeated passes of the separate optimizations while producing runtime speedups of 4%–7%.
While global analyses are stronger, peephole analyses can be unexpectedly powerful. This thesis demonstrates parsetime peephole optimizations that find more than 95% of the constants and common subexpressions found by the best combined analysis. Finding constants and common subexpressions while parsing reduces peak intermediate representation size. This speeds up the later global analyses, reducing total compilation time by 10%. In conjunction with global code motion, these peephole optimizations generate excellent code very quickly, a useful feature for compilers that stress compilation speed over code quality.
Sharlit  A Tool for Building Optimizers
, 1992
"... This paper presents Sharlit, a tool to support the construction of modular and extensible global optimizers. We will show how Sharlit helps in constructing dataflow analyzers and the transformations that use dataflow analysis information: both are major components of any optimizer. ..."
Abstract

Cited by 74 (6 self)
 Add to MetaCart
This paper presents Sharlit, a tool to support the construction of modular and extensible global optimizers. We will show how Sharlit helps in constructing dataflow analyzers and the transformations that use dataflow analysis information: both are major components of any optimizer.
HPCView: A tool for topdown analysis of node performance
 The Journal of Supercomputing
, 2002
"... ..."
(Show Context)
SpaceEfficient Closure Representations
, 1994
"... Many modern compilers implement function calls (or returns) in two steps: first, a closure environment is properly installed to provide access for free variables in the target program fragment; second, the control is transferred to the target by a "jump with arguments (or results)." Closur ..."
Abstract

Cited by 66 (12 self)
 Add to MetaCart
Many modern compilers implement function calls (or returns) in two steps: first, a closure environment is properly installed to provide access for free variables in the target program fragment; second, the control is transferred to the target by a "jump with arguments (or results)." Closure conversion, which decides where and how to represent closures at runtime, is a crucial step in compilation of functional languages. We have a new algorithm that exploits the use of compiletime control and data flow information to optimize closure representations. By extensive closure sharing and allocating as many closures in registers as possible, our new closure conversion algorithm reduces heap allocation by 36% and memory fetches for local/global variables by 43%; and improves the alreadyefficient code generated by the Standard ML of New Jersey compiler by about 17% on a DECstation 5000. Moreover, unlike most other approaches, our new closure allocation scheme satisfies the strong "safe for sp...
Elimination algorithms for data flow analysis
 ACM Computing Surveys
, 1986
"... A unified model of a family of data flow algorithms, called elimination methods, is presented. The algorithms, which gather information about the definition and use of data in a program or a set of programs, are characterized by the manner in which they solve the systems of equations that describe d ..."
Abstract

Cited by 55 (8 self)
 Add to MetaCart
A unified model of a family of data flow algorithms, called elimination methods, is presented. The algorithms, which gather information about the definition and use of data in a program or a set of programs, are characterized by the manner in which they solve the systems of equations that describe data flow problems of interest. The unified model
Interprocedural Symbolic Analysis
, 1994
"... Compiling for efficient execution on advanced computer architectures requires extensive program analysis and transformation. Most compilers limit their analysis to simple phenomena within single procedures, limiting effective optimization of modular codes and making the programmer's job harder. ..."
Abstract

Cited by 48 (1 self)
 Add to MetaCart
Compiling for efficient execution on advanced computer architectures requires extensive program analysis and transformation. Most compilers limit their analysis to simple phenomena within single procedures, limiting effective optimization of modular codes and making the programmer's job harder. We present methods for analyzing array side effects and for comparing nonconstant values computed in the same and different procedures. Regular sections, described by rectangular bounds and stride, prove as effective in describing array side effects in Linpack as more complicated summary techniques. On a set of six programs, regular section analysis of array side effects gives 0 to 39 percent reductions in array dependences at call sites, with 10 to 25 percent increases in analysis time. Symbolic analysis is essential to data dependence testing, array section analysis, and other highlevel program manipulations. We give methods for building symb...