Results 1 -
9 of
9
Efficiently computing static single assignment form and the control dependence graph
- ACM TRANSACTIONS ON PROGRAMMING LANGUAGES AND SYSTEMS
, 1991
"... In optimizing compilers, data structure choices directly influence the power and efficiency of practical program optimization. A poor choice of data structure can inhibit optimization or slow compilation to the point that advanced optimization features become undesirable. Recently, static single ass ..."
Abstract
-
Cited by 749 (7 self)
- Add to MetaCart
In optimizing compilers, data structure choices directly influence the power and efficiency of practical program optimization. A poor choice of data structure can inhibit optimization or slow compilation to the point that advanced optimization features become undesirable. Recently, static single assignment form and the control dependence graph have been proposed to represent data flow and control flow propertiee of programs. Each of these previously unrelated techniques lends efficiency and power to a useful class of program optimization. Although both of these structures are attractive, the difficulty of their construction and their potential size have discouraged their use. We present new algorithms that efficiently compute these data structures for arbitrary control flow graphs. The algorithms use dominance frontiers, a new concept that may have other applications. We also give analytical and experimental evidence that all of these data structures are usually linear in the size of the original program. This paper thus presents strong evidence that these structures can be of practical use in optimization.
VISTA: The Visual Interface for Scheduling Transformations and Analysis
- In Languages and Compilers for Parallel Computing
, 1993
"... . VISTA is a visually oriented, interactive environment for parallelizing sequential programs at the instruction level for execution on fine-grain architectures. Fully automatic parallelization techniques often perform well, but may not be able to achieve the strict performance and code size require ..."
Abstract
-
Cited by 11 (4 self)
- Add to MetaCart
. VISTA is a visually oriented, interactive environment for parallelizing sequential programs at the instruction level for execution on fine-grain architectures. Fully automatic parallelization techniques often perform well, but may not be able to achieve the strict performance and code size requirements needed for some critical applications. In such cases, manual manipulation by an expert user can often provide enough improvements in the parallelization process to meet the requirements of the application. Using VISTA, an expert user fine-tunes the parallelization process by providing rules and directives to the system in response to graphical and numeric feedback provided by the system. 1 Introduction The Visual Interface for Scheduling Transformations and Analysis (VISTA) is a visually oriented, interactive environment for the semi-automatic parallelization of sequential programs at the instruction level for execution on fine-grain architectures. Fully automatic parallelizing compil...
Linear Loop Transformations in Optimizing Compilers for Parallel Machines
, 1995
"... We present the linear loop transformation framework which is the formal basis for state of the art optimization techniques in restructuring compilers for parallel machines. The framework unifies most existing transformations and provides a systematic set of code generation techniques for arbitrary c ..."
Abstract
-
Cited by 10 (0 self)
- Add to MetaCart
We present the linear loop transformation framework which is the formal basis for state of the art optimization techniques in restructuring compilers for parallel machines. The framework unifies most existing transformations and provides a systematic set of code generation techniques for arbitrary compound loop transformations. The algebraic representation of the loop structure and its transformation give way to quantitative techniques for optimizing performance on parallel machines. We discuss in detail the techniques for generating the transformed loop and deriving the desired linear transformation. Key Words: Dependence Analysis, Iteration Spaces, Parallelism, Locality, Load Balance, Conventional Loop Transformations, Linear Loop Transformations Corresponding author. y Parallel Systems Group, Department of Computer Science, 10 King's College Road, University of Toronto, Toronto, ON M5S 1A4, CANADA. Email: kulki@cs.toronto.edu Kulkarni and Stumm: Linear Loop Transformations 2 1...
Loop and Data Transformations: A Tutorial
- University of Toronto
, 1993
"... In this tutorial, we address the problem of restructuring a (possibly sequential) program to improve execution efficiency on parallel machines. This restructuring involves the transformation and partitioning of loop structures and data so as to improve parallelism, static and dynamic locality, an ..."
Abstract
-
Cited by 3 (0 self)
- Add to MetaCart
In this tutorial, we address the problem of restructuring a (possibly sequential) program to improve execution efficiency on parallel machines. This restructuring involves the transformation and partitioning of loop structures and data so as to improve parallelism, static and dynamic locality, and load balance. We present previous and ongoing work on loop and data transformations and motivate a unified framework. Key Words: Dependence Analysis, Iteration and Data Spaces, Hierarchical Memory, Parallelism, Locality, Load Balance, Conventional and Unified Loop transformations, Data Alignment, Data Distributions. 1 Kulkarni: Loop and Data Transformations 2 1
Access Regions: Toward a Powerful Parallelizing Compiler
, 1996
"... The bulk of the work within a scientific program involves processing data stored in arrays. We present a general and efficient means of representing the region of an array accessed by a section of a program. We introduce a notation for access regions, and a set of region operations for manipulatin ..."
Abstract
-
Cited by 2 (2 self)
- Add to MetaCart
The bulk of the work within a scientific program involves processing data stored in arrays. We present a general and efficient means of representing the region of an array accessed by a section of a program. We introduce a notation for access regions, and a set of region operations for manipulating them. We show how a region processor which implements our region operations can form the basis for a parallelizer which handles array privatization, run-time parallelization, communication generation, and interprocedural analysis.
A Generalized Theory of Linear Loop Transformations
, 1994
"... In this paper we present a new theory of linear loop transformations called Computation Decomposition and Alignment (CDA). A CDA transformation has two components: Computation Decomposition first decomposes the computations in the loop into computations of finer granularity, from iterations to insta ..."
Abstract
-
Cited by 2 (2 self)
- Add to MetaCart
In this paper we present a new theory of linear loop transformations called Computation Decomposition and Alignment (CDA). A CDA transformation has two components: Computation Decomposition first decomposes the computations in the loop into computations of finer granularity, from iterations to instances of subexpressions. Computation Alignment subsequently, linearly transforms each of these sets of computations, possibly by using a different transformation for each set. This framework subsumes all existing linear transformation frameworks in that it reduces to a conventional linear loop transformation when the smallest granularity is an iteration, and it reduces to some of the more recently extended frameworks when the smallest granularity is a statement instance. The possibility of being able to align computations at arbitrary granularities addsa new dimensions to performance optimization on high performance computing platforms. We describe ComputationDecompositionandAlignment and pro...
The Program Compaction Revisited: the Functional Framework
- In International Conference on Parallel Processing (EURO-PAR'95), LNCS 966
, 1994
"... . This paper presents a general method to compact the firstorder part of functional languages with call-by-value semantics for finegrain parallel machines like VLIW or super-scalars. This work extends previous works on compaction in two ways. First, it defines a new formal system for the compaction ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
. This paper presents a general method to compact the firstorder part of functional languages with call-by-value semantics for finegrain parallel machines like VLIW or super-scalars. This work extends previous works on compaction in two ways. First, it defines a new formal system for the compaction problem usable to design a meta-compiler for these machines. Second, the compaction is directly applied to functional expressions instead of graph based representations (control flow or dependence flow based representations) leading to a very uniform and simple presentation. 1 Introduction VLIW (Very Long Instruction Word) [14] and super-scalar architectures [11, 5] are fine grain parallel and compiled architectures in the sense that they can execute many instructions per cycle, gathered together by a compiler. VLIW are controlled by a single instruction stream (one program counter) where each processor executes a dedicated field of a long instruction. In a super-scalar machine, the process...
Mutation Scheduling: A Unified Approach to Compiling for Fine-Grain Parallelism
- In Languages and Compilers for Parallel Computing
, 1994
"... . Trade-offs between code selection, register allocation, and instruction scheduling are inherently interdependent, especially when compiling for fine-grain parallel architectures. However, the conventional approach to compiling for such machines arbitrarily separates these phases so that decisio ..."
Abstract
- Add to MetaCart
. Trade-offs between code selection, register allocation, and instruction scheduling are inherently interdependent, especially when compiling for fine-grain parallel architectures. However, the conventional approach to compiling for such machines arbitrarily separates these phases so that decisions made during any one phase place unnecessary constraints on the remaining phases. Mutation Scheduling attempts to solve this problem by combining code selection, register allocation, and instruction scheduling into a unified framework in which trade-offs between the functional, register, and memory bandwidth resources of the target architecture are made "on the fly" in response to changing resource constraints and availability. 1 Introduction In this paper we present Mutation Scheduling (MS), a unified, compiler-based approach for exploiting the functional, register, and memory bandwidth characteristics of arbitrary single-threaded fine-grain parallel architectures, such as VLIW, s...
Fine Grain Parallelisation of Functional Programs for VLIW or Super-scalar Architectures
"... This paper presents a compaction method of functional programs (eg, ML programs) for super-scalars or VLIW architectures. It is a generalisation of the Percolation scheduling system [1] and Perfect Pipelining [2]. It is described by a set of program transformations respecting data-dependences. Inste ..."
Abstract
- Add to MetaCart
This paper presents a compaction method of functional programs (eg, ML programs) for super-scalars or VLIW architectures. It is a generalisation of the Percolation scheduling system [1] and Perfect Pipelining [2]. It is described by a set of program transformations respecting data-dependences. Instead of managing a control-flow based representation of programs, the compaction is directly applied to functional expressions. It leads to a simple expression of compaction with direct renaming and an efficient implementation. It greedily realizes local and global compaction. The software pipelining principle, initially applied to loops, is extended to general recursive functions. To our knowledge, it is the first method to compact a functional language.

