Results 1 - 10
of
20
SUIF: An Infrastructure for Research on Parallelizing and Optimizing Compilers
- ACM SIGPLAN Notices
, 1994
"... Compiler infrastructures that support experimental research are crucial to the advancement of high-performance computing. New compiler technology must be implemented and evaluated in the context of a complete compiler, but developing such an infrastructure requires a huge investment in time and reso ..."
Abstract
-
Cited by 189 (21 self)
- Add to MetaCart
Compiler infrastructures that support experimental research are crucial to the advancement of high-performance computing. New compiler technology must be implemented and evaluated in the context of a complete compiler, but developing such an infrastructure requires a huge investment in time and resources. We have spent a number of years building the SUIF compiler into a powerful, flexible system, and we would now like to share the results of our efforts. SUIF consists of a small, clearly documented kernel and a toolkit of compiler passes built on top of the kernel. The kernel defines the intermediate representation, provides functions to access and manipulate the intermediate representation, and structures the interface between compiler passes. The toolkit currently includes C and Fortran front ends, a loop-level parallelism and locality optimizer, an optimizing MIPS back end, a set of compiler development tools, and support for instructional use. Although we do not expect SUIF to be suitable for everyone, we think it may be useful for many other researchers. We thus invite you to use SUIF and welcome your contributions to this infrastructure. Directions for obtaining the SUIF software are included at the end of this paper. 1
ParaScope: a parallel programming environment
- PROCEEDINGS OF THE IEEE
, 1993
"... The ParaScope parallel programming environment developed to support scientific programming of shared-memory multiprocessors, includes a collection of tools that use global program analysis to help users develop and debug parallel programs. This paper focuses on ParaScope’s compilation system, its pa ..."
Abstract
-
Cited by 120 (33 self)
- Add to MetaCart
The ParaScope parallel programming environment developed to support scientific programming of shared-memory multiprocessors, includes a collection of tools that use global program analysis to help users develop and debug parallel programs. This paper focuses on ParaScope’s compilation system, its parallel program editor, and its parallel debugging system. The compilation system extends the traditional single-procedure compiler by providing a mechanism for managing the compilation of complete programs. Thus, ParaScope can support both traditional single-procedure optimization and optimization across procedure boundaries. The ParaScope editor brings both compiler analysis and user expertise to bear on program parallelization. It assists the knowledgeable user by displaying and managing analysis and by proiiding a variety of interactive program tran.formation.s that are effective in exposing parallelism. The debugging svstem detects and reports timing-dependent errors, called data races, in execution of parallel programs. The system combines static analysis. program instrumentation. and run-time reporting to provide a mechanical system for isolating errors in parallel program executions. Finally, we describe a new project to extend ParaScope to support programming in Fortran D, a machine-independent parallel pro-gramming language intended for use with both distributed-memory and shared-memory parallel computers..
FIAT: A Framework for Interprocedural Analysis and Transformation
, 1995
"... Modern architectures with deep memory hierarchies or parallehsm require the use of increasingly sophisticated code analysis and optimization to achieve maximum performance for large, scientific programs. In such ..."
Abstract
-
Cited by 48 (7 self)
- Add to MetaCart
Modern architectures with deep memory hierarchies or parallehsm require the use of increasingly sophisticated code analysis and optimization to achieve maximum performance for large, scientific programs. In such
Designing the McCAT Compiler Based on a Family of Structured Intermediate Representations
- In Proceedings of the 5th International Workshop on Languages and Compilers for Parallel Computing, number 757 in LNCS
, 1992
"... The effective exploitation of advanced technology for the development of the nextgeneration high-performance computers requires the integrated development of compiler techniques and architectural design. In order to provide a research tool with which we can experiment with both new architectural fea ..."
Abstract
-
Cited by 44 (16 self)
- Add to MetaCart
The effective exploitation of advanced technology for the development of the nextgeneration high-performance computers requires the integrated development of compiler techniques and architectural design. In order to provide a research tool with which we can experiment with both new architectural features and compiler support for those features, we have been developing the McGill Compiler /Architecture Testbed, McCAT. In this paper we focus on the design of the McCAT compiler. The central theme of the paper is that the design of the family of intermediate representations should be driven by the analyses and transformations that are most important for effective compilation for architectures supporting some level of fine-grain parallelism. A primary objective of our design was to provide a natural way of supporting a framework for alias analysis that is general (handles scalars, arrays and pointers), accurate (provides accurate enough estimates for parallelizing transformations) , and pe...
Accurate Analysis of Array References
, 1992
"... ii I certify that I have read this thesis and that in my opinion it is fully adequate, in scope and in quality, as a dissertation for the degree of Doctor of Philosophy. John L. Hennessy(Principal Adviser) I certify that I have read this thesis and that in my opinion it is fully adequate, in scope a ..."
Abstract
-
Cited by 30 (0 self)
- Add to MetaCart
ii I certify that I have read this thesis and that in my opinion it is fully adequate, in scope and in quality, as a dissertation for the degree of Doctor of Philosophy. John L. Hennessy(Principal Adviser) I certify that I have read this thesis and that in my opinion it is fully adequate, in scope and in quality, as a dissertation for the degree of Doctor of Philosophy.
Compiler Support for Software Prefetching
, 1998
"... Due to the growing disparity between processor speed and main memory speed, techniques that improve cache utilization and hide memory latency are often needed to help applications achieve peak performance. Compiler-directed software prefetching is a hybrid software/hardware strategy that addresses t ..."
Abstract
-
Cited by 17 (0 self)
- Add to MetaCart
Due to the growing disparity between processor speed and main memory speed, techniques that improve cache utilization and hide memory latency are often needed to help applications achieve peak performance. Compiler-directed software prefetching is a hybrid software/hardware strategy that addresses this need. In this form of prefetching, the compiler inserts cache prefetch instructions into a program during the compilation process. During the program's execution, the hardware executes the prefetch instructions in parallel with other operations, bringing data items into the cache prior to the point where they are actually used, eliminating processor stalls due to cache misses. In this dissertation, we focus on the compiler's role in software prefetching. In a set of experimental studies, we evaluate the performance of current software prefetching strategies, first for sequential benchmark programs running on a simulated uniprocessor machine, and then for a set of parallel benchmarks on a...
Processor Tagged Descriptors: A Data Structure for Compiling for Distributed-Memory Multicomputers
- the Proceedings of the Parallel Architectures and Compiler Technology Conference
, 1994
"... The computation partitioning, communication analysis, and optimization phases performed during compilation for distributed-memory multicomputers require an efficient way of describing distributed sets of iterations and regions of data. Processor Tagged Descriptors (PTDs) provide these capabilities t ..."
Abstract
-
Cited by 14 (9 self)
- Add to MetaCart
The computation partitioning, communication analysis, and optimization phases performed during compilation for distributed-memory multicomputers require an efficient way of describing distributed sets of iterations and regions of data. Processor Tagged Descriptors (PTDs) provide these capabilities through a single set representation parameterized by the processor location for each dimension of a virtual mesh. A uniform representation is maintained for every processor in the mesh, whether it is a boundary or an interior node. As a result, operations on the sets are very efficient because the effect on all processors in a dimension can be captured in a single symbolic operation. In addition, PTDs are easily extended to an arbitrary number of dimensions, necessary for describing iteration sets in multiply nested loops as well as sections of multidimensional arrays. Using the symbolic features of PTDs it is also possible to generate code for variable numbers of processors, thereby allowi...
Using Tracing and Dynamic Slicing to Tune Compilers
- University of Wisconsin Computer Sciences Department
, 1993
"... Performance tuning improves a compiler's performance by detecting errors and missed opportunities in its analysis, optimization, and code generation stages. Normally, a compiler's author tunes it by examining the generated code to nd suboptimal code sequences. This paper describes a collection of to ..."
Abstract
-
Cited by 10 (0 self)
- Add to MetaCart
Performance tuning improves a compiler's performance by detecting errors and missed opportunities in its analysis, optimization, and code generation stages. Normally, a compiler's author tunes it by examining the generated code to nd suboptimal code sequences. This paper describes a collection of tools, called compiler auditors, that assist a compiler writer by partially mechanizing the process of nding suboptimal code sequences. Although these code sequences do not always exhibit compiler bugs, they frequently illustrate problems in a compiler. Experiments show that auditors e ectively nd suboptimal code, even in a high-quality, commercial compiler. After writing a high-quality compiler, its authors improve it with the time-consuming and tedious process of examining generated assembly code to nd ine cient code sequences that could run faster or consume less space. These sequences direct a compiler writer's attention to places in the compiler at which improved analysis, optimization, or code generation would result in better code. Typically a compiler writer nds this suboptimal code by reading assembly-language listings of compiled code (see, for example Briggs [6]). Performance tuning of this sort is unavoidable
Architectural Support For Compile-Time Speculation
, 1994
"... Studies on instruction-level parallelism (ILP) have shown that there are few independent instructions within the basic blocks of non-numerical applications. To uncover more independent instructions within these applications, instruction schedulers and microarchitectures must support the speculative ..."
Abstract
-
Cited by 9 (3 self)
- Add to MetaCart
Studies on instruction-level parallelism (ILP) have shown that there are few independent instructions within the basic blocks of non-numerical applications. To uncover more independent instructions within these applications, instruction schedulers and microarchitectures must support the speculative execution of instructions. This paper describes an architectural mechanism for speculative execution called boosting. Boosting exploits ILP across conditional branches without adversely affecting the instruction count of the application or the cycle time of the processor. This paper also presents the results of a case study which found that boosting can take full advantage of the parallel execution resources within a superscalar microarchitecture. For this case study, we implemented a novel trace-based, global scheduling algorithm that supports various configurations of boosting hardware. 1 INTRODUCTION By increasing the number of parallel hardware resources in a processor, superscalar and...
Automatic Generation Of Data-Flow Analyzers: A Tool For Building Optimizers
, 1993
"... Modern compilers generate good code by performing global optimizations. Unlike other functions of the compiler such as parsing and code generation which examine only one statement or one basic block at a time, optimizers examine large parts of a program and coordinate changes in widely separated par ..."
Abstract
-
Cited by 8 (0 self)
- Add to MetaCart
Modern compilers generate good code by performing global optimizations. Unlike other functions of the compiler such as parsing and code generation which examine only one statement or one basic block at a time, optimizers examine large parts of a program and coordinate changes in widely separated parts of a program. Thus optimizers use more complex data structures and consume more time. To generate the best code, optimizers perform not one global transformation, but many in concert. These transformations can interact in unforeseen ways. This dissertation concerns the building of optimizers that are modular and extensible. It espouses an optimizer architecture, first proposed by Kildall, in which each phase is based on a data-flow analysis (DFA) of the program and on an optimization function that transforms the program. To support the architecture, a set of abstractions---flow values, flow functions, path simplification rules, action routines---is provided. A tool called Sharlit turns a DFA specification consisting of these abstractions into a solver for a DFA problem. At the heart of Sharlit is an algorithm called path simplification, an extension of Tarjan's fast path algorithm. Path simplification unifies several powerful DFA solution techniques. By using path simplification rules, compiler writers can construct a wide range of data-flow analyzers, from simple iterative ones, to solvers that use local analysis, interval analysis, or sparse data-flow evaluation. Sharlit frees compiler writers from the details of how these various solution techniques. The compiler writer can view the program representation as a simple flow graph in which each instruction is a node. Data structures to represent basic blocks and other regions are automatically generated. Sharlit promotes ...

