Results 11 - 20
of
25
Eliminating Synchronization Overhead in Automatically Parallelized Programs Using Dynamic Feedback
- ACM TRANSACTIONS ON COMPUTER SYSTEMS
, 1999
"... This article presents dynamic feedback, a technique that enables computations to adapt dynamically to different execution environments. A compiler that uses dynamic feedback produces several different versions of the same source code; each version uses a different optimization policy. The generated ..."
Abstract
-
Cited by 8 (1 self)
- Add to MetaCart
This article presents dynamic feedback, a technique that enables computations to adapt dynamically to different execution environments. A compiler that uses dynamic feedback produces several different versions of the same source code; each version uses a different optimization policy. The generated code alternately performs sampling phases and production phases. Each sampling phase measures the overhead of each version in the current environment. Each production phase uses the version with the least overhead in the previous sampling phase. The computation periodically resamples to adjust dynamically to changes in the environment
A New Approach to Forth Native Code Generation
- IN EUROFORTH '92
, 1992
"... RAFTS is a framework for applying state of the art compiler technology to the compilation of Forth. The heart of RAFTS is a simple method for transforming Forth programs into data flow graphs and static single assignment form. Standard code generation and optimization techniques can be applied to pr ..."
Abstract
-
Cited by 7 (1 self)
- Add to MetaCart
RAFTS is a framework for applying state of the art compiler technology to the compilation of Forth. The heart of RAFTS is a simple method for transforming Forth programs into data flow graphs and static single assignment form. Standard code generation and optimization techniques can be applied to programs in these forms. Specifically, RAFTS uses interprocedural register allocation to eliminate nearly all stack accesses. It also removes nearly all stack pointer updates. Inlining and tail call optimization reduce the call overhead. RAFTS compiles all of Forth, including difficult cases like unknown stack heights, PICK, ROLL and EXECUTE. And last, but not least, RAFTS is designed for interactive Forth systems; it is not restricted to batch compilers.
Fusion-Based Register Allocation
- ACM Transactions on Programming Languages and Systems
, 1997
"... This paper describes fusion based register allocation in detail and compares it with other approaches to register allocation. For programs from the SPEC92 suite, fusion based register allocation can improve the execution time (of optimized programs, for the MIPS architecture) by up to 8.4% over Chai ..."
Abstract
-
Cited by 7 (0 self)
- Add to MetaCart
This paper describes fusion based register allocation in detail and compares it with other approaches to register allocation. For programs from the SPEC92 suite, fusion based register allocation can improve the execution time (of optimized programs, for the MIPS architecture) by up to 8.4% over Chaitin-style register allocation.
Demand-Driven Register Allocation
- ACM Trans. on Programming Lang. and Sys
, 1996
"... this paper was presented at the 1992 SIGPLAN Conference on Programming Language Design and Implementation. Permission to copy without fee all or part of this material is granted provided that the copies are not made or distributed for direct commercial advantage, the ACM copyright notice and the tit ..."
Abstract
-
Cited by 6 (0 self)
- Add to MetaCart
this paper was presented at the 1992 SIGPLAN Conference on Programming Language Design and Implementation. Permission to copy without fee all or part of this material is granted provided that the copies are not made or distributed for direct commercial advantage, the ACM copyright notice and the title of the publication and its date appear, and notice is given that copying is by permission of the Association for Computing Machinery. To copy otherwise, or to republish, requires a fee and/or specific permission. 2 \Delta upon good local allocation by eliminating unnecessary loads at the entrance to a basic block and by eliminating unnecessary stores at the exit from a basic block. An initial load of v is unnecessary if all predecessors exit with v in some register, and a terminal store of w is unnecessary if w is dead or all succeeding loads of w can be replaced with a reference to w's register. Of course we do not initially know which values will ultimately be allocated to registers, so we estimate each value's chances of ultimate register residence. At the basic block level, we know values loaded locally into a register and not overwritten have a 100% chance of exiting in a register. The likelihood of other values residing in a register on exit depends on their likelihood of residing in a register upon entrance, the number of unused registers in a block, and the pattern of local register usage. The likelihood that a value will be in a register upon entrance to a block depends on the likelihood it will exit in a register from all predecessor blocks. During global register allocation, we will model the competition between register candidates with estimates of the likelihood of register residence based on the demand for registers. Each instruction that requires a target ...
Precise Instruction Scheduling without a Precise Machine Model
- ACM Computer Architecture News
, 1991
"... this paper. ..."
Code Generation Techniques
- In INFOCOM (1
, 1992
"... Optimal instruction scheduling and register allocation are NP-complete problems that require heuristic solutions. By restricting the problem of register allocation and instruction scheduling for delayed-load architectures to expression trees we are able to find optimal schedules quickly. This thesis ..."
Abstract
-
Cited by 5 (0 self)
- Add to MetaCart
Optimal instruction scheduling and register allocation are NP-complete problems that require heuristic solutions. By restricting the problem of register allocation and instruction scheduling for delayed-load architectures to expression trees we are able to find optimal schedules quickly. This thesis presents a fast, optimal code scheduling algorithm for processors with a delayed load of 1 instruction cycle. The algorithm minimizes both execution time and register use and runs in time proportional to the size of the expression tree. In addition, the algorithm is simple
Rationale and Design of BULK
, 1991
"... BULK is a very-high-level persistent programming language and environment for prototyping and implementing database applications. BULK provides sets and sequences as primitive type constructors, provides high-level operations on them, and allows programmers to define application-oriented bulk types, ..."
Abstract
-
Cited by 5 (2 self)
- Add to MetaCart
BULK is a very-high-level persistent programming language and environment for prototyping and implementing database applications. BULK provides sets and sequences as primitive type constructors, provides high-level operations on them, and allows programmers to define application-oriented bulk types, e.g. syntax trees, bond portfolios, or (geographic) maps. BULK encourages separation of correctness and efficiency concerns by distinguishing logical type from representation. BULK supports a three-step development paradigm consisting of (i) prototyping, (ii) intensive analysis, optimization, and data structure selection by the compiler to achieve efficiency, and (iii) if efficiency is still inadequate, hot-spot refinement [CGK89]. (In hot-spot refinement developers remove performance bottlenecks by providing the compiler with more information, by directing its optimization efforts, or by re-implementation.) Step (i) focuses on correctness, steps (ii) and (iii) on efficiency. Our goal is an implementation that can usually achieve acceptable efficiency by step (ii) and that provides a tractable interface for hot-spot refinement.
The Use of Control-Flow and Control Dependence in Software Tools
, 1993
"... Program development, debugging, and maintenance can be greatly improved by the use of software tools that provide information about program behavior. This thesis focuses on a number of useful software tools and shows how their efficiency, generality, and precision can be increased through the use of ..."
Abstract
-
Cited by 4 (0 self)
- Add to MetaCart
Program development, debugging, and maintenance can be greatly improved by the use of software tools that provide information about program behavior. This thesis focuses on a number of useful software tools and shows how their efficiency, generality, and precision can be increased through the use of control-flow and control dependence analysis. We consider two classes of tools: execution measurement tools, which collect information about a particular program execution; and program analysis tools, which provide information about potential program behavior by statically analyzing the program. We consider three tools that measure aspects of a program's execution: profiling, tracing, and event counting tools. We describe algorithms for profiling and tracing programs that use a combination of control-flow analysis and program instrumentation to produce exact profiles and traces with low run-time overhead. Rather than record information at every point in a program, the algorithms record info...
General Purpose Optimization Technology
, 1994
"... This paper is particularly concerned with the optimization components of the compiler. Examples of optimizations include the classical sequential optimizations like strength reduction, common sub-expression elimination, and the like plus optimizations specific to BSP style programs. Some of these ar ..."
Abstract
-
Cited by 3 (0 self)
- Add to MetaCart
This paper is particularly concerned with the optimization components of the compiler. Examples of optimizations include the classical sequential optimizations like strength reduction, common sub-expression elimination, and the like plus optimizations specific to BSP style programs. Some of these are:
Models, Languages, and Compiler Technology for High Performance Computers
- Mathematical foundations of Computer Science
, 1994
"... Interpretation For Various Analysis Tasks We here consider several examples of abstract interpretation for various analysis tasks. 5.1 Type Annotation Consider the program (where the term p is ignored): P 1 = 1 (f: 2 (f 1) 3 (cond p x: 4 (+ x 1) y: 5 ( y 2))) Here, P 1 binds f to one of two ab ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
Interpretation For Various Analysis Tasks We here consider several examples of abstract interpretation for various analysis tasks. 5.1 Type Annotation Consider the program (where the term p is ignored): P 1 = 1 (f: 2 (f 1) 3 (cond p x: 4 (+ x 1) y: 5 ( y 2))) Here, P 1 binds f to one of two abstractions and then applies f to 1. The first phase of abstract interpretation results in the behavior graph shown in Figure 2. Some comments: [htb] OE 1 OE 2 fi(\Phi f ) OE 4 F+ \Phi x \Phi f \Phi 1 OE 3 Fcond OE 5 F \Phi y \Phi p x:4 (\Delta \Delta \Delta) y:5 (\Delta \Delta \Delta) \Phi 2 // // // 1 // 2 fflffl --- --- --- --- fflffl // 1 yys s s s s 2 fflffl 3 O O O O O O " // 1 // 2 fflffl aaB B B Fig. 2. Behavior Graph for P1 -- The interpretation of the node fi(\Phi f ) is the set of flows of the bodies of those abstractions that f can be bound to. A fi node has no arcs emanating and is dealt with in a manner discussed below. -- The interpretation of th...

