Results 1 
7 of
7
Register Promotion by Sparse Partial Redundancy Elimination of Loads and Stores
 In Proceedings of the ACM SIGPLAN 1998 Conference on Programming Language Design and Implementation
, 1998
"... An algorithm for register promotion is presented based on the observation that the circumstances for promoting a memory location's value to register coincide with situations where the program exhibits partial redundancy between accesses to the memory location. The recent SSAPRE algorithm for el ..."
Abstract

Cited by 37 (2 self)
 Add to MetaCart
An algorithm for register promotion is presented based on the observation that the circumstances for promoting a memory location's value to register coincide with situations where the program exhibits partial redundancy between accesses to the memory location. The recent SSAPRE algorithm for eliminating partial redundancy using a sparse SSA representation forms the foundation for the present algorithm to eliminate redundancy among memory accesses, enabling us to achieve both computational and live range optimality in our register promotion results. We discuss how to effect speculative code motion in the SSAPRE framework. We present two different algorithms for performing speculative code motion: the conservative speculation algorithm used in the absence of profile data, and the the profiledriven speculation algorithm used when profile data are available. We define the static single use (SSU) form and develop the dual of the SSAPRE algorithm, called SSUPRE, to perform the partial redun...
Bidirectional Data Flow Analysis in Code Motion: Myth and Reality
 In Proc. 5th Static Analysis Symposium (SAS'98), LNCS 1503
, 1998
"... . Bidirectional data flow analysis has become the standard technique for solving bitvector based code motion problems in the presence of critical edges. Unfortunately, bidirectional analyses have turned out to be conceptually and computationally harder than their unidirectional counterparts. In thi ..."
Abstract

Cited by 3 (0 self)
 Add to MetaCart
(Show Context)
. Bidirectional data flow analysis has become the standard technique for solving bitvector based code motion problems in the presence of critical edges. Unfortunately, bidirectional analyses have turned out to be conceptually and computationally harder than their unidirectional counterparts. In this paper we show that code motion in the presence of critical edges can be achieved without bidirectional data flow analyses. This is demonstrated by means of an adaption of our algorithm for lazy code motion [15], which is developed from a fresh, specification oriented view. Besides revealing a better conceptual understanding of the phenomena caused by critical edges, this also settles the foundation for a new and e#cient hybrid iteration strategy that intermixes conventional roundrobin iteration with the exhaustive iteration on critical subparts. 1 Motivation In data flow analysis equation systems involving bidirectional dependencies, i. e. dependencies from predecessor nodes as well as fr...
Epath pre—partial redundancy elimination made easy
 ACM SIGPLAN Notices
, 2002
"... Partial redundancy elimination (PRE) subsumes the classical optimizations of loop invariant movement and common subexpression elimination. The original formulation of PRE involved complex bidirectional data flows and had two major deficiencies—missed optimization opportunities and redundant code mo ..."
Abstract

Cited by 2 (0 self)
 Add to MetaCart
(Show Context)
Partial redundancy elimination (PRE) subsumes the classical optimizations of loop invariant movement and common subexpression elimination. The original formulation of PRE involved complex bidirectional data flows and had two major deficiencies—missed optimization opportunities and redundant code movement. To eliminate redundant code movement, most current PRE approaches use a hoistingfollowedbysinking approach. Unfortunately, this approach has a high conceptual complexity and requires complicated correctness proofs. We show that optimization by partial redundancy elimination is simpler than it has been made out to be. Its essence is the concept of eliminatability of an expression. We show that Epath PRE, a formulation of PRE based on the concept of eliminatability paths (Epaths), is easy to understand and simple to prove correct. It uses only wellknown data flow concepts of available expressions and anticipatable (i.e. verybusy) expressions to directly identify code insertion points which avoid redundant code movement. These features reduce the conceptual complexity of PRE considerably. Interestingly, performance studies show that Epath PRE is also less expensive to perform than the closest equivalent approach to PRE. This is a sheer bonus.
REALM: A Loop Memory Access Optimization Technique with LoopCarried Data Dependence Analysis for DSP Applications
"... Reducing memory accesses is particularly important for DSP applications since they are widely used in embedded systems and need to be executed with high performance and low power consumption. In this paper, we focus on optimizing loops which are the most critical sections for DSP applications. We pr ..."
Abstract
 Add to MetaCart
(Show Context)
Reducing memory accesses is particularly important for DSP applications since they are widely used in embedded systems and need to be executed with high performance and low power consumption. In this paper, we focus on optimizing loops which are the most critical sections for DSP applications. We propose a machineindependent intermediatecodelevel loop memory access optimization technique, REALM (REdundAnt Load Exploration & Migration), to explore hidden redundant loads and migrate them outside loops based on loopcarried data dependence analysis. In REALM, we first build up a dataflow graph to describe the interiteration data dependencies among memory operations. Then we perform code transformation by exploiting these dependencies with registers to hold the values of unnecessary loads and migrating these loads outside loops. Different from the previous work based on dataflowanalysis, our dataflowgraphbased approach is easy to be implemented and more suitable for optimizing loop kernels of DSP applications that have simple controlflow structure. We implement our technique into the IMPACT compiler [24] and conduct experiments using a set of benchmarks from DSPstone [28] on the cycleaccurate VLIW simulator of Trimaran [1]. The experimental results show that our technique significantly reduces the number of memory accesses. 1
A Framework for Representing Data Parallel Programs and its Application in Program Reordering
, 1995
"... In this paper, we present a framework for describing the dataflow and dependence information of a data parallel program. Our framework initially represents the program using a directed Intermediate Program Locality Graph (IPLG). Using Dependence Access Relations (DARs), the IPLG is reduced to a P ..."
Abstract
 Add to MetaCart
In this paper, we present a framework for describing the dataflow and dependence information of a data parallel program. Our framework initially represents the program using a directed Intermediate Program Locality Graph (IPLG). Using Dependence Access Relations (DARs), the IPLG is reduced to a Program Locality Graph (PLG). The information provided by PLG is used by the compiler to reorder the program to improve program locality. We view the program reordering problem as an optimization problem. To solve this problem, we present a polynomial time heuristic, called the Range Reduction Heuristic. The best case time complexity of the heuristic is O(m 2 n 2 ), where m is the number of statements and n is the number of arrays used in the program. The average case running time of the heuristic approaches the best case performance. This framework will be implemented as a part of the PASSION (Parallel And Scalable Software for I/O) compiler to compile outofcore HPF programs. Th...