Results 1 
2 of
2
Localizing Nonaffine Array References
, 1999
"... Existing techniques can enhance the locality of arrays indexed by affine functions of induction variables. This paper presents a technique to localize nonaffine array references, such as the indirect memory references common in sparsematrix computations. Our optimization combines elements of tilin ..."
Abstract

Cited by 48 (9 self)
 Add to MetaCart
Existing techniques can enhance the locality of arrays indexed by affine functions of induction variables. This paper presents a technique to localize nonaffine array references, such as the indirect memory references common in sparsematrix computations. Our optimization combines elements of tiling, datacentric tiling, data remapping and inspectorexecutor parallelization. We describe our technique, bucket tiling, which includes the tasks of permutation generation, data remapping, and loop regeneration. We show that profitability cannot generally be determined at compiletime, but requires an extension to runtime. We demonstrate our technique on three codes: integer sort, conjugate gradient, and a kernel used in simulating a beating heart. We observe speedups of 1.91 on integer sort, 1.57 on conjugate gradient, and 2.69 on the heart kernel. 1. Introduction Researchers have long sought to increase data locality and exploit parallelism in loop nests [34, 32, 16, 5, 33, 18]. These wor...
Guiding Program Transformations with Modal Performance Models
, 2000
"... Successful program optimization requires analysis of profitability. From this analysis, a compiler or runtime system can decide where and how to apply an assortment of program transformations. This twofaced problem is called transformation guidance. We consider the desired goal of robust guidance o ..."
Abstract

Cited by 4 (1 self)
 Add to MetaCart
Successful program optimization requires analysis of profitability. From this analysis, a compiler or runtime system can decide where and how to apply an assortment of program transformations. This twofaced problem is called transformation guidance. We consider the desired goal of robust guidance of performance optimizations for hierarchical systems. A guidance system is robust if it unifies disparate sources of knowledge, and makes reasonable decisions hold up, despite a lack of definitive information. In particular, we seek to address concerns presented by aspects of syntax, architecture, and data set. Syntax may not be statically analyzable; for example, the data dependences due to A(B(i)) (an indirect memory reference) cannot be determined until runtime. Architecture poses a problem in the complexity of the relationship between its properties and performance. Data set shares both problems: on the one hand, we cannot analyze properties of unavailable data; and yet, once available, we cannot easily predict how its properties, combined with architectural properties, in uence execution time. This thesis solves aspects of this robust guidance problem. First, we present bucket tiling, a program transformation for locality which handles nona ne array references (such as the indirect which reference mentioned above). Bucket tiling improves the performance of codes such as conjugate gradient and integer sort by 1.5 to 2.8 times. We have developed a tool which automatically applies bucket tiling to C or Fortran codes. To guide locality optimizations such as bucket tiling in a robust manner requires a new modeling strategy. We present the abstraction of modal models. A modal model recognizes, and leverages off the following observation: many aspects of a program's behavior can be assigned to a small, finite number of distinguishable categories. We develop a modal model for guiding locality transformations which uses three parameterized modes to represent three different access patterns. We show how to experimentally determine parameterized formulas for execution time of these modes on any given target platform. Further, we use these modes as the basis for a calculus of performance modeling for our guidance system. Given any program, represented as a tree of modes, we show how to determine an execution time formula for the program. For bucket tiling, we determine execution time formulas for the original and transformed programs, and use these to guide the decision on performing the transformation. We also contrast a modal modeling approach to a staticcombinatoric approach. Such an approach models by counting some observable property of behavior, such as cache misses. This contrast highlights the principle advantage of modal modeling: robustness to syntax, architecture, and data set properties.