Results 1 -
2 of
2
Is Search Really Necessary to Generate High-Performance BLAS?
, 2005
"... Abstract — A key step in program optimization is the estimation of optimal values for parameters such as tile sizes and loop unrolling factors. Traditional compilers use simple analytical models to compute these values. In contrast, library generators like ATLAS use global search over the space of p ..."
Abstract
-
Cited by 32 (7 self)
- Add to MetaCart
Abstract — A key step in program optimization is the estimation of optimal values for parameters such as tile sizes and loop unrolling factors. Traditional compilers use simple analytical models to compute these values. In contrast, library generators like ATLAS use global search over the space of parameter values by generating programs with many different combinations of parameter values, and running them on the actual hardware to determine which values give the best performance. It is widely believed that traditional model-driven optimization cannot compete with search-based empirical optimization because tractable analytical models cannot capture all the complexities of modern high-performance architectures, but few quantitative comparisons have been done to date. To make such a comparison, we replaced the global search engine in ATLAS with a model-driven optimization engine, and measured the relative performance of the code produced by the two systems on a variety of architectures. Since both systems use the same code generator, any differences in the performance of the code produced by the two systems can come only from differences in optimization parameter values. Our experiments show that model-driven optimization can be surprisingly effective, and can generate code with performance comparable to that of code generated by ATLAS using global search. Index Terms — program optimization, empirical optimization, model-driven optimization, compilers, library generators, BLAS, high-performance computing
Programming for Locality and Parallelism with Hierarchically Tiled Arrays
- In Proc. of the 16th International Workshop on Languages and Compilers for Parallel Computing, LCPC 2003
, 2003
"... This paper introduces a new primitive data type, hierarchically tiled arrays (HTAs), which could be incorporated into conventional languages to facilitate parallel programing and programming for locality. It is argued that HTAs enable a natural representation for many algorithms with a high degree o ..."
Abstract
-
Cited by 10 (7 self)
- Add to MetaCart
This paper introduces a new primitive data type, hierarchically tiled arrays (HTAs), which could be incorporated into conventional languages to facilitate parallel programing and programming for locality. It is argued that HTAs enable a natural representation for many algorithms with a high degree of locality. Also, the paper shows that, with HTAs, parallel computations and the associated communication operations can be expressed as array operations within single threaded programs. This, is then argued, facilitates reasoning about the resulting programs and stimulates the development of code that is highly readable and easy to modify. The new data type is illustrated using examples written in an extended version of MATLAB.

