Results 1 -
7 of
7
Tiling Imperfectly-nested Loop Nests
- In Proc. of SC 2000
, 2000
"... Tiling is one of the more important transformations for enhancing locality of reference in programs. Tiling of perfectly-nested loop nests (which are loop nests in which all assignment statements are contained in the innermost loop) is well understood. In practice, most loop nests are imperfectly-ne ..."
Abstract
-
Cited by 39 (0 self)
- Add to MetaCart
for dense numerical linear algebra benchmarks, relaxation codes, and the tomcatv code from the...
Synthesizing transformations for locality enhancement of imperfectly-nested loop nests
- In Proceedings of the 2000 ACM International Conference on Supercomputing
, 2000
"... We present an approach for synthesizing transformations to enhance locality in imperfectly-nested loops. The key idea is to embed the iteration space of every statement in a loop nest into a special iteration space called the product space. The product space can be viewed as a perfectly-nested loop ..."
Abstract
-
Cited by 64 (3 self)
- Add to MetaCart
. The product space is then transformed further to enhance locality, after which fully permutable loops are tiled, and code is generated. We evaluate the effectiveness of this approach for dense numerical linear algebra benchmarks, relaxation codes, and the tomcatv code from the SPEC benchmarks. 1. BACKGROUND
Synthesizing Transformations for Locality Enhancement of Imperfectly-nested Loop Nests
- In Proceedings of the 2000 ACM International Conference on Supercomputing
, 2000
"... We present an approach for synthesizing transformations to enhance locality in imperfectly-nested loops. The key idea is to embed the iteration space of every statement in a loop nest into a special iteration space called the product space. The product space can be viewed as a perfectly-nested loop ..."
Abstract
- Add to MetaCart
. The product space is then transformed further to enhance locality, after which fully permutable loops are tiled, and code is generated. We evaluate the effectiveness of this approach for dense numerical linear algebra benchmarks, relaxation codes, and the tomcatv code from the SPEC benchmarks. 1
Abstract Tiling Imperfectly-nested Loop Nests
"... Tiling is one of the more important transformations for enhancing locality of reference in programs. Tiling of perfectly-nested loop nests (which are loop nests in which all assignment statements are contained in the innermost loop) is well understood. In practice, most loop nests are imperfectly-ne ..."
Abstract
- Add to MetaCart
for dense numerical linear algebra benchmarks, relaxation codes, and the tomcatv code from the SPEC benchmarks. No other single approach in the literature can tile all these codes automatically. 1 Background and Previous Work The memory systems of computers are organized as a hierarchy in which the latency
Compiling Several Classes of Communication Patterns on a Multithreaded Architecture \Lambda
"... Abstract In compiling or developing applications for execution on distributed memory machines, communication optimizations are critical for performance. Multithreaded architectures support multiple threads of execution on each processor, with low-cost thread initiation, low-overhead communication, a ..."
Abstract
- Add to MetaCart
further show how a compiler can generate threaded code for loops with such patterns. We present a cost model to guide thread granularity selection while generating multithreaded code. We present experimental results from three benchmark programs, CG, Tomcatv, and Jacobi. Our results show that: 1
Compiling several classes of communication patterns on a multithreaded architecture
, 2001
"... state.edu Communication optimizations play a crucial role in per-formance of parallel applications which are compiled and executed on distributed memory machines. Multithreaded architectures can support multiple threads of execution on each processor, with low-cost thread initiation, low-overhead co ..."
Abstract
-
Cited by 3 (0 self)
- Add to MetaCart
with each of these patterns. We further show how a compiler can generate threaded code for loops with such patterns. We present experimental results from two benchmark pro-grams, CG, and Tomcatv. Our results show that: 1) the compiler generated multithreaded code achieves high perfor-mance, not previously
Automatic coarse grain task parallel processing on smp using openmp
- Proc. of 13 th International Workshop on Languages and Compilers for Parallel Computing
, 2001
"... This paper proposes a simple and efficient implementation method for a hierarchical coarse grain task parallel processing scheme on a SMP machine. OSCAR multigrain parallelizing compiler automatically generates parallelized code including OpenMP directives and its performance is evaluated on a comme ..."
Abstract
-
Cited by 10 (7 self)
- Add to MetaCart
This paper proposes a simple and efficient implementation method for a hierarchical coarse grain task parallel processing scheme on a SMP machine. OSCAR multigrain parallelizing compiler automatically generates parallelized code including OpenMP directives and its performance is evaluated on a