## Recursive Array Layouts and Fast Parallel Matrix Multiplication (1999)

### Cached

### Download Links

- [www.cs.duke.edu]
- [ftp.cs.unc.edu]
- [ftp.cs.unc.edu]
- DBLP

### Other Repositories/Bibliography

Venue: | In Proceedings of Eleventh Annual ACM Symposium on Parallel Algorithms and Architectures |

Citations: | 48 - 4 self |

### BibTeX

@INPROCEEDINGS{Chatterjee99recursivearray,

author = {Siddhartha Chatterjee and Alvin R. Lebeck and Praveen K. Patnala and Mithuna Thottethodi},

title = {Recursive Array Layouts and Fast Parallel Matrix Multiplication},

booktitle = {In Proceedings of Eleventh Annual ACM Symposium on Parallel Algorithms and Architectures},

year = {1999},

pages = {222--231}

}

### Years of Citing Articles

### OpenURL

### Abstract

Matrix multiplication is an important kernel in linear algebra algorithms, and the performance of both serial and parallel implementations is highly dependent on the memory system behavior. Unfortunately, due to false sharing and cache conflicts, traditional column-major or row-major array layouts incur high variability in memory system performance as matrix size varies. This paper investigates the use of recursive array layouts for improving the performance of parallel recursive matrix multiplication algorithms. We extend previous work by Frens and Wise on recursive matrix multiplication to examine several recursive array layouts and three recursive algorithms: standard matrix multiplication, and the more complex algorithms of Strassen and Winograd. We show that while recursive array layouts significantly outperform traditional layouts (reducing execution times by a factor of 1.2--2.5) for the standard algorithm, they offer little improvement for Strassen's and Winograd's algorithms;...