Auto-Blocking Matrix-Multiplication or Tracking BLAS3 Performance from Source Code (1997)

Cached

Download Links

by Jeremy Frens , David S. Wise
Venue:In Proceedings of the Sixth ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming
Citations:69 - 6 self

Documents Related by Co-Citation

7324 Cache-oblivious algorithms – M FRIGO, C E LEISERSON, H PROKOP, S RAMACHANDRAN - 1999
486 The Cache Performance and Optimizations of Blocked Algorithms – Monica S. Lam, Edward E. Rothberg, Michael E. Wolf - 1991
44 Recursive Array Layouts and Fast Parallel Matrix Multiplication – Siddhartha Chatterjee, Alvin R. Lebeck, Praveen K. Patnala, Mithuna Thottethodi - 1999
222 Space-Filling Curves – Hans Sagan - 1994
430 Cilk: An Efficient Multithreaded Runtime System – Robert D. Blumofe , Christopher F. Joerg, Bradley C. Kuszmaul, Charles E. Leiserson, Keith H. Randall, Yuli Zhou - 1995
304 Gaussian elimination is not optimal – V STRASSEN - 1969
681 A set of level 3 basic linear algebra subprograms – Jack J Dongarra, Jeremy Du Croz, Sven Hammarling, Iain Duff - 1990
43 Towards a Theory of Cache-Efficient Algorithms – Sandeep Sen, Siddhartha Chatterjee, Neeraj Dumir - 1999
31 High Performance Fortran for Highly Irregular Problems – Yu Charlie Hu, S. Lennart Johnsson, Shang-Hua Teng, Y. Charlie, Hu S. Lennart, Johnsson Shang--hua Teng - 1996
51 Dynamic Partitioning of Non-Uniform Structured Workloads with Spacefilling Curves – John R. Pilkington, Scott B. Baden - 1995
286 External Memory Algorithms and Data Structures – Jeffrey Scott Vitter - 1998
67 Nonlinear Array Layouts for Hierarchical Memory Systems – Siddhartha Chatterjee, Vibhor V. Jain, Alvin R. Lebeck, Shyam Mundhra, Mithuna Thottethodi - 1999
199 Optimizing Matrix Multiply using PHiPAC: a Portable, High-Performance, ANSI C Coding Methodology – Jeff Bilmes, Krste Asanovic , Chee-Whye Chin , Jim Demmel - 1996
312 Automatically tuned linear algebra software – R. Clint Whaley, Jack J. Dongarra - 1998
133 Data-centric Multi-level Blocking – Induprakas Kodukula, Nawaaz Ahmed, d Keshav Pingali - 1997
105 Recursion leads to automatic variable blocking for dense linear-algebra algorithms – F G GUSTAVSON - 1997
654 Accuracy and Stability of Numerical Algorithms – N J Higham
25 Ahnentafel indexing into Morton-ordered arrays, or matrix locality for free – David S. Wise - 2000
3637 D.A.Patterson, “Computer Architecture: A quantitative Approach”, Fourth edition – J L Hennessy - 2007