MetaCart Sign in to MyCiteSeerX

Include Citations | Advanced Search | Help

Disambiguated Search | Include Citations | Advanced Search | Help

Cache-Oblivious Algorithms (1999) [58 citations — 1 self]

Abstract:

This thesis presents "cache-oblivious" algorithms that use asymptotically optimal amounts of work, and move data asymptotically optimally among multiple levels of cache. An algorithm is cache oblivious if no program variables dependent on hardware configuration parameters, such as cache size and cache-line length need to be tuned to minimize the number of cache misses. We show that the ordinary algorithms for matrix transposition, matrix multiplication, sorting, and Jacobi-style multipass filtering are not cache optimal. We present algorithms for rectangular matrix transposition, FFT, sorting, and multipass filters, which are asymptotically optimal on computers with multiple levels of caches. For a cache with size Z and cache-line length L, where Z =# (L 2 ), the number of cache misses for an m × n matrix transpose is #(1 + mn=L). The number of cache misses for either an n-point FFT or the sorting of n numbers is #(1 + (n=L)(1 + log Z n)). The cache complexity of computing n ...

Citations

6121 Introduction to Algorithms – Cormen, Leiserson, et al. - 2001
3312 Computer Architecture a Quantitative Approach – Hennessy, Patterson - 1996
2010 The Design and Analysis of Computer Algorithms – Aho, Hopcroft, et al. - 1974
1309 Randomized algorithms – Motwani, Raghavan - 1995
612 Amortized efficiency of list update and paging rules – Sleator, Tarjan - 1985
450 Online Computation and Competitive Analysis – Borodin, El-Yaniv - 1998
401 The input/output complexity of sorting and related problems – Aggarwal, Vitter - 1988
386 A study of replacement algorithms for virtualstorage computers – Belady - 1966
366 Algorithms in C – Sedgewick - 1990
295 FFTW: An adaptive software architecture for the FFT – Frigo, Johnson - 1998
227 Algorithms for parallel memory I: two-level memories – Vitter, Shriver - 1994
222 External memory algorithms and data structures: Dealing with – Vitter - 2000
116 FFTs in external or hierarchical memory – Bailey - 1990
113 A model for hierarchical memory – Aggarwal, Alpern, et al. - 1987
103 A Fast Fourier Transform Compiler – Frigo - 1999
100 Hierarchical memory with block transfer – Aggarwal, Chandra, et al. - 1987
97 I/O Complexity: the RedBlue Pebble Game – Hong, Kung - 1981
92 An analysis of dag-consistent distributed shared-memory algorithms – Blumofe, Frigo, et al. - 1996
90 An algorithm for the machine computation of complex Fourier series – Cooley, Tukey - 1965
68 Auto-blocking matrix-multiplication or tracking BLAS3 performance from source code – Frens, Wise - 1997
67 Sorting and Searching – KNUTH - 1998
66 Locality of reference in lu decomposition with partial pivoting – Toledo - 1997
65 Nonlinear Array Layouts for Hierarchical Memory Systems – Chatterjee, Jain, et al. - 1999
58 Algorithms for parallel memory II: Hierarchical multilevel memories – Vitter, Shriver - 1994
57 Deterministic distribution sort in shared and distributed memory multiprocessors – Nodine, Vitter - 1993
46 DAGconsistent distributed shared memory – Blumofe, Frigo, et al. - 1996
45 Writing Efficient Programs – Bentley - 1982
44 Automatic parallelization of divide and conquer algorithms – Rugina, Rinard - 1999
43 Recursive array layouts and fast parallel matrix multiplication – Chatterjee, Lebeck, et al. - 1999
43 Uniform memory hierarchies – Alpern, Carter, et al. - 1990
41 Fast Fourier Transforms: A Tutorial Review and a State of the Art – Duhamel, Vetterli - 1990
35 Gaussian elimination is not optimal,” Numerische Mathematik 13 – Strassen - 1969
25 An algorithm for computing the mixed radix fast Fourier transform – SINGLETON - 1969
24 Extending the Hong-Kung model to memory hierachies – Savage - 1995
24 Large-scale sorting in uniform memory hierarchies – Vitter, Nodine - 1993
8 Cache-oblivious algorithms (extended abstract – Frigo, Leiserson, et al. - 1999
8 Back to the future: Time to return to some long standing problems in computer systems? Federated Computer Conference – Hennessy - 1999
6 On the algebraic complexity of functions – WINOGRAD - 1970
1 Uniform memory hierarchies. Pro – ALPERN, CARTER, et al. - 1990
1 Future investment in information technology research: Report of the president's information technology advisory committee. Plenary talk at FCRC'99 – KENNEDY
1 Gaussian elimination is not optimal. Numerische Mathematik 13 – STRASSE, V - 1969