Results 1 
3 of
3
Ahnentafel indexing into Mortonordered arrays, or matrix locality for free
 In EuroPar 2000 – Parallel Processing
, 2000
"... Abstract. Definitions for the uniform representation of ddimensional matrices serially in Mortonorder (or Zorder) support both their use with cartesian indices, and their divideandconquer manipulation as quaternary trees. In the latter case, ddimensional arrays are accessed as 2 dary trees. T ..."
Abstract

Cited by 29 (5 self)
 Add to MetaCart
(Show Context)
Abstract. Definitions for the uniform representation of ddimensional matrices serially in Mortonorder (or Zorder) support both their use with cartesian indices, and their divideandconquer manipulation as quaternary trees. In the latter case, ddimensional arrays are accessed as 2 dary trees. This data structure is important because, at once, it relaxes serious problems of locality and latency, and the tree helps schedule multiprocessing. It enables algorithms that avoid cache misses and page faults at all levels in hierarchical memory, independently of a specific runtime environment. This paper gathers the properties of Morton order and its mappings to other indexings, and outlines for compiler support of it. Statistics elsewhere show that the new ordering and block algorithms achieve high flop rates and, indirectly, parallelism without any lowlevel tuning.
QR Factorization with MortonOrdered Quadtree Matrices for Memory Reuse and Parallelism
 In Proc. 2003 ACM Symp. on Principles and Practice of Parallel Programming
, 2003
"... Quadtree matrices using Mortonorder storage provide natural blocking on every level of a memory hierarchy. Writing the natural recursive algorithms to take advantage of this blocking results in code that honors the memory hierarchy without the need for transforming the code. Furthermore, the divide ..."
Abstract

Cited by 14 (1 self)
 Add to MetaCart
(Show Context)
Quadtree matrices using Mortonorder storage provide natural blocking on every level of a memory hierarchy. Writing the natural recursive algorithms to take advantage of this blocking results in code that honors the memory hierarchy without the need for transforming the code. Furthermore, the divideandconquer algorithm breaks problems down into independent computations. These independent computations can be dispatched in parallel for straightforward parallel processing. Proofofconcept is given by an algorithm for QR factorization based on Givens rotations for quadtree matrices in Mortonorder storage. The algorithms deliver positive results, competing with and even beating the LAPACK equivalent. Categories and subject descriptors:
Mortonorder Matrices Deserve Compilers ’ Support
, 1999
"... A proof of concept is offered for the uniform representation of matrices serially in Mortonorder (or Zorder) representation, as well as their divideandconquer processing as quaternary trees. Generally, d dimensional arrays are accessed as 2 dary trees. This data structure is important because, ..."
Abstract
 Add to MetaCart
(Show Context)
A proof of concept is offered for the uniform representation of matrices serially in Mortonorder (or Zorder) representation, as well as their divideandconquer processing as quaternary trees. Generally, d dimensional arrays are accessed as 2 dary trees. This data structure is important because, at once, it relaxes serious problems of locality and latency, while the tree helps schedule multiprocessing. It enables algorithms that avoid cache misses and page faults at all levels in hierarchical memory, independently of a specific runtime environment. This paper gathers the properties of Morton order and its mappings to other indexings, and outlines for compiler support of it. Statistics on matrix multiplication, a critical example, show how the new ordering and block algorithms achieve high flop rates and, indirectly, parallelism without lowlevel tuning. Perhaps because of the early success of columnmajor representation with strength reduction, quadtree representation has been reinvented and redeveloped in areas far from the center that is Programming Languages. As target architectures move to multiprocessing, superscalar pipes, and hierarchical memories, compilers must support quadtrees better, so that more programmers invent algorithms that use them to exploit the hardware.