Results 1 
3 of
3
Ahnentafel indexing into Mortonordered arrays, or matrix locality for free
 In EuroPar 2000 – Parallel Processing
, 2000
"... Abstract. Definitions for the uniform representation of ddimensional matrices serially in Mortonorder (or Zorder) support both their use with cartesian indices, and their divideandconquer manipulation as quaternary trees. In the latter case, ddimensional arrays are accessed as 2 dary trees. T ..."
Abstract

Cited by 27 (5 self)
 Add to MetaCart
(Show Context)
Abstract. Definitions for the uniform representation of ddimensional matrices serially in Mortonorder (or Zorder) support both their use with cartesian indices, and their divideandconquer manipulation as quaternary trees. In the latter case, ddimensional arrays are accessed as 2 dary trees. This data structure is important because, at once, it relaxes serious problems of locality and latency, and the tree helps schedule multiprocessing. It enables algorithms that avoid cache misses and page faults at all levels in hierarchical memory, independently of a specific runtime environment. This paper gathers the properties of Morton order and its mappings to other indexings, and outlines for compiler support of it. Statistics elsewhere show that the new ordering and block algorithms achieve high flop rates and, indirectly, parallelism without any lowlevel tuning.
Mortonorder Matrices Deserve Compilers ’ Support
, 1999
"... A proof of concept is offered for the uniform representation of matrices serially in Mortonorder (or Zorder) representation, as well as their divideandconquer processing as quaternary trees. Generally, d dimensional arrays are accessed as 2 dary trees. This data structure is important because, ..."
Abstract
 Add to MetaCart
(Show Context)
A proof of concept is offered for the uniform representation of matrices serially in Mortonorder (or Zorder) representation, as well as their divideandconquer processing as quaternary trees. Generally, d dimensional arrays are accessed as 2 dary trees. This data structure is important because, at once, it relaxes serious problems of locality and latency, while the tree helps schedule multiprocessing. It enables algorithms that avoid cache misses and page faults at all levels in hierarchical memory, independently of a specific runtime environment. This paper gathers the properties of Morton order and its mappings to other indexings, and outlines for compiler support of it. Statistics on matrix multiplication, a critical example, show how the new ordering and block algorithms achieve high flop rates and, indirectly, parallelism without lowlevel tuning. Perhaps because of the early success of columnmajor representation with strength reduction, quadtree representation has been reinvented and redeveloped in areas far from the center that is Programming Languages. As target architectures move to multiprocessing, superscalar pipes, and hierarchical memories, compilers must support quadtrees better, so that more programmers invent algorithms that use them to exploit the hardware.
Language Support for Mortonorder Matrices
"... The uniform representation of 2dimensional arrays serially in Morton order (or I order) supports both their iterative scan with cartesian indices and their divideandconquer manipulation as quaternary trees. This data structure is important because it relaxes serious problems of locality and latenc ..."
Abstract
 Add to MetaCart
(Show Context)
The uniform representation of 2dimensional arrays serially in Morton order (or I order) supports both their iterative scan with cartesian indices and their divideandconquer manipulation as quaternary trees. This data structure is important because it relaxes serious problems of locality and latency, and the tree helps to schedule multiprocessing. Results here show howitfacilitates algorithms that avoid cache misses and page faults at all levels in hierarchical memory, independently of a speci c runtime environment. We have built a rudimentary CtoC translator that implements matrices in Mortonorder from source that presumes a rowmajor implementation. Early performance from LAPACK's reference implementation of dgesv (linear solver), and all its supporting routines (including dgemm matrixmultiplication) form a successful research demonstration. Its performance predicts improvements from new algebra in backend optimizers. We also present results from a more stylish dgemm algorithm that takes better advantage of this representation. With only routine backend optimizations inserted by hand (unfolding the base case and passing arguments in registers), we achieve machine performance exceeding