## An Algebra of Scans (2004)

### Cached

### Download Links

- [www.informatik.uni-bonn.de]
- [www.cs.bonn.edu]
- DBLP

### Other Repositories/Bibliography

Venue: | In Mathematics of Program Construction |

Citations: | 7 - 0 self |

### BibTeX

@INPROCEEDINGS{Hinze04analgebra,

author = {Ralf Hinze},

title = {An Algebra of Scans},

booktitle = {In Mathematics of Program Construction},

year = {2004},

pages = {186--210},

publisher = {Springer}

}

### OpenURL

### Abstract

A parallel prefix circuit takes n inputs x1 , x2 , . . . , xn and produces the n outputs x1 , x1 x2 , . . . , x1 x2 xn , where `#' is an arbitrary associative binary operation. Parallel prefix circuits and their counterparts in software, parallel prefix computations or scans, have numerous applications ranging from fast integer addition over parallel sorting to convex hull problems. A parallel prefix circuit can be implemented in a variety of ways taking into account constraints on size, depth, or fanout. Traditionally, implementations are either defined graphically or by enumerating the underlying graph. Both approaches have their pros and cons. A figure if well drawn conveys the possibly recursive structure of the scan but it is not amenable to formal manipulation. A description in form of a graph while rigorous obscures the structure of a scan and is equally hard to manipulate. In this paper we show that parallel prefix circuits enjoy a very pleasant algebra. Using only two basic building blocks and four combinators all standard designs can be described succinctly and rigorously. The rules of the algebra allow us to prove the circuits correct and to derive circuit designs in a systematic manner. lord darlington. . . . [Sees a fan lying on the table.] And what a wonderful fan! May I look at it? lady windermere. Do. Pretty, isn't it! It's got my name on it, and everything. I have only just seen it myself. It's my husband's birthday present to me. You know to-day is my birthday? --- Oscar Wilde, Lady Windermere's Fan 1

### Citations

272 | Parallel prefix computation
- Ladner, Fischer
- 1980
(Show Context)
Citation Context ...gets away with fewer operation nodes? Yes, we can! Reconsider the circuit rec32 in Section 4 and note that the left part does not occupy the bottom level. The idea, which is due to Ladner and Fischer =-=[9], is to -=-use the Brent-Kung decomposition for the left part—recall that it increases the depth by two—and the ‘usual’ decomposition for the right part. The following combinator captures one step of the... |

222 |
A programming language
- Iverson
- 1962
(Show Context)
Citation Context ...efix computations are not confined to addition, any associative operation can be used. Functional programmers know parallel prefix computations as scans, a term which originates from the language APL =-=[1]-=-. We will use both terms synonymously. Parallel prefix computations have numerous applications; the most wellknown is probably the carry-lookahead adder [2], a parallel prefix circuit. Others2 R. Hinz... |

95 | Prefix Sums and Their Applications
- Blelloch
- 1990
(Show Context)
Citation Context ...y the carry-lookahead adder [2], a parallel prefix circuit. Others2 R. Hinze applications include the maximum segment sum problem, parallel sorting, solving recurrences, and convex hull problems, see =-=[3]-=-. A parallel prefix computation seems to be inherently sequential. However, it can be made to run in logarithmic time on a parallel architecture or in hardware. In fact, scans can be implemented in a ... |

59 | Powerlist: A structure for parallel recursion
- Misra
- 1994
(Show Context)
Citation Context ...• • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • •s=-=24 R. Hinze There are a few papers that deal with the derivation of parallel prefix circuits. Misra [12]-=- calculates the Brent-Kung circuit via the data structure of powerlists. 1 Since powerlists capture the recursive decomposition of Brent-Kung, the approach while elegant is not easily applicable to ot... |

54 |
98 Language and Libraries
- Haskell
- 2003
(Show Context)
Citation Context ...an-out. Finally, Section 7 reviews related work and Section 8 concludes. 2 Basic combinators This section defines the algebra of scans. Throughout the paper we employ the programming language Haskell =-=[5] a-=-s the meta language. In particular, Haskell’s class system is put to good use: classes allow us to define algebras and instances allow us to define associated models. 2.1 Monoids The binary operatio... |

18 |
New bounds for parallel prefix circuits
- Fich
- 1983
(Show Context)
Citation Context ... + [id 1 | odd n ]) � s ⌈n/2⌉sAn algebra of scans 19 Using double we can define a depth-optimal parallel prefix circuit that has the minimal number of operation nodes among all minimum-depth cir=-=cuits [10]. opt n | n � 1 = id n | otherwise = double opt ⌈n-=-/2⌉ � opt ⌊n/2⌋ The following example circuit of width 32 illustrates that all layers are nicely exploited. • • • • • • • • • • • • • • • • • • • • • ... |

17 | The chip complexity of binary arithmetic
- Brent, Kung
- 1980
(Show Context)
Citation Context ... • • • • • • • • • • • • • • • • • • • • • • • • • • • • • The recn family of circuits implements a simple divide-and-conquer scheme. A differ=-=ent recursive decomposition was devised by Brent and Kung [8]. As an example, here is a Brent-Kung circuit of width 32. • • • • • • • • • • • • • • • • • • • • •-=- • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • •... |

12 | Extracting and implementing list homomorphisms in parallel program development
- Gorlatch
(Show Context)
Citation Context ...s can be transformed into a single scan. The scan function itself is an instance of a so-called list homomorphism. For this class of functions, parallel programs can be derived in a systematic manner =-=[15]-=-. Applying the approach of [15] to scan yields the optimal hypercube algorithm. This algorithm can be seen as a clocked circuit. Consequently, there is no direct correspondence to any of the algorithm... |

11 |
A logarithmic implementation of flexible arrays. Memorandum MR83/4
- Braun, Rem
- 1983
(Show Context)
Citation Context ... nodes that computes the last output is fully balanced, which explains why the depth is minimal. If the width is not a power of two, then recn constructs a slightly skewed tree, known as a Braun tree =-=[6]. Sinc-=-e ‘�’ is associative, we can, of course, realize arbitrary tree shapes; other choices include left-complete trees or quasi left-complete trees [7]. For your amusement, here is a Fibonacci-tree o... |

9 | Constructing Red-Black Trees
- Hinze
- 1999
(Show Context)
Citation Context ...s a slightly skewed tree, known as a Braun tree [6]. Since ‘�’ is associative, we can, of course, realize arbitrary tree shapes; other choices include left-complete trees or quasi left-complete =-=trees [7]. For your amusement, here is a Fibonacci-tree of width 34 • • • • • • • • • • • • • •-=- • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • defined in the obvious way. • • • • • • • • •... |

8 | De)Composition Rules for Parallel Scan and Reduction
- Gorlatch, Lengauer
- 1997
(Show Context)
Citation Context ...conveniently expressed in terms of scans [3]. Besides encouraging wellstructured programming this coarse-grained approach to parallelism allows for various program optimizations. Gorlach and Lengauer =-=[14]-=-, for instance, show that a composition of two scans can be transformed into a single scan. The scan function itself is an instance of a so-called list homomorphism. For this class of functions, paral... |

4 |
A one-microsecond adder using onemegacycle circuitry
- Weinberger, Smith
- 1956
(Show Context)
Citation Context ...ns 23 Parallel prefix computations are nearly as old as the history of computers. One of the first implementations of fast integer addition using carry-lookahead was described by Weinberger and Smith =-=[11]-=-. However, the operation of the circuit seemed to rely on the particularities of carry propagation. It was only 20 years later that Ladner and Fischer formulated the abstract problem of prefix computa... |

2 |
Federal Geographic Data Committee 1998. Content Standard for Digital Geospatial Metadata. FGDC-STD-001-1998
- unknown authors
- 2004
(Show Context)
Citation Context ...s. 1 Since powerlists capture the recursive decomposition of Brent-Kung, the approach while elegant is not easily applicable to other implementations of scans. In a recent pearl, O’Donnell and Rüng=-=er [13]-=- derive the recursive implementation using the digital circuit description language Hydra. The resulting specification contains all the necessary information to simulate or fabricate a circuit. The pa... |

1 |
R.L.: Introduction to Algorithms. First edn
- Cormen, Leiserson, et al.
- 1990
(Show Context)
Citation Context ...nt stretching. As a warm-up in scan calculations, let us derive two simple consequences, which we need later on. f −� x + [j + k ] = (f −� x + [j ]) × id k (1) (f × id #y−1) −� x + y ==-= f −� x + [Σy ] (2) The rules allow u-=-s to push the identity, id n, in and out of a stretch. To prove (1) we argue f −� x + [j + k ] = { flip law } ([1] + x �− f ) × id j +k−1 = { composition } ([1] + x �− f ) × id j −1 ... |

1 |
European Community 1996 Directive 96/9/EC of the European Parliament and of the Council of 11
- unknown authors
- 1996
(Show Context)
Citation Context ..., the graphical approach is not an option, especially, when it comes to proving correctness. Some papers define a family of graphs by numbering the nodes and enumerating the edges, see, for instance, =-=[4]-=-. While this certainly counts as a rigorous definition it is way too concrete: an explicit graph representation obscures the structure of the design and is hard to manipulate formally. In this paper w... |