Results 11 - 20
of
57
Google’s MapReduce Programming Model — Revisited
"... Google’s MapReduce programming model serves for processing large data sets in a massively parallel manner. We deliver the first rigorous description of the model including its advancement as Google’s domain-specific language Sawzall. To this end, we reverse-engineer the seminal papers on MapReduce a ..."
Abstract
-
Cited by 29 (1 self)
- Add to MetaCart
Google’s MapReduce programming model serves for processing large data sets in a massively parallel manner. We deliver the first rigorous description of the model including its advancement as Google’s domain-specific language Sawzall. To this end, we reverse-engineer the seminal papers on MapReduce and Sawzall, and we capture our findings as an executable specification. We also identify and resolve some obscurities in the informal presentation given in the seminal papers. We use typed functional programming (specifically Haskell) as a tool for design recovery and executable specification. Our development comprises three components: (i) the basic program skeleton that underlies MapReduce computations; (ii) the opportunities for parallelism in executing MapReduce computations; (iii) the fundamental characteristics of Sawzall’s aggregators as an advancement of the MapReduce approach. Our development does not formalize the more implementational aspects of an actual, distributed execution of MapReduce computations.
Polytypic Pattern Matching
- In Conference Record of FPCA '95, SIGPLAN-SIGARCH-WG2.8 Conference on Functional Programming Languages and Computer Architecture
, 1995
"... The (exact) pattern matching problem can be informally specified as follows: given a pattern and a text, find all occurrences of the pattern in the text. The pattern and the text may both be lists, or they may both be trees, or they may both be multi-dimensional arrays, etc. This paper describes a g ..."
Abstract
-
Cited by 28 (8 self)
- Add to MetaCart
The (exact) pattern matching problem can be informally specified as follows: given a pattern and a text, find all occurrences of the pattern in the text. The pattern and the text may both be lists, or they may both be trees, or they may both be multi-dimensional arrays, etc. This paper describes a general pattern-matching algorithm for all datatypes definable as an initial object in a category of F -algebras, where F is a regular functor. This class of datatypes includes mutual recursive datatypes and lots of different kinds of trees. The algorithm is a generalisation of the Knuth, Morris, Pratt like pattern-matching algorithm on trees first described by Hoffmann and O'Donnell. 1 Introduction Most editors provide a search function that takes a string of symbols and returns the first position in the text being edited at which this string of symbols occurs. The string of symbols is called a pattern, and the algorithm that detects the position at which a pattern occurs is called a (exa...
Generic Haskell: applications
- In Generic Programming, Advanced Lectures, volume 2793 of LNCS
, 2003
"... Generic Haskell is an extension of Haskell that supports the construction of generic programs. These lecture notes discuss three advanced generic programming applications: generic dictionaries, compressing XML documents, and the zipper: a data structure used to represent a tree together with a s ..."
Abstract
-
Cited by 28 (15 self)
- Add to MetaCart
Generic Haskell is an extension of Haskell that supports the construction of generic programs. These lecture notes discuss three advanced generic programming applications: generic dictionaries, compressing XML documents, and the zipper: a data structure used to represent a tree together with a subtree that is the focus of attention, where that focus may move left, right, up or down the tree. When describing and implementing these examples, we will encounter some advanced features of Generic Haskell, such as type-indexed data types, dependencies between and generic abstractions of generic functions, adjusting a generic function using a default case, and generic functions with a special case for a particular constructor.
Generalised Folds for Nested Datatypes
- Formal Aspects of Computing
, 1999
"... Nested datatypes generalise regular datatypes in much the same way that context-free languages generalise regular ones. Although the categorical semantics of nested types turns out to be similar to the regular case, the fold functions are more limited because they can only describe natural transform ..."
Abstract
-
Cited by 28 (1 self)
- Add to MetaCart
Nested datatypes generalise regular datatypes in much the same way that context-free languages generalise regular ones. Although the categorical semantics of nested types turns out to be similar to the regular case, the fold functions are more limited because they can only describe natural transformations. Practical considerations therefore dictate the introduction of a generalised fold function in which this limitation can be overcome. In the paper we show how to construct generalised folds systematically for each nested datatype, and show that they possess a uniqueness property analogous to that of ordinary folds. As a consequence, generalised folds satisfy fusion properties similar to those developed for regular datatypes. Such properties form the core of an effective calculational theory of inductive datatypes.
Datatype Laws without Signatures
, 1996
"... ing from syntax. Conventionally an equation for algebra ' is just a pair of terms built from variables, the constituent operations of ' , and some fixed standard operations. An equation is valid if the two terms are equal for all values of the variables. We are going to model a syntactic term as a m ..."
Abstract
-
Cited by 21 (6 self)
- Add to MetaCart
ing from syntax. Conventionally an equation for algebra ' is just a pair of terms built from variables, the constituent operations of ' , and some fixed standard operations. An equation is valid if the two terms are equal for all values of the variables. We are going to model a syntactic term as a morphism that has the values of the variables as source. For example, the two terms ` x ' and ` x join x ' (with variable x of type tree ) are modeled by morphisms id and id \Delta id ; join of type tree ! tree . So, an equation for ' is modeled by a pair of terms (T '; T 0 ') , T and T 0 being mappings of morphisms which we call `transformers'. This faces us with the following problem: what properties must we require of an arbitrary mapping T in order that it model a classical syntactic Datatype Laws without Signatures 7 term? Or, rather, what properties of classical syntactic terms are semantically essential, and how can we formalise these as properties of a transformer T ? Of course...
Fold and Unfold for Program Semantics
- In Proc. 3rd ACM SIGPLAN International Conference on Functional Programming
, 1998
"... In this paper we explain how recursion operators can be used to structure and reason about program semantics within a functional language. In particular, we show how the recursion operator fold can be used to structure denotational semantics, how the dual recursion operator unfold can be used to str ..."
Abstract
-
Cited by 21 (4 self)
- Add to MetaCart
In this paper we explain how recursion operators can be used to structure and reason about program semantics within a functional language. In particular, we show how the recursion operator fold can be used to structure denotational semantics, how the dual recursion operator unfold can be used to structure operational semantics, and how algebraic properties of these operators can be used to reason about program semantics. The techniques are explained with the aid of two main examples, the first concerning arithmetic expressions, and the second concerning Milner's concurrent language CCS. The aim of the paper is to give functional programmers new insights into recursion operators, program semantics, and the relationships between them. 1 Introduction Many computations are naturally expressed as recursive programs defined in terms of themselves, and properties proved of such programs using some form of inductive argument. Not surprisingly, many programs will have a similar recursive stru...
Recursion Schemes from Comonads
, 2001
"... . Within the setting of the categorical approach to programming with total functions, a \many-in-one" recursion scheme is introduced that neatly unies a variety of recursion schemes looking as diverging generalizations of the basic recursion scheme of iteration. The scheme is doubly generic: in addi ..."
Abstract
-
Cited by 20 (4 self)
- Add to MetaCart
. Within the setting of the categorical approach to programming with total functions, a \many-in-one" recursion scheme is introduced that neatly unies a variety of recursion schemes looking as diverging generalizations of the basic recursion scheme of iteration. The scheme is doubly generic: in addition to behaving uniformly with respect to a functor determining an inductive type, it is also uniform in a comonad and a distributive law which together determine a particular recursion scheme for this inductive type. By way of examples, it is shown to subsume iteration, a scheme subsuming primitive recursion, and a scheme subsuming course-of-value iteration.
Parallel Implementation of Tree Skeletons
- Journal of Parallel and Distributed Computing
, 1995
"... Trees are a useful data type, but they are not routinely included in parallel programming systems because their irregular structure makes them seem hard to compute with efficiently. We present a method for constructing implementations of skeletons, high-level homomorphic operations on trees, that ex ..."
Abstract
-
Cited by 18 (2 self)
- Add to MetaCart
Trees are a useful data type, but they are not routinely included in parallel programming systems because their irregular structure makes them seem hard to compute with efficiently. We present a method for constructing implementations of skeletons, high-level homomorphic operations on trees, that execute in parallel. In particular, we consider the case where the size of the tree is much larger than the the number of processors available, so that tree data must be partitioned. The approach uses the theory of categorical data types to derive implementation templates based on tree contraction. Many useful tree operations can be computed in time logarithmic in the size of their argument, on a wide range of parallel systems. 1 Contribution One common approach to general-purpose parallel computation is based on packaging complex operations as templates, or skeletons [3, 12]. Skeletons encapsulate the control and data flow necessary to compute useful operations. This permits software to be...
Monadic Maps and Folds for Arbitrary Datatypes
- Memoranda Informatica, University of Twente
, 1994
"... Each datatype constructor comes equiped not only with a so-called map and fold (catamorphism), as is widely known, but, under some condition, also with a kind of map and fold that are related to an arbitrary given monad. This result follows from the preservation of initiality under lifting from the ..."
Abstract
-
Cited by 15 (0 self)
- Add to MetaCart
Each datatype constructor comes equiped not only with a so-called map and fold (catamorphism), as is widely known, but, under some condition, also with a kind of map and fold that are related to an arbitrary given monad. This result follows from the preservation of initiality under lifting from the category of algebras in a given category to a certain other category of algebras in the Kleisli category related to the monad.
Between Functions and Relations in Calculating Programs
, 1992
"... This thesis is about the calculational approach to programming, in which one derives programs from specifications. One such calculational paradigm is Ruby, the relational calculus developed by Jones and Sheeran for describing and designing circuits. We identify two shortcomings with derivations made ..."
Abstract
-
Cited by 15 (4 self)
- Add to MetaCart
This thesis is about the calculational approach to programming, in which one derives programs from specifications. One such calculational paradigm is Ruby, the relational calculus developed by Jones and Sheeran for describing and designing circuits. We identify two shortcomings with derivations made using Ruby. The first is that the notion of a program being an implementation of a specification has never been made precise. The second is to do with types. Fundamental to the use of type information in deriving programs is the idea of having types as special kinds of programs. In Ruby, types are partial equivalence relations (pers). Unfortunately, manipulating some formulae involving types has proved difficult within Ruby. In particular, the preconditions of the `induction' laws that are much used within program derivation often work out to be assertions about types; such assertions have typically been verified either by informal arguments or by using predicate calculus, rather than by ap...

