Results 1 -
6 of
6
Google’s MapReduce Programming Model — Revisited
"... Google’s MapReduce programming model serves for processing large data sets in a massively parallel manner. We deliver the first rigorous description of the model including its advancement as Google’s domain-specific language Sawzall. To this end, we reverse-engineer the seminal papers on MapReduce a ..."
Abstract
-
Cited by 29 (1 self)
- Add to MetaCart
Google’s MapReduce programming model serves for processing large data sets in a massively parallel manner. We deliver the first rigorous description of the model including its advancement as Google’s domain-specific language Sawzall. To this end, we reverse-engineer the seminal papers on MapReduce and Sawzall, and we capture our findings as an executable specification. We also identify and resolve some obscurities in the informal presentation given in the seminal papers. We use typed functional programming (specifically Haskell) as a tool for design recovery and executable specification. Our development comprises three components: (i) the basic program skeleton that underlies MapReduce computations; (ii) the opportunities for parallelism in executing MapReduce computations; (iii) the fundamental characteristics of Sawzall’s aggregators as an advancement of the MapReduce approach. Our development does not formalize the more implementational aspects of an actual, distributed execution of MapReduce computations.
Structural Parallel Algorithmics
, 1991
"... The first half of the paper is a general introduction which emphasizes the central role that the PRAM model of parallel computation plays in algorithmic studies for parallel computers. Some of the collective knowledge-base on non-numerical parallel algorithms can be characterized in a structural way ..."
Abstract
-
Cited by 11 (4 self)
- Add to MetaCart
The first half of the paper is a general introduction which emphasizes the central role that the PRAM model of parallel computation plays in algorithmic studies for parallel computers. Some of the collective knowledge-base on non-numerical parallel algorithms can be characterized in a structural way. Each structure relates a few problems and technique to one another from the basic to the more involved. The second half of the paper provides a bird's-eye view of such structures for: (1) list, tree and graph parallel algorithms; (2) very fast deterministic parallel algorithms; and (3) very fast randomized parallel algorithms. 1 Introduction Parallelism is a concern that is missing from "traditional" algorithmic design. Unfortunately, it turns out that most efficient serial algorithms become rather inefficient parallel algorithms. The experience is that the design of parallel algorithms requires new paradigms and techniques, offering an exciting intellectual challenge. We note that it had...
A parallel algorithm for record clustering
- ACM Trans. on Database Systems
, 1990
"... We present an efficient heuristic algorithm for record clustering that can run on a SIMD machine. We introduce the P-tree, and its associated numbering scheme, which in the split phase allows each processor independently to compute the unique cluster number of a record satisfying an arbitrary query. ..."
Abstract
-
Cited by 7 (1 self)
- Add to MetaCart
We present an efficient heuristic algorithm for record clustering that can run on a SIMD machine. We introduce the P-tree, and its associated numbering scheme, which in the split phase allows each processor independently to compute the unique cluster number of a record satisfying an arbitrary query. We show that by restricting ourselves in the merge phase to combining only sibling clusters, we obtain a parallel algorithm whose speedup ratio is optimal in the number of processors used. Finally, we report on experiments showing that our method produces substantial savings in an environment with relatively little overlap among the queries.
Another PRAM Algorithm for Finding Connected Components of Sparse Graphs
, 1999
"... We present an algorithm which exploits a new approach to the problem of finding the connected components of an undirected graph, CCug for short, with v vertices and e edges. The algorithm has depth O(log² (e)) on a CREW PRAM using e processors, hence its cost is not affected by the number v of g ..."
Abstract
- Add to MetaCart
We present an algorithm which exploits a new approach to the problem of finding the connected components of an undirected graph, CCug for short, with v vertices and e edges. The algorithm has depth O(log² (e)) on a CREW PRAM using e processors, hence its cost is not affected by the number v of graph vertices. This makes the algorithm the one with best speedup and best cost for CCug on highly sparse graphs. On dense graphs conversely, its performance is comparable to the one of the algorithm in [12] and a little worse than the one in [5]. A variant of the algorithm with the same bound but running on the EREW model is also included. The algorithm can be used to find the transitive closure of binary, symmetric relations. In this case e is the number of axioms and v is the range of the relation.
Deterministic P-RAM Simulation with Constant Redundancy * (Preliminary Version)
"... Abstract: In this paper, we show that distributing the memory of a parallel computer and, thereby, decreasing its granularity allows a reduction in the redundancy required to achieve polylog simulation time for each P-RAM step. Previously, realistic models of parallel computation assigned one memory ..."
Abstract
- Add to MetaCart
Abstract: In this paper, we show that distributing the memory of a parallel computer and, thereby, decreasing its granularity allows a reduction in the redundancy required to achieve polylog simulation time for each P-RAM step. Previously, realistic models of parallel computation assigned one memory module to each processor and, as a result, insisted on relatively coarse-grain memory. We propose, on the other hand, a more flexible, but equally valid model of computation, the distributed-memory, bounded-degree network (DMBDN) model. This model allows the use of fine-grain memory while maintaining the realism of a bounded-degree interconnection network. We describe a P-RAM simulation scheme, which is admitted under the DMBDN model, that exploits the increased memory bandwidth provided by a two-dimensional mesh of trees (2DMOT) network to achieve an overhead in memory redundancy lower than that required by other fast, deterministic P-RAM simulations. Specifically, for a deterministic simulation of an n-processor P-RAM on a bounded-degree network, we are able to reduce the number of copies of each variable from O(logn/loglogn) to ®(1) and still simulate each P-RAM step in polylog time. 1.
Lecture 19: Sorting – II Professor: S.
"... is used extensively in image processing, but is also used, for instance, in physics applications where it may be of interest to count the number of particles in certain energy bands. Histogramming can also be used for sorting purposes. We have already discussed a few sorting algorithms, namely • odd ..."
Abstract
- Add to MetaCart
is used extensively in image processing, but is also used, for instance, in physics applications where it may be of interest to count the number of particles in certain energy bands. Histogramming can also be used for sorting purposes. We have already discussed a few sorting algorithms, namely • odd–even transposition sort • up–down sort • shear sort • heap sort Odd–even transposition sort is suitable for sorting on linear arrays in that its sorting time is compatible with the worst case lower bound for sorting on linear arrays. However, the running time of the algorithm is the same whether the data is already sorted or not. The up–down sorter is suitable for sorting on a linear array, with data originally and finally external to the array. But, as for the odd–even transposition sort, the sorting time is independent of the data distribution. With data external to the array, a sorting time proportional to the number of elements to be sorted is a lower bound, since the elements

