Results 1  10
of
12
Gradual Refinement Blending Pattern Matching with Data Abstraction
"... Abstract. Pattern matching is advantageous for understanding and reasoning about function definitions, but it tends to tightly couple the interface and implementation of a datatype. Significant effort has been invested in tackling this loss of modularity; however, decoupling patterns from concrete r ..."
Abstract

Cited by 6 (3 self)
 Add to MetaCart
(Show Context)
Abstract. Pattern matching is advantageous for understanding and reasoning about function definitions, but it tends to tightly couple the interface and implementation of a datatype. Significant effort has been invested in tackling this loss of modularity; however, decoupling patterns from concrete representations while maintaining soundness of reasoning has been a challenge. Inspired by the development of invertible programming, we propose an approach to abstract datatypes based on a rightinvertible language rinv—every function has a right (or pre) inverse. We show how this new design is able to permit a smooth incremental transition from programs with algebraic datatypes and pattern matching, to ones with proper encapsulation (implemented as abstract datatypes), while maintaining simple and sound reasoning.
Towards Systematic Parallel Programming over MapReduce
"... Abstract. MapReduce is a useful and popular programming model for dataintensive distributed parallel computing. But it is still a challenge to develop parallel programs with MapReduce systematically, since it is usually not easy to derive a proper divideandconquer algorithm that matches MapReduce ..."
Abstract

Cited by 5 (2 self)
 Add to MetaCart
(Show Context)
Abstract. MapReduce is a useful and popular programming model for dataintensive distributed parallel computing. But it is still a challenge to develop parallel programs with MapReduce systematically, since it is usually not easy to derive a proper divideandconquer algorithm that matches MapReduce. In this paper, we propose a homomorphismbased framework named Screwdriver for systematic parallel programming with MapReduce, making use of the program calculation theory of list homomorphisms. Screwdriver is implemented as a Java library on top of Hadoop. For any problem which can be resolved by two sequential functions that satisfy the requirements of the third homomorphism theorem, Screwdriver can automatically derive a parallel algorithm as a list homomorphism and transform the initial sequential programs to an efficient MapReduce program. Users need neither to care about parallelism nor to have deep knowledge of MapReduce. In addition to the simplicity of the programming model of our framework, such a calculational approach enables us to resolve many problems that it would be nontrivial to resolve directly with MapReduce. 1
Financial software on gpus: between haskell and fortran
 In Proceedings of the 1st ACM SIGPLAN workshop on Functional highperformance computing, FHPC '12
, 2012
"... This paper presents a realworld pricing kernel for financial derivatives and evaluates the language and compiler tool chain that would allow expressive, hardwareneutral algorithm implementation and efficient execution on graphicsprocessing units (GPU). The language issues refer to preserving al ..."
Abstract

Cited by 3 (2 self)
 Add to MetaCart
(Show Context)
This paper presents a realworld pricing kernel for financial derivatives and evaluates the language and compiler tool chain that would allow expressive, hardwareneutral algorithm implementation and efficient execution on graphicsprocessing units (GPU). The language issues refer to preserving algorithmic invariants, e.g., inherent parallelism made explicit by mapreducescan functional combinators. Efficient execution is achieved by manually applying a series of generallyapplicable compiler transformations that allows the generatedOpenCL code to yield speedups as high as 70 × and 540 × on a commodity mobile and desktop GPU, respectively. Apart from the concrete speedups attained, our contributions are twofold: First, from a language perspective, we illustrate that even stateoftheart autoparallelization techniques are incapable of discovering all the requisite data parallelism when rendering the
Functional Bulk Synchronous Parallel Programs
, 2010
"... With the current generalization of parallel architectures arises the concern of applying formal methods to parallelism, which allows specifications of parallel programs to be precisely stated and the correctness of an implementation to be verified. However, the complexity of parallel, compared to se ..."
Abstract
 Add to MetaCart
(Show Context)
With the current generalization of parallel architectures arises the concern of applying formal methods to parallelism, which allows specifications of parallel programs to be precisely stated and the correctness of an implementation to be verified. However, the complexity of parallel, compared to sequential, programs makes them more errorprone and difficult to verify. This calls for a strongly structured form of parallelism, which should not only ease programming by providing abstractions that conceal much of the complexity of parallel computation, but also provide a systematic way of developing practical programs from specification. Bulk Synchronous Parallelism (BSP) is a model of computation which offers a high degree of abstraction like PRAM models but yet a realistic cost model based on a structured parallelism. We propose a framework for refining a sequential specification toward a functional BSP program, the whole process being done with the help of a proof assistant. The main technical contributions of this paper are as follows: We define BH, a new homomorphic skeleton, which captures the essence of BSP computation in an algorithmic level, and
Automatic Parallelization of Canonical Loops
"... Abstract. This paper presents a compilation technique that performs automatic parallelization of canonical loops. Canonical loops are a pattern observed in many well known algorithms, such as frequent itemsets, Kmeans and K nearest neighbors. Automatic parallelization allows application developers ..."
Abstract
 Add to MetaCart
(Show Context)
Abstract. This paper presents a compilation technique that performs automatic parallelization of canonical loops. Canonical loops are a pattern observed in many well known algorithms, such as frequent itemsets, Kmeans and K nearest neighbors. Automatic parallelization allows application developers to focus on the algorithmic details of the problem they are solving, leaving for the compiler the task of generating correct and efficient parallel code. Our method splits tasks and data among stream processing elements and uses a novel technique based on labeledstreams to minimize the communication between filters. Experiments performed on a cluster of 36 computers indicate that, for the three algorithms mentioned above, our method produces code that scales linearly on the number of available processors. These experiments also show that the automatically generated code is competitive when compared to hand tuned programs. 1.
Rapport n o RR201001Systematic Development of Functional Bulk Synchronous Parallel Programs
, 2010
"... With the current generalization of parallel architectures arises the concern of applying formal methods to parallelism, which allows specifications of parallel programs to be precisely stated and the correctness of an implementation to be verified. However, the complexity of parallel, compared to se ..."
Abstract
 Add to MetaCart
(Show Context)
With the current generalization of parallel architectures arises the concern of applying formal methods to parallelism, which allows specifications of parallel programs to be precisely stated and the correctness of an implementation to be verified. However, the complexity of parallel, compared to sequential, programs makes them more errorprone and difficult to verify. This calls for a strongly structured form of parallelism, which should not only ease programming by providing abstractions that conceal much of the complexity of parallel computation, but also provide a systematic way of developing practical programs from specification. Bulk Synchronous Parallelism (BSP) is a model of computation which offers a high degree of abstraction like PRAM models but yet a realistic cost model based on a structured parallelism. We propose a framework for refining a sequential specification toward a functional BSP program, the whole process being done with the help of a proof assistant. The main technical contributions of this paper are as follows: We define BH, a new homomorphic skeleton, which captures the essence of BSP computation in an algorithmic level, and
Programming with BSP Homomorphisms
"... Abstract. Algorithmic skeletons in conjunction with list homomorphisms play an important role in formal development of parallel algorithms. We have designed a notion of homomorphism dedicated to bulk synchronous parallelism. In this paper we derive two application using this theory: sparse matrix ..."
Abstract
 Add to MetaCart
(Show Context)
Abstract. Algorithmic skeletons in conjunction with list homomorphisms play an important role in formal development of parallel algorithms. We have designed a notion of homomorphism dedicated to bulk synchronous parallelism. In this paper we derive two application using this theory: sparse matrix vector multiplication and the all nearest smaller values problem. We implement a support for BSP homomorphism in the Orléans Skeleton Library and experiment it with these two applications.
tokyo.ac.jp
"... GenerateTestAggregate (GTA for short) is a novel programming model for MapReduce, dramatically simplifying the development of efficient parallel algorithms. Under the GTA model, a parallel computation is encoded into a simple pattern: generate all candidates, test them to filter out invalid ones, ..."
Abstract
 Add to MetaCart
(Show Context)
GenerateTestAggregate (GTA for short) is a novel programming model for MapReduce, dramatically simplifying the development of efficient parallel algorithms. Under the GTA model, a parallel computation is encoded into a simple pattern: generate all candidates, test them to filter out invalid ones, and aggregate valid ones to make the result. Once users specify their parallel computations in the GTA style, they get efficient MapReduce programs for free owing to an automatic optimization given by the GTA theory. In this paper, we report our implementation of a GTA library to support programming in the GTA model. In this library, we provide a compact programming interface for hiding the complexity of GTA’s internal transformation, so that many problems can be encoded in the GTA style easily and straightforwardly. The GTA transformation and optimization mechanism implemented inside is a blackbox to the end users, while users can extend the library by modifying existing (or implementing new) generators, testers or aggregators through standard programming interfaces of the GTA library. This GTA programming library supports both sequential or parallel execution on single computer and oncluster execution with MapReduce computing engines. We evaluate our library by giving the results of our experiments on large data to show the efficiency, scalability and usefulness of this GTA library.
Systematic Development of Correct Bulk Synchronous Parallel Programs
"... Abstract — With the current generalisation of parallel architectures arises the concern of applying formal methods to parallelism. The complexity of parallel, compared to sequential, programs makes them more errorprone and difficult to verify. Bulk Synchronous Parallelism (BSP) is a model of compu ..."
Abstract
 Add to MetaCart
(Show Context)
Abstract — With the current generalisation of parallel architectures arises the concern of applying formal methods to parallelism. The complexity of parallel, compared to sequential, programs makes them more errorprone and difficult to verify. Bulk Synchronous Parallelism (BSP) is a model of computation which offers a high degree of abstraction like PRAM models but yet a realistic cost model based on a structured parallelism. We propose a framework for refining a sequential specification toward a functional BSP program, the whole process being done with the help of the Coq proof assistant. To do so we define BH, a new homomorphic skeleton, which captures the essence of BSP computation in an algorithmic level, and also serves as a bridge in mapping from high level specification to low level BSP parallel programs. I.