Results 1 
9 of
9
Parallelization in Calculational Forms
 In 25th ACM Symposium on Principles of Programming Languages
, 1998
"... The problems involved in developing efficient parallel programs have proved harder than those in developing efficient sequential ones, both for programmers and for compilers. Although program calculation has been found to be a promising way to solve these problems in the sequential world, we believe ..."
Abstract

Cited by 33 (25 self)
 Add to MetaCart
The problems involved in developing efficient parallel programs have proved harder than those in developing efficient sequential ones, both for programmers and for compilers. Although program calculation has been found to be a promising way to solve these problems in the sequential world, we believe that it needs much more effort to study its effective use in the parallel world. In this paper, we propose a calculational framework for the derivation of efficient parallel programs with two main innovations:  We propose a novel inductive synthesis lemma based on which an elementary but powerful parallelization theorem is developed.  We make the first attempt to construct a calculational algorithm for parallelization, deriving associative operators from data type definition and making full use of existing fusion and tupling calculations. Being more constructive, our method is not only helpful in the design of efficient parallel programs in general but also promising in the construc...
(De)Composition Rules for Parallel Scan and Reduction
 In Proc. 3rd Int. Working Conf. on Massively Parallel Programming Models (MPPM'97
, 1998
"... We study the use of welldefined building blocks for SPMD programming of machines with distributed memory. Our general framework is based on homomorphisms, functions that capture the idea of dataparallelism and have a close correspondence with collective operations of the MPI standard, e.g., scan an ..."
Abstract

Cited by 8 (1 self)
 Add to MetaCart
We study the use of welldefined building blocks for SPMD programming of machines with distributed memory. Our general framework is based on homomorphisms, functions that capture the idea of dataparallelism and have a close correspondence with collective operations of the MPI standard, e.g., scan and reduction. We prove two composition rules: under certain conditions, a composition of a scan and a reduction can be transformed into one reduction, and a composition of two scans into one scan. As an example of decomposition, we transform a segmented reduction into a composition of partial reduction and allgather. The performance gain and overhead of the proposed composition and decomposition rules are assessed analytically for the hypercube and compared with the estimates for some other parallel models.
Automatic Inversion Generates DivideandConquer Parallel Programs
"... Divideandconquer algorithms are suitable for modern parallel machines, tending to have large amounts of inherent parallelism and working well with caches and deep memory hierarchies. Among others, list homomorphisms are a class of recursive functions on lists, which match very well with the divide ..."
Abstract

Cited by 7 (5 self)
 Add to MetaCart
Divideandconquer algorithms are suitable for modern parallel machines, tending to have large amounts of inherent parallelism and working well with caches and deep memory hierarchies. Among others, list homomorphisms are a class of recursive functions on lists, which match very well with the divideandconquer paradigm. However, direct programming with list homomorphisms is a challenge for many programmers. In this paper, we propose and implement a novel system that can automatically derive costoptimal list homomorphisms from a pair of sequential programs, based on the third homomorphism theorem. Our idea is to reduce extraction of list homomorphisms to derivation of weak right inverses. We show that a weak right inverse always exists and can be automatically generated from a wide class of sequential programs. We demonstrate our system with several nontrivial examples, including the maximum prefix sum problem, the prefix sum computation, the maximum segment sum problem, and the lineofsight problem. The experimental results show practical efficiency of our automatic parallelization algorithm and good speedups of the generated parallel programs.
An Analytical Method For Parallelization Of Recursive Functions
, 2001
"... Programming with parallel skeletons is an attractive framework because it encourages programmers to develop efficient and portable parallel programs. However, extracting parallelism from sequential specifications and constructing efficient parallel programs using the skeletons are still difficult ..."
Abstract

Cited by 7 (0 self)
 Add to MetaCart
Programming with parallel skeletons is an attractive framework because it encourages programmers to develop efficient and portable parallel programs. However, extracting parallelism from sequential specifications and constructing efficient parallel programs using the skeletons are still difficult tasks. In this paper, we propose an analytical approach to transforming recursive functions on general recursive data structures into compositions of parallel skeletons. Using static slicing, we have defined a classification of subexpressions based on their dataparallelism. Then, skeletonbased parallel programs are generated from the classification. To extend the scope of parallelization, we have adopted more general parallel skeletons which do not require the associativity of argument functions. In this way, our analytical method can parallelize recursive functions with complex data flows. Keywords: data parallelism, parallelization, functional languages, parallel skeletons, data flow analysis, static slice 1.
Interprocedural Dependence Analysis of HigherOrder Programs via Stack Reachability
"... We present a smallstep abstract interpretation for the ANormal Form λcalculus (ANF). This abstraction has been instrumented to find datadependence conflicts for expressions and procedures. Our goal is parallelization: when two expressions have no dependence conflicts, it is safe to evaluate the ..."
Abstract

Cited by 5 (0 self)
 Add to MetaCart
We present a smallstep abstract interpretation for the ANormal Form λcalculus (ANF). This abstraction has been instrumented to find datadependence conflicts for expressions and procedures. Our goal is parallelization: when two expressions have no dependence conflicts, it is safe to evaluate them in parallel. The underlying principle for discovering dependences is Harrison’s principle: whenever a resources is accessed or modified, procedures that have frames live on the stack have a dependence upon that resource. The abstract interpretation models the stack of a modified CESK machine by mimicking heapallocation of continuations. Abstractions of continuation marks are employed so that the abstract semantics retain proper tailcall optimization without sacrificing dependence information.
The Third Homomorphism Theorem on Trees Downward & Upward Lead to DivideandConquer
"... Parallel programs on lists have been intensively studied. It is well known that associativity provides a good characterization for divideandconquer parallel programs. In particular, the third homomorphism theorem is not only useful for systematic development of parallel programs on lists, but it i ..."
Abstract

Cited by 2 (1 self)
 Add to MetaCart
Parallel programs on lists have been intensively studied. It is well known that associativity provides a good characterization for divideandconquer parallel programs. In particular, the third homomorphism theorem is not only useful for systematic development of parallel programs on lists, but it is also suitable for automatic parallelization. The theorem states that if two sequential programs iterate the same list leftward and rightward, respectively, and compute the same value, then there exists a divideandconquer parallel program that computes the same value as the sequential programs. While there have been many studies on lists, few have been done for characterizing and developing of parallel programs on trees. Naive divideandconquer programs, which divide a tree at the root and compute independent subtrees in parallel, take time that is proportional to the height of the input tree and have poor scalability with respect to the number of processors when the input tree is illbalanced. In this paper, we develop a method for systematically constructing scalable divideandconquer parallel programs on trees, in which two sequential programs lead to a scalable divideandconquer parallel program. We focus on paths instead of trees so as to utilize rich results on lists and demonstrate that associativity provides good characterization for scalable divideandconquer parallel programs on trees. Moreover, we generalize the third homomorphism theorem from lists to trees. We demonstrate the effectiveness of our method with various examples. Our results, being generalizations of known results for lists, are generic in the sense that they work well for all polynomial data structures.
An Analytical Method For Parallelization Of Recursive Functions
, 2000
"... Programming with parallel skeletons is an attractive framework because it encourages programmers to develop efficient and portable parallel programs. However, extracting parallelism from sequential specifications and constructing efficient parallel programs using the skeletons are still difficult ..."
Abstract
 Add to MetaCart
Programming with parallel skeletons is an attractive framework because it encourages programmers to develop efficient and portable parallel programs. However, extracting parallelism from sequential specifications and constructing efficient parallel programs using the skeletons are still difficult tasks. In this paper, we propose an analytical approach to transforming recursive functions on general recursive data structures into compositions of parallel skeletons. Using static slicing, wehave defined a classification of subexpressions based on their dataparallelism. Then, skeletonbased parallel programs are generated from the classification. To extend the scope of parallelization, wehave adopted more general parallel skeletons which do not require the associativity of argument functions. In this way, our analytical method can parallelize recursive functions with complex data flows. Keywords: data parallelism, parallelization, functional languages, parallel skeletons, data flow analysis, static slice 1.
Analysis of Parallelism in Recursive Functions on Recursive Data Structures
"... In functional languages, iterative operations on data collections are naturally expressed using recursive functions on recursive data structures. In this paper, we present a method to extract data parallelism from recursive functions and generate data parallel programs. ..."
Abstract
 Add to MetaCart
In functional languages, iterative operations on data collections are naturally expressed using recursive functions on recursive data structures. In this paper, we present a method to extract data parallelism from recursive functions and generate data parallel programs.
On Indexed Data Structures and Functional Matrix Algorithms
, 1997
"... At first sight, scientific computing seems an application area ideally suited for functional programming: scientific programs are described by a constructive input/output specification. However, scientific programmers still remain reluctant to consider the functional approach. The difficulties li ..."
Abstract
 Add to MetaCart
At first sight, scientific computing seems an application area ideally suited for functional programming: scientific programs are described by a constructive input/output specification. However, scientific programmers still remain reluctant to consider the functional approach. The difficulties lie in part in missing applications and reluctance to change, but in our view technically in the features and performance properties which current functional programming languages offer. We study two examples: reflexive transitive closure and LU decomposition. For each, we offer what we believe to be a natural functional solution and examine its strengths and weaknesses. From this, we try to draw conclusions about the shape of scientific problems suitable for functional programming and state properties which would make a functional language more suitable for scientific programming. 1 Introduction Despite the rise in the popularity of functional programming in the last decade, scientifi...