Results 1 
3 of
3
Automatic Inversion Generates DivideandConquer Parallel Programs
"... Divideandconquer algorithms are suitable for modern parallel machines, tending to have large amounts of inherent parallelism and working well with caches and deep memory hierarchies. Among others, list homomorphisms are a class of recursive functions on lists, which match very well with the divide ..."
Abstract

Cited by 7 (5 self)
 Add to MetaCart
(Show Context)
Divideandconquer algorithms are suitable for modern parallel machines, tending to have large amounts of inherent parallelism and working well with caches and deep memory hierarchies. Among others, list homomorphisms are a class of recursive functions on lists, which match very well with the divideandconquer paradigm. However, direct programming with list homomorphisms is a challenge for many programmers. In this paper, we propose and implement a novel system that can automatically derive costoptimal list homomorphisms from a pair of sequential programs, based on the third homomorphism theorem. Our idea is to reduce extraction of list homomorphisms to derivation of weak right inverses. We show that a weak right inverse always exists and can be automatically generated from a wide class of sequential programs. We demonstrate our system with several nontrivial examples, including the maximum prefix sum problem, the prefix sum computation, the maximum segment sum problem, and the lineofsight problem. The experimental results show practical efficiency of our automatic parallelization algorithm and good speedups of the generated parallel programs.
B[count++] = A[i];
"... / * copy all bigger elements from A[1..n] into B[] */ count = 0; for (i=0; i<n; i++) { sumAfter = 0; for (j=i+1; j<n; j++) { sumAfter + = A[j]; if (A[i]> sumAfter) ..."
Abstract
 Add to MetaCart
(Show Context)
/ * copy all bigger elements from A[1..n] into B[] */ count = 0; for (i=0; i<n; i++) { sumAfter = 0; for (j=i+1; j<n; j++) { sumAfter + = A[j]; if (A[i]> sumAfter)
Sequential ” Nested Loop Programs
, 2011
"... Most automatic parallelizers are based on detection of independent operations, and most of them cannot do anything if there is a true dependence between operations. However, there exists a class of programs, for which this can be surmounted based on the nature of the operations. The standard and obv ..."
Abstract
 Add to MetaCart
(Show Context)
Most automatic parallelizers are based on detection of independent operations, and most of them cannot do anything if there is a true dependence between operations. However, there exists a class of programs, for which this can be surmounted based on the nature of the operations. The standard and obvious cases are reductions and scans (prefix computations), which normally occur within loops. We present a method for automatically parallelizing such “inherently ” sequential programs. Our method is based on exact dependence analysis in the polyhedral model, and matrix multiplication over a semiring. It handles both single loop as well as arbitrarily nested loops. We also deal with mutually dependent variables in the loop. Finally, we present some optimizations for the code parallelization. Although the experimental results are preliminary, it shows that scan and reduction parallelizations are effective on practical applications.