Results 1 
4 of
4
The Static Parallelization of Loops and Recursions
 In Proc. 11th Int. Symp. on High Performance Computing Systems (HPCS'97
, 1997
"... We demonstrate approaches to the static parallelization of loops and recursions on the example of the polynomial product. Phrased as a loop nest, the polynomial product can be parallelized automatically by applying a spacetime mapping technique based on linear algebra and linear programming. One ca ..."
Abstract

Cited by 3 (2 self)
 Add to MetaCart
(Show Context)
We demonstrate approaches to the static parallelization of loops and recursions on the example of the polynomial product. Phrased as a loop nest, the polynomial product can be parallelized automatically by applying a spacetime mapping technique based on linear algebra and linear programming. One can choose a parallel program that is optimal with respect to some objective function like the number of execution steps, processors, channels, etc. However, at best, linear execution time complexity can be attained. Through phrasing the polynomial product as a divideandconquer recursion, one can obtain a parallel program with sublinear execution time. In this case, the target program is not derived by an automatic search but given as a program skeleton, which can be deduced by a sequence of equational program transformations. We discuss the use of such skeletons, compare and assess the models in which loops and divideandconquer recursions are parallelized and comment on the performance pr...
ASIC/FPGA CAD Tool for Automated Systolic Algorithm Mapping
"... A specialized ASIC/FPGA CAD tool is described that will take a user's high level code description of an algorithm and automatically generate abstract latencyoptimal systolic arrays. Several new systolic mapping examples of the Lyapunov matrix equation (find X, given AX+XB=C) obtained using thi ..."
Abstract
 Add to MetaCart
(Show Context)
A specialized ASIC/FPGA CAD tool is described that will take a user's high level code description of an algorithm and automatically generate abstract latencyoptimal systolic arrays. Several new systolic mapping examples of the Lyapunov matrix equation (find X, given AX+XB=C) obtained using this CAD tool are described.
1 Automatic Generation of Systolic Array Designs For Reconfigurable Computing
"... The problem of rapidly generating optimal parallel circuit implementations from high level, formal descriptions of affinely indexed algorithms is addressed here in the context of reconfigurable FPGAbased computing. A specialized software tool, SPADE, is described that will take a user's high l ..."
Abstract
 Add to MetaCart
(Show Context)
The problem of rapidly generating optimal parallel circuit implementations from high level, formal descriptions of affinely indexed algorithms is addressed here in the context of reconfigurable FPGAbased computing. A specialized software tool, SPADE, is described that will take a user's high level code description of his algorithms and automatically generate an abstract latencyoptimal, locallyconnected parallel array of elemental processing elements. A design example, the Faddeev algorithm, is used to illustrate the tool's capabilities. 1.
Automatic LatencyOptimal Design of FPGAbased Systolic Arrays
"... "Systolic " algorithms have been shown to be suitable for a very large range of structured problems (i.e., linear algebra, graph theory, computational geometry, numbertheoretic algorithms, string matching, sorting/searching, dynamic programming, ..."
Abstract
 Add to MetaCart
(Show Context)
&quot;Systolic &quot; algorithms have been shown to be suitable for a very large range of structured problems (i.e., linear algebra, graph theory, computational geometry, numbertheoretic algorithms, string matching, sorting/searching, dynamic programming,