Results 1 
5 of
5
Extracting and Implementing List Homomorphisms in Parallel Program Development
 Science of Computer Programming
, 1997
"... this paper, we study functions called list homomorphisms, which represent a particular pattern of parallelism. ..."
Abstract

Cited by 12 (0 self)
 Add to MetaCart
(Show Context)
this paper, we study functions called list homomorphisms, which represent a particular pattern of parallelism.
Compiler Technology for Parallel Scientific Computation
, 1994
"... There is a need for compiler technology that, given the source program, will generate efficient parallel codes for different architectures with minimal user involvement. Parallel computation is becoming indispensable in solving largescale problems in science and engineering. Yet, the use of paralle ..."
Abstract

Cited by 2 (1 self)
 Add to MetaCart
There is a need for compiler technology that, given the source program, will generate efficient parallel codes for different architectures with minimal user involvement. Parallel computation is becoming indispensable in solving largescale problems in science and engineering. Yet, the use of parallel computation is limited by the high costs of developing the needed software. To overcome this difficulty we advocate a comprehensive approach to the development of scalable architectureindependent software for scientific computation based on our experience with Equational Programming Language, EPL.
Simultaneous Parallel Reduction on SIMD Machines
"... Proper distribution of operations among parallel processors in a large scientific computation executed on a distributedmemory machine can significantly reduce the total computation time. In this paper we propose an operation, called simultaneous parallel reduction(SPR), that is amenable to such ..."
Abstract
 Add to MetaCart
Proper distribution of operations among parallel processors in a large scientific computation executed on a distributedmemory machine can significantly reduce the total computation time. In this paper we propose an operation, called simultaneous parallel reduction(SPR), that is amenable to such optimization. SPR performs reduction operations in parallel, each operation reducing a onedimensional consecutive section of a distributed array. Each element of the distributed array is used as an operand to many reductions executed concurrently over the overlapping array's sections. SPR is distinct from a more commonly considered parallel reduction which concurrently evaluates a single reduction. In this paper we consider SPR on Single Instruction Multiple Data (SIMD) machines with different interconnection networks. We focus on SPR over sections whose size is not a power of 2 with the result shifted relative to the arguments. Several algorithms achieving some of the lower bounds on SPR complexity are presented under various assumptions about the properties of the binary operator of the reduction and of the communication cost of the target architectures.