Results 1 -
2 of
2
Communication Operations on Coarse-Grained Mesh Architectures
- Parallel Computing
, 1994
"... In this paper we consider three frequently arising communication operations, one-to-all, all-to-one, and all-to-all. We describe architecture-independent solutions for each operation, as well as solutions tailored towards the mesh architecture. We show how the relationship among the parameters of a ..."
Abstract
-
Cited by 9 (5 self)
- Add to MetaCart
In this paper we consider three frequently arising communication operations, one-to-all, all-to-one, and all-to-all. We describe architecture-independent solutions for each operation, as well as solutions tailored towards the mesh architecture. We show how the relationship among the parameters of a parallel machine and the relationship of these parameters to the message size determines the best solution. We discuss performance and scalability issues of our solutions on the Intel Touchstone Delta. Our results show that in order to cover a broad range of scalability for a particular operation, multiple solutions should be employed. Keywords: Parallel processing, coarse-grained machines, communication operations, scalability. Research supported in part by ARPA under contract DABT63-92-C-0022ONR. The views and conclusions contained in this paper are those of the authors and should not be interpreted as representing official policies, expressed or implied, of the U.S. government. 1 In...
Scalable s-to-p broadcasting on message-passing mpps
- IEEE Transactions on Parallel and Distributed Systems
, 1998
"... Abstract—In s-to-p broadcasting, s processors in a p-processor machine contain a message to be broadcast to all the processors, 1 ≤ s ≤ p. We present a number of different broadcasting algorithms that handle all ranges of s. We show how the performance of each algorithm is influenced by the distribu ..."
Abstract
-
Cited by 5 (0 self)
- Add to MetaCart
Abstract—In s-to-p broadcasting, s processors in a p-processor machine contain a message to be broadcast to all the processors, 1 ≤ s ≤ p. We present a number of different broadcasting algorithms that handle all ranges of s. We show how the performance of each algorithm is influenced by the distribution of the s source processors and by the relationships between the distribution and the characteristics of the interconnection network. For the Intel Paragon we show that for each algorithm and machine dimension there exist ideal distributions and distributions on which the performance degrades. For the Cray T3D we also demonstrate dependencies between distributions and machine sizes. To reduce the dependence of the performance on the distribution of sources, we propose a repositioning approach. In this approach, the initial distribution is turned into an ideal distribution of the target broadcasting algorithm. We report experimental results for the Intel Paragon and Cray T3D and discuss scalability and performance. Index Terms—Broadcasting, communication operations, message-passing MPPs, scalability.

