Bandwidth-efficient Collective Communication for Clustered Wide Area Systems (1999)
| Venue: | In Proc. International Parallel and Distributed Processing Symposium (IPDPS 2000), Cancun |
| Citations: | 24 - 3 self |
BibTeX
@INPROCEEDINGS{Kielmann99bandwidth-efficientcollective,
author = {Thilo Kielmann and Henri E. Bal and Sergei Gorlatch},
title = {Bandwidth-efficient Collective Communication for Clustered Wide Area Systems},
booktitle = {In Proc. International Parallel and Distributed Processing Symposium (IPDPS 2000), Cancun},
year = {1999},
pages = {492--499},
publisher = {IEEE}
}
Years of Citing Articles
OpenURL
Abstract
Metacomputing infrastructures couple multiple clusters (or MPPs) via wide-area networks and thus allow parallel programs to run on geographically distributed resources. A major problem in programming such wide-area parallel applications is the difference in communication costs inside and between clusters. Latency and bandwidth of WANs often are orders of magnitude worse than those of local networks. Our MagPIe library eases wide-area parallel programming by providing an efficient implementation of MPI's collective communication operations. MagPIe exploits the hierarchical structure of clustered wide-area systems and minimizes the communication overhead over the WAN links. In this paper, we present improved algorithms for collective communication that achieve shorter completion times by simultaneously using the aggregate bandwidth of the available wide-area links. Our new algorithms split messages into multiple segments that are sent in parallel over different WAN links, thus resulting ...







