ECO: Efficient Collective Operations for Communication on Heterogeneous Networks
 In International Parallel Processing Symposium
, 1995
"... PVM and other distributed computing systems have enabled the use of networks of workstations for parallel computation, but their approach of treating a network as a collection of pointtopoint connections does not promote efficient communication particularly collective communication. ECO is a ..."
PVM and other distributed computing systems have enabled the use of networks of workstations for parallel computation, but their approach of treating a network as a collection of pointtopoint connections does not promote efficient communication particularly collective communication. ECO is a package which solves this problem with programs which analyze the network and establish efficient communication patterns which are used by a library of collective operations. The analysis is done offline, so that after paying the onetime cost of analyzing the network, the execution of application programs is not delayed. This paper gives performance results from using ECO to implement the collective communication in CHARMM, a widely used macromolecular dynamics package. ECO facilitates the development of data parallel applications by providing a simple interface to routines which use the available heterogeneous networks efficiently. This approach gives a naive programmer the abili...
Multicast Communication in Multicomputer Networks
 IEEE Transactions on Parallel and Distributed Systems
, 1990
"... Efficient routing of messages is the key to the performance of multicomputers. Multicast communication refers to the delivery of the same message from a source node to an arbitrary number of destination nodes. While multicast communication is highly demanded in many applications, it is not directly ..."
Efficient routing of messages is the key to the performance of multicomputers. Multicast communication refers to the delivery of the same message from a source node to an arbitrary number of destination nodes. While multicast communication is highly demanded in many applications, it is not directly supported by all existing multicomputers; rather it is indirectly supported by multiple onetoone or broadcast communications, which result in more network traffic and a waste of system resources. In this paper, we study routing evaluation criteria for multicast communication under different communication paradigms. Multicast communication in multicomputers is formulated as a graph theoretical problem. Depending on the evaluation criteria and communication mechanisms, we study three optimal multicast communication problems, which are equivalent to the finding of the following three subgraphs: optimal multicast path, optimal multicast cycle, and minimal Steiner tree, where the interconnectio...
On Multicast Wormhole Routing in Multicomputer Networks
 In Symposium on Parallel and Distributed Processing
, 1994
"... . We show that deadlocks due to dependencies on consumption channels is a fundamental problem in multicast wormhole routing. This issue of deadlocks has not been addressed in many previously proposed multicast algorithms. We also show that deadlocks on consumption channels can be avoided by using mu ..."
. We show that deadlocks due to dependencies on consumption channels is a fundamental problem in multicast wormhole routing. This issue of deadlocks has not been addressed in many previously proposed multicast algorithms. We also show that deadlocks on consumption channels can be avoided by using multiple classes of consumption channels and restricting the use of consumption channels by multicast messages. In addition, we present a new multicast routing algorithm, columnpath, which uses the wellknown ecube algorithm for multicast routing. Therefore, this algorithm could be implemented in the existing multicomputers with minimal hardware support. We present a simulation study of the performance of Hamiltonpath based multicast algorithms with the proposed columnpath algorithm. Our simulations indicate that the simplistic scheme of sending one copy of a multicast message to each of its destinations exhibits good performance and that the new columnpath algorithm offers higher through...
Flow Control for Limited Buffer Multicast
 IEEE Transactions on Software Engineering
, 1993
"... This paper analyzes a multiround flow control algorithm that attempts to minimize the time required to multicast a message to a group of recipients and receive responses directly from each group member. Such a flow control algorithm may be necessary because the flurry of responses to the multicast ..."
This paper analyzes a multiround flow control algorithm that attempts to minimize the time required to multicast a message to a group of recipients and receive responses directly from each group member. Such a flow control algorithm may be necessary because the flurry of responses to the multicast can overflow the buffer space of the process that issued the multicast. The condition that each recipient directly respond to the multicast prevents the use of reliable multicast protocols based on software combining trees or negativeacknowledgements. The flow control algorithm analyzed here directs the responding processes to hold their responses for some period of time, called the backoff time, before sending them to the originator. The backoff time depends on the number of recipients that will respond, the originator's available buffer space and buffer service time distribution, and the number of times that the originator is willing to retransmit its message. This paper develops an appro...
Resource Deadlocks and Performance of Wormhole Multicast Routing Algorithms
 IEEE Trans. Parallel and Distributed Systems
, 1998
"... We show that deadlocks due to dependencies on consumption channels are a fundamental problem in wormhole multicast routing. This type of resource deadlocks has not been addressed in many previously proposed wormhole multicast algorithms. We also show that deadlocks on consumption channels can be a ..."
We show that deadlocks due to dependencies on consumption channels are a fundamental problem in wormhole multicast routing. This type of resource deadlocks has not been addressed in many previously proposed wormhole multicast algorithms. We also show that deadlocks on consumption channels can be avoided by using multiple classes of consumption channels and restricting the use of consumption channels by multicast messages. We provide upper bounds for the number of consumption channels required to avoid deadlocks. In addition, we present a new multicast routing algorithm, columnpath, which is based on the wellknown dimensionorder routing used in many multicomputers and multiprocessors. Therefore, this algorithm could be implemented in existing multicomputers with simple changes to the hardware. Using simulations, we compare the performance of the proposed columnpath algorithm with the previously proposed Hamiltonianpathbased multipath and an ecubebased multicast routing a...
Evaluation Of Multicast Routing Algorithms For Multimedia Streams
 IN PROCEEDINGS OF IEEE INTERNATIONAL TELECOMMUNICATIONS SYMPOSIUM
, 1994
"... Multimedia applications place new requirements on networks as compared to traditional data applications: (i) they require relatively high bandwidths on a continuous basis for long periods of time; (ii) involve multipoint communications and thus are expected to make heavy use of multicasting; and (ii ..."
Multimedia applications place new requirements on networks as compared to traditional data applications: (i) they require relatively high bandwidths on a continuous basis for long periods of time; (ii) involve multipoint communications and thus are expected to make heavy use of multicasting; and (iii) tend to be interactive and thus require low latency. These requirements must be taken into account when routing multimedia traffic in a network. This report presents a performance evaluation of routing algorithms in the multimedia environment, where the requirements of multipoint communications, bandwidth and latency must be satisfied. We present an exact solution to the optimum multicast routing problem, based on integer programming, and use this solution as a benchmark to evaluate existing heuristic algorithms, considering both performance and cost of implementation (as measured by the average run time), under realistic network and traffic scenarios.
Design and Implementation of Multicast Operations for ATMBased High Performance Computing
 PROCEEDINGS OF SUPERCOMPUTING 94 CONFERENCE
, 1994
"... This paper presents the results of an investigation into the efficient implementation of multicast operations for clusterbased parallel computing on Asynchronous Transfer Mode (ATM) networks. Both software and hardwarebased multicast operations have been implemented and studied on a threeswitch ..."
This paper presents the results of an investigation into the efficient implementation of multicast operations for clusterbased parallel computing on Asynchronous Transfer Mode (ATM) networks. Both software and hardwarebased multicast operations have been implemented and studied on a threeswitch ATM network testbed. Performance measurements are presented that illustrate how software approaches can best take advantage of switchbased network architectures, and what additional advantage can be gained from using underlying hardware support.
A Fast Parallel Algorithm to Recognize P 4 sparse Graphs
 Discrete Appl. Math
"... A number of problems in computational semantics, groupbased collaboration, automated theorem proving, networking, scheduling, and cluster analysis suggested the study of graphs featuring certain "local density" characteristics. Typically, the notion of local density is equated with the absence of c ..."
A number of problems in computational semantics, groupbased collaboration, automated theorem proving, networking, scheduling, and cluster analysis suggested the study of graphs featuring certain "local density" characteristics. Typically, the notion of local density is equated with the absence of chordless paths of length three or more. Recently, a new metric for local density has been proposed, allowing a number of such induced paths to occur. More precisely, a graph G is called P4sparse if no set of five vertices in G induces more than one chordless path of length three. P4sparse graphs generalize the wellknown class of cographs corresponding to a more stringent local density metric. One remarkable feature of P4sparse graphs is that they admit a tree representation unique up to isomorphism. In this work we present a parallel algorithm to recognize P4sparse graphs and show how the data structures returned by the recognition algorithm can be used to construct the corresponding tr...
Some Heuristics and Experiments for Building a Multicasting Tree in a HighSpeed Network
, 1997
"... In this paper, we propose three strategies for building a multicasting tree in a highspeed network. These strategies can be used in any network topology. The first one is based on voting, the second based on constructing a minimum spanning tree, and the third based on repeatedly constructing multip ..."
In this paper, we propose three strategies for building a multicasting tree in a highspeed network. These strategies can be used in any network topology. The first one is based on voting, the second based on constructing a minimum spanning tree, and the third based on repeatedly constructing multiple minimum spanning trees. To demonstrate the effectiveness of these strategies, we show how to apply them to hypercubes, star graphs, and star graphs with some faults. Experimental results are reported to evaluate the performance of these solutions. Keywords: ATM network, collective communication, fault tolerance, highspeed network, hypercube, multicasting, star graph. 1 Introduction In a network environment, it is essential for computers to be able to communicate with each other. Generally speaking, the communication patterns can be classified as unicast (onetoone), broadcast (onetoall), and multicast (onetomany). The first two communication patterns are regular and are comparative...