Results 1  10
of
40
MULTIPROCESSOR SCHEDULING TO ACCOUNT FOR INTERPROCESSOR COMMUNICATION
, 1991
"... Interprocessor communication (PC) overheads have emerged as the major performance limitation in parallel processing systems, due to the transmission delays, synchronization overheads, and conflicts for shared communication resources created by data exchange. Accounting for these overheads is essenti ..."
Abstract

Cited by 67 (11 self)
 Add to MetaCart
Interprocessor communication (PC) overheads have emerged as the major performance limitation in parallel processing systems, due to the transmission delays, synchronization overheads, and conflicts for shared communication resources created by data exchange. Accounting for these overheads is essential for attaining efficient hardware utilization. This thesis introduces two new compiletime heuristics for scheduling precedence graphs onto multiprocessor architectures, which account for interprocessor communication overheads and interconnection constraints in the architecture. These algorithms perform scheduling and routing simultaneously to account for irregular interprocessor interconnections, and schedule all communications as well as all computations to eliminate shared resource contention. The first technique, called dynamiclevel scheduling, modifies the classical HLFET list scheduling strategy to account for IPC and synchronization overheads. By using dynamically changing priorities to match nodes and processors at each step, this technique attains an equitable tradeoff between load balancing and interprocessor communication cost. This method is fast, flexible, widely targetable, and displays promising perforrnance. The second technique, called declustering, establishes a parallelism hierarchy upon the precedence graph using graphanalysis techniques which explicitly address the tradeoff between exploiting parallelism and incurring communication cost. By systematically decomposing this hierarchy, the declustering process exposes parallelism instances in order of importance, assuring efficient use of the available processing resources. In contrast with traditional clustering schemes, this technique can adjust the level of cluster granularity to suit the characteristics of the specified architecture, leading to a more effective solution.
A Unified Theory Of Interconnection Network Structure
 Theoretical Computer Science
, 1986
"... The relationship between the topology of interconnection networks and their functional properties is examined. Graph theoretical characterizations are derived for delta networks, which have a simple routing scheme, and for bidelta networks, which have the delta property in both directions. Delta net ..."
Abstract

Cited by 42 (0 self)
 Add to MetaCart
The relationship between the topology of interconnection networks and their functional properties is examined. Graph theoretical characterizations are derived for delta networks, which have a simple routing scheme, and for bidelta networks, which have the delta property in both directions. Delta networks are shown to have a recursive structure. Bidelta networks are shown to have a unique topology. The definition of bidelta network is used to derive in a uniform manner the labeling schemes that define the omega networks, indirect binary cube networks, flip networks, baseline networks, modified data manipulators, and two new networks; these schemes are generalized to arbitrary radices. The labeling schemes are used to characterize networks with simple routing. In another paper, we characterize the networks with optimal performance/cost ratio. Only the multistage shuffleexchange networks have both optimal performance/cost ratio and simple routing. This helps explain why few fundamentally...
Reconfiguration With Time Division Multiplexed MINs for Multiprocessor Communications
 IEEE Transactions on Parallel and Distributed Systems
, 1994
"... In this paper, timedivision multiplexed multistage interconnection networks (TDMMINs) are proposed for multiprocessor communications. Connections required by an application are partitioned into a number of subsets called mappings, such that connections in each mapping can be established in a MI ..."
Abstract

Cited by 35 (29 self)
 Add to MetaCart
In this paper, timedivision multiplexed multistage interconnection networks (TDMMINs) are proposed for multiprocessor communications. Connections required by an application are partitioned into a number of subsets called mappings, such that connections in each mapping can be established in a MIN without conflict. Switch settings for establishing connections in each mapping are determined and stored in shift registers. By repeatedly changing switch settings, connections in each mapping are established for a time slot in a roundrobin fashion. Thus, all connections required by an application may be established in a MIN in a timedivision multiplexed way. TDMMINs can emulate a completely connected network using N time slots. It can also emulate regular networks such as rings, meshes, CubeConnectedCycles (CCC), binary trees and n dimensional hypercubes using 2, 4, 3, 4 and n time slots, respectively. The problem of partitioning an arbitrary set of requests into a minimal ...
Generalized connection networks for parallel pro. cessor intercommunication
 IEEE Trans. Comput
, 1978
"... AbstractA generalized connection network (GCN) is a switching network with N inputs and N outputs that can be set to pass any of the NN mappings of inputs onto outputs. This paper demonstrates an intimate connection between the problems of GCN construction, message routing on SIMD computers, and "r ..."
Abstract

Cited by 17 (0 self)
 Add to MetaCart
AbstractA generalized connection network (GCN) is a switching network with N inputs and N outputs that can be set to pass any of the NN mappings of inputs onto outputs. This paper demonstrates an intimate connection between the problems of GCN construction, message routing on SIMD computers, and "resource partitioning." A GCN due to Ofman [7] is here improved to use less than 7.6N log N contact pairs, making it the minimal known construction. Any GCN construction leads to a new algorithm for the broadcast of messages among processing elements of an SIMD computer when each processing element is to receive one message. Previous approaches to message broadcasting have not handled the problem in its full generality. The algorithm arising from this paper's GCN takes 8 log N (or 13N 112) routing steps on an N element processor of the perfect shuffle (or meshtype) variety. If each resource in a multiprocessing environment is assigned one output of a GCN, private buses may be provided for any number of disjoint subsets of the resources. The partitioning construction derived from this paper's GCN has 5.7N log N switches, providing an alternative to "banyan networks " with O(N log N) switches but incomplete functionality. Index TermsArray processors, connection networks, message broadcasting, parallel algorithms, parallel processing, resource partitioning, SIMD machines.
Permutation capability of optical multistage interconnection network
 JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING” , PP.60,
, 2000
"... In this paper, we study optical multistage interconnection networks (MINs). Advances in electrooptic technologies have made optical communication a promising networking choice to meet the increasing demands for high channel bandwidth and low communication latency of highperformance computing/commu ..."
Abstract

Cited by 12 (6 self)
 Add to MetaCart
In this paper, we study optical multistage interconnection networks (MINs). Advances in electrooptic technologies have made optical communication a promising networking choice to meet the increasing demands for high channel bandwidth and low communication latency of highperformance computing/communication applications. Although optical MINs hold great promise and have demonstrated advantages over their electronic counterpart, they also hold their own challenges. Due to the unique properties of optics, crosstalk in optical switches should be avoided to make them work properly. Most of the research work described in the literature are for electronic MINs, and hence, crosstalk is not considered. In this paper, we introduce a new concept, semipermutation, to analyze the permutation capability of optical MINs under the constraint of avoiding crosstalk, and apply it to two examples of optical MINs, banyan network and Benes network. For the blocking banyan network, we show that not all semipermutationsare realizable in one pass, and give the number of realizable semipermutations. For the rearrangeable Benes network, we show that any semipermutation is realizable in one pass and any permutation is realizable in two passes under the constraint of avoiding crosstalk. A routing algorithmfor realizing a semipermutationin a Benes network is also presented. Withthe speed and bandwidthprovided by current optical technology, an optical MIN clearly demonstrates a superior overall performance over its electronic MIN counterpart.
On the Communication Throughput of Buffered Multistage Interconnection Networks
 PROC. OF THE 8TH ANNUAL ACM SYMPOSIUM ON PARALLEL ALGORITHMS AND ARCHITECTURES
, 1996
"... Multistage networks (MIN) are used as interconnection structure in a large number of applications. Their performance is mainly determined by their communication throughput which, in most cases, has to be investigated by timeconsuming simulations or approximated by simple models. In this paper, we i ..."
Abstract

Cited by 6 (0 self)
 Add to MetaCart
Multistage networks (MIN) are used as interconnection structure in a large number of applications. Their performance is mainly determined by their communication throughput which, in most cases, has to be investigated by timeconsuming simulations or approximated by simple models. In this paper, we investigate the steady state throughput of single buffered multistage interconnection networks using the so called relaxed blocking model, where a message is deleted, if the receiving buffer is occupied. We derive upper and lower bounds on the throughput of MINs of arbitrary height and show that the throughput of singlebuffered networks is an order of magnitude higher than the throughput of nonbuffered MINs. In detail we show, that the throughput is \Theta(n= p log n) if n is the size of the network. Because the timedynamic of finite buffered MINs defies each marcov or semimarcov approach, we analyze the the equilibriumsituation of the network and give tight upper and lower bounds on t...
Multicommodity Flows in Simple Multistage Networks
 Networks
, 1994
"... . In this paper we consider the integral multicommodity flow problem on directed graphs underlying two classes of multistage interconnection networks. In one direction, we consider 3stage networks. Using existing results on (g; f)factors of bipartite graphs, we show sufficient and necessary condit ..."
Abstract

Cited by 6 (0 self)
 Add to MetaCart
. In this paper we consider the integral multicommodity flow problem on directed graphs underlying two classes of multistage interconnection networks. In one direction, we consider 3stage networks. Using existing results on (g; f)factors of bipartite graphs, we show sufficient and necessary conditions for the existence of a solution when the network has at most 2 secondary switches. In contrast, the problem is shown to be NPcomplete if the network has 3 or more secondaries. In a second direction, we introduce a recursive class of networks that includes multistage hypercubic networks (such as the omega network, the indirect binary ncube, and the generalized cube network) as a proper subset. Networks in the new class may have arbitrary number of stages. Moreover, each stage may contain identical switches of any arbitrary size. The notion of extrastage networks is extended to the new class, and the problem is shown to have polynomial time solutions on rstage networks where r = 3, o...
A universal performance factor for multicriteria evaluation of multistage interconnection networks
, 2006
"... ..."
Fast Packet Switching for Integrated Services
 University of Cambridge Computer Laboratory
, 1988
"... As the communications industry continues to expand two current trends are becoming apparent: the desire to support an increasing diversity of communications services #voice, video, image, text, etc.# and the consequent requirement for increased network capacity to handle the expected growth in suchm ..."
Abstract

Cited by 4 (1 self)
 Add to MetaCart
As the communications industry continues to expand two current trends are becoming apparent: the desire to support an increasing diversity of communications services #voice, video, image, text, etc.# and the consequent requirement for increased network capacity to handle the expected growth in suchmultiservice tra#c. This dissertation describes the design, performance and implementation of a high capacity switch which uses fast packet switching to o#er the integrated support of multiservice tra#c. Applications for this switch are considered within the public network, in the emerging metropolitan area network and within local area networks. The Cambridge Fast Packet Switch is based upon a nonbu#ered, multipath switch fabric with packet bu#ers situated at the input ports of the switch. This results in a very simple implementation suitable for construction in current gate array technology. A simulation study of the throughput at saturation of the switch is #rst presented to select th...
Strictly nonblocking fcast dary multilog networks under fanout and crosstalk constraints
 IEEE Transactions on Communications
, 2007
"... Abstract—We derive conditions which are both necessary and sufficient for the dary multilog switching networks to be fcast strictly nonblocking under all combinations of fanout and crosstalk constraints. The fanout constraint tells us which stage(s) of the networks has fanout capability. The cross ..."
Abstract

Cited by 4 (4 self)
 Add to MetaCart
Abstract—We derive conditions which are both necessary and sufficient for the dary multilog switching networks to be fcast strictly nonblocking under all combinations of fanout and crosstalk constraints. The fanout constraint tells us which stage(s) of the networks has fanout capability. The crosstalk constraint tells us whether or not two connection routes are allowed to share a link (relevant to electronic switches), or are allowed to share a switching element (crosstalkfree or not, relevant to optical switches). Thus, for any given d and f, we completely characterize the dary multilog network under the fcast strictly nonblocking constraint, the link/nodeblocking constraints, and the fanout constraints. The most novel contribution of this paper is the analytical technique, which combines an algebraic view of the dary multilog network with the maxflow mincut theorem. Our results are more general than previously known results on several fronts: (a) dary networks are more general than binary networks, (b) fcast covers both unicast (f = 1) and broadcast (f = N), (c) both linkblocking and nodeblocking are considered in a unified manner, and (d) all combinations of fanout constraints are considered.