Results 1 - 10
of
56
Limits on Interconnection Network Performance
- IEEE Transactions on Parallel and Distributed Systems
, 1991
"... As the performance of interconnection networks becomes increasingly limited by physical constraints in high-speed multiprocessor systems, the parameters of high-performance network design must be reevaluated, starting with a close examination of assumptions and requirements. This paper models networ ..."
Abstract
-
Cited by 166 (4 self)
- Add to MetaCart
As the performance of interconnection networks becomes increasingly limited by physical constraints in high-speed multiprocessor systems, the parameters of high-performance network design must be reevaluated, starting with a close examination of assumptions and requirements. This paper models network latency, taking both switch and wire delays into account. A simple closed form expression for contention in buffered, direct networks is derived and is found to agree closely with simulations. The model includes the effects of packet size and communication locality. Network analysis under various constraints (such as fixed bisection width, fixed channel width, and fixed node size) and under different workload parameters (such as packet size, degree of communication locality, and network request rate) reveals that performance is highly sensitive to these constraints and workloads. A twodimensional network has the lowest latency only when switch delays and network contention are ignored, but...
Programming Parallel Algorithms
, 1996
"... In the past 20 years there has been treftlendous progress in developing and analyzing parallel algorithftls. Researchers have developed efficient parallel algorithms to solve most problems for which efficient sequential solutions are known. Although some ofthese algorithms are efficient only in a th ..."
Abstract
-
Cited by 163 (7 self)
- Add to MetaCart
In the past 20 years there has been treftlendous progress in developing and analyzing parallel algorithftls. Researchers have developed efficient parallel algorithms to solve most problems for which efficient sequential solutions are known. Although some ofthese algorithms are efficient only in a theoretical framework, many are quite efficient in practice or have key ideas that have been used in efficient implementations. This research on parallel algorithms has not only improved our general understanding ofparallelism but in several cases has led to improvements in sequential algorithms. Unf:ortunately there has been less success in developing good languages f:or prograftlftling parallel algorithftls, particularly languages that are well suited for teaching and prototyping algorithms. There has been a large gap between languages
Static Scheduling Algorithms for Allocating Directed Task Graphs to Multiprocessors
, 1999
"... Devices]: Modes of Computation---Parallelism and concurrency General Terms: Algorithms, Design, Performance, Theory Additional Key Words and Phrases: Automatic parallelization, DAG, multiprocessors, parallel processing, software tools, static scheduling, task graphs This research was supported ..."
Abstract
-
Cited by 142 (4 self)
- Add to MetaCart
Devices]: Modes of Computation---Parallelism and concurrency General Terms: Algorithms, Design, Performance, Theory Additional Key Words and Phrases: Automatic parallelization, DAG, multiprocessors, parallel processing, software tools, static scheduling, task graphs This research was supported by the Hong Kong Research Grants Council under contract numbers HKUST 734/96E, HKUST 6076/97E, and HKU 7124/99E. Authors' addresses: Y.-K. Kwok, Department of Electrical and Electronic Engineering, The University of Hong Kong, Pokfulam Road, Hong Kong; email: ykwok@eee.hku.hk; I. Ahmad, Department of Computer Science, The Hong Kong University of Science and Technology, Clear Water Bay, Hong Kong. Permission to make digital / hard copy of part or all of this work for personal or classroom use is granted without fee provided that the copies are not made or distributed for profit or commercial advantage, the copyright notice, the title of the publication, and its date appear, and notice is given that copying is by permission of the ACM, Inc. To copy otherwise, to republish, to post on servers, or to redistribute to lists, requires prior specific permission and / or a fee. 2000 ACM 0360-0300/99/1200--0406 $5.00 ACM Computing Surveys, Vol. 31, No. 4, December 1999 1.
Performance Tradeoffs In Multithreaded Processors
- IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS
, 1991
"... ... utilization. By maintaining multiple process contexts in hardware and switching among them in a few cycles, multithreaded processors can overlap computation with memory accesses and reduce processor idle time. This paper presents an analytical performance model for multithreaded processors th ..."
Abstract
-
Cited by 111 (5 self)
- Add to MetaCart
... utilization. By maintaining multiple process contexts in hardware and switching among them in a few cycles, multithreaded processors can overlap computation with memory accesses and reduce processor idle time. This paper presents an analytical performance model for multithreaded processors that includes cache interference, network contention, context-switching overhead, and data-sharing effects. The model is validated through our own simulations and by comparison with previously published simulation results. Our results indicate that processors can substantially benefit from multithreading, even in systems with small caches. Large caches yield close to full processor utilization with as few as two to four contexts, while small caches may require up to four times as many contexts. Increased network contention due to multithreading has a major effect on performance. The available network bandwidth and the context-switching overhead limits the best possible utilization.
A Unified Theory Of Interconnection Network Structure
- Theoretical Computer Science
, 1986
"... The relationship between the topology of interconnection networks and their functional properties is examined. Graph theoretical characterizations are derived for delta networks, which have a simple routing scheme, and for bidelta networks, which have the delta property in both directions. Delta net ..."
Abstract
-
Cited by 41 (0 self)
- Add to MetaCart
The relationship between the topology of interconnection networks and their functional properties is examined. Graph theoretical characterizations are derived for delta networks, which have a simple routing scheme, and for bidelta networks, which have the delta property in both directions. Delta networks are shown to have a recursive structure. Bidelta networks are shown to have a unique topology. The definition of bidelta network is used to derive in a uniform manner the labeling schemes that define the omega networks, indirect binary cube networks, flip networks, baseline networks, modified data manipulators, and two new networks; these schemes are generalized to arbitrary radices. The labeling schemes are used to characterize networks with simple routing. In another paper, we characterize the networks with optimal performance/cost ratio. Only the multistage shuffle-exchange networks have both optimal performance/cost ratio and simple routing. This helps explain why few fundamentally...
Routing in Multi-hop Packet Switching Networks: Gbps Challenge
- IEEE Network
, 1995
"... The paper is a survey of networking solutions that have been proposed for high-speed packet-switched applications. Using these solutions as examples, we identify the specific problems resulting from very high transmission rates and explain how these problems influence the design of high-speed networ ..."
Abstract
-
Cited by 24 (0 self)
- Add to MetaCart
The paper is a survey of networking solutions that have been proposed for high-speed packet-switched applications. Using these solutions as examples, we identify the specific problems resulting from very high transmission rates and explain how these problems influence the design of high-speed networks and protocols. We conclude that the solutions based on deflection routing are the most promising ones and we suggest a number of directions for their evolution. 1 Introduction Not so long ago, computer networks with high transmission rates (e.g. several Mb/s) were naturally confined to local domains. Although such (and higher) transmission rates were available in telephony on long distances, they were used on a point-to-point basis. Concepts of highly-connected fast networks spanning geographical areas larger than the acreage typically covered by a single institution are relatively new and, besides the emerging atm technology, there are no standard commercially available solutions that c...
An Optical Multi-Mesh Hypercube: A Scalable Optical Interconnection Network for Massively Parallel Computing
- IEEE-OSA Journal of Lightwave Technology
, 1994
"... A new interconnection network for massively parallel computing is introduced. This network is called an Optical Multi-Mesh Hypercube (OMMH) network. The OMMH integrates positive features of both hypercube (small diameter, high connectivity, symmetry, simple control and routing, fault tolerance, etc. ..."
Abstract
-
Cited by 24 (11 self)
- Add to MetaCart
A new interconnection network for massively parallel computing is introduced. This network is called an Optical Multi-Mesh Hypercube (OMMH) network. The OMMH integrates positive features of both hypercube (small diameter, high connectivity, symmetry, simple control and routing, fault tolerance, etc.) and mesh (constant node degree and scalability) topologies and at the same time circumvents their limitations (e.g., the lack of scalability of hypercubes, and the large diameter of meshes). The OMMH can maintain a constant node degree regardless of the increase in the network size. In addition, the flexibility of the OMMH network makes it well-suited for optical implementations. This paper presents the OMMH topology, analyzes its architectural properties and potentials for massively parallel computing, and compares it to the hypercube. Moreover, it also presents a three-dimensional optical design methodology based on free-space optics. The proposed optical implementation has totally space...
Least Common Ancestor Networks
, 1993
"... Least Common Ancestor Networks (LCANs) are introduced and shown to be a class of networks that include fat-trees, baseline networks, SW-banyans and the router networks of the TRAC 1.1 and 2.0, and the CM-5. Some LCAN properties are stated and the permutation routing capabilities of an important subc ..."
Abstract
-
Cited by 12 (2 self)
- Add to MetaCart
Least Common Ancestor Networks (LCANs) are introduced and shown to be a class of networks that include fat-trees, baseline networks, SW-banyans and the router networks of the TRAC 1.1 and 2.0, and the CM-5. Some LCAN properties are stated and the permutation routing capabilities of an important subclass are analyzed. Simulation results for three permutation classes verify the accuracy of an iterative solution for a randomized routing strategy.
A Spanning Multichannel Linked Hypercube: A Gradually Scalable Optical Interconnection Network for Massively Parallel Computing
- IEEE Trans. Parallel and Distributed Systems
, 1998
"... A new, scalable interconnection topology called the Spanning Multichannel Linked Hypercube (SMLH) is proposed. This proposed network is very suitable to massively parallel systems and is highly amenable to optical implementation. The SMLH uses the hypercube topology as a basic building block and c ..."
Abstract
-
Cited by 11 (3 self)
- Add to MetaCart
A new, scalable interconnection topology called the Spanning Multichannel Linked Hypercube (SMLH) is proposed. This proposed network is very suitable to massively parallel systems and is highly amenable to optical implementation. The SMLH uses the hypercube topology as a basic building block and connects such building blocks using two-dimensional multichannel links (similar to spanning buses). In doing so, the SMLH combines positive features of both the hypercube (small diameter, high connectivity, symmetry, simple routing, and fault tolerance) and the spanning bus hypercube (SBH) (constant node degree, scalability, and ease of physical implementation), while at the same time circumventing their disadvantages. The SMLH topology supports many communication patterns found in different classes of computation, such as bus-based, mesh-based, and tree-based problems, as well as hypercube-based problems. A very attractive feature of the SMLH network is its ability to support a large n...

