Results 1 -
8 of
8
The Generalized Dimension Exchange Method for Load Balancing in k-ary n-cubes and Variants
, 1995
"... The Generalized Dimension Exchange (GDE) method is a fully distributed load balancing method that operates in a relaxation fashion for multicomputers with a direct communication network. It is parameterized by an exchange parameter that governs the splitting of load between a pair of directly conne ..."
Abstract
-
Cited by 42 (9 self)
- Add to MetaCart
The Generalized Dimension Exchange (GDE) method is a fully distributed load balancing method that operates in a relaxation fashion for multicomputers with a direct communication network. It is parameterized by an exchange parameter that governs the splitting of load between a pair of directly connected processors during load balancing. An optimal would lead to the fastest convergence of the balancing process. Previous work has resulted in the optimal for the binary n-cubes. In this paper, we derive the optimal 's for the k-ary n-cube network and its variants---the ring, the torus, the chain, and the mesh. We establish the relationships between the optimal convergence rates of the method when applied to these structures, and conclude that the GDE method favors high dimensional k-ary n-cubes. We also reveal the superiority of the GDE method to another relaxation-based method, the diffusion method. We further show through statistical simulations that the optimal 's do speed up the GDE...
Parallel Logic Programming Systems
- Computing Surveys
, 1994
"... Parallelizing logic programming has attracted much interest in the research community, because of the intrinsic OR- and AND-parallelisms of logic programs. One research stream aims at transparent exploitation of parallelism in existing logic programming languages such as Prolog, whale the family of ..."
Abstract
-
Cited by 29 (0 self)
- Add to MetaCart
Parallelizing logic programming has attracted much interest in the research community, because of the intrinsic OR- and AND-parallelisms of logic programs. One research stream aims at transparent exploitation of parallelism in existing logic programming languages such as Prolog, whale the family of concurrent logic languages develops language constructs allowing programmers to express the concurrency—that is, the communication and synchronization between parallel processes—within their algorithms. This article concentrates mainly on transparent exploitation of parallelism and surveys the most mature solutions to the problems to be solved in order to obtain efficient implementations. These solutions have been implemented, and the most efficient parallel logic programming systems reach effective speedups over state-of-the-art sequential Prolog implementations. The article also addresses current and prospective research issues in extending the applicability and the efficiency of existing systems, such as models merging the transparent parallehsm and the concurrent logic languages approaches, combination of constraint logic programming with parallelism, and use of highly parallel architectures.
Nearest Neighbor Algorithms for Load Balancing in Parallel Computers
, 1995
"... With nearest neighbor load balancing algorithms, a processor makes balancing decisions based on localized workload information and manages workload migrations within its neighborhood. This paper compares a couple of fairly well-known nearest neighbor algorithms, the dimension-exchange (DE, for shor ..."
Abstract
-
Cited by 18 (2 self)
- Add to MetaCart
With nearest neighbor load balancing algorithms, a processor makes balancing decisions based on localized workload information and manages workload migrations within its neighborhood. This paper compares a couple of fairly well-known nearest neighbor algorithms, the dimension-exchange (DE, for short) and the diffusion (DF, for short) methods and their several variants---the average dimension-exchange (ADE), the optimally-tuned dimension-exchange (ODE), the local average diffusion (ADF) and the optimally-tuned diffusion (ODF). The measures of interest are their efficiency in driving any initial workload distribution to a uniform distribution and their ability in controlling the growth of the variance among the processors' workloads. The comparison is made with respect to both one-port and all-port communication architectures and in consideration of various implementation strategies including synchronous/asynchronous invocation policies and static/dynamic random workload behaviors. It t...
Recursively Scalable Fat-Trees as Interconnection Networks
- In Proceedings of the Thirteenth IEEE International Phoenix Conference on Computers and Communications
, 1994
"... We introduce orthogonal fat-trees as a type of interconnection network for parallel computers, and show how they can be used to maximize the number of processors in a massively parallel computer when the degree of the internal nodes and the diameter of the network are physically constrained. The bas ..."
Abstract
-
Cited by 7 (0 self)
- Add to MetaCart
We introduce orthogonal fat-trees as a type of interconnection network for parallel computers, and show how they can be used to maximize the number of processors in a massively parallel computer when the degree of the internal nodes and the diameter of the network are physically constrained. The basic building block of these orthogonal fat-trees is a two-level fat-tree that is obtained from a complete set of mutually orthogonal Latin Squares. As a practical application of orthogonal fat-trees, we propose a new interconnection network for a massively parallel computer based on the QR0001 Data Stream Controller Interface, an integrated circuit produced by National Semiconductor, which can sustain a throughput of up to 180 MBytes/sec. The network consists of multiple interconnected rings and is constrained to have at most 16 nodes per ring and a maximum diameter of four. Our solution yields a maximum of 51,984 processors. 1 Introduction The properties of a parallel computer depend on the...
An Efficient 3D Optical Implementation of Binary de Bruijn Networks with Applications to Massively Parallel Computing
- Proceedings of the Second International Conference on Massively Parallel Processing Using Optical Interconnections (MPPOI) ’95
"... The interconnection network structure can be the deciding and limiting factor in cost and performance of parallel computers. One of the most popular point-to-point interconnection networks for parallel computers today is the hypercube. The regularity, logarithmic diameter, symmetry, high connectivit ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
The interconnection network structure can be the deciding and limiting factor in cost and performance of parallel computers. One of the most popular point-to-point interconnection networks for parallel computers today is the hypercube. The regularity, logarithmic diameter, symmetry, high connectivity, fault-tolerance, simple routing, and reconfigurability (easy embedding of other network topologies) of the hypercube make it a very attractive for parallel computers. Unfortunately the hypercube possesses a major drawback which is the complexity of its node structure: the number of links per node increases as the network grows in size. As an alternative to the hypercube, the binary de Bruijn (BdB) network is recently receiving much attention. The BdB not only provides a logarithmic diameter, fault-tolerance, and simple routing but also requires fewer links than the hypercube for the same network size. Additionally, a major advantage of the BdB network is a constant node degree: the number...
Alternative Analysis for Computational Holon Architectures
, 1994
"... Simulator : : : : : : : : : : : : : : : : : : : : : : : : : 87 Appendix E. Examples of Human Performance Process Hierarchical Decomposition 92 Appendix F. Scalable Coherent Interfaces 96 Contents (continued) Chapter Page Appendix G. Synopses of Selected High Performance Parallel Machines 98 Append ..."
Abstract
- Add to MetaCart
Simulator : : : : : : : : : : : : : : : : : : : : : : : : : 87 Appendix E. Examples of Human Performance Process Hierarchical Decomposition 92 Appendix F. Scalable Coherent Interfaces 96 Contents (continued) Chapter Page Appendix G. Synopses of Selected High Performance Parallel Machines 98 Appendix H. Glossary of Acronyms 102 References 105 List of Figures Figure Page 1.1 A Holarchy : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 17 2.1 Possible Paths for Human Performance Process Model Creation : : : : : : : 21 6.1 Numerical Aerodynamics Simulation Results for Embarassingly Parallel Benchmarks : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 40 6.2 CM2: Numerical Aerodynamics Simulation Benchmark Results : : : : : : : 41 6.3 Human Performance Process and Architectures : : : : : : : : : : : : : : : : 42 8.1 Heterogeneous Computing Environment : : : : : : : : : : : : : : : : : : : : 50 9.1 High Performance Systems Metrics : : :...
Optical Binary de Bruijn Networks for Massively Parallel Computing: Design Methodology and Feasibility Study
- Appl. Opt
, 1995
"... The interconnection network structure can be the deciding and limiting factor in cost and performance of parallel computers. One of the most popular point-to-point interconnection networks for parallel computers today is the hypercube. The regularity, logarithmic diameter, symmetry, high connecti ..."
Abstract
- Add to MetaCart
The interconnection network structure can be the deciding and limiting factor in cost and performance of parallel computers. One of the most popular point-to-point interconnection networks for parallel computers today is the hypercube. The regularity, logarithmic diameter, symmetry, high connectivity, fault-tolerance, simple routing, and reconfigurability (easy embedding of other network topologies) of the hypercube make it a very attractive for parallel computers. Unfortunately the hypercube possesses a major drawback which is the complexity of its node structure: the number of links per node increases as the network grows in size. As an alternative to the hypercube, the binary de Bruijn (BdB) network is recently receiving much attention. The BdB not only provides a logarithmic diameter, fault-tolerance, and simple routing but also requires fewer links than the hypercube for the same network size. Additionally, a major advantage of the BdB network is a constant node degree:...

