Results 1  10
of
10
The Generalized Dimension Exchange Method for Load Balancing in kary ncubes and Variants
, 1995
"... The Generalized Dimension Exchange (GDE) method is a fully distributed load balancing method that operates in a relaxation fashion for multicomputers with a direct communication network. It is parameterized by an exchange parameter that governs the splitting of load between a pair of directly conne ..."
Abstract

Cited by 44 (9 self)
 Add to MetaCart
The Generalized Dimension Exchange (GDE) method is a fully distributed load balancing method that operates in a relaxation fashion for multicomputers with a direct communication network. It is parameterized by an exchange parameter that governs the splitting of load between a pair of directly connected processors during load balancing. An optimal would lead to the fastest convergence of the balancing process. Previous work has resulted in the optimal for the binary ncubes. In this paper, we derive the optimal 's for the kary ncube network and its variantsthe ring, the torus, the chain, and the mesh. We establish the relationships between the optimal convergence rates of the method when applied to these structures, and conclude that the GDE method favors high dimensional kary ncubes. We also reveal the superiority of the GDE method to another relaxationbased method, the diffusion method. We further show through statistical simulations that the optimal 's do speed up the GDE...
Parallel Logic Programming Systems
 Computing Surveys
, 1994
"... Parallelizing logic programming has attracted much interest in the research community, because of the intrinsic OR and ANDparallelisms of logic programs. One research stream aims at transparent exploitation of parallelism in existing logic programming languages such as Prolog, whale the family of ..."
Abstract

Cited by 35 (0 self)
 Add to MetaCart
Parallelizing logic programming has attracted much interest in the research community, because of the intrinsic OR and ANDparallelisms of logic programs. One research stream aims at transparent exploitation of parallelism in existing logic programming languages such as Prolog, whale the family of concurrent logic languages develops language constructs allowing programmers to express the concurrency—that is, the communication and synchronization between parallel processes—within their algorithms. This article concentrates mainly on transparent exploitation of parallelism and surveys the most mature solutions to the problems to be solved in order to obtain efficient implementations. These solutions have been implemented, and the most efficient parallel logic programming systems reach effective speedups over stateoftheart sequential Prolog implementations. The article also addresses current and prospective research issues in extending the applicability and the efficiency of existing systems, such as models merging the transparent parallehsm and the concurrent logic languages approaches, combination of constraint logic programming with parallelism, and use of highly parallel architectures.
Nearest Neighbor Algorithms for Load Balancing in Parallel Computers
, 1995
"... With nearest neighbor load balancing algorithms, a processor makes balancing decisions based on localized workload information and manages workload migrations within its neighborhood. This paper compares a couple of fairly wellknown nearest neighbor algorithms, the dimensionexchange (DE, for shor ..."
Abstract

Cited by 21 (2 self)
 Add to MetaCart
(Show Context)
With nearest neighbor load balancing algorithms, a processor makes balancing decisions based on localized workload information and manages workload migrations within its neighborhood. This paper compares a couple of fairly wellknown nearest neighbor algorithms, the dimensionexchange (DE, for short) and the diffusion (DF, for short) methods and their several variantsthe average dimensionexchange (ADE), the optimallytuned dimensionexchange (ODE), the local average diffusion (ADF) and the optimallytuned diffusion (ODF). The measures of interest are their efficiency in driving any initial workload distribution to a uniform distribution and their ability in controlling the growth of the variance among the processors' workloads. The comparison is made with respect to both oneport and allport communication architectures and in consideration of various implementation strategies including synchronous/asynchronous invocation policies and static/dynamic random workload behaviors. It t...
Recursively Scalable FatTrees as Interconnection Networks
 In Proceedings of the Thirteenth IEEE International Phoenix Conference on Computers and Communications
, 1994
"... We introduce orthogonal fattrees as a type of interconnection network for parallel computers, and show how they can be used to maximize the number of processors in a massively parallel computer when the degree of the internal nodes and the diameter of the network are physically constrained. The bas ..."
Abstract

Cited by 10 (0 self)
 Add to MetaCart
(Show Context)
We introduce orthogonal fattrees as a type of interconnection network for parallel computers, and show how they can be used to maximize the number of processors in a massively parallel computer when the degree of the internal nodes and the diameter of the network are physically constrained. The basic building block of these orthogonal fattrees is a twolevel fattree that is obtained from a complete set of mutually orthogonal Latin Squares. As a practical application of orthogonal fattrees, we propose a new interconnection network for a massively parallel computer based on the QR0001 Data Stream Controller Interface, an integrated circuit produced by National Semiconductor, which can sustain a throughput of up to 180 MBytes/sec. The network consists of multiple interconnected rings and is constrained to have at most 16 nodes per ring and a maximum diameter of four. Our solution yields a maximum of 51,984 processors. 1 Introduction The properties of a parallel computer depend on the...
An Efficient 3D Optical Implementation of Binary de Bruijn Networks with Applications to Massively Parallel Computing
 Proceedings of the Second International Conference on Massively Parallel Processing Using Optical Interconnections (MPPOI) ’95
"... The interconnection network structure can be the deciding and limiting factor in cost and performance of parallel computers. One of the most popular pointtopoint interconnection networks for parallel computers today is the hypercube. The regularity, logarithmic diameter, symmetry, high connectivit ..."
Abstract

Cited by 2 (0 self)
 Add to MetaCart
(Show Context)
The interconnection network structure can be the deciding and limiting factor in cost and performance of parallel computers. One of the most popular pointtopoint interconnection networks for parallel computers today is the hypercube. The regularity, logarithmic diameter, symmetry, high connectivity, faulttolerance, simple routing, and reconfigurability (easy embedding of other network topologies) of the hypercube make it a very attractive for parallel computers. Unfortunately the hypercube possesses a major drawback which is the complexity of its node structure: the number of links per node increases as the network grows in size. As an alternative to the hypercube, the binary de Bruijn (BdB) network is recently receiving much attention. The BdB not only provides a logarithmic diameter, faulttolerance, and simple routing but also requires fewer links than the hypercube for the same network size. Additionally, a major advantage of the BdB network is a constant node degree: the number...
Alternative Analysis for Computational Holon Architectures
, 1994
"... Simulator : : : : : : : : : : : : : : : : : : : : : : : : : 87 Appendix E. Examples of Human Performance Process Hierarchical Decomposition 92 Appendix F. Scalable Coherent Interfaces 96 Contents (continued) Chapter Page Appendix G. Synopses of Selected High Performance Parallel Machines 98 Append ..."
Abstract
 Add to MetaCart
Simulator : : : : : : : : : : : : : : : : : : : : : : : : : 87 Appendix E. Examples of Human Performance Process Hierarchical Decomposition 92 Appendix F. Scalable Coherent Interfaces 96 Contents (continued) Chapter Page Appendix G. Synopses of Selected High Performance Parallel Machines 98 Appendix H. Glossary of Acronyms 102 References 105 List of Figures Figure Page 1.1 A Holarchy : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 17 2.1 Possible Paths for Human Performance Process Model Creation : : : : : : : 21 6.1 Numerical Aerodynamics Simulation Results for Embarassingly Parallel Benchmarks : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 40 6.2 CM2: Numerical Aerodynamics Simulation Benchmark Results : : : : : : : 41 6.3 Human Performance Process and Architectures : : : : : : : : : : : : : : : : 42 8.1 Heterogeneous Computing Environment : : : : : : : : : : : : : : : : : : : : 50 9.1 High Performance Systems Metrics : : :...
Optical Binary de Bruijn Networks for Massively Parallel Computing: Design Methodology and Feasibility Study
 Appl. Opt
, 1995
"... The interconnection network structure can be the deciding and limiting factor in cost and performance of parallel computers. One of the most popular pointtopoint interconnection networks for parallel computers today is the hypercube. The regularity, logarithmic diameter, symmetry, high connecti ..."
Abstract
 Add to MetaCart
The interconnection network structure can be the deciding and limiting factor in cost and performance of parallel computers. One of the most popular pointtopoint interconnection networks for parallel computers today is the hypercube. The regularity, logarithmic diameter, symmetry, high connectivity, faulttolerance, simple routing, and reconfigurability (easy embedding of other network topologies) of the hypercube make it a very attractive for parallel computers. Unfortunately the hypercube possesses a major drawback which is the complexity of its node structure: the number of links per node increases as the network grows in size. As an alternative to the hypercube, the binary de Bruijn (BdB) network is recently receiving much attention. The BdB not only provides a logarithmic diameter, faulttolerance, and simple routing but also requires fewer links than the hypercube for the same network size. Additionally, a major advantage of the BdB network is a constant node degree:...
unknown title
"... In the past few years, microprocessor architectures have undergone a fundamental change. Driven by a variety of factors, leading designs have transitioned from single monolithic processors to “multicore ” configurations. In this paper, we survey prior work on parallel processing systems, and discuss ..."
Abstract
 Add to MetaCart
(Show Context)
In the past few years, microprocessor architectures have undergone a fundamental change. Driven by a variety of factors, leading designs have transitioned from single monolithic processors to “multicore ” configurations. In this paper, we survey prior work on parallel processing systems, and discuss the enthusiasm for multicore designs from a psychological perspective. We argue that the semiconductor industry faces a difficult challenge. There is wide agreement that singlecore processing rates have peaked, and that any further significant progress is unlikely. The shift towards parallel architectures is not necessarily a solution, however: parallel software and applications are fundamentally different from their serial counterparts, and the market for parallel computing has never been particularly large. Without a high volume and high profit product such as the consumer microprocessor, it is unclear where revenue will come from to drive forward with Moore’s law scaling.