Results 1  10
of
39
A Dynamic Distributed Load Balancing Algorithm with Provable Good Performance
 In Proceedings of the 5th Annual ACM Symposium on Parallel Algorithms and Architectures
, 1993
"... The overall efficiency of parallel algorithms is most decisively effected by the strategy applied for the mapping of workload. Strategies for balancing dynamically generated workload on a processor network which are also useful for practical applications have intensively been investigated by simulat ..."
Abstract

Cited by 47 (5 self)
 Add to MetaCart
(Show Context)
The overall efficiency of parallel algorithms is most decisively effected by the strategy applied for the mapping of workload. Strategies for balancing dynamically generated workload on a processor network which are also useful for practical applications have intensively been investigated by simulations and by direct applications. This paper presents the complete theoretical analysis of a dynamically distributed load balancing strategy. The algorithm is adaptive by nature and is therefore useful for a broad range of applications. A similar algorithmic principle has already been implemented for a number of applications in the areas of combinatorial optimization, parallel programming languages and graphical animation. The algorithm performed convincingly for all these applications. In our analysis we will prove that the expected number of packets on each processor varies only by a constant factor compared with that on any other processor, independent of the generation and consumption of ...
Load Balancing in Large Networks: A Comparative Study (Extended Abstract)
 In Proceedings of the 3rd IEEE Symposium on Parallel and Distributed Processing
, 1991
"... ) R. Luling, B. Monien, F. Ramme Department of Mathematics and Computer Science University of Paderborn, Germany email : rl@unipaderborn.de, bm@unipaderborn.de, ram@unipaderborn.de Abstract In this paper we compare six well known and two new load balancing strategies on torus and ring topol ..."
Abstract

Cited by 43 (7 self)
 Add to MetaCart
) R. Luling, B. Monien, F. Ramme Department of Mathematics and Computer Science University of Paderborn, Germany email : rl@unipaderborn.de, bm@unipaderborn.de, ram@unipaderborn.de Abstract In this paper we compare six well known and two new load balancing strategies on torus and ring topologies of different sizes and workload characteristics. Through simulations on a large transputer network, we show that all strategies behave differently under the workload of process and data migration. The two new algorithms based on the gradient model method are shown to be robust to both kinds of workloads. Thus, these new algorithms are good candidates for distributed operating systems running on large networks, where the workload characteristics can not be determined in advance. 1 Introduction We study load balancing algorithms on large MIMD multiprocessor systems. The systems we consider are homogeneous and consist of autonomous processing elements (324 transputers in our case), which...
Load Balancing for Distributed Branch & Bound Algorithms
, 1992
"... In this paper, we present a new load balancing algorithm and its application to distributed branch & bound algorithms. We demonstrate the efficiency of this scheme by solving some NPcomplete problems on a network of up to 256 Transputers. The parallelization of our branch & bound algorithm ..."
Abstract

Cited by 35 (7 self)
 Add to MetaCart
In this paper, we present a new load balancing algorithm and its application to distributed branch & bound algorithms. We demonstrate the efficiency of this scheme by solving some NPcomplete problems on a network of up to 256 Transputers. The parallelization of our branch & bound algorithm is fully distributed. Every processor performs the same algorithm but on a different part of the solution tree. In this case, it is necessary to distribute subproblems among the processors to achieve a well balanced workload. We present a load balancing method which overcomes the problem of search overhead and idle times by an appropriate load model and avoids trashing effects by a feedback control strategy. To show the performance of our strategy, we solved the Vertex Cover and the weighted Vertex Cover problem for graphs of up to 150 nodes, using highly efficient branch and bound algorithms. Although the computing times were very short on a 256 processor network, we were able to achieve a speedup ...
Load Balancing Strategies For Distributed Memory Machines
 MultiScale Phenomena and Their Simulation
, 1997
"... Load balancing in large parallel systems with distributed memory is a difficult task often influencing the overall efficiency of applications substantially. A number of efficient distributed load balancing strategies have been developed in the recent years. Although they are currently not generally ..."
Abstract

Cited by 33 (1 self)
 Add to MetaCart
Load balancing in large parallel systems with distributed memory is a difficult task often influencing the overall efficiency of applications substantially. A number of efficient distributed load balancing strategies have been developed in the recent years. Although they are currently not generally available as part of parallel operating systems, it is often not difficult to integrate them into applications. This paper gives a classification of different load balancing problems based on application characteristics. For the case of applications out of the field of scientific computing, useful methods are described in more detail.
Combining Helpful Sets and Parallel Simulated Annealing for the GraphPartitioning Problem
 INT. J. PARALLEL ALGORITHMS AND APPLICATIONS
, 1996
"... In this paper we present a new algorithm for the kpartitioning problem which achieves an improved solution quality compared to known heuristics. We apply the principle of so called "helpful sets", which has shown to be very efficient for graph bisection, to the direct kpartitioning prob ..."
Abstract

Cited by 14 (4 self)
 Add to MetaCart
(Show Context)
In this paper we present a new algorithm for the kpartitioning problem which achieves an improved solution quality compared to known heuristics. We apply the principle of so called "helpful sets", which has shown to be very efficient for graph bisection, to the direct kpartitioning problem. The principle is extended in several ways. We introduce a new abstraction technique which shrinks the graph during runtime in a dynamic way leading to shorter computation times and improved solutions qualities. The use of stochastic methods provides further improvements in terms of solution quality. Additionally we present a parallel implementation of the new heuristic. The parallel algorithm delivers the same solution quality as the sequential one while providing reasonable parallel efficiency on MIMDsystems of moderate size. All results are verified by experiments for various graphs and processor numbers.
Embedding Ladders and Caterpillars into the Hypercube
, 1998
"... We present embeddings of generalized ladders as subgraphs into the hypercube. By embedding caterpillars into ladders, we obtain embeddings of caterpillars into the hypercube. In this way we obtain almost all known results concerning the embeddings of caterpillars into the hypercube. In addition we c ..."
Abstract

Cited by 11 (1 self)
 Add to MetaCart
(Show Context)
We present embeddings of generalized ladders as subgraphs into the hypercube. By embedding caterpillars into ladders, we obtain embeddings of caterpillars into the hypercube. In this way we obtain almost all known results concerning the embeddings of caterpillars into the hypercube. In addition we construct embeddings for some new types of caterpillars.
A Realizable Efficient Parallel Architecture
 IN PROCEEDINGS OF THE FIRST INTERNATIONAL HEINZ NIXDORF SYMPOSIUM: PARALLEL ARCHITECTURES AND THEIR EFFICIENT USE
, 1992
"... The near future will present large scale parallel computers, able to provide computing power of more than one TFlop per second. It is commonly agreed that these systems will be based on the model of asynchronous processors connected by a point to point network. There are a number of different netw ..."
Abstract

Cited by 10 (5 self)
 Add to MetaCart
(Show Context)
The near future will present large scale parallel computers, able to provide computing power of more than one TFlop per second. It is commonly agreed that these systems will be based on the model of asynchronous processors connected by a point to point network. There are a number of different network architectures presented in the past. In this paper we present an architectural principle that combines efficiency, realizability for very large systems, and inherent reliability needed for such large parallel processing systems. The here presented Fat Mesh of Clos network principle can be scaled in many ways to fulfill the special requirements of a system design. Two realizations of this principle are presented: One is based on static switches combined to form a fully reconfigurable system. This architecture has been realized for systems containing up to 320 processors. The other realization uses dynamic routing switches. By combining wormhole routing with randomized and local adaptive ...
Distributed Combinatorial Optimization
 PROC. OF SOFSEM'93, CZECH REPUBLIK
, 1993
"... This paper reports about research projects of the University of Paderborn in the field of distributed combinatorial optimization. We give an introduction into combinatorial optimization and a brief definition of some important applications. As a first exact solution method we describe branch & ..."
Abstract

Cited by 10 (6 self)
 Add to MetaCart
This paper reports about research projects of the University of Paderborn in the field of distributed combinatorial optimization. We give an introduction into combinatorial optimization and a brief definition of some important applications. As a first exact solution method we describe branch & bound and present the results of our work on its distributed implementation. Results of our distributed implementation of iterative deepening conclude the first part about exact methods. In the second part we give an introduction into simulated annealing as a heuristic method and present results of its parallel implementation. This part is concluded with a brief description of genetic algorithms and some other heuristic methods together with some results of their distributed implementation.
A novel approach for execution of distributed tasks on mobile ad hoc networks
 In Proceedings of the IEEE Wireless Computing and Networking Conference (WCNC
, 2002
"... Abstract–Mobile ad hoc networks (MANETs) have received significant attention in the recent past owing to the proliferation in the numbers of tetherless portable devices, and rapid growth in popularity of wireless networking. Most of the MANET research community has remained focused on developing low ..."
Abstract

Cited by 10 (6 self)
 Add to MetaCart
(Show Context)
Abstract–Mobile ad hoc networks (MANETs) have received significant attention in the recent past owing to the proliferation in the numbers of tetherless portable devices, and rapid growth in popularity of wireless networking. Most of the MANET research community has remained focused on developing lower layer mechanisms (such as channel access and routing) for making MANETs operational. However, little focus has been applied on higher layer issues, especially application modeling. In this paper, we present a novel distributed application framework based on task graphs that enables a large class of resource discovery based applications on MANETs. A distributed application is represented as a complex task comprised of smaller subtasks that need to be performed on different classes of computing devices with specialized roles. Execution of a particular task on a MANET involves several logical patterns of data flow between classes of such specialized devices. These data flow patterns induce dependencies between the different classes of devices that need to cooperate to execute the application. Such dependencies yield a task graph representation of the application. We focus on the problem of executing distributed tasks on a MANET by means of
An optimized reconfigurable architecture for Transputer networks
 PROC. OF 25TH HAWAII INT. CONF. ON SYSTEM SCIENCES (HICSS 92
, 1992
"... This paper presents the architecture of a fully reconfigurable distributed memory computing system. It is assumed that the processors communicate via message passing on an application specific regular network of degree four. To realize any network of this class, we use a special multistage Clos netw ..."
Abstract

Cited by 10 (3 self)
 Add to MetaCart
This paper presents the architecture of a fully reconfigurable distributed memory computing system. It is assumed that the processors communicate via message passing on an application specific regular network of degree four. To realize any network of this class, we use a special multistage Clos network which is built up by a minimal number of equal sized switches. These switches can be configured to realize any connection between input and output ports. To map a network onto the architecture, the process graph has to be partitioned into a number of subsets. We prove that the number of external edges between the subsets can be bounded. For that reason, it is possible to minimize the number of links and switches in our architecture without loosing the ability to realize any regular network of degree four. Moreover, any user specific network can be mapped efficiently on the architecture. This implies an efficient configuration of the system. The multistage structure of the architecture ma...