Results 1  10
of
28
A Practical Approach to Dynamic Load Balancing
, 1995
"... algorithm for load balancing. The following sections elaborate on each step in the above algorithm, presenting various design decisions that one encounters. 2.1 Load Evaluation The efficacy of any load balancing scheme is directly dependent on the quality of load evaluation. Good load measurement i ..."
Abstract

Cited by 69 (7 self)
 Add to MetaCart
algorithm for load balancing. The following sections elaborate on each step in the above algorithm, presenting various design decisions that one encounters. 2.1 Load Evaluation The efficacy of any load balancing scheme is directly dependent on the quality of load evaluation. Good load measurement is necessary both to determine that a load imbalance exists and to calculate how much work should be transferred to alleviate that imbalance. One can determine the load associated with a given task analytically, empirically or by a combination of those two methods. 6 CHAPTER 2. METHODOLOGY 2.1.1 Analytic Load Evaluation The load for a task is estimated based on knowledge of the time complexity of the algorithm(s) that task is executing along with the data structures on which it is operating. For example, if one knew that a task involved merge sorting a list of 64 elements, one might estimate the load to be 384, since merge sort is an O(N log 2 N) sorting algorithm, and since 64 log 2 (64) ...
Mesh Partitioning: a Multilevel Balancing and Refinement Algorithm
, 1998
"... Multilevel algorithms are a successful class of optimisation techniques which address the mesh partitioning problem. They usually combine a graph contraction algorithm together with a local optimisation method which refines the partition at each graph level. In this paper we present an enhancement o ..."
Abstract

Cited by 56 (22 self)
 Add to MetaCart
Multilevel algorithms are a successful class of optimisation techniques which address the mesh partitioning problem. They usually combine a graph contraction algorithm together with a local optimisation method which refines the partition at each graph level. In this paper we present an enhancement of the technique which uses imbalance to achieve higher quality partitions. We also present a formulation of the KernighanLin partition optimisation algorithm which incorporates loadbalancing. The resulting algorithm is tested against a different but related stateofthe art partitioner and shown to provide improved results. Keywords: graphpartitioning, mesh partitioning, loadbalancing, multilevel algorithms. 1 Introduction The need for mesh partitioning arises naturally in many finite element (FE) and finite volume (FV) applications. Meshes composed of elements such as triangles or tetrahedra are often better suited than regularly structured grids for representing completely general ge...
An Optimal Dynamic Load Balancing Algorithm
 Daresbury Laboratory
, 1995
"... The problem of redistributing work load on parallel computers is considered. An optimal redistribution algorithm, which minimises the Euclidean norm of the migrating load, is derived. The problem is further studied by modelling with the unsteady heat conduction equation. Relationship between this al ..."
Abstract

Cited by 40 (0 self)
 Add to MetaCart
The problem of redistributing work load on parallel computers is considered. An optimal redistribution algorithm, which minimises the Euclidean norm of the migrating load, is derived. The problem is further studied by modelling with the unsteady heat conduction equation. Relationship between this algorithm and other dynamic load balancing algorithms is discussed. Convergence of the algorithm for special graphs is studied. Finally numerical results on randomly generated graphs are given to demonstrate the effectiveness of the algorithm. 1. Introduction To achieve good performance on a parallel computer, it is essential to maintain a balanced work load among all the processors of the computer. Sometimes the load can be balanced statically. However in many cases the load on each processor can not be predicted a priori. One example that demonstrates the need for both static and dynamic load balancing strategies, which is also the main motivation for this paper, is in the parallel finite e...
Nearest Neighbor Algorithms for Load Balancing in Parallel Computers
, 1995
"... With nearest neighbor load balancing algorithms, a processor makes balancing decisions based on localized workload information and manages workload migrations within its neighborhood. This paper compares a couple of fairly wellknown nearest neighbor algorithms, the dimensionexchange (DE, for shor ..."
Abstract

Cited by 19 (2 self)
 Add to MetaCart
With nearest neighbor load balancing algorithms, a processor makes balancing decisions based on localized workload information and manages workload migrations within its neighborhood. This paper compares a couple of fairly wellknown nearest neighbor algorithms, the dimensionexchange (DE, for short) and the diffusion (DF, for short) methods and their several variantsthe average dimensionexchange (ADE), the optimallytuned dimensionexchange (ODE), the local average diffusion (ADF) and the optimallytuned diffusion (ODF). The measures of interest are their efficiency in driving any initial workload distribution to a uniform distribution and their ability in controlling the growth of the variance among the processors' workloads. The comparison is made with respect to both oneport and allport communication architectures and in consideration of various implementation strategies including synchronous/asynchronous invocation policies and static/dynamic random workload behaviors. It t...
JOSTLE: Partitioning of Unstructured Meshes for Massively Parallel Machines
 Parallel Computational Fluid Dynamics: New Algorithms and Applications
, 1995
"... this paper we discuss the mesh partitioning problem in the light of the coming generation of massively parallel machines and the resulting implications for such algorithms ..."
Abstract

Cited by 19 (2 self)
 Add to MetaCart
this paper we discuss the mesh partitioning problem in the light of the coming generation of massively parallel machines and the resulting implications for such algorithms
Fast Priority Queues for Parallel BranchandBound
 In Workshop on Algorithms for Irregularly Structured Problems, number 980 in LNCS
, 1995
"... . Currently used parallel best first branchandbound algorithms either suffer from contention at a centralized priority queue or can only approximate the best first strategy. Bottleneck free algorithms for parallel priority queues are known but they cannot be implemented very efficiently on contemp ..."
Abstract

Cited by 15 (2 self)
 Add to MetaCart
. Currently used parallel best first branchandbound algorithms either suffer from contention at a centralized priority queue or can only approximate the best first strategy. Bottleneck free algorithms for parallel priority queues are known but they cannot be implemented very efficiently on contemporary machines. We present quite simple randomized algorithms for parallel priority queues on distributed memory machines. For branchandbound they are asymptotically as efficient as previously known PRAM algorithms with high probability. The simplest versions require not much more communication than the approximated branchandbound algorithm of Karp and Zhang. Keywords: Analysis of randomized algorithms, distributed memory, load balancing, median selection, parallel best first branchandbound, parallel pritority queue. 1 Introduction Branchandbound search is an important technique for many combinatorial optimization problems. Since it can be a quite time consuming technique, paralleli...
A Parallelisable Algorithm for Optimising Unstructured Mesh Partitions
 Math. Res. Rep., Univ. of
, 1995
"... A new method is described for optimising graph partitions which arise in mapping unstructured mesh calculations to parallel computers. The method employs a combination of iterative techniques to both evenly balance the workload and minimise the number and volume of interprocessor communications. I ..."
Abstract

Cited by 12 (4 self)
 Add to MetaCart
A new method is described for optimising graph partitions which arise in mapping unstructured mesh calculations to parallel computers. The method employs a combination of iterative techniques to both evenly balance the workload and minimise the number and volume of interprocessor communications. It is designed to work efficiently in parallel as well as sequentially and can be applied directly to dynamically refined meshes. In addition, when combined with a fast direct partitioning technique (such as the Greedy algorithm) to give an initial partition, the resulting twostage process proves itself to be both a powerful and flexible solution to the static graphpartitioning problem. A clustering technique can also be employed to speed up the whole process. Experiments, on graphs with up to a million nodes, indicate that the resulting code is up to an order of magnitude faster than existing stateoftheart techniques such as Multilevel Recursive Spectral Bisection, whilst providing partitions of equivalent quality.
A Load Balancing Technique for Multiphase Computations
 Proc. of High Performance Computing `97
, 1997
"... Parallel computations comprised of multiple, tightly interwoven phases of computation may require a different approach to dynamic load balancing than singlephase computations. This paper presents a load sharing method based on the view of load as a vector, rather than as a scalar. This approach all ..."
Abstract

Cited by 11 (5 self)
 Add to MetaCart
Parallel computations comprised of multiple, tightly interwoven phases of computation may require a different approach to dynamic load balancing than singlephase computations. This paper presents a load sharing method based on the view of load as a vector, rather than as a scalar. This approach allows multiphase computations to achieve higher efficiency on largescale multicomputers than possible with traditional techniques. Results are presented for two largescale particle simulations running on 128 nodes of an Intel Paragon and on 256 processors of a Cray T3D, respectively. INTRODUCTION Load balancing techniques already in the literature have concentrated entirely on singlephase computations (Boillat 1990; Cybenko 1989; Evans and Butt 1993; Heirich and Taylor 1995; Horton 1993; Kohring 1995; Lin and Keller 1987; Muniz and Zaluska 1995; Song 1994; Walshaw and Berzins 1995; Watts et al. 1996; WillebeekLeMair and Reeves 1993; Williams 1991; Xu and Lau 1997). That is, they work only ...
Practical Dynamic Load Balancing for Irregular Problems
 In Parallel Algorithms for Irregularly Structured Problems: IRREGULAR `96 Proceedings
, 1996
"... . In this paper, we present a cohesive, practical load balancing framework that addresses many shortcomings of existing strategies. These techniques are portable to a broad range of prevalent architectures, including massively parallel machines such as the Cray T3D and Intel Paragon, shared memory s ..."
Abstract

Cited by 10 (6 self)
 Add to MetaCart
. In this paper, we present a cohesive, practical load balancing framework that addresses many shortcomings of existing strategies. These techniques are portable to a broad range of prevalent architectures, including massively parallel machines such as the Cray T3D and Intel Paragon, shared memory systems such as the SGI Power Challenge, and networks of workstations. This scheme improves on earlier work in this area and can be analyzed using wellunderstood techniques. The algorithm operates using nearestneighbor communication and inherently maintains existing locality in the application. A simple software interface allows the programmer to use load balancing with very little effort. Unlike many previous efforts in this arena, the techniques have been applied to largescale industrial applications, one of which is described herein. 1 Introduction A number of trends in computational science and engineering have increased the need for effective dynamic load balancing techniques. In par...