Results 1  10
of
18
Multilevel Diffusion Schemes for Repartitioning of Adaptive Meshes
 JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING
, 1997
"... For a large class of irregular mesh applications, the structure of the mesh changes from one phase of the computation to the next. Eventually, as the mesh evolves, the adapted mesh has to be repartitioned to ensure good load balance. If this new graph is partitioned from scratch, it may lead to an ..."
Abstract

Cited by 65 (7 self)
 Add to MetaCart
For a large class of irregular mesh applications, the structure of the mesh changes from one phase of the computation to the next. Eventually, as the mesh evolves, the adapted mesh has to be repartitioned to ensure good load balance. If this new graph is partitioned from scratch, it may lead to an excessive migration of data among processors. In this paper, we present schemes for computing repartitionings of adaptively refined meshes that perform diffusion of
Multilevel hypergraph partitioning
 Applications in VLSI design, ACM/IEEE Design Automation Conference
, 1997
"... Traditional hypergraph partitioning algorithms compute a bisection a graph such that the number of hyperedges that are cut by the partitioning is minimized and each partition has an equal number of vertices. The task of minimizing the cut can be considered as the objective and the requirement that t ..."
Abstract

Cited by 64 (2 self)
 Add to MetaCart
Traditional hypergraph partitioning algorithms compute a bisection a graph such that the number of hyperedges that are cut by the partitioning is minimized and each partition has an equal number of vertices. The task of minimizing the cut can be considered as the objective and the requirement that the partitions will be of the same size can be considered as the constraint. In this paper we extend the partitioning problem by incorporating an arbitrary number of balancing constraints. In our formulation, a vector of weights is assigned to each vertex, and the goal is to produce a bisection such that the partitioning satisfies a balancing constraint associated with each weight, while attempting to minimize the cut. We present new multiconstraint hypergraph partitioning algorithms that are based on the multilevel partitioning paradigm. We experimentally evaluate the effectiveness of our multiconstraint partitioners on a variety of synthetically generated problems.
Efficient Schemes for Nearest Neighbor Load Balancing
, 1998
"... We design a general mathematical framework to analyze the properties of nearest neighbor balancing algorithms of the diffusion type. Within this framework we develop a new optimal polynomial scheme (OPS) which we show to terminate within a finite number m of steps, where m only depends on the graph ..."
Abstract

Cited by 46 (13 self)
 Add to MetaCart
We design a general mathematical framework to analyze the properties of nearest neighbor balancing algorithms of the diffusion type. Within this framework we develop a new optimal polynomial scheme (OPS) which we show to terminate within a finite number m of steps, where m only depends on the graph and not on the initial load distribution. We show that all existing diffusion load balancing algorithms, including OPS, determine a flow of load on the edges of the graph which is uniquely defined, independent of the method and minimal in the l 2 norm. This result can be extended to edge weighted graphs. The l 2 minimality is achieved only if a diffusion algorithm is used as preprocessing and the real movement of load is performed in a second step. Thus, it is advisable to split the balancing process into the two steps of first determining a balancing flow and afterwards moving the load. We introduce the problem of scheduling a flow and present some first results on its complexity and the ...
Dynamic load distribution in the borealis stream processor
 In ICDE
, 2005
"... Distributed and parallel computing environments are becoming cheap and commonplace. The availability of large numbers of CPU’s makes it possible to process more data at higher speeds. Streamprocessing systems are also becoming more important, as broad classes of applications require results in real ..."
Abstract

Cited by 46 (5 self)
 Add to MetaCart
Distributed and parallel computing environments are becoming cheap and commonplace. The availability of large numbers of CPU’s makes it possible to process more data at higher speeds. Streamprocessing systems are also becoming more important, as broad classes of applications require results in realtime. Since load can vary in unpredictable ways, exploiting the abundant processor cycles requires effective dynamic load distribution techniques. Although load distribution has been extensively studied for the traditional pullbased systems, it has not yet been fully studied in the context of pushbased continuous query processing. In this paper, we present a correlation based load distribution algorithm that aims at avoiding overload and minimizing endtoend latency by minimizing load variance and maximizing load correlation. While finding the optimal solution for such a problem is NPhard, our greedy algorithm can find reasonable solutions in polynomial time. We present both a global algorithm for initial load distribution and a pairwise algorithm for dynamic load migration.
Parallel Dynamic GraphPartitioning for Unstructured Meshes
, 1997
"... A parallel method for the dynamic partitioning of unstructured meshes is described. The method introduces a new iterative optimisation technique known as relative gain optimisation which both balances the workload and attempts to minimise the interprocessor communications overhead. Experiments on a ..."
Abstract

Cited by 13 (4 self)
 Add to MetaCart
A parallel method for the dynamic partitioning of unstructured meshes is described. The method introduces a new iterative optimisation technique known as relative gain optimisation which both balances the workload and attempts to minimise the interprocessor communications overhead. Experiments on a series of adaptively refined meshes indicate that the algorithm provides partitions of an equivalent or higher quality to static partitioners (which do not reuse the existing partition) and much more rapidly. Perhaps more importantly, the algorithm results in only a small fraction of the amount of data migration compared to the static partitioners.
A Comparison of Some Dynamic LoadBalancing Algorithms for a Parallel Adaptive Flow Solver
 Parallel Computing
, 2000
"... In this paper we contrast the performance of three different parallel dynamic loadbalancing algorithms when used in conjunction with a particular parallel, adaptive, timedependent, 3d flow solver that has recently been developed at Leeds. An overview of this adaptive solver is given along with a ..."
Abstract

Cited by 12 (8 self)
 Add to MetaCart
In this paper we contrast the performance of three different parallel dynamic loadbalancing algorithms when used in conjunction with a particular parallel, adaptive, timedependent, 3d flow solver that has recently been developed at Leeds. An overview of this adaptive solver is given along with a description of a new dynamic loadbalancing algorithm. The effectiveness of this algorithm is then assessed when it is coupled with the solver to tackle a model 3d flow problem in parallel. Two alternative parallel dynamic loadbalancing algorithms are also described and tested on the same flow problem. 1 Introduction The use of distributed memory parallel computers for the solution of large, complex computational mechanics problems has great potential for both significant increases in mesh sizes and the significant reduction of solution times. For transient problems accuracy and efficiency constraints also require the use of mesh adaptation since solution features on different length scal...
Optimal and AlternatingDirection Loadbalancing Schemes
, 1999
"... . We discuss iterative nearest neighbor load balancing schemes on processor networks which are represented by a cartesian product of graphs like e.g. tori or hypercubes. By the use of the AlternatingDirection Loadbalancing scheme, the number of load balance iterations decreases by a factor of 2 for ..."
Abstract

Cited by 12 (5 self)
 Add to MetaCart
. We discuss iterative nearest neighbor load balancing schemes on processor networks which are represented by a cartesian product of graphs like e.g. tori or hypercubes. By the use of the AlternatingDirection Loadbalancing scheme, the number of load balance iterations decreases by a factor of 2 for this type of graphs. The resulting flow is analyzed theoretically and it can be very high for certain cases. Therefore, we furthermore present the MixedDirection scheme which needs the same number of iterations but results in a much smaller flow. Apart from that, we present a simple optimal diffusion scheme for general graphs which calculates a minimal balancing flow in the l 2 norm. The scheme is based on the spectrum of the graph representing the network and needs only m \Gamma 1 iterations in order to balance the load with m being the number of distinct eigenvalues. 1 Introduction We consider the load balancing problem in a synchronous, distributed processor network. Each node of the ne...
Multilevel Algorithms for Generating Coarse Grids for Multigrid Method
, 2001
"... Geometric Multigrid methods have gained widespread acceptance for solving large systems of linear equations, especially for structured grids. One of the ..."
Abstract

Cited by 8 (2 self)
 Add to MetaCart
Geometric Multigrid methods have gained widespread acceptance for solving large systems of linear equations, especially for structured grids. One of the
Dynamic ReAllocation of Meshes for Parallel Finite Element Applications
, 1998
"... ith adaptive meshing (as in local mesh refinement and coarsening) or adaptive remeshing. However, as will be seen when considering the applications included within the DRAMA project, a need for dynamic load balancing arises in applications with fixed meshes where computational and/or communications ..."
Abstract

Cited by 3 (3 self)
 Add to MetaCart
ith adaptive meshing (as in local mesh refinement and coarsening) or adaptive remeshing. However, as will be seen when considering the applications included within the DRAMA project, a need for dynamic load balancing arises in applications with fixed meshes where computational and/or communications costs vary greatly as the simulation progresses. Major advances have been made in recent years in the two areas which form the starting point for the project activities: the development of parallel meshpartitioning algorithms suitable for dynamic repartitioning (reallocation of submeshes to processors at runtime); the migration and optimisation of industrialstrength simulation codes to HPC platforms using the messagepassing paradigm. However, most industrialstrength parallel simulations using large processor numbers are performed with static partitioning and nonadaptive meshing  or when adaptive meshing, then with a sequentialised repartitioning phase which greatly reduces the para
Parallel Multilevel Diffusion Algorithms for Repartitioning of Adaptive Meshes
, 1997
"... Graph partitioning has been shown to be an effective way to divide a large computation over an arbitrary number of processors. A good partitioning can ensure load balance and minimize the communication overhead of the computation by partitioning an irregular mesh into p equal parts while minimizin ..."
Abstract

Cited by 3 (0 self)
 Add to MetaCart
Graph partitioning has been shown to be an effective way to divide a large computation over an arbitrary number of processors. A good partitioning can ensure load balance and minimize the communication overhead of the computation by partitioning an irregular mesh into p equal parts while minimizing the number of edges cut by the partition. For a large class of irregular mesh applications, the structure of the graph changes from one phase of the computation to the next. Eventually, as the graph evolves, the adapted mesh has to be repartitioned to ensure good load balance. Failure to do so will lead to higher parallel run time. This repartitioning needs to maintain a low edgecut in order to minimize communication overhead in the followon computation. It also needs to minimize the time for physically migrating data from one processor to another since this time can dominate overall run time. Finally, it must be fast and scalable since it may be necessary to repartition frequently...