Results 1 - 10
of
18
Multilevel Diffusion Schemes for Repartitioning of Adaptive Meshes
- JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING
, 1997
"... For a large class of irregular mesh applications, the structure of the mesh changes from one phase of the computation to the next. Eventually, as the mesh evolves, the adapted mesh has to be repartitioned to ensure good load balance. If this new graph is partitioned from scratch, it may lead to an ..."
Abstract
-
Cited by 56 (7 self)
- Add to MetaCart
For a large class of irregular mesh applications, the structure of the mesh changes from one phase of the computation to the next. Eventually, as the mesh evolves, the adapted mesh has to be repartitioned to ensure good load balance. If this new graph is partitioned from scratch, it may lead to an excessive migration of data among processors. In this paper, we present schemes for computing repartitionings of adaptively refined meshes that perform diffusion of
Multilevel Hypergraph Partitioning
, 2002
"... Introduction Hypergraph partitioning is an important problem with extensive application to many areas, including VLSI design [Alpert and Kahng, 1995], efficient storage of large databases on disks [Shekhar and Liu, 1996], and data mining [Mobasher et al., 1996, Karypis et al., 1999b]. The problem i ..."
Abstract
-
Cited by 55 (2 self)
- Add to MetaCart
Introduction Hypergraph partitioning is an important problem with extensive application to many areas, including VLSI design [Alpert and Kahng, 1995], efficient storage of large databases on disks [Shekhar and Liu, 1996], and data mining [Mobasher et al., 1996, Karypis et al., 1999b]. The problem is to partition the vertices of a hypergraph into k equal-size parts, such that the number of hyperedges connecting vertices in different parts is minimized. During the course of VLSI circuit design and synthesis, it is important to be able to divide the system specification into clusters so that the inter-cluster connections are minimized. This step has many applications including design packaging, HDL-based synthesis, design optimization, rapid prototyping, simulation, and testing. Many rapid prototyping systems use partitioning to map a complex circuit onto hundreds of interconnected FPGAs. Such partitioning instances are challenging because the timing, area, and I/O resource utilization
Efficient Schemes for Nearest Neighbor Load Balancing
, 1998
"... We design a general mathematical framework to analyze the properties of nearest neighbor balancing algorithms of the diffusion type. Within this framework we develop a new optimal polynomial scheme (OPS) which we show to terminate within a finite number m of steps, where m only depends on the graph ..."
Abstract
-
Cited by 37 (13 self)
- Add to MetaCart
We design a general mathematical framework to analyze the properties of nearest neighbor balancing algorithms of the diffusion type. Within this framework we develop a new optimal polynomial scheme (OPS) which we show to terminate within a finite number m of steps, where m only depends on the graph and not on the initial load distribution. We show that all existing diffusion load balancing algorithms, including OPS, determine a flow of load on the edges of the graph which is uniquely defined, independent of the method and minimal in the l 2 -norm. This result can be extended to edge weighted graphs. The l 2 -minimality is achieved only if a diffusion algorithm is used as preprocessing and the real movement of load is performed in a second step. Thus, it is advisable to split the balancing process into the two steps of first determining a balancing flow and afterwards moving the load. We introduce the problem of scheduling a flow and present some first results on its complexity and the ...
Dynamic load distribution in the borealis stream processor
- In ICDE
, 2005
"... Distributed and parallel computing environments are becoming cheap and commonplace. The availability of large numbers of CPU’s makes it possible to process more data at higher speeds. Stream-processing systems are also becoming more important, as broad classes of applications require results in real ..."
Abstract
-
Cited by 31 (4 self)
- Add to MetaCart
Distributed and parallel computing environments are becoming cheap and commonplace. The availability of large numbers of CPU’s makes it possible to process more data at higher speeds. Stream-processing systems are also becoming more important, as broad classes of applications require results in real-time. Since load can vary in unpredictable ways, exploiting the abundant processor cycles requires effective dynamic load distribution techniques. Although load distribution has been extensively studied for the traditional pull-based systems, it has not yet been fully studied in the context of push-based continuous query processing. In this paper, we present a correlation based load distribution algorithm that aims at avoiding overload and minimizing end-to-end latency by minimizing load variance and maximizing load correlation. While finding the optimal solution for such a problem is NP-hard, our greedy algorithm can find reasonable solutions in polynomial time. We present both a global algorithm for initial load distribution and a pair-wise algorithm for dynamic load migration.
Parallel Dynamic Graph-Partitioning for Unstructured Meshes
, 1997
"... A parallel method for the dynamic partitioning of unstructured meshes is described. The method introduces a new iterative optimisation technique known as relative gain optimisation which both balances the workload and attempts to minimise the interprocessor communications overhead. Experiments on a ..."
Abstract
-
Cited by 12 (4 self)
- Add to MetaCart
A parallel method for the dynamic partitioning of unstructured meshes is described. The method introduces a new iterative optimisation technique known as relative gain optimisation which both balances the workload and attempts to minimise the interprocessor communications overhead. Experiments on a series of adaptively refined meshes indicate that the algorithm provides partitions of an equivalent or higher quality to static partitioners (which do not reuse the existing partition) and much more rapidly. Perhaps more importantly, the algorithm results in only a small fraction of the amount of data migration compared to the static partitioners.
Optimal and Alternating-Direction Loadbalancing Schemes
, 1999
"... . We discuss iterative nearest neighbor load balancing schemes on processor networks which are represented by a cartesian product of graphs like e.g. tori or hypercubes. By the use of the AlternatingDirection Loadbalancing scheme, the number of load balance iterations decreases by a factor of 2 for ..."
Abstract
-
Cited by 11 (5 self)
- Add to MetaCart
. We discuss iterative nearest neighbor load balancing schemes on processor networks which are represented by a cartesian product of graphs like e.g. tori or hypercubes. By the use of the AlternatingDirection Loadbalancing scheme, the number of load balance iterations decreases by a factor of 2 for this type of graphs. The resulting flow is analyzed theoretically and it can be very high for certain cases. Therefore, we furthermore present the Mixed-Direction scheme which needs the same number of iterations but results in a much smaller flow. Apart from that, we present a simple optimal diffusion scheme for general graphs which calculates a minimal balancing flow in the l 2 norm. The scheme is based on the spectrum of the graph representing the network and needs only m \Gamma 1 iterations in order to balance the load with m being the number of distinct eigenvalues. 1 Introduction We consider the load balancing problem in a synchronous, distributed processor network. Each node of the ne...
A Comparison of Some Dynamic Load-Balancing Algorithms for a Parallel Adaptive Flow Solver
- Parallel Computing
, 2000
"... In this paper we contrast the performance of three different parallel dynamic load-balancing algorithms when used in conjunction with a particular parallel, adaptive, time-dependent, 3-d flow solver that has recently been developed at Leeds. An overview of this adaptive solver is given along with a ..."
Abstract
-
Cited by 9 (6 self)
- Add to MetaCart
In this paper we contrast the performance of three different parallel dynamic load-balancing algorithms when used in conjunction with a particular parallel, adaptive, time-dependent, 3-d flow solver that has recently been developed at Leeds. An overview of this adaptive solver is given along with a description of a new dynamic loadbalancing algorithm. The effectiveness of this algorithm is then assessed when it is coupled with the solver to tackle a model 3-d flow problem in parallel. Two alternative parallel dynamic load-balancing algorithms are also described and tested on the same flow problem. 1 Introduction The use of distributed memory parallel computers for the solution of large, complex computational mechanics problems has great potential for both significant increases in mesh sizes and the significant reduction of solution times. For transient problems accuracy and efficiency constraints also require the use of mesh adaptation since solution features on different length scal...
Multilevel Algorithms for Generating Coarse Grids for Multigrid Method
, 2001
"... Geometric Multigrid methods have gained widespread acceptance for solving large systems of linear equations, especially for structured grids. One of the ..."
Abstract
-
Cited by 7 (2 self)
- Add to MetaCart
Geometric Multigrid methods have gained widespread acceptance for solving large systems of linear equations, especially for structured grids. One of the
Dynamic Re-Allocation of Meshes for Parallel Finite Element Applications
, 1998
"... ith adaptive meshing (as in local mesh refinement and coarsening) or adaptive re-meshing. However, as will be seen when considering the applications included within the DRAMA project, a need for dynamic load balancing arises in applications with fixed meshes where computational and/or communications ..."
Abstract
-
Cited by 3 (3 self)
- Add to MetaCart
ith adaptive meshing (as in local mesh refinement and coarsening) or adaptive re-meshing. However, as will be seen when considering the applications included within the DRAMA project, a need for dynamic load balancing arises in applications with fixed meshes where computational and/or communications costs vary greatly as the simulation progresses. Major advances have been made in recent years in the two areas which form the starting point for the project activities: the development of parallel mesh-partitioning algorithms suitable for dynamic repartitioning (re-allocation of sub-meshes to processors at run-time); the migration and optimisation of industrial-strength simulation codes to HPC platforms using the message-passing paradigm. However, most industrial-strength parallel simulations using large processor numbers are performed with static partitioning and non-adaptive meshing - or when adaptive meshing, then with a sequentialised repartitioning phase which greatly reduces the para
Toward Optimal Diffusion Matrices
- In Proceedings of the 16th International Parallel and Distributed Processing Symposium (IPDPS’02), page 67 (CD). IEEE Computer Society
, 2002
"... Efficient load balancing algorithms are the key to many efficient parallel applications. Until now, research in this area has mainly been focusing on homogeneous schemes. However, observations show that the convergence rate of diffusion algorithms can be improved using edge weighted graphs without d ..."
Abstract
-
Cited by 2 (1 self)
- Add to MetaCart
Efficient load balancing algorithms are the key to many efficient parallel applications. Until now, research in this area has mainly been focusing on homogeneous schemes. However, observations show that the convergence rate of diffusion algorithms can be improved using edge weighted graphs without deteriorating the flows quality. In this paper we consider common interconnection topologies and demonstrate, how optimal edge weights can be calculated for the First and Second Order Diffusion Schemes. Using theoretical analysis and practical experiments we show, what improvements can be archived on selected networks.

