Results 1 - 10
of
23
Parallel Optimisation Algorithms for Multilevel Mesh Partitioning
- Parallel Comput
, 2000
"... Three parallel optimisation algorithms, for use in the context of multilevel graph partitioning of unstructured meshes, are described. The first, interface optimisation, reduces the computation to a set of independent optimisation problems in interface regions. The next, alternating optimisation, is ..."
Abstract
-
Cited by 37 (14 self)
- Add to MetaCart
Three parallel optimisation algorithms, for use in the context of multilevel graph partitioning of unstructured meshes, are described. The first, interface optimisation, reduces the computation to a set of independent optimisation problems in interface regions. The next, alternating optimisation, is a restriction of this technique in which mesh entities are only allowed to migrate between subdomains in one direction. The third treats the gain as a potential field and uses the concept of relative gain for selecting appropriate vertices to migrate. The results are compared and seen to produce very high global quality partitions, very rapidly. The results are also compared with another partitioning tool and shown to be of higher quality although taking longer to compute. 2000 Elsevier Science B.V. All rights reserved.
Parallel Implementation and Practical Use of Sparse Approximate Inverse Preconditioners With a Priori Sparsity Patterns
- Int. J. High Perf. Comput. Appl
, 2001
"... This paper describes and tests a parallel, message passing code for constructing sparse approximate inverse preconditioners using Frobenius norm minimization. The sparsity patterns of the preconditioners are chosen as patterns of powers of sparsified matrices. Sparsification is necessary when powers ..."
Abstract
-
Cited by 19 (1 self)
- Add to MetaCart
This paper describes and tests a parallel, message passing code for constructing sparse approximate inverse preconditioners using Frobenius norm minimization. The sparsity patterns of the preconditioners are chosen as patterns of powers of sparsified matrices. Sparsification is necessary when powers of a matrix have a large number of nonzeros, making the approximate inverse computation expensive. For our test problems, the minimum solution time is achieved with approximate inverses with fewer than twice the number of nonzeros of the original matrix. Additional accuracy is not compensated by the increased cost per iteration. The results lead to further understanding of how to use these methods and how well these methods work in practice. In addition, this paper describes programming techniques required for high performance, including one-sided communication, local coordinate numbering, and load repartitioning.
Multilevel algorithms for partitioning power-law graphs
- IEEE INTERNATIONAL PARALLEL & DISTRIBUTED PROCESSING SYMPOSIUM (IPDPS). IN
, 2006
"... Graph partitioning is an enabling technology for parallel processing as it allows for the effective decomposition of unstructured computations whose data dependencies correspond to a large sparse and irregular graph. Even though the problem of computing high-quality partitionings of graphs arising i ..."
Abstract
-
Cited by 17 (0 self)
- Add to MetaCart
Graph partitioning is an enabling technology for parallel processing as it allows for the effective decomposition of unstructured computations whose data dependencies correspond to a large sparse and irregular graph. Even though the problem of computing high-quality partitionings of graphs arising in scientific computations is to a large extent wellunderstood, this is far from being true for emerging HPC applications whose underlying computation involves graphs whose degree distribution follows a power-law curve. This paper presents new multilevel graph partitioning algorithms that are specifically designed for partitioning such graphs. It presents new clustering-based coarsening schemes that identify and collapse together groups of vertices that are highly connected. An experimental evaluation of these schemes on 10 different graphs show that the proposed algorithms consistently and significantly
Dynamic load balancing of finite element applications with the DRAMA library
- APPLIED MATHEMATICAL MODELLING 25 (2000) 83±98
, 2000
"... The DRAMA library, developed within the European Commission funded (ESPRIT) project DRAMA, supports dynamic load-balancing for parallel (message-passing) mesh-based applications. The target applications are those with dynamic and solution-adaptive features. The focus within the DRAMA project was on ..."
Abstract
-
Cited by 8 (1 self)
- Add to MetaCart
The DRAMA library, developed within the European Commission funded (ESPRIT) project DRAMA, supports dynamic load-balancing for parallel (message-passing) mesh-based applications. The target applications are those with dynamic and solution-adaptive features. The focus within the DRAMA project was on finite element simulation codes for structural mechanics. An introduction to the DRAMA library will illustrate that the very general cost model and the interface designed specifically for application requirements provide simplified and effective access to a range of parallel partitioners. The main body of the paper will demonstrate the ability to provide dynamic load-balancing for parallel FEM problems that include: adaptive meshing, re-meshing, the need for multi-phase partitioning.
Communication Support for Adaptive Computation
, 2001
"... This memory cannot be utilized in subsequent phases, decreasing the total memory which is usable for communication, thus potentially increasing the number of phases. Instead, another processor can temporarily move some of its data to this processor to free up space for messages. An example is illust ..."
Abstract
-
Cited by 6 (4 self)
- Add to MetaCart
This memory cannot be utilized in subsequent phases, decreasing the total memory which is usable for communication, thus potentially increasing the number of phases. Instead, another processor can temporarily move some of its data to this processor to free up space for messages. An example is illustrated in Fig. 3. In this simple example, the top two processors want to exchange 100 units of data, but each has only one unit of available memory. A simplistic approach will require 100 phases. However, the third processor has 100 units of free memory. By parking data on this third processor (i.e. transferring free memory to another processor), the number of phases can be reduced to three.
Multi-constraint mesh partitioning for contact/impact computations
- in: Proc. SC2003, ACM
, 2003
"... We present a novel approach for decomposing contact/impact computations in which the mesh elements come in contact with each other during the course of the simulation. Effective decomposition of these computations poses a number of challenges as it needs to both balance the computations and minimize ..."
Abstract
-
Cited by 5 (0 self)
- Add to MetaCart
We present a novel approach for decomposing contact/impact computations in which the mesh elements come in contact with each other during the course of the simulation. Effective decomposition of these computations poses a number of challenges as it needs to both balance the computations and minimize the amount of communication that is performed during the finite element and the contact search phase. Our approach achieves the first goal by partitioning the underlying mesh such that it simultaneously balances both the work that is performed during the finite element phase and that performed during contact search phase, while producing subdomains whose boundaries consist of piecewise axes-parallel lines or planes. The second goal is achieved by using a decision tree to decompose the space into rectangular or box-shaped regions that contain contact points from a single partition. Our experimental evaluation on a sequence of 100 meshes, shows that this new approach can significantly reduce the communication overhead over existing algorithms. 1
Partitioning sparse matrices for parallel preconditioned iterative methods
- SIAM Journal on Scientific Computing
, 2004
"... Abstract. This paper addresses the parallelization of the preconditioned iterative methods that use explicit preconditioners such as approximate inverses. Parallelizing a full step of these methods requires the coefficient and preconditioner matrices to be well partitioned. We first show that differ ..."
Abstract
-
Cited by 5 (4 self)
- Add to MetaCart
Abstract. This paper addresses the parallelization of the preconditioned iterative methods that use explicit preconditioners such as approximate inverses. Parallelizing a full step of these methods requires the coefficient and preconditioner matrices to be well partitioned. We first show that different methods impose different partitioning requirements for the matrices. Then we develop hypergraph models to meet those requirements. In particular, we develop models that enable us to obtain partitionings on the coefficient and preconditioner matrices simultaneously. Experiments on a set of unsymmetric sparse matrices show that the proposed models yield effective partitioning results. A parallel implementation of the right preconditioned BiCGStab method on a PC cluster verifies that the theoretical gains obtained by the models hold in practice.
Dynamic Mesh Partitioning & Load-Balancing for Parallel Computational Mechanics Codes
- Parallel & Distributed Processing for Computational Mechanics. Saxe-Coburg Publications
, 1999
"... We discuss the load-balancing issues arising in parallel mesh based computational mechanics codes for which the processor loading changes during the run. We briefly touch on geometric repartitioning ideas and then focus on different ways of using a graph both to solve the load-balancing problem a ..."
Abstract
-
Cited by 4 (1 self)
- Add to MetaCart
We discuss the load-balancing issues arising in parallel mesh based computational mechanics codes for which the processor loading changes during the run. We briefly touch on geometric repartitioning ideas and then focus on different ways of using a graph both to solve the load-balancing problem and the optimisation problem, both locally and globally. We also briefly discuss whether repartitioning is always valid. Sample illustrative results are presented and we conclude that repartitioning is an attractive option if the load changes are not too dramatic and that there is a certain trade-off between partition quality and volume of data that the underlying application needs to migrate.
Partitioning and Dynamic Load Balancing for the Numerical Solution of Partial Differential Equations
- Numerical Solution of Partial Differential Equations on Parallel Computers
, 2005
"... lement methods, have workloads that are unpredictable or change during the computation, requiring dynamic load balancers that adjust the decomposition as the computation proceeds. Partitioning approaches attempt to distribute computational work equally, while minimizing interprocessor communication ..."
Abstract
-
Cited by 4 (1 self)
- Add to MetaCart
lement methods, have workloads that are unpredictable or change during the computation, requiring dynamic load balancers that adjust the decomposition as the computation proceeds. Partitioning approaches attempt to distribute computational work equally, while minimizing interprocessor communication costs. Communication costs are governed by the amount of data to be shared by cooperating processes (communication volume) and the number of partitions sharing the data (number of messages). Dynamic load-balancing procedures should also operate in parallel on distributed data, execute quickly, and minimize data movement by making the new data distribution as similar as possible to the existing one. The partitioning problem is defined in more detail in Section 1. Numerous partitioning strategies have been developed. The various strategies are distinguished by trade-o#s between partition quality, amount of data movement, and partitioning speed. Characteristics of an application (e.g., computat
Graph Partitioning in Scientific Simulations: Multilevel Schemes versus Space-Filling Curves
"... Using space-filling curves to partition unstructured finite element meshes is a widely applied strategy when it comes to distributing load among several computation nodes. Compared to more elaborated graph partitioning packages, this geometric approach is relatively easy to implement and very fast. ..."
Abstract
-
Cited by 3 (1 self)
- Add to MetaCart
Using space-filling curves to partition unstructured finite element meshes is a widely applied strategy when it comes to distributing load among several computation nodes. Compared to more elaborated graph partitioning packages, this geometric approach is relatively easy to implement and very fast. However, results are not expected to be as good as those of the latter, but no detailed comparison has ever been published. In this paper we will...

