Results 1  10
of
25
Parallel Optimisation Algorithms for Multilevel Mesh Partitioning
 Parallel Comput
, 2000
"... Three parallel optimisation algorithms, for use in the context of multilevel graph partitioning of unstructured meshes, are described. The first, interface optimisation, reduces the computation to a set of independent optimisation problems in interface regions. The next, alternating optimisation, is ..."
Abstract

Cited by 45 (14 self)
 Add to MetaCart
Three parallel optimisation algorithms, for use in the context of multilevel graph partitioning of unstructured meshes, are described. The first, interface optimisation, reduces the computation to a set of independent optimisation problems in interface regions. The next, alternating optimisation, is a restriction of this technique in which mesh entities are only allowed to migrate between subdomains in one direction. The third treats the gain as a potential field and uses the concept of relative gain for selecting appropriate vertices to migrate. The results are compared and seen to produce very high global quality partitions, very rapidly. The results are also compared with another partitioning tool and shown to be of higher quality although taking longer to compute. 2000 Elsevier Science B.V. All rights reserved.
Multilevel algorithms for partitioning powerlaw graphs
 IEEE INTERNATIONAL PARALLEL & DISTRIBUTED PROCESSING SYMPOSIUM (IPDPS). IN
, 2006
"... Graph partitioning is an enabling technology for parallel processing as it allows for the effective decomposition of unstructured computations whose data dependencies correspond to a large sparse and irregular graph. Even though the problem of computing highquality partitionings of graphs arising i ..."
Abstract

Cited by 26 (0 self)
 Add to MetaCart
Graph partitioning is an enabling technology for parallel processing as it allows for the effective decomposition of unstructured computations whose data dependencies correspond to a large sparse and irregular graph. Even though the problem of computing highquality partitionings of graphs arising in scientific computations is to a large extent wellunderstood, this is far from being true for emerging HPC applications whose underlying computation involves graphs whose degree distribution follows a powerlaw curve. This paper presents new multilevel graph partitioning algorithms that are specifically designed for partitioning such graphs. It presents new clusteringbased coarsening schemes that identify and collapse together groups of vertices that are highly connected. An experimental evaluation of these schemes on 10 different graphs show that the proposed algorithms consistently and significantly
Parallel Implementation and Practical Use of Sparse Approximate Inverse Preconditioners With a Priori Sparsity Patterns
 Int. J. High Perf. Comput. Appl
, 2001
"... This paper describes and tests a parallel, message passing code for constructing sparse approximate inverse preconditioners using Frobenius norm minimization. The sparsity patterns of the preconditioners are chosen as patterns of powers of sparsified matrices. Sparsification is necessary when powers ..."
Abstract

Cited by 20 (1 self)
 Add to MetaCart
This paper describes and tests a parallel, message passing code for constructing sparse approximate inverse preconditioners using Frobenius norm minimization. The sparsity patterns of the preconditioners are chosen as patterns of powers of sparsified matrices. Sparsification is necessary when powers of a matrix have a large number of nonzeros, making the approximate inverse computation expensive. For our test problems, the minimum solution time is achieved with approximate inverses with fewer than twice the number of nonzeros of the original matrix. Additional accuracy is not compensated by the increased cost per iteration. The results lead to further understanding of how to use these methods and how well these methods work in practice. In addition, this paper describes programming techniques required for high performance, including onesided communication, local coordinate numbering, and load repartitioning.
Partitioning sparse matrices for parallel preconditioned iterative methods
 SIAM Journal on Scientific Computing
, 2004
"... Abstract. This paper addresses the parallelization of the preconditioned iterative methods that use explicit preconditioners such as approximate inverses. Parallelizing a full step of these methods requires the coefficient and preconditioner matrices to be well partitioned. We first show that differ ..."
Abstract

Cited by 15 (9 self)
 Add to MetaCart
Abstract. This paper addresses the parallelization of the preconditioned iterative methods that use explicit preconditioners such as approximate inverses. Parallelizing a full step of these methods requires the coefficient and preconditioner matrices to be well partitioned. We first show that different methods impose different partitioning requirements for the matrices. Then we develop hypergraph models to meet those requirements. In particular, we develop models that enable us to obtain partitionings on the coefficient and preconditioner matrices simultaneously. Experiments on a set of unsymmetric sparse matrices show that the proposed models yield effective partitioning results. A parallel implementation of the right preconditioned BiCGStab method on a PC cluster verifies that the theoretical gains obtained by the models hold in practice.
Dynamic load balancing of finite element applications with the DRAMA library
 APPLIED MATHEMATICAL MODELLING 25 (2000) 83±98
, 2000
"... The DRAMA library, developed within the European Commission funded (ESPRIT) project DRAMA, supports dynamic loadbalancing for parallel (messagepassing) meshbased applications. The target applications are those with dynamic and solutionadaptive features. The focus within the DRAMA project was on ..."
Abstract

Cited by 9 (2 self)
 Add to MetaCart
The DRAMA library, developed within the European Commission funded (ESPRIT) project DRAMA, supports dynamic loadbalancing for parallel (messagepassing) meshbased applications. The target applications are those with dynamic and solutionadaptive features. The focus within the DRAMA project was on finite element simulation codes for structural mechanics. An introduction to the DRAMA library will illustrate that the very general cost model and the interface designed specifically for application requirements provide simplified and effective access to a range of parallel partitioners. The main body of the paper will demonstrate the ability to provide dynamic loadbalancing for parallel FEM problems that include: adaptive meshing, remeshing, the need for multiphase partitioning.
Communication Support for Adaptive Computation
, 2001
"... This memory cannot be utilized in subsequent phases, decreasing the total memory which is usable for communication, thus potentially increasing the number of phases. Instead, another processor can temporarily move some of its data to this processor to free up space for messages. An example is illust ..."
Abstract

Cited by 6 (4 self)
 Add to MetaCart
This memory cannot be utilized in subsequent phases, decreasing the total memory which is usable for communication, thus potentially increasing the number of phases. Instead, another processor can temporarily move some of its data to this processor to free up space for messages. An example is illustrated in Fig. 3. In this simple example, the top two processors want to exchange 100 units of data, but each has only one unit of available memory. A simplistic approach will require 100 phases. However, the third processor has 100 units of free memory. By parking data on this third processor (i.e. transferring free memory to another processor), the number of phases can be reduced to three.
Dynamic Mesh Partitioning & LoadBalancing for Parallel Computational Mechanics Codes
 Parallel & Distributed Processing for Computational Mechanics. SaxeCoburg Publications
, 1999
"... We discuss the loadbalancing issues arising in parallel mesh based computational mechanics codes for which the processor loading changes during the run. We briefly touch on geometric repartitioning ideas and then focus on different ways of using a graph both to solve the loadbalancing problem a ..."
Abstract

Cited by 6 (1 self)
 Add to MetaCart
We discuss the loadbalancing issues arising in parallel mesh based computational mechanics codes for which the processor loading changes during the run. We briefly touch on geometric repartitioning ideas and then focus on different ways of using a graph both to solve the loadbalancing problem and the optimisation problem, both locally and globally. We also briefly discuss whether repartitioning is always valid. Sample illustrative results are presented and we conclude that repartitioning is an attractive option if the load changes are not too dramatic and that there is a certain tradeoff between partition quality and volume of data that the underlying application needs to migrate.
Graph Partitioning in Scientific Simulations: Multilevel Schemes versus SpaceFilling Curves
"... Using spacefilling curves to partition unstructured finite element meshes is a widely applied strategy when it comes to distributing load among several computation nodes. Compared to more elaborated graph partitioning packages, this geometric approach is relatively easy to implement and very fast. ..."
Abstract

Cited by 6 (3 self)
 Add to MetaCart
Using spacefilling curves to partition unstructured finite element meshes is a widely applied strategy when it comes to distributing load among several computation nodes. Compared to more elaborated graph partitioning packages, this geometric approach is relatively easy to implement and very fast. However, results are not expected to be as good as those of the latter, but no detailed comparison has ever been published. In this paper we will...
Multiconstraint mesh partitioning for contact/impact computations
 in: Proc. SC2003, ACM
, 2003
"... We present a novel approach for decomposing contact/impact computations in which the mesh elements come in contact with each other during the course of the simulation. Effective decomposition of these computations poses a number of challenges as it needs to both balance the computations and minimize ..."
Abstract

Cited by 5 (0 self)
 Add to MetaCart
We present a novel approach for decomposing contact/impact computations in which the mesh elements come in contact with each other during the course of the simulation. Effective decomposition of these computations poses a number of challenges as it needs to both balance the computations and minimize the amount of communication that is performed during the finite element and the contact search phase. Our approach achieves the first goal by partitioning the underlying mesh such that it simultaneously balances both the work that is performed during the finite element phase and that performed during contact search phase, while producing subdomains whose boundaries consist of piecewise axesparallel lines or planes. The second goal is achieved by using a decision tree to decompose the space into rectangular or boxshaped regions that contain contact points from a single partition. Our experimental evaluation on a sequence of 100 meshes, shows that this new approach can significantly reduce the communication overhead over existing algorithms. 1
Behavioral Simulations in MapReduce
"... In many scientific domains, researchers are turning to largescale behavioral simulations to better understand realworld phenomena. While there has been a great deal of work on simulation tools from the highperformance computing community, behavioral simulations remain challenging to program and a ..."
Abstract

Cited by 5 (4 self)
 Add to MetaCart
In many scientific domains, researchers are turning to largescale behavioral simulations to better understand realworld phenomena. While there has been a great deal of work on simulation tools from the highperformance computing community, behavioral simulations remain challenging to program and automatically scale in parallel environments. In this paper we present BRACE (Big Red Agentbased Computation Engine), which extends the MapReduce framework to process these simulations efficiently across a cluster. We can leverage spatial locality to treat behavioral simulations as iterated spatial joins and greatly reduce the communication between nodes. In our experiments we achieve nearly linear scaleup on several realistic simulations. Though processing behavioral simulations in parallel as iterated spatial joins can be very efficient, it can be much simpler for the domain scientists to program the behavior of a single agent. Furthermore, many simulations include a considerable amount of complex computation and message passing between agents, which makes it important to optimize the performance of a single node and the communication across nodes. To address both of these challenges, BRACE includes a highlevel language called BRASIL (the Big Red Agent SImulation Language). BRASIL has objectoriented features for programming simulations, but can be compiled to a dataflow representation for automatic parallelization and optimization. We show that by using various optimization techniques, we can achieve both scalability and singlenode performance similar to that of a handcoded simulation. 1.