Permuting Sparse Rectangular Matrices into BlockDiagonal Form
 SIAM Journal on Scientific Computing
, 2002
"... We investigate the problem of permuting a sparse rectangular matrix into block diagonal form. Block diagonal form of a matrix grants an inherent parallelism for solving the deriving problem, as recently investigated in the context of mathematical programming, LU factorization and QR factorization. W ..."
We investigate the problem of permuting a sparse rectangular matrix into block diagonal form. Block diagonal form of a matrix grants an inherent parallelism for solving the deriving problem, as recently investigated in the context of mathematical programming, LU factorization and QR factorization. We propose bipartite graph and hypergraph models to represent the nonzero structure of a matrix, which reduce the permutation problem to those of graph partitioning by vertex separator and hypergraph partitioning, respectively. Our experiments on a wide range of matrices, using stateoftheart graph and hypergraph partitioning tools MeTiS and PaToH, revealed that the proposed methods yield very effective solutions both in terms of solution quality and runtime.
Encapsulating Multiple CommunicationCost Metrics in Partitioning Sparse Rectangular Matrices for Parallel MatrixVector Multiplies
"... This paper addresses the problem of onedimensional partitioning of structurally unsymmetricsquare and rectangular sparse matrices for parallel matrixvector and matrixtransposevector multiplies. The objective is to minimize the communication cost while maintaining the balance on computational load ..."
This paper addresses the problem of onedimensional partitioning of structurally unsymmetricsquare and rectangular sparse matrices for parallel matrixvector and matrixtransposevector multiplies. The objective is to minimize the communication cost while maintaining the balance on computational loads of processors. Most of the existing partitioning models consider only the total message volume hoping that minimizing this communicationcost metric is likely to reduce other metrics. However, the total message latency (startup time) may be more important than the total message volume. Furthermore, the maximum message volume and latency handled by a single processor are also important metrics. We propose a twophase approach that encapsulates all these four communicationcost metrics. The objective in the first phase is to minimize the total message volume while maintainingthe computationalload balance. The objective in the second phase is to encapsulate the remaining three communicationcost metrics. We propose communicationhypergraph and partitioning models for the second phase. We then present several methods for partitioning communication hypergraphs. Experiments on a wide range of test matrices show that the proposed approach yields very effective partitioning results. A parallel implementation on a PC cluster verifies that the theoretical improvements shown by partitioning results hold in practice.
A finegrain hypergraph model for 2D decomposition of sparse matrices
 in: Proceedings of the 15th International Parallel and Distributed Processing Symposium, 2001, p. 118. C. Aykanat
"... We propose a new hypergraph model for the decomposition of irregular computational domains. This work focuses on the decomposition of sparse matrices for parallel matrixvector multiplication. However, the proposed model can also be used to decompose computational domains of other parallel reduction ..."
We propose a new hypergraph model for the decomposition of irregular computational domains. This work focuses on the decomposition of sparse matrices for parallel matrixvector multiplication. However, the proposed model can also be used to decompose computational domains of other parallel reduction problems. We propose a “finegrain” hypergraph model for twodimensional decomposition of sparse matrices. In the proposed finegrain hypergraph model, vertices represent nonzeros and hyperedges represent sparsity patterns of rows and columns of the matrix. By partitioning the finegrain hypergraph into equally weighted vertex parts (processors) so that hyperedges are split among as few processors as possible, the model correctly minimizes communication volume while maintaining computationalload balance. Experimental results on a wide range of realistic sparse matrices confirm the validity of the proposed model, by achieving up to 50 percent better decompositionsthan the existing models, in terms of totalcommunication volume. 1
Hypergraphbased Dynamic Load Balancing for Adaptive Scientific Computations
"... Adaptive scientific computations require that periodic repartitioning (load balancing) occur dynamically to maintain load balance. Hypergraph partitioning is a successful model for minimizing communication volume in scientific computations, and partitioning software for the static case is widely ava ..."
Adaptive scientific computations require that periodic repartitioning (load balancing) occur dynamically to maintain load balance. Hypergraph partitioning is a successful model for minimizing communication volume in scientific computations, and partitioning software for the static case is widely available. In this paper, we present a new hypergraph model for the dynamic case, where we minimize the sum of communication in the application plus the migration cost to move data, thereby reducing total execution time. The new model can be solved using hypergraph partitioning with fixed vertices. We describe an implementation of a parallel multilevel repartitioning algorithm within the Zoltan loadbalancing toolkit, which to our knowledge is the first code for dynamic load balancing based on hypergraph partitioning. Finally, we present experimental results that demonstrate the effectiveness of our approach on a Linux cluster with up to 64 processors. Our new algorithm compares favorably to the widely used ParMETIS partitioning software in terms of quality, and would have reduced total execution time in most of our test cases. ∗ Sandia is a multiprogram laboratory operated by Sandia Corporation,
A hypergraphpartitioning approach for coarsegrain decomposition
 in: Proceedings of the 2001 ACM/IEEE Conference on Supercomputing, 2001
"... We propose a new twophase method for the coarsegrain decomposition of irregular computational domains. This work focuses on the 2D partitioning of sparse matrices for parallel matrixvector multiplication. However, the proposed model can also be used to decompose computational domains of other par ..."
We propose a new twophase method for the coarsegrain decomposition of irregular computational domains. This work focuses on the 2D partitioning of sparse matrices for parallel matrixvector multiplication. However, the proposed model can also be used to decompose computational domains of other parallel reduction problems. This work also introduces the use of multiconstraint hypergraph partitioning, for solving the decomposition problem. The proposed method explicitly models the minimization of communication volume while enforcing the upper bound of p + q; 2 on the maximum number of messages handled by a single processor, for a parallel system with P = p q processors. Experimental results on a wide range of realistic sparse matrices confirm the validity of the proposed methods, by achieving up to 25 percent better partitions than the standard graph model, in terms of total communication volume, and 59 percent better partitions in terms of number of messages, on the overall average. 1.
On twodimensional sparse matrix partitioning: Models, methods, and a recipe
 SIAM J. Sci. Comput
, 2010
"... Abstract. We consider twodimensional partitioning of general sparse matrices for parallel sparse matrixvector multiply operation. We present three hypergraphpartitioningbased methods, each having unique advantages. The first one treats the nonzeros of the matrix individually and hence produces f ..."
Abstract. We consider twodimensional partitioning of general sparse matrices for parallel sparse matrixvector multiply operation. We present three hypergraphpartitioningbased methods, each having unique advantages. The first one treats the nonzeros of the matrix individually and hence produces finegrain partitions. The other two produce coarser partitions, where one of them imposes a limit on the number of messages sent and received by a single processor, and the other trades that limit for a lower communication volume. We also present a thorough experimental evaluation of the proposed twodimensional partitioning methods together with the hypergraphbased onedimensional partitioning methods, using an extensive set of public domain matrices. Furthermore, for the users of these partitioning methods, we present a partitioning recipe that chooses one of the partitioning methods according to some matrix characteristics.
Revisiting hypergraph models for sparse matrix partitioning
 SIAM Review
, 2007
"... Abstract. We provide an exposition of hypergraph models for parallelizing sparse matrixvector multiplies. Our aim is to emphasize the expressive power of hypergraph models. First, we set forth an elementary hypergraph model for the parallel matrixvector multiply based on onedimensional (1D) matri ..."
Abstract. We provide an exposition of hypergraph models for parallelizing sparse matrixvector multiplies. Our aim is to emphasize the expressive power of hypergraph models. First, we set forth an elementary hypergraph model for the parallel matrixvector multiply based on onedimensional (1D) matrix partitioning. In the elementary model, the vertices represent the data of a matrixvector multiply, and the nets encode dependencies among the data. We then apply a recently proposed hypergraph transformation operation to devise models for 1Dsparse matrix partitioning. The resulting 1Dpartitioning models are equivalent to the previously proposed computational hypergraph models and are not meant to be replacements for them. Nevertheless, the new models give us insights into the previous ones and help us explain a subtle requirement, known as the consistency condition, of hypergraph partitioning models. Later, we demonstrate the flexibility of the elementary model on a few 1Dpartitioning problems that are hard to solve using the previously proposed models. We also discuss extensions of the proposed elementary model to twodimensional matrix partitioning. Key words. parallel computing, sparse matrixvector multiply, hypergraph models
Partitioning sparse matrices for parallel preconditioned iterative methods
 SIAM Journal on Scientific Computing
, 2004
"... Abstract. This paper addresses the parallelization of the preconditioned iterative methods that use explicit preconditioners such as approximate inverses. Parallelizing a full step of these methods requires the coefficient and preconditioner matrices to be well partitioned. We first show that differ ..."
Abstract. This paper addresses the parallelization of the preconditioned iterative methods that use explicit preconditioners such as approximate inverses. Parallelizing a full step of these methods requires the coefficient and preconditioner matrices to be well partitioned. We first show that different methods impose different partitioning requirements for the matrices. Then we develop hypergraph models to meet those requirements. In particular, we develop models that enable us to obtain partitionings on the coefficient and preconditioner matrices simultaneously. Experiments on a set of unsymmetric sparse matrices show that the proposed models yield effective partitioning results. A parallel implementation of the right preconditioned BiCGStab method on a PC cluster verifies that the theoretical gains obtained by the models hold in practice.
HYPERGRAPH PARTITIONINGBASED FILLREDUCING ORDERING
, 2009
"... A typical first step of a direct solver for linear system Mx = b is reordering of symmetric matrix M to improve execution time and space requirements of the solution process. In this work, we propose a novel nesteddissectionbased ordering approach that utilizes hypergraph partitioning. Our approac ..."
A typical first step of a direct solver for linear system Mx = b is reordering of symmetric matrix M to improve execution time and space requirements of the solution process. In this work, we propose a novel nesteddissectionbased ordering approach that utilizes hypergraph partitioning. Our approach is based on formulation of graph partitioning by vertex separator (GPVS) problem as a hypergraph partitioning problem. This new formulation is immune to deficiency of GPVS in a multilevel framework hence enables better orderings. In matrix terms, our method relies on the existence of a structural factorization of the input M matrix in the form of M = AAT (or M = AD2AT). We show that the partitioning of the rownet hypergraph representation of rectangular matrix A induces a GPVS of the standard graph representation of matrix M. In the absence of such factorization, we also propose simple, yet effective structural factorization techniques that are based on finding an edge clique cover of the standard graph representation of matrix M, and hence applicable to any arbitrary symmetric matrix M. Our experimental evaluation has shown that the proposed method achieves better ordering in comparison to stateoftheart graphbased ordering tools even for symmetric matrices where structural M = AAT factorization is not provided as an input. For matrices coming from linear programming problems, our method enables even faster and better orderings.