Results 1  10
of
63
Permuting Sparse Rectangular Matrices into BlockDiagonal Form
 SIAM Journal on Scientific Computing
, 2002
"... We investigate the problem of permuting a sparse rectangular matrix into block diagonal form. Block diagonal form of a matrix grants an inherent parallelism for solving the deriving problem, as recently investigated in the context of mathematical programming, LU factorization and QR factorization. W ..."
Abstract

Cited by 58 (19 self)
 Add to MetaCart
(Show Context)
We investigate the problem of permuting a sparse rectangular matrix into block diagonal form. Block diagonal form of a matrix grants an inherent parallelism for solving the deriving problem, as recently investigated in the context of mathematical programming, LU factorization and QR factorization. We propose bipartite graph and hypergraph models to represent the nonzero structure of a matrix, which reduce the permutation problem to those of graph partitioning by vertex separator and hypergraph partitioning, respectively. Our experiments on a wide range of matrices, using stateoftheart graph and hypergraph partitioning tools MeTiS and PaToH, revealed that the proposed methods yield very effective solutions both in terms of solution quality and runtime.
Hypergraphbased Dynamic Load Balancing for Adaptive Scientific Computations
, 2007
"... Adaptive scientific computations require that periodic repartitioning (load balancing) occur dynamically to maintain load balance. Hypergraph partitioning is a successful model for minimizing communication volume in scientific computations, and partitioning software for the static case is widely ava ..."
Abstract

Cited by 39 (6 self)
 Add to MetaCart
Adaptive scientific computations require that periodic repartitioning (load balancing) occur dynamically to maintain load balance. Hypergraph partitioning is a successful model for minimizing communication volume in scientific computations, and partitioning software for the static case is widely available. In this paper, we present a new hypergraph model for the dynamic case, where we minimize the sum of communication in the application plus the migration cost to move data, thereby reducing total execution time. The new model can be solved using hypergraph partitioning with fixed vertices. We describe an implementation of a parallel multilevel repartitioning algorithm within the Zoltan loadbalancing toolkit, which to our knowledge is the first code for dynamic load balancing based on hypergraph partitioning. Finally, we present experimental results that demonstrate the effectiveness of our approach on a Linux cluster with up to 64 processors. Our new algorithm compares favorably to the widely used ParMETIS partitioning software in terms of quality, and would have reduced total execution time in most of our test cases.
Encapsulating Multiple CommunicationCost Metrics in Partitioning Sparse Rectangular Matrices for Parallel MatrixVector Multiplies
"... This paper addresses the problem of onedimensional partitioning of structurally unsymmetricsquare and rectangular sparse matrices for parallel matrixvector and matrixtransposevector multiplies. The objective is to minimize the communication cost while maintaining the balance on computational load ..."
Abstract

Cited by 35 (21 self)
 Add to MetaCart
(Show Context)
This paper addresses the problem of onedimensional partitioning of structurally unsymmetricsquare and rectangular sparse matrices for parallel matrixvector and matrixtransposevector multiplies. The objective is to minimize the communication cost while maintaining the balance on computational loads of processors. Most of the existing partitioning models consider only the total message volume hoping that minimizing this communicationcost metric is likely to reduce other metrics. However, the total message latency (startup time) may be more important than the total message volume. Furthermore, the maximum message volume and latency handled by a single processor are also important metrics. We propose a twophase approach that encapsulates all these four communicationcost metrics. The objective in the first phase is to minimize the total message volume while maintainingthe computationalload balance. The objective in the second phase is to encapsulate the remaining three communicationcost metrics. We propose communicationhypergraph and partitioning models for the second phase. We then present several methods for partitioning communication hypergraphs. Experiments on a wide range of test matrices show that the proposed approach yields very effective partitioning results. A parallel implementation on a PC cluster verifies that the theoretical improvements shown by partitioning results hold in practice.
On twodimensional sparse matrix partitioning: Models, methods, and a recipe
 SIAM J. SCI. COMPUT
, 2010
"... We consider twodimensional partitioning of general sparse matrices for parallel sparse matrixvector multiply operation. We present three hypergraphpartitioningbased methods, each having unique advantages. The first one treats the nonzeros of the matrix individually and hence produces finegrain ..."
Abstract

Cited by 35 (18 self)
 Add to MetaCart
(Show Context)
We consider twodimensional partitioning of general sparse matrices for parallel sparse matrixvector multiply operation. We present three hypergraphpartitioningbased methods, each having unique advantages. The first one treats the nonzeros of the matrix individually and hence produces finegrain partitions. The other two produce coarser partitions, where one of them imposes a limit on the number of messages sent and received by a single processor, and the other trades that limit for a lower communication volume. We also present a thorough experimental evaluation of the proposed twodimensional partitioning methods together with the hypergraphbased onedimensional partitioning methods, using an extensive set of public domain matrices. Furthermore, for the users of these partitioning methods, we present a partitioning recipe that chooses one of the partitioning methods according to some matrix characteristics.
A finegrain hypergraph model for 2D decomposition of sparse matrices
 in: Proceedings of the 15th International Parallel and Distributed Processing Symposium, 2001, p. 118. C. Aykanat
"... We propose a new hypergraph model for the decomposition of irregular computational domains. This work focuses on the decomposition of sparse matrices for parallel matrixvector multiplication. However, the proposed model can also be used to decompose computational domains of other parallel reduction ..."
Abstract

Cited by 33 (9 self)
 Add to MetaCart
(Show Context)
We propose a new hypergraph model for the decomposition of irregular computational domains. This work focuses on the decomposition of sparse matrices for parallel matrixvector multiplication. However, the proposed model can also be used to decompose computational domains of other parallel reduction problems. We propose a “finegrain” hypergraph model for twodimensional decomposition of sparse matrices. In the proposed finegrain hypergraph model, vertices represent nonzeros and hyperedges represent sparsity patterns of rows and columns of the matrix. By partitioning the finegrain hypergraph into equally weighted vertex parts (processors) so that hyperedges are split among as few processors as possible, the model correctly minimizes communication volume while maintaining computationalload balance. Experimental results on a wide range of realistic sparse matrices confirm the validity of the proposed model, by achieving up to 50 percent better decompositionsthan the existing models, in terms of totalcommunication volume. 1
A hypergraphpartitioning approach for coarsegrain decomposition
 in: Proceedings of the 2001 ACM/IEEE Conference on Supercomputing, 2001
"... We propose a new twophase method for the coarsegrain decomposition of irregular computational domains. This work focuses on the 2D partitioning of sparse matrices for parallel matrixvector multiplication. However, the proposed model can also be used to decompose computational domains of other par ..."
Abstract

Cited by 33 (16 self)
 Add to MetaCart
We propose a new twophase method for the coarsegrain decomposition of irregular computational domains. This work focuses on the 2D partitioning of sparse matrices for parallel matrixvector multiplication. However, the proposed model can also be used to decompose computational domains of other parallel reduction problems. This work also introduces the use of multiconstraint hypergraph partitioning, for solving the decomposition problem. The proposed method explicitly models the minimization of communication volume while enforcing the upper bound of p + q; 2 on the maximum number of messages handled by a single processor, for a parallel system with P = p q processors. Experimental results on a wide range of realistic sparse matrices confirm the validity of the proposed methods, by achieving up to 25 percent better partitions than the standard graph model, in terms of total communication volume, and 59 percent better partitions in terms of number of messages, on the overall average. 1.
Uniformization and Hypergraph Partitioning for the Distributed Computation of Response Time Densities in Very Large Markov Models
 Journal of Parallel and Distributed Computing
, 2004
"... Fast response times and the satisfaction of response time quantile targets are important performance criteria for almost all transaction processing and computercommunication systems. We present a distributed uniformizationbased technique for obtaining response time densities from very large unstru ..."
Abstract

Cited by 26 (6 self)
 Add to MetaCart
(Show Context)
Fast response times and the satisfaction of response time quantile targets are important performance criteria for almost all transaction processing and computercommunication systems. We present a distributed uniformizationbased technique for obtaining response time densities from very large unstructured Markov models. Our method utilises hypergraph partitioning to minimise interprocessor communication while maintaining a good load balance. The resulting algorithm scales well on a distributedmemory parallel computer and, unusually for a problem of this nature, also produces nearlinear speedups on a network of commodity PCs linked by 100 Mbps Ethernet. We demonstrate our approach by calculating passage time densities in a 1.6 million state Markov chain derived from a Generalised Stochastic Petri net model and a 10.8 million state Markov chain derived from a closed treelike queueing network. We compare the accuracy of our results with simulation and known analytical solutions and contrast the runtime performance of our technique with an approach based on numerical Laplace transform inversion.
Revisiting hypergraph models for sparse matrix partitioning
 SIAM Review
, 2007
"... Abstract. We provide an exposition of hypergraph models for parallelizing sparse matrixvector multiplies. Our aim is to emphasize the expressive power of hypergraph models. First, we set forth an elementary hypergraph model for the parallel matrixvector multiply based on onedimensional (1D) matri ..."
Abstract

Cited by 20 (11 self)
 Add to MetaCart
(Show Context)
Abstract. We provide an exposition of hypergraph models for parallelizing sparse matrixvector multiplies. Our aim is to emphasize the expressive power of hypergraph models. First, we set forth an elementary hypergraph model for the parallel matrixvector multiply based on onedimensional (1D) matrix partitioning. In the elementary model, the vertices represent the data of a matrixvector multiply, and the nets encode dependencies among the data. We then apply a recently proposed hypergraph transformation operation to devise models for 1Dsparse matrix partitioning. The resulting 1Dpartitioning models are equivalent to the previously proposed computational hypergraph models and are not meant to be replacements for them. Nevertheless, the new models give us insights into the previous ones and help us explain a subtle requirement, known as the consistency condition, of hypergraph partitioning models. Later, we demonstrate the flexibility of the elementary model on a few 1Dpartitioning problems that are hard to solve using the previously proposed models. We also discuss extensions of the proposed elementary model to twodimensional matrix partitioning. Key words. parallel computing, sparse matrixvector multiply, hypergraph models
On computing inverse entries of a sparse matrix in an outofcore environment
, 2010
"... Abstract. The inverse of an irreducible sparse matrix is structurally full, so that it is impractical to think of computing or storing it. However, there are several applications where a subset of the entries of the inverse is required. Given a factorization of the sparse matrix held in outofcore ..."
Abstract

Cited by 16 (5 self)
 Add to MetaCart
Abstract. The inverse of an irreducible sparse matrix is structurally full, so that it is impractical to think of computing or storing it. However, there are several applications where a subset of the entries of the inverse is required. Given a factorization of the sparse matrix held in outofcore storage, we show how to compute such a subset e ciently, by accessing only parts of the factors. When there are many inverse entries to compute, we need to guarantee that the overall computation scheme has reasonable memory requirements, while minimizing the cost of loading the factors. This leads to a partitioning problem that we prove is NPcomplete. We also show that we cannot get a close approximation to the optimal solution in polynomial time. We thus need to develop heuristic algorithms, and we propose: (i) a lower bound on the cost of an optimum solution; (ii) an exact algorithm for a particular case; (iii) two other heuristics for a more general case; and (iv) hypergraph partitioning models for the most general setting. We illustrate the performance of our algorithms in practice using the MUMPS software package on a set of reallife problems as well as some standard test matrices. We show that our techniques can improve the execution time by a factor of 50. Key words. Sparse matrices, direct methods for linear systems and matrix inversion, multifrontal method, graphs and hypergraphs. AMS subject classi cations. 05C50, 05C65, 65F05, 65F50 1. Introduction. We