Results 1  10
of
95
METIS  Unstructured Graph Partitioning and Sparse Matrix Ordering System, Version 2.0
, 1995
"... this paper is organized as follows: Section 2 briefly describes the various ideas and algorithms implemented in METIS. Section 3 describes the user interface to the METIS graph partitioning and sparse matrix ordering packages. Sections 4 and 5 describe the formats of the input and output files used ..."
Abstract

Cited by 122 (5 self)
 Add to MetaCart
this paper is organized as follows: Section 2 briefly describes the various ideas and algorithms implemented in METIS. Section 3 describes the user interface to the METIS graph partitioning and sparse matrix ordering packages. Sections 4 and 5 describe the formats of the input and output files used by METIS. Section 6 describes the standalone library that implements the various algorithms implemented in METIS. Section 7 describes the system requirements for the METIS package. Appendix A describes and compares various graph partitioning algorithms that are extensively used.
Analysis of multilevel graph partitioning
, 1995
"... Recently, a number of researchers have investigated a class of algorithms that are based on multilevel graph partitioning that have moderate computational complexity, and provide excellent graph partitions. However, there exists little theoretical analysis that could explain the ability of multileve ..."
Abstract

Cited by 92 (13 self)
 Add to MetaCart
Recently, a number of researchers have investigated a class of algorithms that are based on multilevel graph partitioning that have moderate computational complexity, and provide excellent graph partitions. However, there exists little theoretical analysis that could explain the ability of multilevel algorithms to produce good partitions. In this paper we present such an analysis. We show under certain reasonable assumptions that even if no refinement is used in the uncoarsening phase, a good bisection of the coarser graph is worse than a good bisection of the finer graph by at most a small factor. We also show that the size of a good vertexseparator of the coarse graph projected to the finer graph (without performing refinement in the uncoarsening phase) is higher than the size of a good vertexseparator of the finer graph by at most a small factor.
Analyzing Scalability of Parallel Algorithms and Architectures
 Journal of Parallel and Distributed Computing
, 1994
"... The scalability of a parallel algorithm on a parallel architecture is a measure of its capacity to effectively utilize an increasing number of processors. Scalability analysis may be used to select the best algorithmarchitecture combination for a problem under different constraints on the growth of ..."
Abstract

Cited by 90 (18 self)
 Add to MetaCart
The scalability of a parallel algorithm on a parallel architecture is a measure of its capacity to effectively utilize an increasing number of processors. Scalability analysis may be used to select the best algorithmarchitecture combination for a problem under different constraints on the growth of the problem size and the number of processors. It may be used to predict the performance of a parallel algorithm and a parallel architecture for a large number of processors from the known performance on fewer processors. For a fixed problem size, it may be used to determine the optimal number of processors to be used and the maximum possible speedup that can be obtained. The objective of this paper is to critically assess the state of the art in the theory of scalability analysis, and motivate further research on the development of new and more comprehensive analytical tools to study the scalability of parallel algorithms and architectures. We survey a number of techniques and formalisms t...
SuperLU DIST: A scalable distributedmemory sparse direct solver for unsymmetric linear systems
 ACM Trans. Mathematical Software
, 2003
"... We present the main algorithmic features in the software package SuperLU DIST, a distributedmemory sparse direct solver for large sets of linear equations. We give in detail our parallelization strategies, with a focus on scalability issues, and demonstrate the software’s parallel performance and sc ..."
Abstract

Cited by 88 (18 self)
 Add to MetaCart
We present the main algorithmic features in the software package SuperLU DIST, a distributedmemory sparse direct solver for large sets of linear equations. We give in detail our parallelization strategies, with a focus on scalability issues, and demonstrate the software’s parallel performance and scalability on current machines. The solver is based on sparse Gaussian elimination, with an innovative static pivoting strategy proposed earlier by the authors. The main advantage of static pivoting over classical partial pivoting is that it permits a priori determination of data structures and communication patterns, which lets us exploit techniques used in parallel sparse Cholesky algorithms to better parallelize both LU decomposition and triangular solution on largescale distributed machines.
Fast and Effective Algorithms for Graph Partitioning and Sparse Matrix Ordering
 IBM JOURNAL OF RESEARCH AND DEVELOPMENT
, 1996
"... Graph partitioning is a fundamental problem in several scientific and engineering applications. In this paper, we describe heuristics that improve the stateoftheart practical algorithms used in graphpartitioning software in terms of both partitioning speed and quality. An important use of graph ..."
Abstract

Cited by 55 (11 self)
 Add to MetaCart
Graph partitioning is a fundamental problem in several scientific and engineering applications. In this paper, we describe heuristics that improve the stateoftheart practical algorithms used in graphpartitioning software in terms of both partitioning speed and quality. An important use of graphpartitioning is in ordering sparse matrices for obtaining direct solutions to sparse systems of linear equations arising in engineering and optimization applications. The experiments reported in this paper show that the use of these heuristics results in a considerable improvement in the quality of sparsematrix orderings over conventional ordering methods, especially for sparse matrices arising in linear programming problems. In addition, our graphpartitioningbased ordering algorithm is more parallelizable than minimumdegreebased ordering algorithms, and it renders the ordered matrix more amenable to parallel factorization.
Algorithm 887: Cholmod, supernodal sparse cholesky factorization and update/downdate
 ACM Transactions on Mathematical Software
, 2008
"... CHOLMOD is a set of routines for factorizing sparse symmetric positive definite matrices of the form A or A A T, updating/downdating a sparse Cholesky factorization, solving linear systems, updating/downdating the solution to the triangular system Lx = b, and many other sparse matrix functions for b ..."
Abstract

Cited by 55 (7 self)
 Add to MetaCart
CHOLMOD is a set of routines for factorizing sparse symmetric positive definite matrices of the form A or A A T, updating/downdating a sparse Cholesky factorization, solving linear systems, updating/downdating the solution to the triangular system Lx = b, and many other sparse matrix functions for both symmetric and unsymmetric matrices. Its supernodal Cholesky factorization relies on LAPACK and the Level3 BLAS, and obtains a substantial fraction of the peak performance of the BLAS. Both real and complex matrices are supported. CHOLMOD is written in ANSI/ISO C, with both C and MATLAB TM interfaces. It appears in MATLAB 7.2 as x=A\b when A is sparse symmetric positive definite, as well as in several other sparse matrix functions.