Results 1  10
of
68
A supernodal approach to sparse partial pivoting
 SIAM Journal on Matrix Analysis and Applications
, 1999
"... We investigate several ways to improve the performance of sparse LU factorization with partial pivoting, as used to solve unsymmetric linear systems. To perform most of the numerical computation in dense matrix kernels, we introduce the notion of unsymmetric supernodes. To better exploit the memory ..."
Abstract

Cited by 192 (23 self)
 Add to MetaCart
We investigate several ways to improve the performance of sparse LU factorization with partial pivoting, as used to solve unsymmetric linear systems. To perform most of the numerical computation in dense matrix kernels, we introduce the notion of unsymmetric supernodes. To better exploit the memory hierarchy, weintroduce unsymmetric supernodepanel updates and twodimensional data partitioning. To speed up symbolic factorization, we use Gilbert and Peierls's depth rst search with Eisenstat and Liu's symmetric structural reductions. We have implemented a sparse LU code using all these ideas. We present experiments demonstrating that it is signi cantly faster than earlier partial pivoting codes. We also compare performance with Umfpack, which uses a multifrontal approach; our code is usually faster.
Solving unsymmetric sparse systems of linear equations with PARDISO
 Journal of Future Generation Computer Systems
, 2004
"... Supernode partitioning for unsymmetric matrices together with complete block diagonal supernode pivoting and asynchronous computation can achieve high gigaflop rates for parallel sparse LU factorization on shared memory parallel computers. The progress in weighted graph matching algorithms helps to ..."
Abstract

Cited by 95 (8 self)
 Add to MetaCart
Supernode partitioning for unsymmetric matrices together with complete block diagonal supernode pivoting and asynchronous computation can achieve high gigaflop rates for parallel sparse LU factorization on shared memory parallel computers. The progress in weighted graph matching algorithms helps to extend these concepts further and unsymmetric prepermutation of rows is used to place large matrix entries on the diagonal. Complete block diagonal supernode pivoting allows dynamical interchanges of columns and rows during the factorization process. The level3 BLAS efficiency is retained and an advanced twolevel left–right looking scheduling scheme results in good speedup on SMP machines. These algorithms have been integrated into the recent unsymmetric version of the PARDISO solver. Experiments demonstrate that a wide set of unsymmetric linear systems can be solved and high performance is consistently achieved for large sparse unsymmetric matrices from real world applications. Key words: Computational sciences, numerical linear algebra, direct solver, unsymmetric linear systems
SuperLU DIST: A scalable distributedmemory sparse direct solver for unsymmetric linear systems
 ACM Trans. Mathematical Software
, 2003
"... We present the main algorithmic features in the software package SuperLU DIST, a distributedmemory sparse direct solver for large sets of linear equations. We give in detail our parallelization strategies, with a focus on scalability issues, and demonstrate the software’s parallel performance and sc ..."
Abstract

Cited by 88 (18 self)
 Add to MetaCart
We present the main algorithmic features in the software package SuperLU DIST, a distributedmemory sparse direct solver for large sets of linear equations. We give in detail our parallelization strategies, with a focus on scalability issues, and demonstrate the software’s parallel performance and scalability on current machines. The solver is based on sparse Gaussian elimination, with an innovative static pivoting strategy proposed earlier by the authors. The main advantage of static pivoting over classical partial pivoting is that it permits a priori determination of data structures and communication patterns, which lets us exploit techniques used in parallel sparse Cholesky algorithms to better parallelize both LU decomposition and triangular solution on largescale distributed machines.
A Combined Unifrontal/Multifrontal Method for Unsymmetric Sparse Matrices
 ACM TRANSACTIONS ON MATHEMATICAL SOFTWARE
, 1995
"... We discuss the organization of frontal matrices in multifrontal methods for the solution of large sparse sets of unsymmetric linear equations. In the multifrontal method, work on a frontal matrix can be suspended, the frontal matrix can be stored for later reuse, and a new frontal matrix can be g ..."
Abstract

Cited by 72 (13 self)
 Add to MetaCart
We discuss the organization of frontal matrices in multifrontal methods for the solution of large sparse sets of unsymmetric linear equations. In the multifrontal method, work on a frontal matrix can be suspended, the frontal matrix can be stored for later reuse, and a new frontal matrix can be generated. There are thus several frontal matrices stored during the factorization and one or more or these are assembled (summed) when creating a new frontal matrix. Although this means that arbitrary sparsity patterns can be handled efficiently, extra work is required to sum the frontal matrices together and can be costly because indirect addressing is required. The (uni)frontal method avoids this extra work by factorizing the matrix with a single frontal matrix. Rows and columns are added to the frontal matrix, and pivot rows and columns are removed. Data movement is simpler, but higher fillin can result if the matrix cannot be permuted into a variableband form with small profile...
Implementation of Interior Point Methods for Large Scale Linear Programming
 in Interior Point Methods in Mathematical Programming
, 1996
"... In the past 10 years the interior point methods (IPM) for linear programming have gained extraordinary interest as an alternative to the sparse simplex based methods. This has initiated a fruitful competition between the two types of algorithms which has lead to very efficient implementations on bot ..."
Abstract

Cited by 68 (20 self)
 Add to MetaCart
In the past 10 years the interior point methods (IPM) for linear programming have gained extraordinary interest as an alternative to the sparse simplex based methods. This has initiated a fruitful competition between the two types of algorithms which has lead to very efficient implementations on both sides. The significant difference between interior point and simplex based methods is reflected not only in the theoretical background but also in the practical implementation. In this paper we give an overview of the most important characteristics of advanced implementations of interior point methods. First, we present the infeasibleprimaldual algorithm which is widely considered the most efficient general purpose IPM. Our discussion includes various algorithmic enhancements of the basic algorithm. The only shortcoming of the "traditional" infeasibleprimaldual algorithm is to detect a possible primal or dual infeasibility of the linear program. We discuss how this problem can be solve...
A column preordering strategy for the unsymmetricpattern multifrontal method
 ACM Transactions on Mathematical Software
, 2004
"... A new method for sparse LU factorization is presented that combines a column preordering strategy with a rightlooking unsymmetricpattern multifrontal numerical factorization. The column ordering is selected to give a good a priori upper bound on fillin and then refined during numerical factoriza ..."
Abstract

Cited by 58 (4 self)
 Add to MetaCart
A new method for sparse LU factorization is presented that combines a column preordering strategy with a rightlooking unsymmetricpattern multifrontal numerical factorization. The column ordering is selected to give a good a priori upper bound on fillin and then refined during numerical factorization (while preserving the bound). Pivot rows are selected to maintain numerical stability and to preserve sparsity. The method analyzes the matrix and automatically selects one of three preordering and pivoting strategies. The number of nonzeros in the LU factors computed by the method is typically less than or equal to those found by a wide range of unsymmetric sparse LU factorization methods, including leftlooking methods and prior multifrontal methods.
On the solution of equality constrained quadratic programming problems arising . . .
, 1998
"... ..."
Stability of Block Algorithms with Fast Level 3 BLAS
 ACM Trans. Math. Soft
, 1992
"... . Block algorithms are becoming increasingly popular in matrix computations. Since their basic unit of data is a submatrix rather than a scalar they have a higher level of granularity than point algorithms, and this makes them wellsuited to highperformance computers. The numerical stability of the ..."
Abstract

Cited by 37 (15 self)
 Add to MetaCart
. Block algorithms are becoming increasingly popular in matrix computations. Since their basic unit of data is a submatrix rather than a scalar they have a higher level of granularity than point algorithms, and this makes them wellsuited to highperformance computers. The numerical stability of the block algorithms in the new linear algebra program library LAPACK is investigated here. It is shown that these algorithms have backward error analyses in which the backward error bounds are commensurate with the error bounds for the underlying level 3 BLAS (BLAS3). One implication is that the block algorithms are as stable as the corresponding point algorithms when conventional BLAS3 are used. A second implication is that the use of BLAS3 based on fast matrix multiplication techniques affects the stability only insofar as it increases the constant terms in the normwise backward error bounds. For linear equation solvers employing LU factorization it is shown that fixed precision iterative re...
Backward Error and Condition of Structured Linear Systems
 SIMAX
, 1992
"... Reports available from: ..."