Results 1 - 10
of
60
A supernodal approach to sparse partial pivoting
- SIAM Journal on Matrix Analysis and Applications
, 1999
"... We investigate several ways to improve the performance of sparse LU factorization with partial pivoting, as used to solve unsymmetric linear systems. To perform most of the numerical computation in dense matrix kernels, we introduce the notion of unsymmetric supernodes. To better exploit the memory ..."
Abstract
-
Cited by 158 (20 self)
- Add to MetaCart
We investigate several ways to improve the performance of sparse LU factorization with partial pivoting, as used to solve unsymmetric linear systems. To perform most of the numerical computation in dense matrix kernels, we introduce the notion of unsymmetric supernodes. To better exploit the memory hierarchy, weintroduce unsymmetric supernode-panel updates and two-dimensional data partitioning. To speed up symbolic factorization, we use Gilbert and Peierls's depth- rst search with Eisenstat and Liu's symmetric structural reductions. We have implemented a sparse LU code using all these ideas. We present experiments demonstrating that it is signi cantly faster than earlier partial pivoting codes. We also compare performance with Umfpack, which uses a multifrontal approach; our code is usually faster.
SuperLU DIST: A scalable distributed-memory sparse direct solver for unsymmetric linear systems
- ACM Trans. Mathematical Software
, 2003
"... We present the main algorithmic features in the software package SuperLU DIST, a distributedmemory sparse direct solver for large sets of linear equations. We give in detail our parallelization strategies, with a focus on scalability issues, and demonstrate the software’s parallel performance and sc ..."
Abstract
-
Cited by 68 (14 self)
- Add to MetaCart
We present the main algorithmic features in the software package SuperLU DIST, a distributedmemory sparse direct solver for large sets of linear equations. We give in detail our parallelization strategies, with a focus on scalability issues, and demonstrate the software’s parallel performance and scalability on current machines. The solver is based on sparse Gaussian elimination, with an innovative static pivoting strategy proposed earlier by the authors. The main advantage of static pivoting over classical partial pivoting is that it permits a priori determination of data structures and communication patterns, which lets us exploit techniques used in parallel sparse Cholesky algorithms to better parallelize both LU decomposition and triangular solution on large-scale distributed machines.
Solving unsymmetric sparse systems of linear equations with PARDISO
- Journal of Future Generation Computer Systems
, 2004
"... Supernode partitioning for unsymmetric matrices together with complete block diagonal supernode pivoting and asynchronous computation can achieve high gigaflop rates for parallel sparse LU factorization on shared memory parallel computers. The progress in weighted graph matching algorithms helps to ..."
Abstract
-
Cited by 62 (5 self)
- Add to MetaCart
Supernode partitioning for unsymmetric matrices together with complete block diagonal supernode pivoting and asynchronous computation can achieve high gigaflop rates for parallel sparse LU factorization on shared memory parallel computers. The progress in weighted graph matching algorithms helps to extend these concepts further and unsymmetric prepermutation of rows is used to place large matrix entries on the diagonal. Complete block diagonal supernode pivoting allows dynamical interchanges of columns and rows during the factorization process. The level-3 BLAS efficiency is retained and an advanced two-level left–right looking scheduling scheme results in good speedup on SMP machines. These algorithms have been integrated into the recent unsymmetric version of the PARDISO solver. Experiments demonstrate that a wide set of unsymmetric linear systems can be solved and high performance is consistently achieved for large sparse unsymmetric matrices from real world applications. Key words: Computational sciences, numerical linear algebra, direct solver, unsymmetric linear systems
A Combined Unifrontal/Multifrontal Method for Unsymmetric Sparse Matrices
- ACM TRANSACTIONS ON MATHEMATICAL SOFTWARE
, 1995
"... We discuss the organization of frontal matrices in multifrontal methods for the solution of large sparse sets of unsymmetric linear equations. In the multifrontal method, work on a frontal matrix can be suspended, the frontal matrix can be stored for later reuse, and a new frontal matrix can be g ..."
Abstract
-
Cited by 59 (11 self)
- Add to MetaCart
We discuss the organization of frontal matrices in multifrontal methods for the solution of large sparse sets of unsymmetric linear equations. In the multifrontal method, work on a frontal matrix can be suspended, the frontal matrix can be stored for later reuse, and a new frontal matrix can be generated. There are thus several frontal matrices stored during the factorization and one or more or these are assembled (summed) when creating a new frontal matrix. Although this means that arbitrary sparsity patterns can be handled efficiently, extra work is required to sum the frontal matrices together and can be costly because indirect addressing is required. The (uni-)frontal method avoids this extra work by factorizing the matrix with a single frontal matrix. Rows and columns are added to the frontal matrix, and pivot rows and columns are removed. Data movement is simpler, but higher fill-in can result if the matrix cannot be permuted into a variable-band form with small profile...
Implementation of Interior Point Methods for Large Scale Linear Programming
- in Interior Point Methods in Mathematical Programming
, 1996
"... In the past 10 years the interior point methods (IPM) for linear programming have gained extraordinary interest as an alternative to the sparse simplex based methods. This has initiated a fruitful competition between the two types of algorithms which has lead to very efficient implementations on bot ..."
Abstract
-
Cited by 56 (18 self)
- Add to MetaCart
In the past 10 years the interior point methods (IPM) for linear programming have gained extraordinary interest as an alternative to the sparse simplex based methods. This has initiated a fruitful competition between the two types of algorithms which has lead to very efficient implementations on both sides. The significant difference between interior point and simplex based methods is reflected not only in the theoretical background but also in the practical implementation. In this paper we give an overview of the most important characteristics of advanced implementations of interior point methods. First, we present the infeasible-primal-dual algorithm which is widely considered the most efficient general purpose IPM. Our discussion includes various algorithmic enhancements of the basic algorithm. The only shortcoming of the "traditional" infeasible-primal-dual algorithm is to detect a possible primal or dual infeasibility of the linear program. We discuss how this problem can be solve...
SuperLU Users' Guide
, 1999
"... This document describes a collection of three related ANSI C subroutine libraries for solving sparse linear systems of equations AX = B. Here A is a square, nonsingular, n \Theta n sparse matrix, and X and B are dense n \Theta nrhs matrices, where nrhs is the number of right-hand sides and solution ..."
Abstract
-
Cited by 43 (2 self)
- Add to MetaCart
This document describes a collection of three related ANSI C subroutine libraries for solving sparse linear systems of equations AX = B. Here A is a square, nonsingular, n \Theta n sparse matrix, and X and B are dense n \Theta nrhs matrices, where nrhs is the number of right-hand sides and solution vectors. Matrix A need not be symmetric or definite; indeed, SuperLU is particularly appropriate for matrices with very unsymmetric structure. All three libraries use variations of Gaussian elimination optimized to take advantage both of sparsity and the computer architecture, in particular memory hierarchies (caches) and parallelism. In this introduction we refer to all three libraries collectively as SuperLU. The three libraries within SuperLU are as follows. Detailed references are also given (see also [19]).
A column pre-ordering strategy for the unsymmetric-pattern multifrontal method
- ACM Transactions on Mathematical Software
, 2004
"... A new method for sparse LU factorization is presented that combines a column pre-ordering strategy with a right-looking unsymmetric-pattern multifrontal numerical factorization. The column ordering is selected to give a good a priori upper bound on fill-in and then refined during numerical factoriza ..."
Abstract
-
Cited by 36 (2 self)
- Add to MetaCart
A new method for sparse LU factorization is presented that combines a column pre-ordering strategy with a right-looking unsymmetric-pattern multifrontal numerical factorization. The column ordering is selected to give a good a priori upper bound on fill-in and then refined during numerical factorization (while preserving the bound). Pivot rows are selected to maintain numerical stability and to preserve sparsity. The method analyzes the matrix and automatically selects one of three pre-ordering and pivoting strategies. The number of nonzeros in the LU factors computed by the method is typically less than or equal to those found by a wide range of unsymmetric sparse LU factorization methods, including left-looking methods and prior multifrontal methods.
Backward Error and Condition of Structured Linear Systems
- SIMAX
, 1992
"... Reports available from: ..."
Sparse Gaussian Elimination on High Performance Computers
, 1996
"... This dissertation presents new techniques for solving large sparse unsymmetric linear systems on high performance computers, using Gaussian elimination with partial pivoting. The efficiencies of the new algorithms are demonstrated for matrices from various fields and for a variety of high performan ..."
Abstract
-
Cited by 33 (5 self)
- Add to MetaCart
This dissertation presents new techniques for solving large sparse unsymmetric linear systems on high performance computers, using Gaussian elimination with partial pivoting. The efficiencies of the new algorithms are demonstrated for matrices from various fields and for a variety of high performance machines. In the first part we discuss optimizations of a sequential algorithm to exploit the memory hierarchies that exist in most RISC-based superscalar computers. We begin with the left-looking supernode-column algorithm by Eisenstat, Gilbert and Liu, which includes Eisenstat and Liu's symmetric structural reduction for fast symbolic factorization. Our key contribution is to develop both numeric and symbolic schemes to perform supernodepanel updates to achieve better data reuse in cache and floating-point register...
Stability of Block Algorithms with Fast Level 3 BLAS
- ACM Trans. Math. Soft
, 1992
"... . Block algorithms are becoming increasingly popular in matrix computations. Since their basic unit of data is a submatrix rather than a scalar they have a higher level of granularity than point algorithms, and this makes them well-suited to high-performance computers. The numerical stability of the ..."
Abstract
-
Cited by 33 (14 self)
- Add to MetaCart
. Block algorithms are becoming increasingly popular in matrix computations. Since their basic unit of data is a submatrix rather than a scalar they have a higher level of granularity than point algorithms, and this makes them well-suited to high-performance computers. The numerical stability of the block algorithms in the new linear algebra program library LAPACK is investigated here. It is shown that these algorithms have backward error analyses in which the backward error bounds are commensurate with the error bounds for the underlying level 3 BLAS (BLAS3). One implication is that the block algorithms are as stable as the corresponding point algorithms when conventional BLAS3 are used. A second implication is that the use of BLAS3 based on fast matrix multiplication techniques affects the stability only insofar as it increases the constant terms in the normwise backward error bounds. For linear equation solvers employing LU factorization it is shown that fixed precision iterative re...

