Results 1  10
of
16
Multifrontal QR factorization in a multiprocessor environment
, 1994
"... We describe the design and implementation of a parallel QR decomposition algorithm for a large sparse matrix A. The algorithm is based on the multifrontal approach and makes use of Householder transformations. The tasks are distributed among processors according to an assembly tree which is built ..."
Abstract

Cited by 29 (9 self)
 Add to MetaCart
We describe the design and implementation of a parallel QR decomposition algorithm for a large sparse matrix A. The algorithm is based on the multifrontal approach and makes use of Householder transformations. The tasks are distributed among processors according to an assembly tree which is built from the symbolic factorization of the matrix A T A. Uniprocessor issues are first addressed. We then discuss the multiprocessor implementation of the method. Parallelization of both the factorization phase and the solve phase are considered. We use relaxation of the sparsity structure of both the original matrix and the frontal matrices to improve the performance. We show that, in this case, the use of Level 3 BLAS can lead to very significant performance improvement. The eight processor Alliant FX/80 is used to illustrate our discussion. 1 ENSEEIHTIRIT (Toulouse, France), amestoy@enseeiht.fr. 2 CERFACS (Toulouse, France) also Rutherford App leton Lab., (England), duff@cerfac...
Incomplete Factorization Preconditioning For Linear Least Squares Problems
, 1994
"... this paper is the modified version of GramSchmidt orthogonalization with a rejection test applied right after the formation of the offdiagonal elements of the factor R. For a given rejection parameter 0 / 1, the rejection test is: if r ij ! /= k a ..."
Abstract

Cited by 18 (4 self)
 Add to MetaCart
(Show Context)
this paper is the modified version of GramSchmidt orthogonalization with a rejection test applied right after the formation of the offdiagonal elements of the factor R. For a given rejection parameter 0 / 1, the rejection test is: if r ij ! /= k a
Multifrontal multithreaded rankrevealing sparse QR factorization
"... SuiteSparseQR is a sparse QR factorization package based on the multifrontal method. Within each frontal matrix, LAPACK and the multithreaded BLAS enable the method to obtain high performance on multicore architectures. Parallelism across different frontal matrices is handled with Intel’s Threading ..."
Abstract

Cited by 15 (2 self)
 Add to MetaCart
(Show Context)
SuiteSparseQR is a sparse QR factorization package based on the multifrontal method. Within each frontal matrix, LAPACK and the multithreaded BLAS enable the method to obtain high performance on multicore architectures. Parallelism across different frontal matrices is handled with Intel’s Threading Building Blocks library. The symbolic analysis and ordering phase preeliminates singletons by permuting the input matrix into the form [R11 R12; 0 A22] where R11 is upper triangular with diagonal entries above a given tolerance. Next, the fillreducing ordering, column elimination tree, and frontal matrix structures are found without requiring the formation of the pattern of A T A. Rankdetection is performed within each frontal matrix using Heath’s method, which does not require column pivoting. The resulting sparse QR factorization obtains a substantial fraction of the theoretical peak performance of a multicore computer.
Multifrontal Computation with the Orthogonal Factors of Sparse Matrices
 SIAM Journal on Matrix Analysis and Applications
, 1994
"... . This paper studies the solution of the linear least squares problem for a large and sparse m by n matrix A with m n by QR factorization of A and transformation of the righthand side vector b to Q T b. A multifrontalbased method for computing Q T b using Householder factorization is presented ..."
Abstract

Cited by 10 (0 self)
 Add to MetaCart
. This paper studies the solution of the linear least squares problem for a large and sparse m by n matrix A with m n by QR factorization of A and transformation of the righthand side vector b to Q T b. A multifrontalbased method for computing Q T b using Householder factorization is presented. A theoretical operation count for the K by K unbordered grid model problem and problems defined on graphs with p nseparators shows that the proposed method requires O(NR ) storage and multiplications to compute Q T b, where NR = O(n log n) is the number of nonzeros of the upper triangular factor R of A. In order to introduce BLAS2 operations, Schreiber and Van Loan's StorageEfficientWY Representation [SIAM J. Sci. Stat. Computing, 10(1989),pp. 5557] is applied for the orthogonal factor Q i of each frontal matrix F i . If this technique is used, the bound on storage increases to O(n(logn) 2 ). Some numerical results for the grid model problems as well as HarwellBoeing problems...
On the row merge tree for sparse LU factorization with partial pivoting
 BIT
"... We consider the problem of structure prediction for sparse LU factorization with partial pivoting. In this context, it is well known that the column elimination tree plays an important role for matrices satisfying an irreducibility condition, called the strong Hall property. Our primary goal in this ..."
Abstract

Cited by 5 (2 self)
 Add to MetaCart
(Show Context)
We consider the problem of structure prediction for sparse LU factorization with partial pivoting. In this context, it is well known that the column elimination tree plays an important role for matrices satisfying an irreducibility condition, called the strong Hall property. Our primary goal in this paper is to address the structure prediction problem for matrices satisfying a weaker assumption, which is the Hall property. For this we consider the row merge matrix, an upper bound that contains the nonzeros in L and U for all possible row permutations that can later appear in the numerical factorization due to partial pivoting. We discuss the row merge tree, a structure that represents information obtained from the row merge matrix; that is, information on the dependencies among the columns in Gaussian elimination with partial pivoting and on structural upper bounds of the factors L and U. We present new theoretical results that show that the nonzero structure of the row merge matrix can be described in terms of branches and subtrees of the row merge tree. These results lead to an efficient algorithm for the computation of the row merge tree, that uses as input the structure of A alone, and has a time complexity almost linear in the number of nonzeros in A. We also investigate experimentally the usage of the row merge tree for structure prediction purposes on a set of matrices that satisfy only the Hall property. We analyze in particular the size of upper bounds of the structure of L and U, the reordering of the matrix based on a postorder traversal and its impact on the factorization runtime. We show experimentally that for some matrices, the row merge tree is a preferred alternative to the column elimination tree.
Dealing with Dense Rows in the Solution of Sparse Linear Least Squares Problems
, 1995
"... Sparse linear least squares problems containing a few relatively dense rows occur frequently in practice. Straightforward solution of these problems could cause catastrophic fill and delivers extremely poor performance. This paper studies a scheme for solving such problems efficiently by handling de ..."
Abstract

Cited by 5 (0 self)
 Add to MetaCart
(Show Context)
Sparse linear least squares problems containing a few relatively dense rows occur frequently in practice. Straightforward solution of these problems could cause catastrophic fill and delivers extremely poor performance. This paper studies a scheme for solving such problems efficiently by handling dense rows and sparse rows separately. How a sparse matrix is partitioned into dense rows and sparse rows determines the efficiency of the overall solution process. A new algorithm is proposed to find a partition of a sparse matrix which leads to satisfactory or even optimal performance. Extensive numerical experiments are performed to demonstrate the effectiveness of the proposed scheme. A MATLAB implementation is included. 1 This work was supported in part by the Cornell Theory Center which receives funding from members of its Corporate Research Institute, the National Science Foundation (NSF), the Advanced Research Projects Agency (ARPA), the National Institutes of Health (NIH), New York S...
Sparse Householder QR Factorization on a Mesh
 In Fourth Euromicro Workshop on Parallel and Distributed Processing
, 1996
"... ..."
(Show Context)
Parallel Multifrontal Solution Of Sparse Linear Least Squares Problems On DistributedMemory Multiprocessors
 Advanced Computing Research Institute, Center for Theory and Simulation in Science and Engineering, Cornell
, 1994
"... . We describe the issues involved in the design and implementation of efficient parallel algorithms for solving sparse linear least squares problems on distributedmemory multiprocessors. We consider both the QR factorization method due to Golub and the method of corrected seminormal equations due ..."
Abstract

Cited by 3 (0 self)
 Add to MetaCart
(Show Context)
. We describe the issues involved in the design and implementation of efficient parallel algorithms for solving sparse linear least squares problems on distributedmemory multiprocessors. We consider both the QR factorization method due to Golub and the method of corrected seminormal equations due to Bj¨orck. The major tasks involved are sparse QR factorization, sparse triangular solution and sparse matrixvector multiplication. The sparse QR factorization is accomplished by a parallel multifrontal scheme recently introduced. New parallel algorithms for solving the related sparse triangular systems and for performing sparse matrixvector multiplications are proposed. The arithmetic and communication complexities of our algorithms on regular grid problems are presented. Experimental results on an Intel iPSC/860 machine are described. Key words. parallel algorithms, sparse matrix, orthogonal factorization, multifrontal method, least squares problems, triangular solution, distributedme...
Computing sparse orthogonal factors in MATLAB
, 1998
"... In this report a new version of the multifrontal sparse QR factorization routine sqr, originally by Matstoms, for general sparse matrices is described and evaluated. In the previous version the orthogonal factor Q is discarded due to storage considerations. The new version provides Q and uses the mu ..."
Abstract

Cited by 2 (0 self)
 Add to MetaCart
(Show Context)
In this report a new version of the multifrontal sparse QR factorization routine sqr, originally by Matstoms, for general sparse matrices is described and evaluated. In the previous version the orthogonal factor Q is discarded due to storage considerations. The new version provides Q and uses the multifrontal structure to store this orthogonal factor in a compact way. A new data class with overloaded operators is implemented in Matlab to provide an easy usage of the compact orthogonal factors. This implicit way of storing the orthogonal factor also results in faster computation and application of Q and Q T . Examples are given, where the new version is up to four times faster when computing only R and up to 1000 times faster when computing both Q and R, than the builtin function qr in Matlab. The sqr package is available at URL: http://www.mai.liu.se/~milun/sls/. Key words: QR factorization, sparse problems, multifrontal method, orthogonal factorization. 1 Introduction. Let A 2 IR...
Exact Prediction Of QR FillIn By RowMerge Trees
"... . Rowmerge trees for forming the QR factorization of a sparse matrix A are closely related to elimination trees for the Cholesky factorization of A T A. Rowmerge trees predict the exact fillin (assuming no numerical cancellation) provided A satisfies the strong Hall property, but overestimates ..."
Abstract

Cited by 2 (0 self)
 Add to MetaCart
(Show Context)
. Rowmerge trees for forming the QR factorization of a sparse matrix A are closely related to elimination trees for the Cholesky factorization of A T A. Rowmerge trees predict the exact fillin (assuming no numerical cancellation) provided A satisfies the strong Hall property, but overestimates the fillin in general. However, here a fast and simple postprocessing step for rowmerge trees is presented that predicts the exact fillin for sparse QR factorization using Householder reflectors, for general matrices. Key words. rowmerge trees, elimination trees, QR factorization 1. Introduction. Matrix factorizations of sparse matrices typically result in creating further nonzero entries, or fillin. If this fillin can be accurately predicted in advance, then the factorization can be performed in less time, as the additional memory needed can be allocated once in advance. Notice that fillin can be reduced with some matrix reordering algorithms. After that, the algorithms presented ...