Results 1 -
7 of
7
Computing Rank-Revealing QR Factorizations of Dense Matrices
- Argonne Preprint ANL-MCS-P559-0196, Argonne National Laboratory
, 1996
"... this paper, and we give only a brief synopsis here. For details, the reader is referred to the code. Test matrices 1 through 5 were designed to exercise column pivoting. Matrix 6 was designed to test the behavior of the condition estimation in the presence of clusters for the smallest singular value ..."
Abstract
-
Cited by 19 (1 self)
- Add to MetaCart
this paper, and we give only a brief synopsis here. For details, the reader is referred to the code. Test matrices 1 through 5 were designed to exercise column pivoting. Matrix 6 was designed to test the behavior of the condition estimation in the presence of clusters for the smallest singular value. For the other cases, we employed the LAPACK matrix generator xLATMS, which generates random symmetric matrices by multiplying a diagonal matrix with prescribed singular values by random orthogonal matrices from the left and right. For the break1 distribution, all singular values are 1.0 except for one. In the arithmetic and geometric distributions, they decay from 1.0 to a specified smallest singular value in an arithmetic and geometric fashion, respectively. In the "reversed" distributions, the order of the diagonal entries was reversed. For test cases 7 though 12, we used xLATMS to generate a matrix of order
Parallel Performance of a Symmetric Eigensolver based on the Invariant Subspace Decomposition Approach
, 1994
"... In this paper, we discuss work in progress on a complete eigensolver based on the Invariant Subspace Decomposition Algorithm for dense symmetric matrices (SYISDA). We describe a recently developed acceleration technique that substantially reduces the overall work required by this algorithm and revie ..."
Abstract
-
Cited by 15 (0 self)
- Add to MetaCart
In this paper, we discuss work in progress on a complete eigensolver based on the Invariant Subspace Decomposition Algorithm for dense symmetric matrices (SYISDA). We describe a recently developed acceleration technique that substantially reduces the overall work required by this algorithm and review the algorithmic highlights of a distributed-memory implementation of this approach. These include a fast matrix-matrix multiplication algorithm, a new approach to parallel band reduction and tridiagonalization, and a harness for coordinating the divide-and-conquer parallelism in the problem. We present performance results for the dominant kernel, dense matrix multiplication, as well as for the overall SYISDA implementation on the Intel Touchstone Delta and the Intel Paragon. 1. Introduction Computation of eigenvalues and eigenvectors is an essential kernel in many applications, and several promising parallel algorithms have been investigated [26, 3, 28, 22, 25, 6]. The work presented in t...
Sparse Multifrontal Rank Revealing QR Factorization
- SIAM J. Matrix Anal. Appl
, 1995
"... We describe an algorithm to compute a rank revealing sparse QR factorization. We augment a basic sparse multifrontal QR factorization with an incremental condition estimator to provide an estimate of the least singular value and vector for each successive column of R. We remove a column from R as ..."
Abstract
-
Cited by 14 (0 self)
- Add to MetaCart
We describe an algorithm to compute a rank revealing sparse QR factorization. We augment a basic sparse multifrontal QR factorization with an incremental condition estimator to provide an estimate of the least singular value and vector for each successive column of R. We remove a column from R as soon as the condition estimate exceeds a tolerance, using the approximate singular vector to select a suitable column. Removing columns, or pivoting, requires a dynamic data structure and necessarily degrades sparsity. But most of the additional work fits naturally into the multifrontal factorization's use of efficient dense vector kernels, minimizing overall cost. Further, pivoting as soon as possible reduces the cost of pivot selection and data access. We present a theoretical analysis that shows that our use of approximate singular vectors does not degrade the quality of our rank-revealing factorization; we achieve an exponential bound like methods that use exact singular vectors. We prov...
A Parallel Implementation of the Invariant Subspace Decomposition Algorithm for Dense Symmetric Matrices
, 1993
"... . We give an overview of the Invariant Subspace Decomposition Algorithm for dense symmetric matrices (SYISDA) by first describing the algorithm, followed by a discussion of a parallel implementation of SYISDA on the Intel Delta. Our implementation utilizes an optimized parallel matrix multiplication ..."
Abstract
-
Cited by 12 (2 self)
- Add to MetaCart
. We give an overview of the Invariant Subspace Decomposition Algorithm for dense symmetric matrices (SYISDA) by first describing the algorithm, followed by a discussion of a parallel implementation of SYISDA on the Intel Delta. Our implementation utilizes an optimized parallel matrix multiplication implementation we have developed. Load balancing in the costly early stages of the algorithm is accomplished without redistribution of data between stages through the use of the block scattered decomposition. Computation of the invariant subspaces at each stage is done using a new tridiagonalization scheme due to Bischof and Sun. 1. Introduction Computation of all the eigenvalues and eigenvectors of a dense symmetric matrix is an essential kernel in many applications. The ever-increasing computational power available from parallel computers offers the potential for solving much larger problems than could have been contemplated previously. Hardware scalability of parallel machines is freque...
A BLAS-3 version of the QR factorization with column pivoting
- SIAM J. SCI. COMPUT
, 1995
"... The QR factorization with column pivoting (QRP), originally suggested by Golub and Businger in 1965, is a popular approach to computing rank-revealing factorizations. Using BLAS Level 1, it was implemented in LINPACK, and, using BLAS Level 2, in LAPACK. While the BLAS Level2version delivers, in gen ..."
Abstract
-
Cited by 12 (2 self)
- Add to MetaCart
The QR factorization with column pivoting (QRP), originally suggested by Golub and Businger in 1965, is a popular approach to computing rank-revealing factorizations. Using BLAS Level 1, it was implemented in LINPACK, and, using BLAS Level 2, in LAPACK. While the BLAS Level2version delivers, in general, superior performance, it may result in worse performance for large matrix sizes due to cache e ects. We introduce a modi cation of the QRP algorithm which allows the use of BLAS Level 3 kernels while maintaining the numerical behavior of the LINPACK and LAPACK implementations. Experimental comparisons of this approach with the LINPACK and LAPACK implementations on IBM RS/6000, SGI R8000, and DEC Alpha platforms show considerable performance improvements.
The PRISM Project: Infrastructure and Algorithms for Parallel Eigensolvers
, 1994
"... The goal of the PRISM project is the development of infrastructure and algorithms for the parallel solution of eigenvalue problems. We are currently investigating a complete eigensolver based on the Invariant Subspace Decomposition Algorithm for dense symmetric matrices (SYISDA). After briefly revie ..."
Abstract
-
Cited by 10 (6 self)
- Add to MetaCart
The goal of the PRISM project is the development of infrastructure and algorithms for the parallel solution of eigenvalue problems. We are currently investigating a complete eigensolver based on the Invariant Subspace Decomposition Algorithm for dense symmetric matrices (SYISDA). After briefly reviewing SYISDA, we discuss the algorithmic highlights of a distributed-memory implementation of this approach. These include a fast matrix-matrix multiplication algorithm, a new approach to parallel band reduction and tridiagonalization, and a harness for coordinating the divide-and-conquer parallelism in the problem. We also present performance results of these kernels as well as the overall SYISDA implementation on the Intel Touchstone Delta prototype. 1. Introduction Computation of eigenvalues and eigenvectors is an essential kernel in many applications, and several promising parallel algorithms have been investigated [29, 24, 3, 27, 21]. The work presented in this paper is part of the PRI...
On Orthogonal Block Elimination
"... . We consider the block elimination problem Q ` A 1 A 2 ' = ` \GammaC 0 ' , where, given a matrix A 2 R m\Thetak , A 11 2 R k\Thetak , we try to find a matrix C with C T C = A T A and an orthogonal matrix Q that eliminates A 2 . Sun and Bischof recently showed that any orthogonal ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
. We consider the block elimination problem Q ` A 1 A 2 ' = ` \GammaC 0 ' , where, given a matrix A 2 R m\Thetak , A 11 2 R k\Thetak , we try to find a matrix C with C T C = A T A and an orthogonal matrix Q that eliminates A 2 . Sun and Bischof recently showed that any orthogonal matrix can be represented in the so-called basis-kernel representation Q = Q(Y; S) = I \Gamma Y ST T . Applying this framework to the block elimination problem, we show that there is considerable freedom in solving the block elimination problem and that, depending on A and C, we can find Y 2 R m\Thetar , S 2 R r\Thetar , where r is between rank(A 2 ) and k, to solve the block elimination problem. We then introduce the canonical basis Y = ` A 1 + C A 2 ' and the canonical kernel S = (A 1 + C) y C \GammaT , which can be determined easily once C has been computed, and relate this view to previously suggested approaches for computing block orthogonal matrices. We also show that th...

