Results 1  10
of
13
Computing the Singular Value Decomposition with High Relative Accuracy
 Linear Algebra Appl
, 1997
"... We analyze when it is possible to compute the singular values and singular vectors of a matrix with high relative accuracy. This means that each computed singular value is guaranteed to have some correct digits, even if the singular values have widely varying magnitudes. This is in contrast to the a ..."
Abstract

Cited by 55 (12 self)
 Add to MetaCart
We analyze when it is possible to compute the singular values and singular vectors of a matrix with high relative accuracy. This means that each computed singular value is guaranteed to have some correct digits, even if the singular values have widely varying magnitudes. This is in contrast to the absolute accuracy provided by conventional backward stable algorithms, whichin general only guarantee correct digits in the singular values with large enough magnitudes. It is of interest to compute the tiniest singular values with several correct digits, because in some cases, such as #nite element problems and quantum mechanics, it is the smallest singular values that havephysical meaning, and should be determined accurately by the data. Many recent papers have identi#ed special classes of matrices where high relative accuracy is possible, since it is not possible in general. The perturbation theory and algorithms for these matrix classes have been quite di#erent, motivating us to seek a co...
Computing RankRevealing QR Factorizations of Dense Matrices
 Argonne Preprint ANLMCSP5590196, Argonne National Laboratory
, 1996
"... this paper, and we give only a brief synopsis here. For details, the reader is referred to the code. Test matrices 1 through 5 were designed to exercise column pivoting. Matrix 6 was designed to test the behavior of the condition estimation in the presence of clusters for the smallest singular value ..."
Abstract

Cited by 26 (1 self)
 Add to MetaCart
this paper, and we give only a brief synopsis here. For details, the reader is referred to the code. Test matrices 1 through 5 were designed to exercise column pivoting. Matrix 6 was designed to test the behavior of the condition estimation in the presence of clusters for the smallest singular value. For the other cases, we employed the LAPACK matrix generator xLATMS, which generates random symmetric matrices by multiplying a diagonal matrix with prescribed singular values by random orthogonal matrices from the left and right. For the break1 distribution, all singular values are 1.0 except for one. In the arithmetic and geometric distributions, they decay from 1.0 to a specified smallest singular value in an arithmetic and geometric fashion, respectively. In the "reversed" distributions, the order of the diagonal entries was reversed. For test cases 7 though 12, we used xLATMS to generate a matrix of order
Fast linear algebra is stable
 In preparation
, 2006
"... In [23] we showed that a large class of fast recursive matrix multiplication algorithms is stable in a normwise sense, and that in fact if multiplication of nbyn matrices can be done by any algorithm in O(n ω+η) operations for any η> 0, then it can be done stably in O(n ω+η) operations for any η> ..."
Abstract

Cited by 25 (15 self)
 Add to MetaCart
In [23] we showed that a large class of fast recursive matrix multiplication algorithms is stable in a normwise sense, and that in fact if multiplication of nbyn matrices can be done by any algorithm in O(n ω+η) operations for any η> 0, then it can be done stably in O(n ω+η) operations for any η> 0. Here we extend this result to show that essentially all standard linear algebra operations, including LU decomposition, QR decomposition, linear equation solving, matrix inversion, solving least squares problems, (generalized) eigenvalue problems and the singular value decomposition can also be done stably (in a normwise sense) in O(n ω+η) operations. 1
A BLAS3 version of the QR factorization with column pivoting
 SIAM J. SCI. COMPUT
, 1995
"... The QR factorization with column pivoting (QRP), originally suggested by Golub and Businger in 1965, is a popular approach to computing rankrevealing factorizations. Using BLAS Level 1, it was implemented in LINPACK, and, using BLAS Level 2, in LAPACK. While the BLAS Level2version delivers, in gen ..."
Abstract

Cited by 17 (2 self)
 Add to MetaCart
The QR factorization with column pivoting (QRP), originally suggested by Golub and Businger in 1965, is a popular approach to computing rankrevealing factorizations. Using BLAS Level 1, it was implemented in LINPACK, and, using BLAS Level 2, in LAPACK. While the BLAS Level2version delivers, in general, superior performance, it may result in worse performance for large matrix sizes due to cache e ects. We introduce a modi cation of the QRP algorithm which allows the use of BLAS Level 3 kernels while maintaining the numerical behavior of the LINPACK and LAPACK implementations. Experimental comparisons of this approach with the LINPACK and LAPACK implementations on IBM RS/6000, SGI R8000, and DEC Alpha platforms show considerable performance improvements.
Parallel Performance of a Symmetric Eigensolver based on the Invariant Subspace Decomposition Approach
, 1994
"... In this paper, we discuss work in progress on a complete eigensolver based on the Invariant Subspace Decomposition Algorithm for dense symmetric matrices (SYISDA). We describe a recently developed acceleration technique that substantially reduces the overall work required by this algorithm and revie ..."
Abstract

Cited by 15 (0 self)
 Add to MetaCart
In this paper, we discuss work in progress on a complete eigensolver based on the Invariant Subspace Decomposition Algorithm for dense symmetric matrices (SYISDA). We describe a recently developed acceleration technique that substantially reduces the overall work required by this algorithm and review the algorithmic highlights of a distributedmemory implementation of this approach. These include a fast matrixmatrix multiplication algorithm, a new approach to parallel band reduction and tridiagonalization, and a harness for coordinating the divideandconquer parallelism in the problem. We present performance results for the dominant kernel, dense matrix multiplication, as well as for the overall SYISDA implementation on the Intel Touchstone Delta and the Intel Paragon. 1. Introduction Computation of eigenvalues and eigenvectors is an essential kernel in many applications, and several promising parallel algorithms have been investigated [26, 3, 28, 22, 25, 6]. The work presented in t...
A Parallel Implementation of the Invariant Subspace Decomposition Algorithm for Dense Symmetric Matrices
, 1993
"... . We give an overview of the Invariant Subspace Decomposition Algorithm for dense symmetric matrices (SYISDA) by first describing the algorithm, followed by a discussion of a parallel implementation of SYISDA on the Intel Delta. Our implementation utilizes an optimized parallel matrix multiplication ..."
Abstract

Cited by 12 (2 self)
 Add to MetaCart
. We give an overview of the Invariant Subspace Decomposition Algorithm for dense symmetric matrices (SYISDA) by first describing the algorithm, followed by a discussion of a parallel implementation of SYISDA on the Intel Delta. Our implementation utilizes an optimized parallel matrix multiplication implementation we have developed. Load balancing in the costly early stages of the algorithm is accomplished without redistribution of data between stages through the use of the block scattered decomposition. Computation of the invariant subspaces at each stage is done using a new tridiagonalization scheme due to Bischof and Sun. 1. Introduction Computation of all the eigenvalues and eigenvectors of a dense symmetric matrix is an essential kernel in many applications. The everincreasing computational power available from parallel computers offers the potential for solving much larger problems than could have been contemplated previously. Hardware scalability of parallel machines is freque...
The PRISM Project: Infrastructure and Algorithms for Parallel Eigensolvers
, 1994
"... The goal of the PRISM project is the development of infrastructure and algorithms for the parallel solution of eigenvalue problems. We are currently investigating a complete eigensolver based on the Invariant Subspace Decomposition Algorithm for dense symmetric matrices (SYISDA). After briefly revie ..."
Abstract

Cited by 12 (6 self)
 Add to MetaCart
The goal of the PRISM project is the development of infrastructure and algorithms for the parallel solution of eigenvalue problems. We are currently investigating a complete eigensolver based on the Invariant Subspace Decomposition Algorithm for dense symmetric matrices (SYISDA). After briefly reviewing SYISDA, we discuss the algorithmic highlights of a distributedmemory implementation of this approach. These include a fast matrixmatrix multiplication algorithm, a new approach to parallel band reduction and tridiagonalization, and a harness for coordinating the divideandconquer parallelism in the problem. We also present performance results of these kernels as well as the overall SYISDA implementation on the Intel Touchstone Delta prototype. 1. Introduction Computation of eigenvalues and eigenvectors is an essential kernel in many applications, and several promising parallel algorithms have been investigated [29, 24, 3, 27, 21]. The work presented in this paper is part of the PRI...
Accurate SVDs of structured matrices
 SIAM J. Mat. Anal. Appl
, 1999
"... We present new O(n 3) algorithms to compute very accurate SVDs of Cauchy matrices, Vandermonde matrices, and related \unitdisplacementrank " matrices. These algorithms compute all the singular values with guaranteed relative accuracy, independent of their dynamic range. In contrast, previous O(n 3 ..."
Abstract

Cited by 8 (4 self)
 Add to MetaCart
We present new O(n 3) algorithms to compute very accurate SVDs of Cauchy matrices, Vandermonde matrices, and related \unitdisplacementrank " matrices. These algorithms compute all the singular values with guaranteed relative accuracy, independent of their dynamic range. In contrast, previous O(n 3) algorithms can potentially lose all relative accuracy in the tiniest singular values.
Computing Symmetric RankRevealing Decompositions Via Triangular Factorization
 SIAM JOURNAL ON MATRIX ANALYSIS AND APPLICATIONS
, 2001
"... We present a family of algorithms for computing symmetric rankrevealing VSV decompositions, based on triangular factorization of the matrix. The VSV decomposition consists of a middle symmetric matrix that reveals the numerical rank in having three blocks with small norm, plus an orthogonal matrix ..."
Abstract

Cited by 7 (2 self)
 Add to MetaCart
We present a family of algorithms for computing symmetric rankrevealing VSV decompositions, based on triangular factorization of the matrix. The VSV decomposition consists of a middle symmetric matrix that reveals the numerical rank in having three blocks with small norm, plus an orthogonal matrix whose columns span approximations to the numerical range and null space. We show that for semidefinite matrices the VSV decomposition should be computed via the ULV decomposition, while for indefinite matrices it must be computed via a URVlike decomposition that involves hypernormal rotations.