Results 1  10
of
13
Applied Numerical Linear Algebra
 Society for Industrial and Applied Mathematics
, 1997
"... We survey general techniques and open problems in numerical linear algebra on parallel architectures. We rst discuss basic principles of parallel processing, describing the costs of basic operations on parallel machines, including general principles for constructing e cient algorithms. We illustrate ..."
Abstract

Cited by 532 (26 self)
 Add to MetaCart
We survey general techniques and open problems in numerical linear algebra on parallel architectures. We rst discuss basic principles of parallel processing, describing the costs of basic operations on parallel machines, including general principles for constructing e cient algorithms. We illustrate these principles using current architectures and software systems, and by showing how one would implement matrix multiplication. Then, we present direct and iterative algorithms for solving linear systems of equations, linear least squares problems, the symmetric eigenvalue problem, the nonsymmetric eigenvalue problem, and the singular value decomposition. We consider dense, band and sparse matrices.
Solving A Polynomial Equation: Some History And Recent Progress
, 1997
"... The classical problem of solving an nth degree polynomial equation has substantially influenced the development of mathematics throughout the centuries and still has several important applications to the theory and practice of presentday computing. We briefly recall the history of the algorithmic a ..."
Abstract

Cited by 85 (16 self)
 Add to MetaCart
The classical problem of solving an nth degree polynomial equation has substantially influenced the development of mathematics throughout the centuries and still has several important applications to the theory and practice of presentday computing. We briefly recall the history of the algorithmic approach to this problem and then review some successful solution algorithms. We end by outlining some algorithms of 1995 that solve this problem at a surprisingly low computational cost.
Design of a Parallel Nonsymmetric Eigenroutine Toolbox, Part I
, 1993
"... The dense nonsymmetric eigenproblem is one of the hardest linear algebra problems to solve effectively on massively parallel machines. Rather than trying to design a "black box" eigenroutine in the spirit of EISPACK or LAPACK, we propose building a toolbox for this problem. The tools are meant to ..."
Abstract

Cited by 63 (14 self)
 Add to MetaCart
The dense nonsymmetric eigenproblem is one of the hardest linear algebra problems to solve effectively on massively parallel machines. Rather than trying to design a "black box" eigenroutine in the spirit of EISPACK or LAPACK, we propose building a toolbox for this problem. The tools are meant to be used in different combinations on different problems and architectures. In this paper, we will describe these tools which include basic block matrix computations, the matrix sign function, 2dimensional bisection, and spectral divide and conquer using the matrix sign function to find selected eigenvalues. We also outline how we deal with illconditioning and potential instability. Numerical examples are included. A future paper will discuss error analysis in detail and extensions to the generalized eigenproblem.
A Parallel Implementation of the Nonsymmetric QR Algorithm for Distributed Memory Architectures
 SIAM J. SCI. COMPUT
, 2002
"... One approach to solving the nonsymmetric eigenvalue problem in parallel is to parallelize the QR algorithm. Not long ago, this was widely considered to be a hopeless task. Recent efforts have led to significant advances, although the methods proposed up to now have suffered from scalability problems ..."
Abstract

Cited by 36 (3 self)
 Add to MetaCart
One approach to solving the nonsymmetric eigenvalue problem in parallel is to parallelize the QR algorithm. Not long ago, this was widely considered to be a hopeless task. Recent efforts have led to significant advances, although the methods proposed up to now have suffered from scalability problems. This paper discusses an approach to parallelizingthe QR algorithm that greatly improves scalability. A theoretical analysis indicates that the algorithm is ultimately not scalable, but the nonscalability does not become evident until the matrix dimension is enormous. Experiments on the Intel Paragon system, the IBM SP2 supercomputer, the SGI Origin 2000, and the Intel ASCI Option Red supercomputer are reported.
The spectral decomposition of nonsymmetric matrices on distributed memory parallel computers
 SIAM J. Sci. Comput
, 1997
"... Abstract. The implementation and performance of a class of divideandconquer algorithms for computing the spectral decomposition of nonsymmetric matrices on distributed memory parallel computers are studied in this paper. After presenting a general framework, we focus on a spectral divideandconqu ..."
Abstract

Cited by 31 (11 self)
 Add to MetaCart
Abstract. The implementation and performance of a class of divideandconquer algorithms for computing the spectral decomposition of nonsymmetric matrices on distributed memory parallel computers are studied in this paper. After presenting a general framework, we focus on a spectral divideandconquer (SDC) algorithm with Newton iteration. Although the algorithm requires several times as many floating point operations as the best serial QR algorithm, it can be simply constructed from a small set of highly parallelizable matrix building blocks within Level 3 basic linear algebra subroutines (BLAS). Efficient implementations of these building blocks are available on a wide range of machines. In some illconditioned cases, the algorithm may lose numerical stability, but this can easily be detected and compensated for. The algorithm reached 31 % efficiency with respect to the underlying PUMMA matrix multiplication and 82 % efficiency with respect to the underlying ScaLAPACK matrix inversion on a 256 processor Intel Touchstone Delta system, and 41 % efficiency with respect to the matrix multiplication in CMSSL on a 32 node Thinking Machines CM5 with vector units. Our performance model predicts the performance reasonably accurately. To take advantage of the geometric nature of SDC algorithms, we have designed a graphical user interface to let the user choose the spectral decomposition according to specified regions in the complex plane.
Trading off Parallelism and Numerical Stability
, 1992
"... The fastest parallel algorithm for a problem may be significantly less stable numerically than the fastest serial algorithm. We illustrate this phenomenon by a series of examples drawn from numerical linear algebra. We also show how some of these instabilities may be mitigated by better floating poi ..."
Abstract

Cited by 12 (5 self)
 Add to MetaCart
The fastest parallel algorithm for a problem may be significantly less stable numerically than the fastest serial algorithm. We illustrate this phenomenon by a series of examples drawn from numerical linear algebra. We also show how some of these instabilities may be mitigated by better floating point arithmetic.
Progress in the numerical solution of the nonsymmetric eigenvalue problem
, 1993
"... With the growing demands from disciplinary and interdisciplinary fields of science and engineering for the numerical solution of the nonsymmetric eigenvalue problem, competitive new techniques have been developed for solving the problem. In this paper we examine the state of the art of the algorithm ..."
Abstract

Cited by 11 (1 self)
 Add to MetaCart
With the growing demands from disciplinary and interdisciplinary fields of science and engineering for the numerical solution of the nonsymmetric eigenvalue problem, competitive new techniques have been developed for solving the problem. In this paper we examine the state of the art of the algorithmic techniques and the software scene for the problem. Some current developments are also outlined. KEY WORDS nonsymmetric matrices; MACK sparse matrices; eigenvalue problem; EISPACK, 1.
Homotopy Method For The Large Sparse Real Nonsymmetric Eigenvalue Problem
, 1996
"... . A homotopy method to compute the eigenpairs, i.e., the eigenvectors and eigenvalues, of a given real matrix A1 is presented. From the eigenpairs of some real matrix A 0 , the eigenpairs of A(t) j (1 \Gamma t)A 0 + tA 1 are followed at successive "times" from t = 0 to t = 1 using continuation. At ..."
Abstract

Cited by 10 (0 self)
 Add to MetaCart
. A homotopy method to compute the eigenpairs, i.e., the eigenvectors and eigenvalues, of a given real matrix A1 is presented. From the eigenpairs of some real matrix A 0 , the eigenpairs of A(t) j (1 \Gamma t)A 0 + tA 1 are followed at successive "times" from t = 0 to t = 1 using continuation. At t = 1, the eigenpairs of the desired matrix A 1 are found. The following phenomena are present when following the eigenpairs of a general nonsymmetric matrix: ffl bifurcation ffl illconditioning due to nonorthogonal eigenvectors ffl jumping of eigenpaths These can present considerable computational difficulties. Since each eigenpair can be followed independently, this algorithm is ideal for concurrent computers. The homotopy method has the potential to compete with other algorithms for computing a few eigenvalues of large sparse matrices. It may be a useful tool for determining the stability of a solution of a PDE. Some numerical results will be presented. Key words. eigenvalues, homo...
Continuation Methods For The Computation Of Zeros Of SzegĂ¶ Polynomials
, 1995
"... . Let fOE j g 1 j=0 be a family of monic polynomials that are orthogonal with respect to an inner product on the unit circle. The polynomials OE j arise in time series analysis and are often referred to as Szego polynomials or Levinson polynomials. Knowledge about the location of their zeros is im ..."
Abstract

Cited by 10 (4 self)
 Add to MetaCart
. Let fOE j g 1 j=0 be a family of monic polynomials that are orthogonal with respect to an inner product on the unit circle. The polynomials OE j arise in time series analysis and are often referred to as Szego polynomials or Levinson polynomials. Knowledge about the location of their zeros is important for frequency analysis of time series and for filter implementation. We present fast algorithms for computing the zeros of the polynomials OE n based on the observation that the zeros are eigenvalues of a rankone modification of a unitary upper Hessenberg matrix Hn(0) of order n. The algorithms first determine the spectrum of Hn(0) by one of several available schemes that require only O(n 2 ) arithmetic operations. The eigenvalues of the rankone perturbation are then determined from the eigenvalues of Hn(0) by a continuation method. The computation of the n zeros of OE n in this manner typically requires only O(n 2 ) arithmetic operations. The algorithms have a structure that ...
Parallelizing the QR Algorithm for the Unsymmetric Algebraic Eigenvalue Problem: Myths and Reality Accepted for publication
 in SIAM Journal of Scientific Computing, Date unknown
, 1996
"... Abstract. Over the last few years, it has been suggested that the popular QR algorithm for the unsymmetric eigenvalue problem does not parallelize. In this paper, we present both positive and negative results on this subject: In theory, asymptotically perfect speedup can be obtained. In practice, re ..."
Abstract

Cited by 7 (2 self)
 Add to MetaCart
Abstract. Over the last few years, it has been suggested that the popular QR algorithm for the unsymmetric eigenvalue problem does not parallelize. In this paper, we present both positive and negative results on this subject: In theory, asymptotically perfect speedup can be obtained. In practice, reasonable speedup can be obtained on a MIMD distributed memory computer, for a relatively small number of processors. However, we also show theoretically that it is impossible for the standard QR algorithm to be scalable. Performance ofaparallel implementation of the LAPACK DLAHQR routine on the Intel Paragon TM system is reported. 1. Introduction. Distributed