## Parallel ScaLAPACK-style Algorithms for Solving Continuous-Time Sylvester Equations (2003)

Venue: | In Euro-Par 2003 Parallel Processing, H. Kosch and et al, Eds. Lecture Notes in Computer Science |

Citations: | 8 - 7 self |

### BibTeX

@INPROCEEDINGS{Granat03parallelscalapack-style,

author = {Robert Granat and Peter Poromaa},

title = {Parallel ScaLAPACK-style Algorithms for Solving Continuous-Time Sylvester Equations},

booktitle = {In Euro-Par 2003 Parallel Processing, H. Kosch and et al, Eds. Lecture Notes in Computer Science},

year = {2003},

pages = {800--809},

publisher = {Springer}

}

### Years of Citing Articles

### OpenURL

### Abstract

Abstract. An implementation of a parallel ScaLAPACK-style solver for the general Sylvester equation, op(A)X − Xop(B) = C, where op(A) denotes A or its transpose A T, is presented. The parallel algorithm is based on explicit blocking of the Bartels-Stewart method. An initial transformation of the coefficient matrices A and B to Schur form leads to a reduced triangular matrix equation. We use different matrix traversing strategies to handle the transposes in the problem to solve, leading to different new parallel wave-front algorithms. We also present a strategy to handle the problem when 2 x 2 diagonal blocks of the matrices in Schur form, corresponding to complex conjugate pairs of eigenvalues, are split between several blocks in the block partitioned matrices. Finally, the solution of the reduced matrix equation is transformed back to the originally coordinate system. The implementation acts in a ScaLAPACK environment using 2-dimensional block cyclic mapping of the matrices onto a rectangular grid of processes. Real performance results are presented which verify that our parallel algorithms are reliable and scalable. Keywords: Sylvester matrix equation, continuous-time, Bartels–Stewart

### Citations

789 |
S.: A set of level-3 basic linear algebra subprograms
- Dongarra, Croz, et al.
- 1990
(Show Context)
Citation Context ...carry out Step 1 we use the QR-algorithm [2]. The updates in Step 2 and the back-transformation in Step 4 are carried out using ordinary GEMM-operations C ← βC +αop(A)op(B), where α and β are scalars =-=[5, 13, 14]-=-. Our focus is on Step 3. Using the Kronecker product notation, ⊗ , we can rewrite the triangular Sylvester equation as a linear system of equations Zx = y, (2) where Z = IN ⊗op(A)−op(B) T ⊗IM is a ma... |

381 |
R.: ScaLAPACK User’s Guide
- Blackford, Choi, et al.
- 1997
(Show Context)
Citation Context ...nally, in Section 4, we present experimental results and discuss the performance of our general ScaLAPACK-style solver. Our parallel implementations mainly adopt to the ScaLAPACK software conventions =-=[3]-=-. The P processors (or virtual processes) are viewed as a rectangular processor grid Pr ×Pc, with Pr ≥ 1 processor rows and Pc ≥ 1 processor columns such that P = Pr · Pc. The data layout of dense mat... |

91 | GEMM-Based Level 3 BLAS: High-Performance Model Implementations and Performance Evaluation Benchmark
- Kågström, Ling, et al.
- 1995
(Show Context)
Citation Context ...carry out Step 1 we use the QR-algorithm [2]. The updates in Step 2 and the back-transformation in Step 4 are carried out using ordinary GEMM-operations C ← βC +αop(A)op(B), where α and β are scalars =-=[5, 13, 14]-=-. Our focus is on Step 3. Using the Kronecker product notation, ⊗ , we can rewrite the triangular Sylvester equation as a linear system of equations Zx = y, (2) where Z = IN ⊗op(A)−op(B) T ⊗IM is a ma... |

49 | Recursive Blocked Algorithms for Solving Triangular Systems: Part I: One-Sided and Coupled Sylvester-type Matrix Equations
- JONSSON, KÅGSTRÖM
- 2002
(Show Context)
Citation Context ...a combined backward/forward substitution process [1]. In blocked algorithms, the explicit Kronecker matrix representation Zx = y is used in kernels for solving small-sized matrix equations (e.g., see =-=[11, 12, 15]-=-). The rest of the paper is organized as follows: In Section 2, we give a brief overview of blocked algorithms for solving the triangular SYCT equation. Section 3 is devoted to parallel algorithms foc... |

37 | A parallel implementation of the nonsymmetric QR algorithm for distributed memory architectures. LAPACK Working Note 121
- Henry, Watkins, et al.
- 1997
(Show Context)
Citation Context ...nes PDGEHRD, PDLAHQR and PDGEMM [3]. The first two routines are used in Step 1 to compute the Schur decompositions of A and B (reduction to upper Hessenberg form followed by the parallel QR algorithm =-=[9, 8]-=-). PDGEMM is the parallel implementation of the level 3 BLAS DGEMM operation and is used in Steps 2 and 4 for doing the two-sided matrix multiply updates. To carry out Step 3 in parallel, we traverse ... |

34 |
de Geijn. Parallelizing the QR algorithm for the unsymmetric algebraic eigenvalue problem: myths and reality
- Henry, van
- 1997
(Show Context)
Citation Context ...nes PDGEHRD, PDLAHQR and PDGEMM [3]. The first two routines are used in Step 1 to compute the Schur decompositions of A and B (reduction to upper Hessenberg form followed by the parallel QR algorithm =-=[9, 8]-=-). PDGEMM is the parallel implementation of the level 3 BLAS DGEMM operation and is used in Steps 2 and 4 for doing the two-sided matrix multiply updates. To carry out Step 3 in parallel, we traverse ... |

25 |
Distributed and shared memory block algorithms for the triangular Sylvester equation with sep−1 estimators
- K˚agström, Poromaa
- 1992
(Show Context)
Citation Context ...nt spectra. The Sylvester equation appears naturally in several applications. Examples include block-diagonalizing of a matrix in Schur form and condition estimation of eigenvalue problems (e.g., see =-=[15, 10, 16]-=-). Our method for solving SYCT (1) is based on the Bartels–Stewart method [1]:s1. Transform A and B to upper (quasi)triangular form TA and TB, respectively, using orthogonal similarity transformations... |

22 |
Perturbation theory and backward error for
- Higham
- 1993
(Show Context)
Citation Context ...nt spectra. The Sylvester equation appears naturally in several applications. Examples include block-diagonalizing of a matrix in Schur form and condition estimation of eigenvalue problems (e.g., see =-=[15, 10, 16]-=-). Our method for solving SYCT (1) is based on the Bartels–Stewart method [1]:s1. Transform A and B to upper (quasi)triangular form TA and TB, respectively, using orthogonal similarity transformations... |

14 |
Parallel Algorithms for Triangular Sylvester Equations: Design, Scheduling and Scalability Issues
- Poromaa
- 1998
(Show Context)
Citation Context ...nt spectra. The Sylvester equation appears naturally in several applications. Examples include block-diagonalizing of a matrix in Schur form and condition estimation of eigenvalue problems (e.g., see =-=[15, 10, 16]-=-). Our method for solving SYCT (1) is based on the Bartels–Stewart method [1]:s1. Transform A and B to upper (quasi)triangular form TA and TB, respectively, using orthogonal similarity transformations... |

10 |
Algorithm 432: Solution of the Equation
- Bartels, Stewart
- 1972
(Show Context)
Citation Context ... include block-diagonalizing of a matrix in Schur form and condition estimation of eigenvalue problems (e.g., see [15, 10, 16]). Our method for solving SYCT (1) is based on the Bartels–Stewart method =-=[1]-=-:s1. Transform A and B to upper (quasi)triangular form TA and TB, respectively, using orthogonal similarity transformations: Q T AQ = TA, P T BP = TB. 2. Update the matrix C with respect to the transf... |

10 |
GEMM-based level 3 BLAS: Portability and optimization issues
- K˚agström, Ling, et al.
- 1998
(Show Context)
Citation Context ...carry out Step 1 we use the QR-algorithm [2]. The updates in Step 2 and the back-transformation in Step 4 are carried out using ordinary GEMM-operations C ← βC +αop(A)op(B), where α and β are scalars =-=[5, 13, 14]-=-. Our focus is on Step 3. Using the Kronecker product notation, ⊗ , we can rewrite the triangular Sylvester equation as a linear system of equations Zx = y, (2) where Z = IN ⊗op(A)−op(B) T ⊗IM is a ma... |

9 |
LAPACK Users’ Guide, Third Edition
- Blackford, Demmel, et al.
- 1999
(Show Context)
Citation Context ... the matrix is upper block triangular with 1 × 1 and 2 × 2 diagonal blocks, corresponding to real and complex conjugate pairs of eigenvalues, respectively. To carry out Step 1 we use the QR-algorithm =-=[2]-=-. The updates in Step 2 and the back-transformation in Step 4 are carried out using ordinary GEMM-operations C ← βC +αop(A)op(B), where α and β are scalars [5, 13, 14]. Our focus is on Step 3. Using t... |

6 |
An Hierarchical Approach for Performance Analysis of ScaLAPACK-based Routines Using the Distributed Linear Algebra Machine
- Dackland, K˚agström
- 1996
(Show Context)
Citation Context ...ks of A and B can be expressed as Da = ⌈M/MB⌉ and Db = ⌈N/NB⌉, respectively. Then Equation (3) can be rewritten in block-partitioned form: AiiXij − XijBjj = Cij − ( Da � k=i+1 j−1 AikXkj − � XikBkj), =-=(4)-=- where i = 1, 2, . . . , Da and j = 1, 2, . . . , Db. Based on this summation formula, a serial blocked algorithm can be formulated, see Figure 1. for j=1, Db for i=Da, 1, -1 {Solve the (i, j)th subsy... |

6 | A Web Computing Environment for the SLICOT Library
- Elmroth, Johansson, et al.
- 2001
(Show Context)
Citation Context ....4 3.1 0.60 0.78 Table 1. Performance of PDTRSY solving AX − XB = C and AX − XB T = C. Our software is designed for integration in state-of-the-art software libraries such as ScaLAPACK [3] and SLICOT =-=[17, 6]-=-. Acknowledgements This research was conducted using the resources of the High Performance Computing Center North (HPC2N).sM = N MB Pr Pl Time (sec.) Sp Ep #Ext Abs. residual 1024 64 1 1 696 1.0 1.00 ... |

4 | A Parallel ScaLAPACK-style Sylvester Solver
- Granat
- 2003
(Show Context)
Citation Context ...AX − XB = C on a 2 × 2 processor grid. shown that this gives a theoretical limit for the speedup of the triangular solver as max(Pr, Pc) for solving the subsystems, and as Pr ·Pc for the GEMM-updates =-=[7, 16]-=-. A high-level parallel block algorithm for the solving the general triangular SYCT equation (1) is presented in Figure 5. 3.1 The 2 × 2 diagonal block split problem When entering the triangular solve... |