## Performance Evaluation of a New Parallel Preconditioner (1995)

### Cached

### Download Links

- [www.cs.cmu.edu]
- [www.cs.cmu.edu]
- [www.cs.cmu.edu]
- [www.cs.cmu.edu]
- [www.cs.cmu.edu]
- [reports-archive.adm.cs.cmu.edu]
- DBLP

### Other Repositories/Bibliography

Venue: | In Proceedings of the Ninth International Parallel Processing Symposium |

Citations: | 21 - 2 self |

### BibTeX

@TECHREPORT{Gremban95performanceevaluation,

author = {Keith D. Gremban and Gary L. Miller and Marco Zagha},

title = {Performance Evaluation of a New Parallel Preconditioner},

institution = {In Proceedings of the Ninth International Parallel Processing Symposium},

year = {1995}

}

### Years of Citing Articles

### OpenURL

### Abstract

The linear systems associated with large, sparse, symmetric, positive definite matrices are often solved iteratively using the preconditioned conjugate gradient method. We have developed a new class of preconditioners, support tree preconditioners, that are based on the connectivity of the graphs corresponding to the matrices and are well-structured for parallel implementation. In this paper, we evaluate the performance of support tree preconditioners by comparing them against two common types of preconditioners: diagonal scaling, and incomplete Cholesky. Support tree preconditioners require less overall storage and less work per iteration than incomplete Cholesky preconditioners. In terms of total execution time, support tree preconditioners outperform both diagonal scaling and incomplete Cholesky preconditioners. 1

### Citations

517 |
Partitioning sparse matrices with eigenvectors of graphs
- Pothen, Simon, et al.
- 1990
(Show Context)
Citation Context ... have constructed support trees using variants of dual tree bisection [7], and recursive coordinate bisection [23]. In the near future, we will construct separator trees using spectral separators [16]=-=[21]-=-[23], and geometric separators [20] as part of our research on the relationship between the method of partitioning and the performance of the corresponding support tree preconditioner. As stated previ... |

345 | Random walks and electric networks - Doyle, Snell - 1984 |

305 |
Partitioning of Unstructured Problems for Parallel Processing
- Simon
- 1991
(Show Context)
Citation Context ... any method for graph partitioning may be used to construct support trees. For example, we have constructed support trees using variants of dual tree bisection [7], and recursive coordinate bisection =-=[23]-=-. In the near future, we will construct separator trees using spectral separators [16][21][23], and geometric separators [20] as part of our research on the relationship between the method of partitio... |

265 |
der Vorst, An iterative solution method for linear systems of which the coefficient matrix is a symmetric M-matrix
- Meijerink, van
- 1977
(Show Context)
Citation Context ...e preconditioners as a posteriori, since they depend only on the coefficient matrix and not on the details of the process used to construct the linear system. Diagonal scaling and incomplete Cholesky =-=[19]-=- are two examples of algebraic preconditioners. Multilevel preconditioners are less general in that they depend on some knowledge of the differential equation or of the discretization process [13]; we... |

214 |
Finite Element Solution of Boundary Value Problems
- Axelsson, Barker
- 1984
(Show Context)
Citation Context ...of preconditioned conjugate gradients (PCG). Much research has focused on the development of good preconditioners. Three criteria should be met by a good preconditioner B for the coefficient matrix A =-=[4]-=-[25]: • Preconditioning with B should reduce the number of iterations required to converge. • B should be easy to compute. That is, the cost of constructing the preconditioner should be small compared... |

188 |
An Improved Spectral Graph Partitioning Algorithm for Mapping Parallel Computations
- Hendrickson, Leland
- 1995
(Show Context)
Citation Context ..., we have constructed support trees using variants of dual tree bisection [7], and recursive coordinate bisection [23]. In the near future, we will construct separator trees using spectral separators =-=[16]-=-[21][23], and geometric separators [20] as part of our research on the relationship between the method of partitioning and the performance of the corresponding support tree preconditioner. As stated p... |

142 | NESL: A Nested Data-Parallel Language
- Blelloch
- 1993
(Show Context)
Citation Context ... towards constructing a version of STCG that is optimized from end to end. Currently, the code used to generate support tree preconditioners is written in NESL, an experimental data-parallel language =-=[5]-=-. The various implementations of PCG were written in Fortran. We made no attempt to go beyond the obvious optimizations to improve the performance of ICCG. Numerous other authors have reported on the ... |

77 |
Automatic mesh partitioning
- Miller, Teng, et al.
- 1993
(Show Context)
Citation Context ...ng variants of dual tree bisection [7], and recursive coordinate bisection [23]. In the near future, we will construct separator trees using spectral separators [16][21][23], and geometric separators =-=[20]-=- as part of our research on the relationship between the method of partitioning and the performance of the corresponding support tree preconditioner. As stated previously, the exact form and weighting... |

72 |
der Vorst, Solving linear systems on vector and shared memory computers
- Dongarra, Duff, et al.
- 1991
(Show Context)
Citation Context ...to convergence. b) Total execution time for iterative process on a Cray C-90 (msecs). b)sverge, but is easily parallelized, yielding very high computational rates on vector and parallel architectures =-=[8]-=-, [13], [17], [24]. In fact, the computational rates achievable by DSCG can often make up for the high number of iterations, making DSCG the iterative method of choice in many cases [8], [17]. Incompl... |

69 |
Efficient implementation of a class of preconditioned conjugate gradient methods
- Eisenstat
- 1981
(Show Context)
Citation Context ...mount of work per iteration; determination of orderings to increase the amount of parallelism; use of factored inverses. We discuss each of these approaches in the paragraphs below. t z = y Eisenstat =-=[11]-=- reported an efficient implementation of ICCG for cases in which the preconditioner can be represented in the form K ( L + D) D , where . By rescaling the original system by to obtain , where , , and ... |

45 |
der Vorst, \High performance preconditioning
- van
- 1989
(Show Context)
Citation Context ...preconditioned conjugate gradients (PCG). Much research has focused on the development of good preconditioners. Three criteria should be met by a good preconditioner B for the coefficient matrix A [4]=-=[25]-=-: • Preconditioning with B should reduce the number of iterations required to converge. • B should be easy to compute. That is, the cost of constructing the preconditioner should be small compared to ... |

39 |
Stopping criteria for iterative solvers
- Arioli, Duff, et al.
- 1992
(Show Context)
Citation Context ... u ( 1, y) = u ( x, 0) = u ( x, 1) = 0 For our initial experiments, we used the same forcing function as Greenbaum et al.: f ( x, y) = −2x ( 1−x) −2y ( 1−y) 12sOur starting vector was x Arioli et al. =-=[3]-=-: . We used as our stopping criterion the condition reported to be superior by 0 = 0 We halted when ω 2 ≤ 1.0 x 10 -10 . Figure 9a shows the results in terms of number of iterations for convergence. T... |

36 | Optimal parallel solution of sparse triangular systems
- Alvarado, Schreiber
- 1993
(Show Context)
Citation Context ...rent in performing triangular solves 3sby the usual backward and forward substitution algorithms. An alternative is to use a different algorithm for solving triangular systems. Alvarado and Schreiber =-=[1]-=- presented a method of solving a sparse triangular system by representing the inverse as the product of a few sparse factors, which enables solving the system as a sequence of sparse matrix-vector mul... |

28 | Segmented operations for sparse matrix computation on vector multiprocessors
- Blelloch, Heroux, et al.
- 1993
(Show Context)
Citation Context ...ted with a single general-purpose sparse matrix multiplication subroutine. On the Cray C-90, we use an algorithm called SEGMV, which accommodates arbitrary row sizes using “segmented scan” operations =-=[6]-=-. Compared to other methods (such as Ellpack/Itpack and Jagged Diagonal), SEGMV performance is comparable for structured matrices, and superior for most irregular matrices. Thus our PCG implementation... |

25 |
Meurant, The effect of orderings on preconditioned conjugate gradients
- Duff, A
- 1989
(Show Context)
Citation Context ...en in Fortran. We made no attempt to go beyond the obvious optimizations to improve the performance of ICCG. Numerous other authors have reported on the effects of ordering on ICCG (see, for example, =-=[10]-=-), and on parallel implementations of ICCG (see [8], [24], and [25]). Rather than reproduce their work, we decided to extrapolate values for an optimistic implementation of ICCG. We applied the result... |

22 |
Solving sparse triangular linear systems on parallel computers
- Anderson, Saad
- 1989
(Show Context)
Citation Context ... of the Mflop rate of unpreconditioned CG. The technique of determining independent nodes to evaluate in parallel is known as level scheduling, and was first discussed in general by Anderson and Saad =-=[2]-=-. The effectiveness of ordering nodes for optimal parallel performance is limited by the topology of the original system, however. The excellent results reported above were for regular rectangular gra... |

17 |
List ranking and parallel tree contraction
- Reid-Miller, Miller, et al.
- 1993
(Show Context)
Citation Context ...ing, since all existing leaves are “raked” off the tree at each step. Parallel node evaluation by leaf raking is a special case of a more general parallel algorithm known as parallel tree contraction =-=[22]-=-. Figure 8 illustrates the process of leaf raking on a simple tree. An analogous process exists for the downward directed tree: expansions from parents to children can be performed independently in pa... |

14 | A parallel preconditioned conjugate gradient package for solving sparse linear systems on a Cray
- Heroux, Vu, et al.
- 1991
(Show Context)
Citation Context ...nce. b) Total execution time for iterative process on a Cray C-90 (msecs). b)sverge, but is easily parallelized, yielding very high computational rates on vector and parallel architectures [8], [13], =-=[17]-=-, [24]. In fact, the computational rates achievable by DSCG can often make up for the high number of iterations, making DSCG the iterative method of choice in many cases [8], [17]. Incomplete Cholesky... |

9 |
der Vorst. ICCG and Related Methods for 3D Problems on Vector Computers
- van
- 1989
(Show Context)
Citation Context ...) Total execution time for iterative process on a Cray C-90 (msecs). b)sverge, but is easily parallelized, yielding very high computational rates on vector and parallel architectures [8], [13], [17], =-=[24]-=-. In fact, the computational rates achievable by DSCG can often make up for the high number of iterations, making DSCG the iterative method of choice in many cases [8], [17]. Incomplete Cholesky (IC) ... |

8 | Nested Dissection: A survey and comparison of various nested dissection algorithms - Khaira, Miller, et al. - 1992 |

7 | Automatic partitioning of unstructured grids into connected components
- Dagum
- 1993
(Show Context)
Citation Context ...ors used to construct them. In practice, any method for graph partitioning may be used to construct support trees. For example, we have constructed support trees using variants of dual tree bisection =-=[7]-=-, and recursive coordinate bisection [23]. In the near future, we will construct separator trees using spectral separators [16][21][23], and geometric separators [20] as part of our research on the re... |

4 |
Comparison of linear system solvers applied to diffusiontype finite element equations
- Greenbaum, Li, et al.
- 1989
(Show Context)
Citation Context ...sky [19] are two examples of algebraic preconditioners. Multilevel preconditioners are less general in that they depend on some knowledge of the differential equation or of the discretization process =-=[13]-=-; we classify these preconditioners as a priori, since they depend on knowledge about the construction of the coefficient matrix, rather than on just the matrix itself. The best performance is achieve... |

3 |
Multilevel Preconditioners: Analysis, performance enhancements, and parallel algorithms
- Guo
- 1992
(Show Context)
Citation Context ...r should be small; thus, on parallel machines it is important that the application of the preconditioner be parallelizable. Preconditioners can be categorized as being either algebraic, or multilevel =-=[15]-=-. Algebraic preconditioners depend only on the algebraic structure of the coefficient matrix A; we classify these preconditioners as a posteriori, since they depend only on the coefficient matrix and ... |

1 |
Towards the Application of Graph Theory to Finding Parallel Preconditioners for Sparse Symmetric Linear Systems
- Gremban, Miller
(Show Context)
Citation Context ... a). Let H be the support tree for G. Let B be the Laplacian matrix corresponding to H. We would like to use B as a preconditioner for A, but B is of order 2n-1, and A is of order n. In another paper =-=[6]-=-, we describe the theory proving that B can be used as a preconditioner for A. Here, we present an overview of the theory in order to gain some intuition. Suppose that H is a binary tree with n leaves... |