Results 1  10
of
38
Minimizing Communication in Sparse Matrix Solvers
"... Data communication within the memory system of a single processor node and between multiple nodes in a system is the bottleneck in many iterative sparse matrix solvers like CG and GMRES. Here k iterations of a conventional implementation perform k sparsematrixvectormultiplications and Ω(k) vecto ..."
Abstract

Cited by 23 (9 self)
 Add to MetaCart
Data communication within the memory system of a single processor node and between multiple nodes in a system is the bottleneck in many iterative sparse matrix solvers like CG and GMRES. Here k iterations of a conventional implementation perform k sparsematrixvectormultiplications and Ω(k) vector operations like dot products, resulting in communication that grows by a factor of Ω(k) in both the memory and network. By reorganizing the sparsematrix kernel to compute a set of matrixvector products at once and reorganizing the rest of the algorithm accordingly, we can perform k iterations by sending O(log P) messages instead of O(k · log P) messages on a parallel machine, and reading the matrix A from DRAM to cache just once, instead of k times on a sequential machine. This reduces communication to the minimum possible. We combine these techniques to form a new variant of GMRES. Our sharedmemory implementation on an 8core Intel Clovertown gets speedups of up to 4.3 × over standard GMRES, without sacrificing convergence rate or numerical stability. 1.
Deflated iterative methods for linear equations with multiple righthand sides
, 2004
"... Abstract. A new approach is discussed for solving large nonsymmetric systems of linear equations with multiple righthand sides. The first system is solved with a deflated GMRES method that generates eigenvector information at the same time that the linear equations are solved. Subsequent systems ar ..."
Abstract

Cited by 16 (6 self)
 Add to MetaCart
Abstract. A new approach is discussed for solving large nonsymmetric systems of linear equations with multiple righthand sides. The first system is solved with a deflated GMRES method that generates eigenvector information at the same time that the linear equations are solved. Subsequent systems are solved by combining restarted GMRES with a projection over the previously determined eigenvectors. This approach offers an alternative to block methods, and it can also be combined with a block method. It is useful when there are a limited number of small eigenvalues that slow the convergence. An example is given showing significant improvement for a problem from quantum chromodynamics. The second and subsequent righthand sides are solved much quicker than without the deflation. This new approach is relatively simple to implement and is very efficient compared to other deflation methods.
Recycling Subspace Information for Diffuse Optical Tomography
 SIAM J. Sci. Comput
, 2004
"... We discuss the efficient solution of a large sequence of slowly varying linear systems arising in computations for diffuse optical tomographic imaging. In particular, we analyze a number of strategies for recycling Krylov subspace information for the most efficient solution. We reconstruct threedim ..."
Abstract

Cited by 13 (2 self)
 Add to MetaCart
We discuss the efficient solution of a large sequence of slowly varying linear systems arising in computations for diffuse optical tomographic imaging. In particular, we analyze a number of strategies for recycling Krylov subspace information for the most efficient solution. We reconstruct threedimensional...
Parallel domain decomposition methods for stochastic elliptic equations
 SIAM J. Sci. Comput
"... Abstract. We present parallel Schwarz type domain decomposition preconditioned recycling Krylov subspace methods for the numerical solution of stochastic elliptic problems, whose coefficients are assumed to be a random field with finite variance. KarhunenLoève (KL) expansion and double orthogonal p ..."
Abstract

Cited by 8 (1 self)
 Add to MetaCart
Abstract. We present parallel Schwarz type domain decomposition preconditioned recycling Krylov subspace methods for the numerical solution of stochastic elliptic problems, whose coefficients are assumed to be a random field with finite variance. KarhunenLoève (KL) expansion and double orthogonal polynomials are used to reformulate the stochastic elliptic problem into a large number of related, but uncoupled deterministic equations. The key to an efficient algorithm lies in “recycling computed subspaces”. Based on a careful analysis of the KL expansion we propose and test a grouping algorithm that tells us when to recycle and when to recompute some components of the expensive computation. We show theoretically and experimentally that the Schwarz preconditioned recycling GMRES method is optimal for the entire family of linear systems. A fully parallel implementation is provided and scalability results are reported in the paper. Key words. Stochastic elliptic equations, domain decomposition, recycling Krylov subspace method, parallel scalability
DEFLATED AND RESTARTED SYMMETRIC LANCZOS METHODS FOR EIGENVALUES AND LINEAR EQUATIONS WITH MULTIPLE Righthand Sides
, 2008
"... A deflated restarted Lanczos algorithm is given for both solving symmetric linear equations and computing eigenvalues and eigenvectors. The restarting limits the storage so that finding eigenvectors is practical. Meanwhile, the deflating from the presence of the eigenvectors allows the linear equat ..."
Abstract

Cited by 7 (3 self)
 Add to MetaCart
A deflated restarted Lanczos algorithm is given for both solving symmetric linear equations and computing eigenvalues and eigenvectors. The restarting limits the storage so that finding eigenvectors is practical. Meanwhile, the deflating from the presence of the eigenvectors allows the linear equations to generally have good convergence in spite of the restarting. Some reorthogonalization is necessary to control roundoff error, and several approaches are discussed. The eigenvectors generated while solving the linear equations can be used to help solve systems with multiple righthand sides. Experiments are given with large matrices from quantum chromodynamics that have many righthand sides.
LowRank Tensor Krylov Subspace Methods for Parametrized Linear Systems
, 2010
"... We consider linear systems A(α)x(α) = b(α) depending on possibly many parameters α = (α1,...,αp). Solving these systems simultaneously for a standard discretization of the parameter space would require a computational effort growing exponentially in the number of parameters. We show that this curse ..."
Abstract

Cited by 6 (2 self)
 Add to MetaCart
We consider linear systems A(α)x(α) = b(α) depending on possibly many parameters α = (α1,...,αp). Solving these systems simultaneously for a standard discretization of the parameter space would require a computational effort growing exponentially in the number of parameters. We show that this curse of dimensionality can be avoided for sufficiently smooth parameter dependencies. For this purpose, computational methods are developed that benefit from the fact that x(α) can be well approximated by a tensor of low rank. In particular, lowrank tensor variants of shortrecurrence Krylov subspace methods are presented. Numerical experiments for deterministic PDEs with parametrized coefficients and stochastic elliptic PDEs demonstrate the effectiveness of our approach.
Preconditioners for generalized saddlepoint problems
"... We propose and examine blockdiagonal preconditioners and variants of indefinite preconditioners for block twobytwo generalized saddlepoint problems. That is, we consider the nonsymmetric, nonsingular case where the (2,2) block is small in norm, and we are particularly concerned with the case wh ..."
Abstract

Cited by 4 (2 self)
 Add to MetaCart
We propose and examine blockdiagonal preconditioners and variants of indefinite preconditioners for block twobytwo generalized saddlepoint problems. That is, we consider the nonsymmetric, nonsingular case where the (2,2) block is small in norm, and we are particularly concerned with the case where the (1,2) block is different from the transposed (2,1) block. We provide theoretical and experimental analyses of the convergence and eigenvalue distributions of the preconditioned matrices. We also extend the results of [de Sturler and Liesen 2005] to matrices with nonzero (2,2) block and to the use of approximate Schur complements. To demonstrate the effectiveness of these preconditioners we show convergence results, spectra and eigenvalue bounds for two model NavierStokes problems.
Deflated GMRES for systems with multiple shifts and multiple righthand sides
, 2007
"... Abstract. We consider solution of multiply shifted systems of nonsymmetric linear equations, possibly also with multiple righthand sides. First, for a single righthand side, the matrix is shifted by several multiples of the identity. Such problems arise in a number of applications, including latti ..."
Abstract

Cited by 4 (0 self)
 Add to MetaCart
Abstract. We consider solution of multiply shifted systems of nonsymmetric linear equations, possibly also with multiple righthand sides. First, for a single righthand side, the matrix is shifted by several multiples of the identity. Such problems arise in a number of applications, including lattice quantum chromodynamics where the matrices are complex and nonHermitian. Some Krylov iterative methods such as GMRES and BiCGStab have been used to solve multiply shifted systems for about the cost of solving just one system. Restarted GMRES can be improved by deflating eigenvalues for matrices that have a few small eigenvalues. We show that a particular deflated method, GMRESDR, can be applied to multiply shifted systems. In quantum chromodynamics, it is common to have multiple righthand sides with multiple shifts for each righthand side. We develop a method that efficiently solves the multiple righthand sides by using a deflated version of GMRES and yet keeps costs for all of the multiply shifted systems close to those for one shift. An example is given showing this can be extremely effective with a quantum chromodynamics matrix.
Parametric Model Order Reduction Accelerated by Subspace Recycling
"... Abstract — Many model order reduction methods for parameterized systems need to construct a projection matrix V which requires computing several moment matrices of the parameterized systems. For computing each moment matrix, the solution of a linear system with multiple righthand sides is required. ..."
Abstract

Cited by 4 (2 self)
 Add to MetaCart
Abstract — Many model order reduction methods for parameterized systems need to construct a projection matrix V which requires computing several moment matrices of the parameterized systems. For computing each moment matrix, the solution of a linear system with multiple righthand sides is required. Furthermore, the number of linear systems increases with both the number of moment matrices used and the number of parameters in the system. Usually, a considerable number of linear systems has to be solved when the system includes more than two parameters. The standard way of solving these linear systems in case sparse direct solvers are not feasible is to use conventional iterative methods such as GMRES or CG. In this paper, a fast recycling algorithm is applied to solve the whole sequence of linear systems and is shown to be much more efficient than the standard iterative solver GMRES as well as the newly proposed recycling method MKRGMRES from [10]. As a result, the computation of the reducedorder model can be significantly accelerated. I.
Preconditioner updates applied to CFD model problems
"... This paper deals with solving sequences of nonsymmetric linear systems with a block structure arising from compressible flow problems. The systems are solved by a preconditioned iterative method. We attempt to improve the overall solution process by sharing a part of the computational effort through ..."
Abstract

Cited by 3 (1 self)
 Add to MetaCart
This paper deals with solving sequences of nonsymmetric linear systems with a block structure arising from compressible flow problems. The systems are solved by a preconditioned iterative method. We attempt to improve the overall solution process by sharing a part of the computational effort throughout the sequence. Our approach is fully algebraic and it is based on updating preconditioners by a block triangular update. A particular update is computed in a blackbox fashion from the known preconditioner of some of the previous matrices, and from the difference of involved matrices. Results of our test compressible flow problems show, that the strategy speeds up the entire computation. The acceleration is particularly important in phases of instationary behavior where we saved about half of the computational time in the supersonic and moderate Mach number cases. In the low Mach number case the updated decompositions were similarly effective as the frozen preconditioners.