Results 1 - 10
of
10
Parallel Numerical Linear Algebra
- Society for Industrial and Applied Mathematics
, 1997
"... We survey general techniques and open problems in numerical linear algebra on parallel architectures. We first discuss basic principles of parallel processing, describing the costs of basic operations on parallel machines, including general principles for constructing efficient algorithms. We illust ..."
Abstract
-
Cited by 418 (24 self)
- Add to MetaCart
We survey general techniques and open problems in numerical linear algebra on parallel architectures. We first discuss basic principles of parallel processing, describing the costs of basic operations on parallel machines, including general principles for constructing efficient algorithms. We illustrate these principles using current architectures and software systems, and by showing how one would implement matrix multiplication. Then, we present direct and iterative algorithms for solving linear systems of equations, linear least squares problems, the symmetric eigenvalue problem, the nonsymmetric eigenvalue problem, the singular value decomposition, and generalizations of these to two matrices. We consider dense, band and sparse matrices.
Developments and Trends in the Parallel Solution of Linear Systems
- Parallel Computing
, 1999
"... In this review paper, we consider some important developments and trends in algorithm design for the solution of linear systems concentrating on aspects that involve the exploitation of parallelism. We briefly discuss the solution of dense linear systems, before studying the solution of sparse equat ..."
Abstract
-
Cited by 5 (0 self)
- Add to MetaCart
In this review paper, we consider some important developments and trends in algorithm design for the solution of linear systems concentrating on aspects that involve the exploitation of parallelism. We briefly discuss the solution of dense linear systems, before studying the solution of sparse equations by direct and iterative methods. We consider preconditioning techniques for iterative solvers and discuss some of the present research issues in this field. Keywords: linear systems, dense matrices, sparse matrices, tridiagonal systems, parallelism, direct methods, iterative methods, Krylov methods, preconditioning. AMS(MOS) subject classifications: 65F05, 65F50. 1 Introduction Solution methods for systems of linear equations Ax = b; (1) where A is a coefficient matrix of order n and x and b are n-vectors, are usually grouped into two distinct classes: direct methods and iterative methods. However, CCLRC - Rutherford Appleton Laboratory, Oxfordshire, England and CERFACS, Toulouse,...
On the Portability and Efficiency of Parallel Algorithms and Software
- Delft University of Technology
, 1994
"... Parallel software development must face the fact that different architectures require different implementations. Flexibility in modifying parallel methods and software is necessary because the efficiency of algorithms is dependent on the characteristics of the target computer. Furthermore different ..."
Abstract
-
Cited by 4 (3 self)
- Add to MetaCart
Parallel software development must face the fact that different architectures require different implementations. Flexibility in modifying parallel methods and software is necessary because the efficiency of algorithms is dependent on the characteristics of the target computer. Furthermore different parallel computers require different implementations of data in data-structures. The required flexibility is obtained by identifying abstraction levels and development steps in parallel algorithm and software development. The approach that is proposed ensures that all choices in the design are properly recognised and documented. As a result it is simple to compare the characteristics of a new parallel computer with the characteristics that are used in the software. In this way the development itself becomes more portable and thus less architecture dependent. 1 Introduction A main task of parallel software development is to obtain highly efficient and portable software for parallel computers...
Parallel Krylov Methods for Econometric Model Simulation
- Computational Economics
, 2000
"... This paper investigates parallel solution methods to simulate large-scale macroeconometric models with forward-looking variables. The method chosen is the Newton-Krylov algorithm. We concentrate on a parallel solution to the sparse linear system arising in the Newton algorithm, and we empirically ..."
Abstract
-
Cited by 2 (1 self)
- Add to MetaCart
This paper investigates parallel solution methods to simulate large-scale macroeconometric models with forward-looking variables. The method chosen is the Newton-Krylov algorithm. We concentrate on a parallel solution to the sparse linear system arising in the Newton algorithm, and we empirically analyze the scalability of the GMRES method, which belongs to the class of so-called Krylov subspace methods. The results obtained using an implementation of the PETSc 2.0 software library on an IBM SP2 show a near linear scalability for the problem tested. Keywords: Parallel computing, Newton-Krylov methods, sparse matrices, forward-looking models, GMRES, scalability. JEL Classification: C63, C88, C30. 1 Introduction There are many engineering problems for which parallel computing has proven efficient. Economic problems are, however, often quite different in both structure and quantification. This is particularly true for systems of equations representing large economic models, wh...
Execution Time Analysis for Least Squares Problems on Massively Parallel Distributed Memory Computers
- In Proceedings of International Conference on Computational Modeling and Computing (CMCP-96
, 1996
"... . In this paper we mainly focus on the study of the parallelization of PCGLS, a basic iterative method whose main idea is to organize the computation of conjugate gradient method with preconditioner applied to normal equations. Based on the data distribution model, we analyze fully the most suitable ..."
Abstract
-
Cited by 2 (1 self)
- Add to MetaCart
. In this paper we mainly focus on the study of the parallelization of PCGLS, a basic iterative method whose main idea is to organize the computation of conjugate gradient method with preconditioner applied to normal equations. Based on the data distribution model, we analyze fully the most suitable communication network topology for solving least squares problems on massively distributed memory computers. A theoretical model of communication phases is presented which allows us to give a detail execution time complexity analysis and investigates its usefulness. It is shown that the implementation of PCGLS, with a row-block decomposition of the coefficient matrix, on a ring of communication structure is the most efficient choice. Performance tests of the developed parallel PCGLS algorithm have been carried out on the massively distributed memory system ParsytecGC/PowerPlus and experimental timing results are compared with the theoretical execution time complexity analysis. 1 Introductio...
Parallel iterative solution methods for linear systems arising from discretized PDE's
- Lecture Notes on Parallel Iterative Methods for discretized PDE's. AGARD Special Course on Parallel Computing in CFD, available from http://www.math.ruu.nl/people/vorst/#lec
, 1995
"... In these notes we will present anoverview of a number of related iterative methods for the solution of linear systems of equations. These methods are so-called Krylov projection type methods and they include popular methods as Conjugate Gradients, Bi-Conjugate Gradients, CGS, Bi-CGSTAB, QMR, LSQR an ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
In these notes we will present anoverview of a number of related iterative methods for the solution of linear systems of equations. These methods are so-called Krylov projection type methods and they include popular methods as Conjugate Gradients, Bi-Conjugate Gradients, CGS, Bi-CGSTAB, QMR, LSQR and GMRES. We will showhow these methods can be derived from simple basic iteration formulas. We will not give convergence proofs, but we will refer for these, as far as available, to litterature. Iterative methods are often used in combination with so-called preconditioning operators (approximations for the inverses of the operator of the system to be solved). Since these preconditioners are not essential in the derivation of the iterative methods, we will not givemuch attention to them in these notes. However, in most of the actual iteration schemes, we have included them in order to facilitate the use of these schemes in actual computations. For the application of the iterative schemes one usually thinks of linear sparse systems, e.g., like those arising in the nite element or nite di erence approximations of (systems of) partial di erential equations. However, the structure of the operators plays no explicit role in any oftheseschemes, and these schemes might also successfully be used to solve certain large dense linear systems. Depending on the situation that might be attractive in terms of numbers of oating point operations. It will turn out that all of the iterative are parallelizable in a straight forward manner. However, especially for computers with a memory hierarchy (i.e., like cache or vector registers), and for distributed memory computers, the performance can often be improved signi cantly through rescheduling of the operations. We will discuss parallel implementations, and occasionally we will report on experimental ndings.
Data Distribution And Communication Schemes For Least Squares Problems On Massively Distributed Memory Computers
- In Proceedings of International Conference on Computational Modelling
, 1996
"... In this paper we study the parallelization of PCGLS, a basic iterative method whose main idea is to organize the computation of conjugate gradient method with preconditioner applied to normal equations. Two important schemes are discussed. What is the best possible data distribution and which com ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
In this paper we study the parallelization of PCGLS, a basic iterative method whose main idea is to organize the computation of conjugate gradient method with preconditioner applied to normal equations. Two important schemes are discussed. What is the best possible data distribution and which communication network topology is most suitable for solving least squares problems on massively distributed memory computers. A theoretical model of data distribution and communication phases is presented which allows us to give a detail execution time complexity analysis and investigates its usefulness. It is shown that the implementation of PCGLS, with a row-block decomposition of the coefficient matrix, on a ring of communication structure is the most efficient choice. Performance tests of the developed parallel PCGLS algorithm have been carried out on the massively distributed memory system ParsytecGC/PowerPlus and experimental timing results are compared with the theoretical execut...
Communication cost reduction for Krylov methods on parallel computers
- Proc. of High-Performance Computing and Networking Conference
"... Abstract. On large distributed memory parallel computers the global communication cost of inner products seriously limits the performance of Krylov subspace methods [3]. We consider improved algorithms to reduce this communication overhead, and we analyze the performance by experiments on a 400-proc ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
Abstract. On large distributed memory parallel computers the global communication cost of inner products seriously limits the performance of Krylov subspace methods [3]. We consider improved algorithms to reduce this communication overhead, and we analyze the performance by experiments on a 400-processor parallel computer and with a simple performance model. 1
Lecture Notes on Iterative Methods
, 1994
"... Introduction In these notes we will present an overview of a number of related iterative methods for the solution of linear systems of equations. These methods are so-called Krylov projection type methods and they include popular methods as Conjugate Gradients, Bi-Conjugate Gradients, LSQR and GMRE ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
Introduction In these notes we will present an overview of a number of related iterative methods for the solution of linear systems of equations. These methods are so-called Krylov projection type methods and they include popular methods as Conjugate Gradients, Bi-Conjugate Gradients, LSQR and GMRES. We will show how these methods can be derived from simple basic iteration formulas. We will not give convergence proofs, but we will refer for these, as far as available, to litterature. Iterative methods are often used in combination with so-called preconditioning operators (approximations for the inverses of the operator of the system to be solved). Since these preconditioners are not essential in the derivation of these iterative methods, we will not discuss on them explicitly in these notes. However, in most of the actual iteration schemes, we have included them in order to facilitate the use of these schemes in actual computations. For the application of the iterative schemes
Implementation Aspects
"... e inner products, vector updates and matrix vector product are easily parallelized and vectorized. The more successful preconditionings, i.e, based upon incomplete LU decomposition, are not easily parallelizable. For that reason one is often satisfied with the use of only diagonal scaling as a preco ..."
Abstract
- Add to MetaCart
e inner products, vector updates and matrix vector product are easily parallelized and vectorized. The more successful preconditionings, i.e, based upon incomplete LU decomposition, are not easily parallelizable. For that reason one is often satisfied with the use of only diagonal scaling as a preconditioner on highly parallel computers, such as the CM2 [24]. On distributed memory computers we need large grained parallelism in order to reduce synchronization overhead. This can be achieved by combining the work required for a successive number of iteration steps. The idea is to construct first in parallel a straight forward Krylov basis for the search subspace in which an update for the current solution will be determined. Once this basis has been computed, the vectors are orthogonalized, as is done in Krylov subspace methods. The construction as well as the orthogonalization can be done with large grained parallelism, and has su#cient degree of parallelism in it. This approach has be

