Results 1  10
of
19
Automatic preconditioning by limited memory QuasiNewton updating
 SIAM J. OPTIM
, 1999
"... The paper proposes a preconditioner for the conjugate gradient method (CG) that is designed for solving systems of equations Ax = bi with different right hand side vectors, or for solving a sequence of slowly varying systems Akx = bk. The preconditioner has the form of a limited memory quasiNewton ..."
Abstract

Cited by 44 (2 self)
 Add to MetaCart
The paper proposes a preconditioner for the conjugate gradient method (CG) that is designed for solving systems of equations Ax = bi with different right hand side vectors, or for solving a sequence of slowly varying systems Akx = bk. The preconditioner has the form of a limited memory quasiNewton matrix and is generated using information from the CG iteration. The automatic preconditioner does not require explicit knowledge of the coefficient matrix A and is therefore suitable for problems where only products of A times avector can be computed. Numerical experiments indicate that the preconditioner has most to offer when these matrixvector products are expensive to compute, and when low accuracy in the solution is required. The effectiveness of the preconditioner is tested within a Hessianfree Newton method for optimization, and by solving certain linear systems arising in finite element models.
Lanczostype solvers for nonsymmetric linear systems of equations
 Acta Numer
, 1997
"... Among the iterative methods for solving large linear systems with a sparse (or, possibly, structured) nonsymmetric matrix, those that are based on the Lanczos process feature short recurrences for the generation of the Krylov space. This means low cost and low memory requirement. This review article ..."
Abstract

Cited by 43 (11 self)
 Add to MetaCart
Among the iterative methods for solving large linear systems with a sparse (or, possibly, structured) nonsymmetric matrix, those that are based on the Lanczos process feature short recurrences for the generation of the Krylov space. This means low cost and low memory requirement. This review article introduces the reader not only to the basic forms of the Lanczos process and some of the related theory, but also describes in detail a number of solvers that are based on it, including those that are considered to be the most efficient ones. Possible breakdowns of the algorithms and ways to cure them by lookahead are also discussed. www.DownloadPaper.ir
Data distribution schemes of sparse arrays on distributed memory multicomputers
 in International Conference on Parallel Processing Workshops
, 2002
"... A data distribution scheme of sparse arrays on a distributed memory multicomputer, in general, is composed of three phases, data partition, data distribution, and data compression. To implement the data distribution scheme, methods proposed in the literature first perform the data partition phase, t ..."
Abstract

Cited by 4 (2 self)
 Add to MetaCart
(Show Context)
A data distribution scheme of sparse arrays on a distributed memory multicomputer, in general, is composed of three phases, data partition, data distribution, and data compression. To implement the data distribution scheme, methods proposed in the literature first perform the data partition phase, then the data distribution phase, followed by the data compression phase. We called this scheme as Send Followed Compress (SFC) scheme. In this paper, we propose two other data distribution schemes, Compress Followed Send (CFS) and EncodingDecoding (ED), for sparse array distribution. In the CFS scheme, the data compression phase is performed before the data distribution phase. In the ED scheme, the data compression phase can be divided into two steps, encoding and decoding. The encoding step and the decoding step are performed before and after the data distribution phase, respectively. To evaluate the CFS and the ED schemes, we compare them with the SFC scheme. Both theoretical analysis and experimental test were conducted. In theoretical analysis, we analyze the SFC, the CFS, and the ED schemes in terms of the data distribution time and the data compression time. In experimental test, we implemented these schemes on an IBM SP2 parallel machine. From the experimental results, for most of test cases, the CFS and the ED schemes outperform the SFC scheme. For the CFS and the ED schemes, the ED scheme outperforms the CFS scheme for all test cases. Index Terms − Data distribution schemes, Data
CRPC Research into Linear Algebra Software for High Performance Computers
, 1994
"... In this paper we look at a number of approaches being investigated in the Center for Research on Parallel Computation (CRPC) to develop linear algebra software for highperformance computers. These approaches are exemplified by the LAPACK, templates, and ARPACK projects. LAPACK is a software library ..."
Abstract

Cited by 4 (2 self)
 Add to MetaCart
In this paper we look at a number of approaches being investigated in the Center for Research on Parallel Computation (CRPC) to develop linear algebra software for highperformance computers. These approaches are exemplified by the LAPACK, templates, and ARPACK projects. LAPACK is a software library for performing dense and banded linear algebra computations, and was designed to run efficiently on high performance computers. We focus on the design of the distributed memory version of LAPACK, and on an objectoriented interface to LAPACK. The templates project aims at making the task of developing sparse linear algebra software simpler and easier. Reusable software templates are provided that the user can then customize to modify and optimize a particular algorithm, and hence build a more complex applications. ARPACK is a software package for solving large scale eigenvalue problems, and is based on an implicitly restarted variant of the Arnoldi scheme. The paper focuses on issues impact...
Extending a hierarchical tiling arrays library to support sparse data partitioning,” The Journal of Supercomputing
, 2012
"... Abstract Layout methods for dense and sparse data are often seen as two separate problems with their own particular techniques. However, they are based on the same basic concepts. This paper studies how to integrate automatic datalayout and partition techniques for both dense and sparse data struc ..."
Abstract

Cited by 4 (3 self)
 Add to MetaCart
(Show Context)
Abstract Layout methods for dense and sparse data are often seen as two separate problems with their own particular techniques. However, they are based on the same basic concepts. This paper studies how to integrate automatic datalayout and partition techniques for both dense and sparse data structures. In particular, we show how to include support for sparse matrices or graphs in Hitmap, a library for hierarchical tiling and automatic mapping of arrays. The paper shows that it is possible to offer a unique interface to work with both dense and sparse data structures. Thus, the programmer can use a single and homogeneous programming style, reducing the development effort and simplifying the use of sparse data structures in parallel computations. Our experimental evaluation shows that this integration of techniques can be effectively done without compromising performance.
Analyzing Data Structures for Parallel Sparse Direct Solvers: Pivoting And FillIn
 TO 168, PROCEEDINGS OF THE SIXTH WORKSHOP ON COMPILERS FOR PARALLEL COMPUTERS, CPC'96, SPECIAL ISSUE OF THE VOLUME KONFERENZEN DES FORSCHUNGSZENTRUMS JULICH, VOL.21
, 1996
"... This paper addresses the problem of the parallelization of sparse direct methods for the solution of linear systems in distributed memory multiprocessors. Sparse direct solvers include pivoting operations and suffer from fillin, problems that turn the efficient parallelization into a challenging ..."
Abstract

Cited by 2 (2 self)
 Add to MetaCart
This paper addresses the problem of the parallelization of sparse direct methods for the solution of linear systems in distributed memory multiprocessors. Sparse direct solvers include pivoting operations and suffer from fillin, problems that turn the efficient parallelization into a challenging task. We present some data structures to store the sparse matrices that permit to deal in a efficient way with both problems. These data structures have been evaluated on a Cray T3D, implementing, in particular, LU and QR factorizations as examples of direct solvers. Any of the data representations considered enforces the handling of indirections for data accesses, pointer referencing and dynamic data creation. All of
Serial and Parallel Krylov Methods for Implicit Finite Difference Schemes Arising in Multivariate Option Pricing
, 2001
"... This paper investigates computational and implementation issues for the valuation of options on three underlying assets, focusing on the use of the finite difference methods. We demonstrate that implicit methods, which have good convergence and stability properties, can now be implemented efficientl ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
This paper investigates computational and implementation issues for the valuation of options on three underlying assets, focusing on the use of the finite difference methods. We demonstrate that implicit methods, which have good convergence and stability properties, can now be implemented efficiently due to the recent development of techniques that allow the efficient solution of large and sparse linear systems. In the trivariate option valuation problem, we use nonstationary iterative methods (also called Krylov methods) for the solution of the large and sparse linear systems arising while using implicit methods. Krylov methods are investigated both in serial and in parallel implementations. Computational results show that the parallel implementation is particularly efficient if a fine grid space is needed.
multithreading for memory intensive applications
"... Exploring the performance limits of simultaneous ..."
(Show Context)