Results 1 
4 of
4
Optimizing the performance of sparse matrixvector multiplication
, 2000
"... Copyright 2000 by EunJin Im ..."
Efficient Management of Parallelism in ObjectOriented Numerical Software Libraries
 Modern Software Tools in Scientific Computing
, 1997
"... Parallel numerical software based on the messagepassing model is enormously complicated. This paper introduces a set of techniques to manage the complexity, while maintaining high efficiency and ease of use. The PETSc 2.0 package uses objectoriented programming to conceal the details of the messag ..."
Abstract

Cited by 33 (0 self)
 Add to MetaCart
Parallel numerical software based on the messagepassing model is enormously complicated. This paper introduces a set of techniques to manage the complexity, while maintaining high efficiency and ease of use. The PETSc 2.0 package uses objectoriented programming to conceal the details of the message passing, without concealing the parallelism, in a highquality set of numerical software libraries. In fact, the programming model used by PETSc is also the most appropriate for NUMA sharedmemory machines, since they require the same careful attention to memory hierarchies as do distributedmemory machines. Thus, the concepts discussed are appropriate for all scalable computing systems. The PETSc libraries provide many of the data structures and numerical kernels required for the scalable solution of PDEs, offering performance portability. 1 Introduction Currently the only generalpurpose, efficient, scalable approach to programming distributedmemory parallel systems is the messagepass...
ModelBased Memory Hierarchy Optimizations for Sparse Matrices
 In Workshop on Profile and FeedbackDirected Compilation
, 1998
"... Sparse matrixvector multiplication is an important computational kernel used in numerical algorithms. It tends to run much more slowly than its dense counterpart, and its performance depends heavily on both the nonzero structure of the sparse matrix and on the machine architecture. In this paper we ..."
Abstract

Cited by 8 (1 self)
 Add to MetaCart
Sparse matrixvector multiplication is an important computational kernel used in numerical algorithms. It tends to run much more slowly than its dense counterpart, and its performance depends heavily on both the nonzero structure of the sparse matrix and on the machine architecture. In this paper we address the problem of optimizing sparse matrixvector multiplication for the memory hierarchies that exist on modern machines and how machinespecific or matrixspecific profiling information can be used to decide which optimizations should be applied and what parameters should be used. We also consider a variation of the problem in which a matrix is multiplied by a set of vectors. Performance is measured on a 167 MHz Ultrasparc I, 200 MHz Pentium Pro, and 450 MHz DEC Alpha 21164. Experiments show these optimization techniques to have significant payoff, although the effectiveness of each depends on the matrix structure and machine. 1 Introduction Matrixvector multiplication is an importa...