On Improving the Performance of Sparse MatrixVector Multiplication
 In Proceedings of the International Conference on HighPerformance Computing
, 1997
We analyze singlenode performance of sparse matrixvector multiplication by investigating issues of data locality and finegrained parallelism. We examine the datalocality characteristics of the compressedsparse row representation and consider improvements in locality through matrix permutation
Improving MemorySystem Performance of Sparse MatrixVector Multiplication
 IBM Journal of Research and Development
, 1997
Sparse MatrixVector Multiplication is an important kernel that often runs inefficiently on superscalar RISC processors. This paper describe techniques that increase instructionlevel parallelism and improve performance. The techniques include reordering to reduce cache misses originally due to Das
Towards a Fast Parallel Sparse MatrixVector Multiplication
, 1999
that lead to a fast implementation. It is shown how these optimisations can be incorporated into an efficient parallel implementation using messagepassing. We conduct numerical experiments on many different machines and show that our optimisations speed up the sparse matrixvector multiplication
ABSTRACT Sparse MatrixVector Multiplication on FPGAs ∗ Floatingpoint Sparse MatrixVector Multiplication (SpMXV) is
a key computational kernel in scientific and engineering applications. The poor data locality of sparse matrices significantly reduces the performance of SpMXV on generalpurpose processors, which rely heavily on the cache hierarchy to achieve high performance. The abundant hardware resources
Efficient Multithreaded Untransposed, Transposed or Symmetric Sparse MatrixVector Multiplication with the Recursive Sparse Blocks Format
, 2014
In earlier work we have introduced the “Recursive Sparse Blocks ” (RSB) sparse matrix storage scheme oriented towards cache efficient matrixvector multiplication (SpMV) and triangular solution (SpSV) on cache based shared memory parallel computers. Both the transposed (SpMV T) and symmetric (Sym
A TwoDimensional Data Distribution Method For Parallel Sparse MatrixVector Multiplication
 SIAM REVIEW
A new method is presented for distributing data in sparse matrixvector multiplication. The method is twodimensional, tries to minimise the true communication volume, and also tries to spread the computation and communication work evenly over the processors. The method starts with a recursive
Exact sparse matrixvector multiplication on gpu’s and multicore architectures
 In Proceedings of the 4th International Workshop on Parallel and Symbolic Computation, PASCO ’10
, 2010
We propose different implementations of the sparse matrix–dense vector multiplication (SpMV) for finite fields and rings Z /mZ. We take advantage of graphic card processors (GPU) and multicore architectures. Our aim is to improve the speed of SpMV in the LinBox library, and henceforth the speed
Encapsulating Multiple CommunicationCost Metrics in Partitioning Sparse Rectangular Matrices for Parallel MatrixVector Multiplies
This paper addresses the problem of onedimensional partitioning of structurally unsymmetricsquare and rectangular sparse matrices for parallel matrixvector and matrixtransposevector multiplies. The objective is to minimize the communication cost while maintaining the balance on computational
IIMS Postgraduate Seminar 2009 Parallel sparse matrix algorithms for numerical computing Matrixvector multiplication
, addition, Scalar multiplication, Transpose. This report will discuss the Matrixvector multiplication that is one of the most important computations in the science computation, and it always involves enormous computation, therefore there is a need to using parallel technology to implement them
1Efficient Multicore Sparse MatrixVector Multiplication for Finite Element Electromagnetics on the CellBE processor
for efficient sparse matrixvector multiplication on multicore systems, and show results for a set of finite element matrices that demonstrate its potential. I.
