Results 1  10
of
549
Performance Optimizations and Bounds for Sparse MatrixVector Multiply
 In Proceedings of Supercomputing
, 2002
"... We consider performance tuning, by code and data structure reorganization, of sparse matrixvector multiply (SpMV), one of the most important computational kernels in scientific applications. This paper addresses the fundamental questions of what limits exist on such performance tuning, and how ..."
Abstract

Cited by 57 (10 self)
 Add to MetaCart
We consider performance tuning, by code and data structure reorganization, of sparse matrixvector multiply (SpMV), one of the most important computational kernels in scientific applications. This paper addresses the fundamental questions of what limits exist on such performance tuning, and how
Benchmarking Sparse MatrixVector Multiply
, 2006
"... Abstract — We present a benchmark for evaluating the performance of Sparse matrixdense vector multiply (abbreviated as SpMV) on scalar uniprocessor machines. Though SpMV is an important kernel in scientific computation, there are currently no adequate benchmarks for measuring its performance across ..."
Abstract

Cited by 8 (2 self)
 Add to MetaCart
Abstract — We present a benchmark for evaluating the performance of Sparse matrixdense vector multiply (abbreviated as SpMV) on scalar uniprocessor machines. Though SpMV is an important kernel in scientific computation, there are currently no adequate benchmarks for measuring its performance
A library for parallel sparse matrixvector multiplies
, 2005
"... We provide parallel matrixvector multiply routines for 1D and 2D partitioned sparse square and rectangular matrices. We clearly give pseudocodes that perform necessary initializations for parallel execution. We show how to maximize overlapping between communication and computation through the pro ..."
Abstract

Cited by 7 (6 self)
 Add to MetaCart
We provide parallel matrixvector multiply routines for 1D and 2D partitioned sparse square and rectangular matrices. We clearly give pseudocodes that perform necessary initializations for parallel execution. We show how to maximize overlapping between communication and computation through
An asynchronous matrixvector multiplier for discrete cosine transform
 in Proceedings of the 2000 International Symposium on Low Power Electronics and Design, ISLPED’00
, 2000
"... This paper proposes an efficient asynchronous hardwired matrixvector multiplier for the twodimensional discrete cosine transform and inverse discrete cosine transform (DCT/IDCT). The design achieves low power and high performance by taking advantage of the typically large fraction of zero and smal ..."
Abstract

Cited by 6 (2 self)
 Add to MetaCart
This paper proposes an efficient asynchronous hardwired matrixvector multiplier for the twodimensional discrete cosine transform and inverse discrete cosine transform (DCT/IDCT). The design achieves low power and high performance by taking advantage of the typically large fraction of zero
Reconfigurable Sparse/Dense MatrixVector Multiplier
"... We propose an ANSI/IEEE754 double precision floatingpoint matrixvector multiplier. Its main feature is the capability to process efficiently both Dense MatrixVector Multiplications (DMVM) and Sparse MatrixVector Multiplications (SMVM). The design is composed of multiple processing elements (PE ..."
Abstract

Cited by 2 (0 self)
 Add to MetaCart
We propose an ANSI/IEEE754 double precision floatingpoint matrixvector multiplier. Its main feature is the capability to process efficiently both Dense MatrixVector Multiplications (DMVM) and Sparse MatrixVector Multiplications (SMVM). The design is composed of multiple processing elements
Gaussian Processes and Fast MatrixVector Multiplies
"... Gaussian processes (GPs) provide a flexible framework for probabilistic regression. The necessary computations involve standard matrix operations. There have been several attempts to accelerate these operations based on fast kernel matrixvector multiplications. By focussing on the simplest GP compu ..."
Abstract

Cited by 8 (1 self)
 Add to MetaCart
Gaussian processes (GPs) provide a flexible framework for probabilistic regression. The necessary computations involve standard matrix operations. There have been several attempts to accelerate these operations based on fast kernel matrixvector multiplications. By focussing on the simplest GP
Parallel Sparse MatrixVector Multiply Software for Matrices with Data Locality
, 1995
"... In this paper we describe general software utilities for performing unstructured sparse matrixvector multiplications on distributedmemory messagepassing computers. The matrixvector multiply comprises an important kernel in the solution of large sparse linear systems by iterative methods. Our foc ..."
Abstract

Cited by 27 (3 self)
 Add to MetaCart
In this paper we describe general software utilities for performing unstructured sparse matrixvector multiplications on distributedmemory messagepassing computers. The matrixvector multiply comprises an important kernel in the solution of large sparse linear systems by iterative methods. Our
Reconfigurable Fixed Point Dense and Sparse MatrixVector Multiply/Add Unit
 In Proceedings of the IEEE International Conference on ApplicationSpecific Systems, Architectures, and Processors (ASAP’06
, 2006
"... In this paper, we propose a reconfigurable hardware accelerator for fixedpointmatrixvectormultiply/add operations, capable to work on dense and sparse matrices formats. The prototyped hardware unit accommodates 4 dense or sparse matrix inputs and performs computations in a space parallel design ..."
Abstract

Cited by 4 (1 self)
 Add to MetaCart
In this paper, we propose a reconfigurable hardware accelerator for fixedpointmatrixvectormultiply/add operations, capable to work on dense and sparse matrices formats. The prototyped hardware unit accommodates 4 dense or sparse matrix inputs and performs computations in a space parallel design
A BENCHMARK FOR REGISTERBLOCKED SPARSE MATRIXVECTOR MULTIPLY
"... Abstract. We develop a sparse matrixvector multiply (SMVM) benchmark for block compressed sparse row (BSR) matrices. These occur frequently in linear systems generated by the finite element method (FEM), for example, and are naturally suited for register blocking optimizations. Unlike current SMVM ..."
Abstract
 Add to MetaCart
Abstract. We develop a sparse matrixvector multiply (SMVM) benchmark for block compressed sparse row (BSR) matrices. These occur frequently in linear systems generated by the finite element method (FEM), for example, and are naturally suited for register blocking optimizations. Unlike current SMVM
Results 1  10
of
549