Results 1  10
of
12
A BENCHMARK FOR REGISTERBLOCKED SPARSE MATRIXVECTOR MULTIPLY
"... Abstract. We develop a sparse matrixvector multiply (SMVM) benchmark for block compressed sparse row (BSR) matrices. These occur frequently in linear systems generated by the finite element method (FEM), for example, and are naturally suited for register blocking optimizations. Unlike current SMVM ..."
Abstract
 Add to MetaCart
performance on each architecture tested. Our randomly generated test cases successfully predict SMVM performance for FEM and similar matrices encountered in practice. To demonstrate the applicability of the benchmark, we use it to evaluate the effectiveness of the SSE2 SIMD floatingpoint instruction set. 1.
Reconfigurable Sparse/Dense MatrixVector Multiplier
"... We propose an ANSI/IEEE754 double precision floatingpoint matrixvector multiplier. Its main feature is the capability to process efficiently both Dense MatrixVector Multiplications (DMVM) and Sparse MatrixVector Multiplications (SMVM). The design is composed of multiple processing elements (PE ..."
Abstract

Cited by 2 (0 self)
 Add to MetaCart
(PE) and is optimized for FPGAs. We investigate theoretically the boundary conditions when the DMVM equals the SMVM performance with respect to the matrix sparsity. Thus, we can determine the most efficient processing mode configuration with respect to the input data sparsity. Furthermore, we evaluate
Block Based Compression Storage Expected Performance
 In Proceedings of HPCS2000, Victoria
, 2000
"... In this paper we present some preliminary performance evaluations of the Block Based Compression Storage (BBCS) scheme, that consists of a sparse matrix representation format and an associated Vector Processor (VP) architectural extension, designed to alleviate the performance degradation experience ..."
Abstract

Cited by 8 (5 self)
 Add to MetaCart
. Subsequently, we consider a set of benchmark matrices and report some preliminary performance evaluations by comparing the BBCS scheme with the Jagged Diagonal (JD) scheme. The simulations of the SMVM algorithm core execution indicate the BBCS scheme always performs better than the JD scheme for large VP
Sparse matrixvector multiplication for finite element method matrices on FPGAs
 FieldProgrammable Custom Computing Machines, Annual IEEE Symposium on
, 2006
"... We present an architecture and an implementation of an FPGAbased sparse matrixvector multiplier (SMVM) for use in the iterative solution of large, sparse systems of equations arising from Finite Element Method (FEM) applications. The architecture is based on a pipelined linear array of processi ..."
Abstract

Cited by 6 (1 self)
 Add to MetaCart
of processing elements (PEs). A hardwareoriented matrix “striping” scheme is developed which reduces the number of required processing elements. Our current 8 PE prototype achieves a peak performance of 1.76 GFLOPS and a sustained performance of 1.5 GFLOPS with 8 GB/s of memory bandwidth. The SMVM
General Terms
"... Large, high density FPGAs with high local distributed memory bandwidth surpass the peak floatingpoint performance of highend, generalpurpose processors. Microprocessors do not deliver near their peak floatingpoint performance on efficient algorithms that use the Sparse MatrixVector Multiply (SM ..."
Abstract
 Add to MetaCart
(SMVM) kernel. In fact, it is not uncommon for microprocessors to yield only 10–20 % of their peak floatingpoint performance when computing SMVM. We develop and analyze a scalable SMVM implementation on modern FPGAs and show that it can sustain high throughput, near peak, floatingpoint performance
BBCS Based Sparse MatrixVector Multiplication: Initial Evaluation
 16th IMACS World Congress on Scientific Computation, Applied Mathematics and Simulation
, 2000
"... This paper presents an evaluation of the BBCS scheme meant to alleviate the performance degradation experienced byVector Processors (VPs) when manipulating sparse matrices. In particular we address the execution of Sparse Matrix Vector Multiplication (SMVM) algorithms on VPs. First weintroduce a B ..."
Abstract

Cited by 5 (3 self)
 Add to MetaCart
This paper presents an evaluation of the BBCS scheme meant to alleviate the performance degradation experienced byVector Processors (VPs) when manipulating sparse matrices. In particular we address the execution of Sparse Matrix Vector Multiplication (SMVM) algorithms on VPs. First weintroduce a
Direct and Transposed Sparse MatrixVector
 in Proceedings of the 2002 Euromicro conference on Massivelyparallel computing systems, MPCS2002
, 2002
"... In this paper we investigate the execution of Ab and A^T b, where A is a sparse matrix and b a dense vector, using the Blocked Based Compression Storage (BBCS) scheme and an Augmented Vector Architecture (AVA). In particular, we demonstrate that by using the BBCS format, we can represent both the di ..."
Abstract

Cited by 1 (1 self)
 Add to MetaCart
the direct and the transposed matrix for the purposes of matrixvector multiplication with no additional costs in storage, access time and computation performance. To achieve this, we propose a new instruction and a hardware modification for the AVA. Subsequently we evaluate the performance of the transposed
A Hardware Accelerator for the OpenFoam Sparse MatrixVector Product
, 2009
"... One of the key kernels in scientific applications is the Sparse Matrix Vector Multiplication (SMVM). Profiling OpenFOAM, a sophisticated scientific Computational Fluid Dynamics tool, proved the SMVM to be its most computational intensive kernel. A traditional way to solve such computationally intens ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
One of the key kernels in scientific applications is the Sparse Matrix Vector Multiplication (SMVM). Profiling OpenFOAM, a sophisticated scientific Computational Fluid Dynamics tool, proved the SMVM to be its most computational intensive kernel. A traditional way to solve such computationally
Rowinterleaved streaming data flow implementation of Sparse Matrix Vector Multiplication in FPGA
"... Abstract. Sparse MatrixVector Multiplication (SMVM) is the critical computational kernel of many iterative solvers for systems of sparse linear equations. In this paper we propose an FPGA design for SMVM which interleaves CRS (Compressed Row Storage) format so that just a single floating point accu ..."
Abstract
 Add to MetaCart
Abstract. Sparse MatrixVector Multiplication (SMVM) is the critical computational kernel of many iterative solvers for systems of sparse linear equations. In this paper we propose an FPGA design for SMVM which interleaves CRS (Compressed Row Storage) format so that just a single floating point
Vector ISA Extension for Sparse MatrixVector Multiplication
"... . In this paper we introduce a vector ISA extension to facilitate sparse matrix manipulation on vector processors (VPs). First we introduce a new Block Based Compressed Storage (BBCS) format for sparse matrix representation and a Blockwise Sparse MatrixVector Multiplication approach. Additionally, ..."
Abstract
 Add to MetaCart
, we propose two vector instructions, Multiple Inner Product and Accumulate (MIPA) and LoaD Section (LDS), specially tuned to increase the VP performance when executing sparse matrixvector multiplications. 1 Introduction In many areas of scientic computing the manipulation of sparse matrices
Results 1  10
of
12