Results 1  10
of
78
The University of Florida sparse matrix collection
 NA DIGEST
, 1997
"... The University of Florida Sparse Matrix Collection is a large, widely available, and actively growing set of sparse matrices that arise in real applications. Its matrices cover a wide spectrum of problem domains, both those arising from problems with underlying 2D or 3D geometry (structural enginee ..."
Abstract

Cited by 305 (15 self)
 Add to MetaCart
The University of Florida Sparse Matrix Collection is a large, widely available, and actively growing set of sparse matrices that arise in real applications. Its matrices cover a wide spectrum of problem domains, both those arising from problems with underlying 2D or 3D geometry (structural engineering, computational fluid dynamics, model reduction, electromagnetics, semiconductor devices, thermodynamics, materials, acoustics, computer graphics/vision, robotics/kinematics, and other discretizations) and those that typically do not have such geometry (optimization, circuit simulation, networks and graphs, economic and financial modeling, theoretical and quantum chemistry, chemical process simulation, mathematics and statistics, and power networks). The collection meets a vital need that artificiallygenerated matrices cannot meet, and is widely used by the sparse matrix algorithms community for the development and performance evaluation of sparse matrix algorithms. The collection includes software for accessing and managing the collection, from MATLAB, Fortran, and C.
A TwoDimensional Data Distribution Method For Parallel Sparse MatrixVector Multiplication
 SIAM REVIEW
"... A new method is presented for distributing data in sparse matrixvector multiplication. The method is twodimensional, tries to minimise the true communication volume, and also tries to spread the computation and communication work evenly over the processors. The method starts with a recursive bipar ..."
Abstract

Cited by 67 (8 self)
 Add to MetaCart
A new method is presented for distributing data in sparse matrixvector multiplication. The method is twodimensional, tries to minimise the true communication volume, and also tries to spread the computation and communication work evenly over the processors. The method starts with a recursive bipartitioning of the sparse matrix, each time splitting a rectangular matrix into two parts with a nearly equal number of nonzeros. The communication volume caused by the split is minimised. After the matrix partitioning, the input and output vectors are partitioned with the objective of minimising the maximum communication volume per processor. Experimental results of our implementation, Mondriaan, for a set of sparse test matrices show a reduction in communication compared to onedimensional methods, and in general a good balance in the communication work.
Performance Optimizations and Bounds for Sparse MatrixVector Multiply
 In Proceedings of Supercomputing
, 2002
"... We consider performance tuning, by code and data structure reorganization, of sparse matrixvector multiply (SpMV), one of the most important computational kernels in scientific applications. This paper addresses the fundamental questions of what limits exist on such performance tuning, and how ..."
Abstract

Cited by 51 (9 self)
 Add to MetaCart
We consider performance tuning, by code and data structure reorganization, of sparse matrixvector multiply (SpMV), one of the most important computational kernels in scientific applications. This paper addresses the fundamental questions of what limits exist on such performance tuning, and how closely tuned code approaches these limits.
Recycling Krylov Subspaces for Sequences of Linear Systems
 SIAM J. Sci. Comput
, 2004
"... Many problems in engineering and physics require the solution of a large sequence of linear systems. We can reduce the cost of solving subsequent systems in the sequence by recycling information from previous systems. We consider two dierent approaches. For several model problems, we demonstrate tha ..."
Abstract

Cited by 46 (3 self)
 Add to MetaCart
Many problems in engineering and physics require the solution of a large sequence of linear systems. We can reduce the cost of solving subsequent systems in the sequence by recycling information from previous systems. We consider two dierent approaches. For several model problems, we demonstrate that we can reduce the iteration count required to solve a linear system by a factor of two. We consider both Hermitian and nonHermitian problems, and present numerical experiments to illustrate the eects of subspace recycling.
IPSepCoLa: An incremental procedure for separation constraint layout of graphs
 IEEE TRANSACTIONS ON VISUALISATION AND COMPUTER GRAPHICS
, 2006
"... We extend the popular forcedirected approach to network (or graph) layout to allow separation constraints, which enforce a minimum horizontal or vertical separation between selected pairs of nodes. This simple class of linear constraints is expressive enough to satisfy a wide variety of applicati ..."
Abstract

Cited by 28 (13 self)
 Add to MetaCart
We extend the popular forcedirected approach to network (or graph) layout to allow separation constraints, which enforce a minimum horizontal or vertical separation between selected pairs of nodes. This simple class of linear constraints is expressive enough to satisfy a wide variety of applicationspecific layout requirements, including: layout of directed graphs to better show flow; layout with nonoverlapping node labels; and layout of graphs with grouped nodes (called clusters). In the stress majorization forcedirected layout process, separation constraints can be treated as a quadratic programming problem. We give an incremental algorithm based on gradient projection for efficiently solving this problem. The algorithm is considerably faster than using generic constraint optimization techniques and is comparable in speed to unconstrained stress majorization. We demonstrate the utility of our technique with sample data from a number of practical applications including geneactivation networks, terrorist networks and visualization of highdimensional data.
A Practical Algorithm for Making Filled Graphs Minimal
 THEOR. COMP. SC
, 2001
"... For an arbitrary filled graph G + of a given original graph G, we consider the problem of removing fill edges from G + in order to obtain a graph M that is both a minimal filled graph of G and a subgraph of G + . For G + with f fill edges and e original edges, we give a simple O(f(e+f)) a ..."
Abstract

Cited by 23 (13 self)
 Add to MetaCart
For an arbitrary filled graph G + of a given original graph G, we consider the problem of removing fill edges from G + in order to obtain a graph M that is both a minimal filled graph of G and a subgraph of G + . For G + with f fill edges and e original edges, we give a simple O(f(e+f)) algorithm which solves the problem and computes a corresponding minimal elimination ordering of G. We report on experiments with an implementation of our algorithm, where we test graphs G corresponding to some real sparse matrix applications and apply wellknown and widely used ordering heuristics to find G + . Our findings show the amount of fill that is commonly removed by a minimalization for each of these heuristics, and also indicate that the runtime of our algorithm on these practical graphs is better than the presented worstcase bound.
Performance models for evaluation and automatic tuning of symmetric sparse matrixvector multiply
 In Proceedings of the International Conference on Parallel Processing
, 2004
"... We present optimizations for sparse matrixvector multiply SpMV and its generalization to multiple vectors, SpMM, when the matrix is symmetric: (1) symmetric storage, (2) register blocking, and (3) vector blocking. Combined with register blocking, symmetry saves more than 50 % in matrix storage. We ..."
Abstract

Cited by 18 (4 self)
 Add to MetaCart
We present optimizations for sparse matrixvector multiply SpMV and its generalization to multiple vectors, SpMM, when the matrix is symmetric: (1) symmetric storage, (2) register blocking, and (3) vector blocking. Combined with register blocking, symmetry saves more than 50 % in matrix storage. We also show performance speedups of 2.1× for SpMV and 2.6 × for SpMM, when compared to the best nonsymmetric register blocked implementation. We present an approach for the selection of tuning parameters, based on empirical modeling and search that consists of three steps: (1) Offline benchmark, (2) Runtime search, and (3) Heuristic performance model. This approach generally selects parameters to achieve performance with 85 % of that achieved with exhaustive search. We evaluate our implementations with respect to upper bounds on performance. Our model bounds performance by considering only the cost of memory operations and using lower bounds on the number of cache misses. Our optimized codes are within 68 % of the upper bounds. 1
Using dense storage to solve small sparse linear systems
 ACM Trans. Math. Softw
, 2007
"... A data structure is used to build a linear solver specialized for relatively small sparse systems. The proposed solver, optimized for runtime performance at the expense of memory footprint, outperforms widely used direct and sparse solvers for systems with between 100 and 3000 equations. A multithr ..."
Abstract

Cited by 17 (6 self)
 Add to MetaCart
A data structure is used to build a linear solver specialized for relatively small sparse systems. The proposed solver, optimized for runtime performance at the expense of memory footprint, outperforms widely used direct and sparse solvers for systems with between 100 and 3000 equations. A multithreaded version of the solver is shown to give some speedups for problems with medium fillin, while it does not give any benefit for very sparse problems. Categories and Subject Descriptors: G.1.3 [Numerical Analysis]: Numerical Linear Algebra— Linear systems (direct and interactive methods), sparse, structured, and very large systems (direct and iterative methods); G.4 [Mathematical Software]: Algorithm design and analysis; E.1 [Data