Results 1  10
of
115
Optimization of Sparse Matrixvector Multiplication on Emerging Multicore Platforms
 In Proc. SC2007: High performance computing, networking, and storage conference
, 2007
"... We are witnessing a dramatic change in computer architecture due to the multicore paradigm shift, as every electronic device from cell phones to supercomputers confronts parallelism of unprecedented scale. To fully unleash the potential of these systems, the HPC community must develop multicore spec ..."
Abstract

Cited by 136 (23 self)
 Add to MetaCart
(Show Context)
We are witnessing a dramatic change in computer architecture due to the multicore paradigm shift, as every electronic device from cell phones to supercomputers confronts parallelism of unprecedented scale. To fully unleash the potential of these systems, the HPC community must develop multicore specific optimization methodologies for important scientific computations. In this work, we examine sparse matrixvector multiply (SpMV) – one of the most heavily used kernels in scientific computing – across a broad spectrum of multicore designs. Our experimental platform includes the homogeneous AMD dualcore and Intel quadcore designs, the heterogeneous STI Cell, as well as the first scientific study of the highly multithreaded Sun Niagara2. We present several optimization strategies especially effective for the multicore environment, and demonstrate significant performance improvements compared to existing stateoftheart serial and parallel SpMV implementations. Additionally, we present key insights into the architectural tradeoffs of leading multicore design strategies, in the context of demanding memorybound numerical algorithms. 1.
Subexponential Parameterized Algorithms on Graphs of Bounded Genus and HMinorFree Graphs
, 2003
"... We introduce a new framework for designing fixedparameter algorithms with subexponential running time2 . Our results apply to a broad family of graph problems, called bidimensional problems, which includes many domination and covering problems such as vertex cover, feedback vertex set, minimum m ..."
Abstract

Cited by 57 (19 self)
 Add to MetaCart
We introduce a new framework for designing fixedparameter algorithms with subexponential running time2 . Our results apply to a broad family of graph problems, called bidimensional problems, which includes many domination and covering problems such as vertex cover, feedback vertex set, minimum maximal matching, dominating set, edge dominating set, cliquetransversal set, and many others restricted to bounded genus graphs. Furthermore, it is fairly straightforward to prove that a problem is bidimensional. In particular, our framework includes as special cases all previously known problems to have such subexponential algorithms. Previously, these algorithms applied to planar graphs, singlecrossingminorfree graphs, and/or map graphs; we extend these results to apply to boundedgenus graphs as well. In a parallel development of combinatorial results, we establish an upper bound on the treewidth (or branchwidth) of a boundedgenus graph that excludes some planar graph H as a minor. This bound depends linearly on the size (H) of the excluded graph H and the genus g(G) of the graph G, and applies and extends the graphminors work of Robertson and Seymour. Building on these results...
Graph Sandwich Problems
, 1994
"... The graph sandwich problem for property \Pi is defined as follows: Given two graphs G ) such that E ` E , is there a graph G = (V; E) such that E which satisfies property \Pi? Such problems generalize recognition problems and arise in various applications. Concentrating mainly o ..."
Abstract

Cited by 55 (8 self)
 Add to MetaCart
The graph sandwich problem for property \Pi is defined as follows: Given two graphs G ) such that E ` E , is there a graph G = (V; E) such that E which satisfies property \Pi? Such problems generalize recognition problems and arise in various applications. Concentrating mainly on properties characterizing subfamilies of perfect graphs, we give polynomial algorithms for several properties and prove the NPcompleteness of others. We describe
An Algorithm for Coarsening Unstructured Meshes
 Numer. Math
, 1996
"... . We develop and analyze a procedure for creating a hierarchical basis of continuous piecewise linear polynomials on an arbitrary, unstructured, nonuniform triangular mesh. Using these hierarchical basis functions, we are able to define and analyze corresponding iterative methods for solving the lin ..."
Abstract

Cited by 51 (5 self)
 Add to MetaCart
(Show Context)
. We develop and analyze a procedure for creating a hierarchical basis of continuous piecewise linear polynomials on an arbitrary, unstructured, nonuniform triangular mesh. Using these hierarchical basis functions, we are able to define and analyze corresponding iterative methods for solving the linear systems arising from finite element discretizations of elliptic partial differential equations. We show that such iterative methods perform as well as those developed for the usual case of structured, locally refined meshes. In particular, we show that the generalized condition numbers for such iterative methods are of order J 2 , where J is the number of hierarchical basis levels. Key words. Finite element, hierarchical basis, multigrid, unstructured mesh. AMS subject classifications. 65F10, 65N20 1. Introduction. Iterative methods using the hierarchical basis decomposition have proved to be among the most robust for solving broad classes of elliptic partial differential equations, ...
Robust Ordering of Sparse Matrices using Multisection
 Department of Computer Science, York University
, 1996
"... In this paper we provide a robust reordering scheme for sparse matrices. The scheme relies on the notion of multisection, a generalization of bisection. The reordering strategy is demonstrated to have consistently good performance in terms of fill reduction when compared with multiple minimum degree ..."
Abstract

Cited by 50 (2 self)
 Add to MetaCart
(Show Context)
In this paper we provide a robust reordering scheme for sparse matrices. The scheme relies on the notion of multisection, a generalization of bisection. The reordering strategy is demonstrated to have consistently good performance in terms of fill reduction when compared with multiple minimum degree and generalized nested dissection. Experimental results show that by using multisection, we obtain an ordering which is consistently as good as or better than both for a wide spectrum of sparse problems. 1 Introduction It is well recognized that finding a fillreducing ordering is crucial in the success of the numerical solution of sparse linear systems. For symmetric positivedefinite systems, the minimum degree [38] and the nested dissection [11] orderings are perhaps the most popular ordering schemes. They represent two opposite approaches to the ordering problem. However, they share a common undesirable characteristic. Both schemes produce generally good orderings, but the ordering qua...
Predicting Structure In Sparse Matrix Computations
 SIAM J. Matrix Anal. Appl
, 1994
"... . Many sparse matrix algorithmsfor example, solving a sparse system of linear equationsbegin by predicting the nonzero structure of the output of a matrix computation from the nonzero structure of its input. This paper is a catalog of ways to predict nonzero structure. It contains known result ..."
Abstract

Cited by 50 (5 self)
 Add to MetaCart
(Show Context)
. Many sparse matrix algorithmsfor example, solving a sparse system of linear equationsbegin by predicting the nonzero structure of the output of a matrix computation from the nonzero structure of its input. This paper is a catalog of ways to predict nonzero structure. It contains known results for problems including various matrix factorizations, and new results for problems including some eigenvector computations. Key words. sparse matrix algorithms, graph theory, matrix factorization, systems of linear equations, eigenvectors AMS(MOS) subject classifications. 15A18, 15A23, 65F50, 68R10 1. Introduction. A sparse matrix algorithm is an algorithm that performs a matrix computation in such a way as to take advantage of the zero/nonzero structure of the matrices involved. Usually this means not explicitly storing or manipulating some or all of the zero elements; sometimes sparsity can also be exploited to work on different parts of a matrix problem in parallel. Large sparse matr...
Tractability of Parameterized Completion Problems on Chordal, Strongly Chordal and Proper Interval Graphs
, 1994
"... We study the parameterized complexity of three NPhard graph completion problems. The MINIMUM FILLIN problem is to decide if a graph can be triangulated by adding at most k edges. We develop O(c m) and O(k mn + f(k)) algorithms for this problem on a graph with n vertices and m edges. Here f(k ..."
Abstract

Cited by 49 (5 self)
 Add to MetaCart
(Show Context)
We study the parameterized complexity of three NPhard graph completion problems. The MINIMUM FILLIN problem is to decide if a graph can be triangulated by adding at most k edges. We develop O(c m) and O(k mn + f(k)) algorithms for this problem on a graph with n vertices and m edges. Here f(k) is exponential in k and the constants hidden by the bigO notation are small and do not depend on k. In particular, this implies that the problem is fixedparameter tractable (FPT). The PROPER
Complexity classification of some edge modification problems
, 2001
"... In an edge modification problem one has to change the edge set of a given graph as little as possible so as to satisfy a certain property. We prove the NPhardness of a variety of edge modification problems with respect to some wellstudied classes of graphs. These include perfect, chordal, chain, c ..."
Abstract

Cited by 47 (2 self)
 Add to MetaCart
(Show Context)
In an edge modification problem one has to change the edge set of a given graph as little as possible so as to satisfy a certain property. We prove the NPhardness of a variety of edge modification problems with respect to some wellstudied classes of graphs. These include perfect, chordal, chain, comparability, split and asteroidal triple free. We show that some of these problems become polynomial when the input graph has bounded degree. We also give a general constant factor approximation algorithm for deletion and editing problems on bounded degree graphs with respect to properties that can be characterized by a finite set of forbidden induced subgraphs.
Highly Parallel Sparse Cholesky Factorization
 SIAM Journal on Scientific and Statistical Computing
, 1992
"... We develop and compare several finegrained parallel algorithms to compute the Cholesky factorization of a sparse matrix. Our experimental implementations are on the Connection Machine, a distributedmemory SIMD machine whose programming model conceptually supplies one processor per data element. In ..."
Abstract

Cited by 46 (1 self)
 Add to MetaCart
(Show Context)
We develop and compare several finegrained parallel algorithms to compute the Cholesky factorization of a sparse matrix. Our experimental implementations are on the Connection Machine, a distributedmemory SIMD machine whose programming model conceptually supplies one processor per data element. In contrast to specialpurpose algorithms in which the matrix structure conforms to the connection structure of the machine, our focus is on matrices with arbitrary sparsity structure.
Triangulation of Graphs  Algorithms Giving Small Total State Space
, 1990
"... The problem of achieving small total state space for triangulated belief graphs (networks) is considered. It is an NPcomplete problem to find a triangulation with minimum state space. Our interest ..."
Abstract

Cited by 39 (0 self)
 Add to MetaCart
The problem of achieving small total state space for triangulated belief graphs (networks) is considered. It is an NPcomplete problem to find a triangulation with minimum state space. Our interest