Results 1  10
of
39
Applied Numerical Linear Algebra
 Society for Industrial and Applied Mathematics
, 1997
"... We survey general techniques and open problems in numerical linear algebra on parallel architectures. We rst discuss basic principles of parallel processing, describing the costs of basic operations on parallel machines, including general principles for constructing e cient algorithms. We illustrate ..."
Abstract

Cited by 532 (26 self)
 Add to MetaCart
We survey general techniques and open problems in numerical linear algebra on parallel architectures. We rst discuss basic principles of parallel processing, describing the costs of basic operations on parallel machines, including general principles for constructing e cient algorithms. We illustrate these principles using current architectures and software systems, and by showing how one would implement matrix multiplication. Then, we present direct and iterative algorithms for solving linear systems of equations, linear least squares problems, the symmetric eigenvalue problem, the nonsymmetric eigenvalue problem, and the singular value decomposition. We consider dense, band and sparse matrices.
Sparse matrices in Matlab: Design and implementation
, 1991
"... We have extended the matrix computation language and environment Matlab to include sparse matrix storage and operations. The only change to the outward appearance of the Matlab language is a pair of commands to create full or sparse matrices. Nearly all the operations of Matlab now apply equally to ..."
Abstract

Cited by 131 (20 self)
 Add to MetaCart
We have extended the matrix computation language and environment Matlab to include sparse matrix storage and operations. The only change to the outward appearance of the Matlab language is a pair of commands to create full or sparse matrices. Nearly all the operations of Matlab now apply equally to full or sparse matrices, without any explicit action by the user. The sparse data structure represents a matrix in space proportional to the number of nonzero entries, and most of the operations compute sparse results in time proportionaltothenumber of arithmetic operations on nonzeros.
A Shifted Block Lanczos Algorithm For Solving Sparse Symmetric Generalized Eigenproblems
, 1994
"... An "industrial strength" algorithm for solving sparse symmetric generalized eigenproblems is described. The algorithm has its foundations in known techniques in solving sparse symmetric eigenproblems, notably the spectral transformation of Ericsson and Ruhe and the block Lanczos algorithm. However, ..."
Abstract

Cited by 86 (7 self)
 Add to MetaCart
An "industrial strength" algorithm for solving sparse symmetric generalized eigenproblems is described. The algorithm has its foundations in known techniques in solving sparse symmetric eigenproblems, notably the spectral transformation of Ericsson and Ruhe and the block Lanczos algorithm. However, the combination of these two techniques is not trivial; there are many pitfalls awaiting the unwary implementor. The focus of this paper is on identifying those pitfalls and avoiding them, leading to a "bombproof" algorithm that can live as a black box eigensolver inside a large applications code. The code that results comprises a robust shift selection strategy and a block Lanczos algorithm that is a novel combination of new techniques and extensions of old techniques.
A spectral algorithm for envelope reduction of sparse matrices
 ACM/IEEE CONFERENCE ON SUPERCOMPUTING
, 1993
"... The problem of reordering a sparse symmetric matrix to reduce its envelope size is considered. A new spectral algorithm for computing an envelopereducing reordering is obtained by associating a Laplacian matrix with the given matrix and then sorting the components of a specified eigenvector of the ..."
Abstract

Cited by 59 (5 self)
 Add to MetaCart
The problem of reordering a sparse symmetric matrix to reduce its envelope size is considered. A new spectral algorithm for computing an envelopereducing reordering is obtained by associating a Laplacian matrix with the given matrix and then sorting the components of a specified eigenvector of the Laplacian. This Laplacian eigenvector solves a continuous relaxation of a discrete problem related to envelope minimization called the minimum 2sum problem. The permutation vector computed by the spectral algorithm is a closest permutation vector to the specified Laplacian eigenvector. Numerical results show that the new reordering algorithm usually computes smaller envelope sizes than those obtained from the current standards such as the GibbsPooleStockmeyer (GPS) algorithm or the reverse CuthillMcKee (RCM) algorithm in SPARSPAK, in some cases reducing the envelope by more than a factor of two.
Highly Parallel Sparse Cholesky Factorization
 SIAM Journal on Scientific and Statistical Computing
, 1992
"... We develop and compare several finegrained parallel algorithms to compute the Cholesky factorization of a sparse matrix. Our experimental implementations are on the Connection Machine, a distributedmemory SIMD machine whose programming model conceptually supplies one processor per data element. In ..."
Abstract

Cited by 45 (1 self)
 Add to MetaCart
We develop and compare several finegrained parallel algorithms to compute the Cholesky factorization of a sparse matrix. Our experimental implementations are on the Connection Machine, a distributedmemory SIMD machine whose programming model conceptually supplies one processor per data element. In contrast to specialpurpose algorithms in which the matrix structure conforms to the connection structure of the machine, our focus is on matrices with arbitrary sparsity structure.
Improved load distribution in parallel sparse Cholesky factorization
 In Proc. of Supercomputing'94
, 1994
"... Compared to the customary columnoriented approaches, blockoriented, distributedmemory sparse Cholesky factorization benefits from an asymptotic reduction in interprocessor communication volume and an asymptotic increase in the amount of concurrency that is exposed in the problem. Unfortunately, ..."
Abstract

Cited by 38 (1 self)
 Add to MetaCart
Compared to the customary columnoriented approaches, blockoriented, distributedmemory sparse Cholesky factorization benefits from an asymptotic reduction in interprocessor communication volume and an asymptotic increase in the amount of concurrency that is exposed in the problem. Unfortunately, blockoriented approaches (specifically, the block fanout method) have suffered from poor balance of the computational load. As a result, achieved performance can be quite low. This paper investigates the reasons for this load imbalance and proposes simple block mapping heuristics that dramatically improve it. The result is a roughly 20_o increase in realized parallel factorization performance, as demonstrated by performance results from an Intel Paragon TM system. We have achieved performance of nearly 3.2 billion floating point operations per second with this technique on a 196node Paragon system. 1
Spectral Nested Dissection
, 1992
"... . We describe a spectral nested dissection algorithm for computing orderings appropriate for parallel factorization of sparse, symmetric matrices. The algorithm makes use of spectral properties of the Laplacian matrix associated with the given matrix to compute separators. We evaluate the quality of ..."
Abstract

Cited by 30 (5 self)
 Add to MetaCart
. We describe a spectral nested dissection algorithm for computing orderings appropriate for parallel factorization of sparse, symmetric matrices. The algorithm makes use of spectral properties of the Laplacian matrix associated with the given matrix to compute separators. We evaluate the quality of the spectral orderings with respect to several measures: fill, elimination tree height, height and weight balances of elimination trees, and clique tree heights. Spectral orderings compare quite favorably with commonly used orderings, outperforming them by a wide margin for some of these measures. These results are confirmed by computing a multifrontal numerical factorization with the different orderings on a Cray YMP with eight processors. Keywords. graph partitioning, graph spectra, Laplacian matrix, ordering algorithms, parallel orderings, parallel sparse Cholesky factorization, sparse matrix, vertex separator AMS(MOS) subject classifications. 65F50, 65F05, 65F15, 68R10 1. Introducti...
Domain Decomposition Preconditioning For PVersion Finite Elements With High Aspect Ratios
, 1991
"... A recent domain decomposition type preconditioner for the pversion... ..."
Abstract

Cited by 26 (2 self)
 Add to MetaCart
A recent domain decomposition type preconditioner for the pversion...
On the Future of Problem Solving Environments

, 2000
"... In this paper we review the current state of the problem solving environment (PSE) field and make projections for the future. First we describe the computing context, the definition of a PSE and the goals of a PSE. The stateoftheart is summarized along with sources (books, bibliographics, web sit ..."
Abstract

Cited by 16 (2 self)
 Add to MetaCart
In this paper we review the current state of the problem solving environment (PSE) field and make projections for the future. First we describe the computing context, the definition of a PSE and the goals of a PSE. The stateoftheart is summarized along with sources (books, bibliographics, web sites) of more detailed information. The principal components and paradigms for building PSEs are presented. The discussion of the future is given in three parts: future trends, scenarios for 2010/2025, and research