Results 1  10
of
21
Special Purpose Parallel Computing
 Lectures on Parallel Computation
, 1993
"... A vast amount of work has been done in recent years on the design, analysis, implementation and verification of special purpose parallel computing systems. This paper presents a survey of various aspects of this work. A long, but by no means complete, bibliography is given. 1. Introduction Turing ..."
Abstract

Cited by 77 (5 self)
 Add to MetaCart
A vast amount of work has been done in recent years on the design, analysis, implementation and verification of special purpose parallel computing systems. This paper presents a survey of various aspects of this work. A long, but by no means complete, bibliography is given. 1. Introduction Turing [365] demonstrated that, in principle, a single general purpose sequential machine could be designed which would be capable of efficiently performing any computation which could be performed by a special purpose sequential machine. The importance of this universality result for subsequent practical developments in computing cannot be overstated. It showed that, for a given computational problem, the additional efficiency advantages which could be gained by designing a special purpose sequential machine for that problem would not be great. Around 1944, von Neumann produced a proposal [66, 389] for a general purpose storedprogram sequential computer which captured the fundamental principles of...
A Parallelizable Eigensolver for Real Diagonalizable Matrices with Real Eigenvalues
, 1991
"... . In this paper, preliminary research results on a new algorithm for finding all the eigenvalues and eigenvectors of a real diagonalizable matrix with real eigenvalues are presented. The basic mathematical theory behind this approach is reviewed and is followed by a discussion of the numerical consi ..."
Abstract

Cited by 25 (6 self)
 Add to MetaCart
. In this paper, preliminary research results on a new algorithm for finding all the eigenvalues and eigenvectors of a real diagonalizable matrix with real eigenvalues are presented. The basic mathematical theory behind this approach is reviewed and is followed by a discussion of the numerical considerations of the actual implementation. The numerical algorithm has been tested on thousands of matrices on both a Cray2 and an IBM RS/6000 Model 580 workstation. The results of these tests are presented. Finally, issues concerning the parallel implementation of the algorithm are discussed. The algorithm's heavy reliance on matrixmatrix multiplication, coupled with the divide and conquer nature of this algorithm, should yield a highly parallelizable algorithm. 1. Introduction. Computation of all the eigenvalues and eigenvectors of a dense matrix is essential for solving problems in many fields. The everincreasing computational power available from modern supercomputers offers the potenti...
A Parallel Divide and Conquer Algorithm for the Symmetric Eigenvalue Problem on Distributed Memory Architectures
, 1999
"... ..."
A Parallel Implementation of the Invariant Subspace Decomposition Algorithm for Dense Symmetric Matrices
, 1993
"... . We give an overview of the Invariant Subspace Decomposition Algorithm for dense symmetric matrices (SYISDA) by first describing the algorithm, followed by a discussion of a parallel implementation of SYISDA on the Intel Delta. Our implementation utilizes an optimized parallel matrix multiplication ..."
Abstract

Cited by 12 (2 self)
 Add to MetaCart
. We give an overview of the Invariant Subspace Decomposition Algorithm for dense symmetric matrices (SYISDA) by first describing the algorithm, followed by a discussion of a parallel implementation of SYISDA on the Intel Delta. Our implementation utilizes an optimized parallel matrix multiplication implementation we have developed. Load balancing in the costly early stages of the algorithm is accomplished without redistribution of data between stages through the use of the block scattered decomposition. Computation of the invariant subspaces at each stage is done using a new tridiagonalization scheme due to Bischof and Sun. 1. Introduction Computation of all the eigenvalues and eigenvectors of a dense symmetric matrix is an essential kernel in many applications. The everincreasing computational power available from parallel computers offers the potential for solving much larger problems than could have been contemplated previously. Hardware scalability of parallel machines is freque...
The PRISM Project: Infrastructure and Algorithms for Parallel Eigensolvers
, 1994
"... The goal of the PRISM project is the development of infrastructure and algorithms for the parallel solution of eigenvalue problems. We are currently investigating a complete eigensolver based on the Invariant Subspace Decomposition Algorithm for dense symmetric matrices (SYISDA). After briefly revie ..."
Abstract

Cited by 12 (6 self)
 Add to MetaCart
The goal of the PRISM project is the development of infrastructure and algorithms for the parallel solution of eigenvalue problems. We are currently investigating a complete eigensolver based on the Invariant Subspace Decomposition Algorithm for dense symmetric matrices (SYISDA). After briefly reviewing SYISDA, we discuss the algorithmic highlights of a distributedmemory implementation of this approach. These include a fast matrixmatrix multiplication algorithm, a new approach to parallel band reduction and tridiagonalization, and a harness for coordinating the divideandconquer parallelism in the problem. We also present performance results of these kernels as well as the overall SYISDA implementation on the Intel Touchstone Delta prototype. 1. Introduction Computation of eigenvalues and eigenvectors is an essential kernel in many applications, and several promising parallel algorithms have been investigated [29, 24, 3, 27, 21]. The work presented in this paper is part of the PRI...
Trading off Parallelism and Numerical Stability
, 1992
"... The fastest parallel algorithm for a problem may be significantly less stable numerically than the fastest serial algorithm. We illustrate this phenomenon by a series of examples drawn from numerical linear algebra. We also show how some of these instabilities may be mitigated by better floating poi ..."
Abstract

Cited by 12 (5 self)
 Add to MetaCart
The fastest parallel algorithm for a problem may be significantly less stable numerically than the fastest serial algorithm. We illustrate this phenomenon by a series of examples drawn from numerical linear algebra. We also show how some of these instabilities may be mitigated by better floating point arithmetic.
A Scalable Eigenvalue Solver for Symmetric Tridiagonal Matrices
 in Proceedings of the Sixth SIAM Conference on Parallel Processing
, 1994
"... Both massively parallel computers and clusters of workstations are considered promising platforms for numerical scientific computing. This paper describes the first distributedmemory implementation of the splitmerge algorithm, an eigenvalue solver for symmetric tridiagonal matrices that uses La ..."
Abstract

Cited by 10 (9 self)
 Add to MetaCart
Both massively parallel computers and clusters of workstations are considered promising platforms for numerical scientific computing. This paper describes the first distributedmemory implementation of the splitmerge algorithm, an eigenvalue solver for symmetric tridiagonal matrices that uses Laguerre's iteration and exploits the separation property in order to create independent subtasks. Implementations of the splitmerge algorithm on both an nCUBE2 hypercube and a cluster of Sun Sparc10 workstations are described, with emphasis on load balancing, communication overhead, and interaction with other user processes. A performance study demonstrates the advantage of the new algorithm over a parallelization of the wellknown bisection algorithm. A comparison of the performance of the nCUBE2 and cluster implementations supports the claim that workstation clusters offer a costeffective alternative to massively parallel computers for certain scientific applications. This work...
Practical Parallel DivideandConquer Algorithms
, 1997
"... Nested data parallelism has been shown to be an important feature of parallel languages, allowing the concise expression of algorithms that operate on irregular data structures such as graphs and sparse matrices. However, previous nested dataparallel languages have relied on a vector PRAM impleme ..."
Abstract

Cited by 6 (2 self)
 Add to MetaCart
Nested data parallelism has been shown to be an important feature of parallel languages, allowing the concise expression of algorithms that operate on irregular data structures such as graphs and sparse matrices. However, previous nested dataparallel languages have relied on a vector PRAM implementation layer that cannot be efficiently mapped to MPPs with high interprocessor latency. This thesis shows that by restricting the problem set to that of dataparallel divideandconquer algorithms I can maintain the expressibility of full nested dataparallel languages while achieving good efficiency on current distributedmemory machines. Specifically, I define
A History Of Inverse Iteration
 Helmut Wielandt, Mathematische Werke, Mathematical Works, volume II: Matrix Theory and Analysis. Walter de Gruyter
, 1995
"... This article describes the efforts undertaken on behalf of inverse iteration ..."
Abstract

Cited by 4 (1 self)
 Add to MetaCart
This article describes the efforts undertaken on behalf of inverse iteration