Results 1  10
of
67
Software libraries for linear algebra computations on high performance computers
 SIAM REVIEW
, 1995
"... This paper discusses the design of linear algebra libraries for high performance computers. Particular emphasis is placed on the development of scalable algorithms for MIMD distributed memory concurrent computers. A brief description of the EISPACK, LINPACK, and LAPACK libraries is given, followed b ..."
Abstract

Cited by 67 (16 self)
 Add to MetaCart
This paper discusses the design of linear algebra libraries for high performance computers. Particular emphasis is placed on the development of scalable algorithms for MIMD distributed memory concurrent computers. A brief description of the EISPACK, LINPACK, and LAPACK libraries is given, followed by an outline of ScaLAPACK, which is a distributed memory version of LAPACK currently under development. The importance of blockpartitioned algorithms in reducing the frequency of data movement between different levels of hierarchical memory is stressed. The use of such algorithms helps reduce the message startup costs on distributed memory concurrent computers. Other key ideas in our approach are the use of distributed versions of the Level 3 Basic Linear Algebra Subprograms (BLAS) as computational building blocks, and the use of Basic Linear Algebra Communication Subprograms (BLACS) as communication building blocks. Together the distributed BLAS and the BLACS can be used to construct highe...
Large Dense Numerical Linear Algebra in 1993: The Parallel Computing Influence
 International Journal Supercomputer Applications
, 1994
"... This paper surveys the current state of applications of large dense numerical linear algebra, and the influence of parallel computing. Furthermore, we attempt to crystalize many important ideas that we feel have been sometimes been misunderstood in the rush to write fast programs. 1 Introduction Th ..."
Abstract

Cited by 35 (2 self)
 Add to MetaCart
This paper surveys the current state of applications of large dense numerical linear algebra, and the influence of parallel computing. Furthermore, we attempt to crystalize many important ideas that we feel have been sometimes been misunderstood in the rush to write fast programs. 1 Introduction This paper represents my continuing efforts to track the status of large dense linear algebra problems. The goal is to shatter the barriers that separate the various interested communities while commenting on the influence of parallel computing. A secondary goal is to crystalize the most important ideas that have all too often been obscured by the details of machines and algorithms. Parallel supercomputing is in the spotlight. In the race towards the proliferation of papers on person X's experiences with machine Y (and why his algorithm runs faster than person Z's), sometimes we have lost sight of the applications for which these algorithms are meant to be useful. This paper concentrates on la...
Algorithms in Fastimp: a fast and wideband impedance extraction program for complicated 3D geometries
 ACM/IEEE Design Automation Conference
, 2003
"... Abstract—In this paper, we describe the algorithms used in FastImp, a program for accurate analysis of wideband electromagnetic effects in very complicated geometries of conductors. The program is based on a recently developed surface integral formulation and a precorrected fast Fourier transform ( ..."
Abstract

Cited by 22 (11 self)
 Add to MetaCart
Abstract—In this paper, we describe the algorithms used in FastImp, a program for accurate analysis of wideband electromagnetic effects in very complicated geometries of conductors. The program is based on a recently developed surface integral formulation and a precorrected fast Fourier transform (FFT) accelerated iterative method, but includes a new piecewise quadrature panel integration scheme, a new scaling and preconditioning technique as well as a generalized grid interpolation and projection strategy. Computational results are given on a variety of integrated circuit interconnect structures to demonstrate that FastImp is robust and can accurately analyze very complicated geometries of conductors. Index Terms—Fast integral equation solver, panel integration, parasitic extraction, preconditioner, surface integral formulation, wideband analysis. I.
The Design of Linear Algebra Libraries for High Performance Computers
, 1993
"... This paper discusses the design of linear algebra libraries for high performance computers. Particular emphasis is placed on the development of scalable algorithms for MIMD distributed memory concurrent computers. A brief description of the EISPACK, LINPACK, and LAPACK libraries is given, followe ..."
Abstract

Cited by 16 (1 self)
 Add to MetaCart
This paper discusses the design of linear algebra libraries for high performance computers. Particular emphasis is placed on the development of scalable algorithms for MIMD distributed memory concurrent computers. A brief description of the EISPACK, LINPACK, and LAPACK libraries is given, followed by an outline of ScaLAPACK, which is a distributed memory version of LAPACK currently under development. The importance of blockpartitioned algorithms in reducing the frequency of data movementbetween di#erent levels of hierarchical memory is stressed. The use of such algorithms helps reduce the message startup costs on distributed memory concurrent computers. Other key ideas in our approach are the use of distributed versions of the Level 3 Basic Linear Algebra Subgrams #BLAS# as computational building blocks, and the use of Basic Linear Algebra Communication Subprograms #BLACS# as communication building blocks. Together the distributed BLAS and the BLACS can be used to construct ...
Numerical Study of ThreeDimensional Flow using Fast Parallel Particle Algorithms.
, 1994
"... Numerical studies of turbulent flows have always been prone to crude approximations due to the limitations in computing power. With the advent of supercomputers, new turbulence models and fast particle algorithms, more highly resolved models can now be computed. Vortex Methods are gridfree and so a ..."
Abstract

Cited by 12 (1 self)
 Add to MetaCart
Numerical studies of turbulent flows have always been prone to crude approximations due to the limitations in computing power. With the advent of supercomputers, new turbulence models and fast particle algorithms, more highly resolved models can now be computed. Vortex Methods are gridfree and so avoid a number of shortcomings of gridbased methods for solving turbulent fluid flow equations; these include such problems as poor resolution and numerical diffusion. In these methods, the continuum vorticity field is discretised into a collection of Lagrangian elements, known as vortex elements, which are free to move in the flow field they collectively induce. The vortex element interaction constitutes an Nbody problem, which may be calculated by a direct pairwise summation method, in a time proportional to N 2 . This time complexity may be reduced by use of fast particle algorithms. The most common algorithms are known as the Nbody Treecodes and have a hierarchical structure. An inde...
Improving the Robustness of a Surface Integral Formulation for Wideband Impendance Extraction of 3D Structures
 International Conference on Computer AidedDesign
, 2001
"... In order for parasitic extraction of highspeed integrated circuit interconnect to be sufficiently efficient, and fit with modelorder reduction techniques, a robust wideband surface integral formulation is essential. One recently developed surface integral formulation has shown promise, but was pla ..."
Abstract

Cited by 12 (8 self)
 Add to MetaCart
In order for parasitic extraction of highspeed integrated circuit interconnect to be sufficiently efficient, and fit with modelorder reduction techniques, a robust wideband surface integral formulation is essential. One recently developed surface integral formulation has shown promise, but was plagued with numerical difficulties of poorly understood origin. In this paper we show that one of that formulation's difficulties was related to the inaccuracy in the approach to evaluate integrals over discretization panels, and we present an accurate approach based on an adapted piecewise quadrature scheme. We also show that the condition number of the original system of integral equations can be reduced by differentiating one of the integral equations. Computational results on a ring and a spiral inductor are used to show that the new quadrature scheme and the differentiated integral formulation improve accuracy and accelerate the convergence of iterative solution methods.
Fast Wavelet Transforms for Matrices Arising From Boundary Element Methods
, 1994
"... For many boundary element methods applied to Laplace's equation in two dimensions, the resulting integral equation has both an integral with a logarithmic kernel and an integral with a discontinuous kernel. If standard collocation methods are used to discretize the integral equation we are left ..."
Abstract

Cited by 7 (0 self)
 Add to MetaCart
For many boundary element methods applied to Laplace's equation in two dimensions, the resulting integral equation has both an integral with a logarithmic kernel and an integral with a discontinuous kernel. If standard collocation methods are used to discretize the integral equation we are left with two dense matrices. We consider expressing these matrices in terms of wavelet bases with compact support via a fast wavelet transform as in Beylkin, Coifman and Rokhlin. Upper bounds on the size of the wavelet transform elements are obtained. These bounds are then used to show that if the original matrices are of size N \Theta N , the resulting transformed matrices are sparse, having only O(N log N ) significant entries. Some numerical results will also be presented. Unlike Beylkin, Coifman and Rokhlin who use the fast wavelet transform as a numerical approximation to a continuous operator already expressed in a full wavelet basis of L 2 (IR), we think of the fast wavelet transform as a cha...
The First Annual Large Dense Linear System Survey
 Int. Rept. Univ. California, Berkeley CA
, 1991
"... In the March 24, 1991 issue of NA Digest, I submitted a questionnaire asking who was solving large dense linear systems of equations. Based on the responses, nearly all large dense linear systems today arise from either the benchmarking of supercomputers or applications involving the influence of a ..."
Abstract

Cited by 7 (2 self)
 Add to MetaCart
In the March 24, 1991 issue of NA Digest, I submitted a questionnaire asking who was solving large dense linear systems of equations. Based on the responses, nearly all large dense linear systems today arise from either the benchmarking of supercomputers or applications involving the influence of a two dimensional boundary on three dimensional space. Not surprisingly, the area of computational aerodynamics or aeroelectromechanics represents an important commercial application requiring the solution of such systems. The largest unstructured matrix that has been factored using Gaussian Elimination was a complex matrix of size 55,296. The largest dense matrix solved on a Sun using an iterative method was a real matrix of size 20,000. It is unclear at this time whether dense methods are truly needed at all for huge matrices. It is intended to survey users every year with the hope of including more applications as I am made aware of them. 1 Introduction The idea to poll solvers of large d...
Swimming due to transverse shape deformations
 J. FLUID MECH
, 2009
"... Balance laws are derived for the swimming of a deformable body due to prescribed shape changes and the effect of the wake vorticity. The underlying balances of momenta, though classical in nature, provide a unifying framework for the swimming of threedimensional and planar bodies and they hold even ..."
Abstract

Cited by 6 (2 self)
 Add to MetaCart
Balance laws are derived for the swimming of a deformable body due to prescribed shape changes and the effect of the wake vorticity. The underlying balances of momenta, though classical in nature, provide a unifying framework for the swimming of threedimensional and planar bodies and they hold even in the presence of viscosity. The derived equations are consistent with Lighthill’s reactive force theory for the swimming of slender bodies and, when neglecting vorticity, reduce to the model developed in Kanso et al. (J. Nonlinear Sci., vol. 15, 2005, p. 255) for swimming in potential flow. The locomotion of a deformable body is examined through two sets of examples: the first set studies the effect of cyclic shape deformations, both flapping and undulatory, on the locomotion in potential flow while the second examines the effect of the wake vorticity on the net locomotion. In the latter, the vortex wake is modelled using pairs of point vortices shed periodically from the tail of the deformable body.
A Coupled Numerical Technique for SelfConsistent Analysis of MicroElectroMechanicalSystems
, 1996
"... An efficient algorithm for selfconsistent analysis of 3D microelectro mechanicalsystems (MEMS) is described. The algorithm employs a hybrid finiteelement/boundaryelement technique for coupled mechanical and electrical analysis. The coupled algorithm is shown to converge rapidly and is much fas ..."
Abstract

Cited by 4 (3 self)
 Add to MetaCart
An efficient algorithm for selfconsistent analysis of 3D microelectro mechanicalsystems (MEMS) is described. The algorithm employs a hybrid finiteelement/boundaryelement technique for coupled mechanical and electrical analysis. The coupled algorithm is shown to converge rapidly and is much faster than relaxation for tightly coupled problems.