Results 1  10
of
14
An Extended Set of Fortran Basic Linear Algebra Subprograms
 ACM TRANSACTIONS ON MATHEMATICAL SOFTWARE
, 1986
"... This paper describes an extension to the set of Basic Linear Algebra Subprograms. The extensions are targeted at matrixvector operations which should provide for efficient and portable implementations of algorithms for high performance computers. ..."
Abstract

Cited by 447 (69 self)
 Add to MetaCart
This paper describes an extension to the set of Basic Linear Algebra Subprograms. The extensions are targeted at matrixvector operations which should provide for efficient and portable implementations of algorithms for high performance computers.
Prospectus for the Development of a Linear Algebra Library for HighPerformance Computers
 MATHEMATICS AND COMPUTER SCIENCE DIVISION REPORT ANL/MCSTM97, ARGONNE NATIONAL LABORATORY, ARGONNE, IL
, 1987
"... ..."
Multiprocessing Linear Algebra Algorithms on the Cray XMP2: Experiences With Small Granularity
 Journal of Parallel Distributed Computing
, 1984
"... This paper gives a brief overview of the CRAY XMP2 generalpurpose multiprocessor system and discusses how it can be used effectively to solve problems that have small granularity. An implementation is described for linear algebra algorithms that solve systems of linear equations when the matrix ..."
Abstract

Cited by 4 (2 self)
 Add to MetaCart
This paper gives a brief overview of the CRAY XMP2 generalpurpose multiprocessor system and discusses how it can be used effectively to solve problems that have small granularity. An implementation is described for linear algebra algorithms that solve systems of linear equations when the matrix is general and when the matrix is symmetric and positive definite. OVERVIEW OF THE SYSTEM “Multiprocessor ” is a term that has been used for years. Our definition follows those of [8], [9], and [lo]. The CRAY XMP is a followup to the CRAY1S system offered by CRAY Research, Inc. The CRAY XMP family is a generalpurpose multiprocessor systeln. It inherits the basic vector functions of CRAYlS, with major architectural improvements for each individual processor. The interprocessor communication mechanism and the provision of SolidState Disk device(SSD) are new designs that create tremendous potential in the realm of highspeed computing. The CRAY XMP2 system is the first product of the CRAY XMP family.
Are There Iterative BLAS?
, 1994
"... A technique for optimizing software is proposed that involves the use of a standardized set of computational kernels that are common to many iterative methods for solving large sparse linear systems of equations. These kernels, referred to as "Iterative Basic Linear Algebra Subprograms" or ..."
Abstract

Cited by 3 (0 self)
 Add to MetaCart
A technique for optimizing software is proposed that involves the use of a standardized set of computational kernels that are common to many iterative methods for solving large sparse linear systems of equations. These kernels, referred to as "Iterative Basic Linear Algebra Subprograms" or "Iterative BLAS", are defined and techniques for their optimization on vector computers are presented. Several sparse matrix storage formats for different classes of matrix problems are proposed that allow the vectorization of fundamental operations in various iterative methods using these kernels. 1 Introduction Many iterative methods perform operations that can be easily optimized on most vector computers, such as the dot product of two vectors and the updating of a vector using another vector. These operations are often used in linear algebra applications, and they have been denoted as Basic Linear Algebra Subprograms or BLAS [23]. In the BLAS library, the calling sequences of these primitive vec...
Design And Implementation Of A Fortran Assistant Tool For Vector Compilers
 Intl. Journal of High Speed Computing
, 1996
"... In this paper, we present the design and implementation of sourcetosource High Performance Fortran assistant Tool (HPFT) in DEC 3000 workstations. For a given sequential program written in Fortran 77, HPFT generates a vectorized, reuse exploited, and/or parallelized version for vector computers ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
In this paper, we present the design and implementation of sourcetosource High Performance Fortran assistant Tool (HPFT) in DEC 3000 workstations. For a given sequential program written in Fortran 77, HPFT generates a vectorized, reuse exploited, and/or parallelized version for vector computers. Several new compilation schemes in vectorization, reuse exploitation, and multithreading are designed in HPFT. Performance evaluator is developed for measuring the system performance. The user interface is also designed for programmer to capture the information related to the compilation and execution of program. Experimental results based on the Convex C3840 vector computer show that the developed HPFT enhances the system performance and usually reduces the program execution time. Keywords: Data dependence, loop optimization, vector compilers, vector register reuse. Short title: Will be used by the Publisher as running head. 1. Introduction. Vector computers such as Cray famil...
Principal Component Analysis on Vector Computers
, 1995
"... Principal component analysis is a classical multivariate technique. This is a basic tool in the field of image processing. Due to the iterative performing and the high computational cost of this algorithms over conventional computers, they are good candidates for the pipeline processing. In this wor ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
Principal component analysis is a classical multivariate technique. This is a basic tool in the field of image processing. Due to the iterative performing and the high computational cost of this algorithms over conventional computers, they are good candidates for the pipeline processing. In this work we analyse this code from the vectorization approach and present an efficient implementation on the Fujitsu VP2400/10. 1 Introduction The notion of principal components of a sample was introduced by Pearson (1901) as a statistical tool for reducing multivariate data encountered in applied statistical research to a smaller dimensionality [7]. He defined a "plane of closest fit" as a subspace which minimizes the sum of squares of the distances from each point containing data. The term "principal components" was later applied for the purpose of analysing covariance and correlation structures. Since then, it has become increasingly popular in multivariate statistical theory and applications. ...
Applications of Parallel and Vector Algorithms in Nonlinear Structural Dynamics Using the Finite Element Method
, 1992
"... This research is directed toward the numerical analysis of large, three dimensional, nonlinear dynamic problems in structural and solid mechanics. Such problems include those exhibiting large deformations, displacements, or rotations, those requiring finite strain plasticity material models that cou ..."
Abstract
 Add to MetaCart
This research is directed toward the numerical analysis of large, three dimensional, nonlinear dynamic problems in structural and solid mechanics. Such problems include those exhibiting large deformations, displacements, or rotations, those requiring finite strain plasticity material models that couple geometric and material nonlinearities, and those demanding detailed geometric modeling. A finite element code was developed, designed around the 3D isoparametric family of elements, and using a Total Lagrangian formulation and implicit integration of the global equations of motion. The research was conducted using the Alliant FXl8 and Convex C240 supercomputers. The research focuses on four main areas: Development of element computation algorithms that exploit the inherent opportunities for concurrency and vectorization present in the finite element method; Comparison of the preconditioned conjugate gradient method to a representative direct solver; Investigation of various nonlinear solution algorithms, such as modified NewtonRaphson, secantNewton, and nonlinear preconditioned conjugate gradient; and,
A Proposal [or an ExLcndcd Set of IrorLran Basic Linear Algebra Subprograms
"... AbstractThis paper describes an extension to the set of Basic Linear Algebra Subprograms. The extensions proposed are targeted at matrix vector operations which should provide for more efficient and portable implementations of algor, ithrns for high performance computers. Part I: The Proposal I. ..."
Abstract
 Add to MetaCart
AbstractThis paper describes an extension to the set of Basic Linear Algebra Subprograms. The extensions proposed are targeted at matrix vector operations which should provide for more efficient and portable implementations of algor, ithrns for high performance computers. Part I: The Proposal I.
NAG Central Office, Mayfield House
"... This paper describes modifications to many of the standard algorithms used in computing eigenvalues and eigenvectors of matrices. These modifications can dramatically increase the performance of the underlying software on highperformance computers without resorting to assembler language, without si ..."
Abstract
 Add to MetaCart
This paper describes modifications to many of the standard algorithms used in computing eigenvalues and eigenvectors of matrices. These modifications can dramatically increase the performance of the underlying software on highperformance computers without resorting to assembler language, without significantly influencing the floatingpoint operation count, and without affecting the roundofferror properties of the algorithms. The techniques are applied to a wide variety of algorithms and are beneficial in various architectural settings.