Results 1  10
of
13
An Extended Set of Fortran Basic Linear Algebra Subprograms
 ACM TRANSACTIONS ON MATHEMATICAL SOFTWARE
, 1986
"... This paper describes an extension to the set of Basic Linear Algebra Subprograms. The extensions are targeted at matrixvector operations which should provide for efficient and portable implementations of algorithms for high performance computers. ..."
Abstract

Cited by 450 (71 self)
 Add to MetaCart
This paper describes an extension to the set of Basic Linear Algebra Subprograms. The extensions are targeted at matrixvector operations which should provide for efficient and portable implementations of algorithms for high performance computers.
Prospectus for the Development of a Linear Algebra Library for HighPerformance Computers
 MATHEMATICS AND COMPUTER SCIENCE DIVISION REPORT ANL/MCSTM97, ARGONNE NATIONAL LABORATORY, ARGONNE, IL
, 1987
"... ..."
Multiprocessing Linear Algebra Algorithms on the Cray XMP2: Experiences With Small Granularity
 Journal of Parallel Distributed Computing
, 1984
"... This paper gives a brief overview of the CRAY XMP2 generalpurpose multiprocessor system and discusses how it can be used effectively to solve problems that have small granularity. An implementation is described for linear algebra algorithms that solve systems of linear equations when the matrix ..."
Abstract

Cited by 4 (2 self)
 Add to MetaCart
This paper gives a brief overview of the CRAY XMP2 generalpurpose multiprocessor system and discusses how it can be used effectively to solve problems that have small granularity. An implementation is described for linear algebra algorithms that solve systems of linear equations when the matrix is general and when the matrix is symmetric and positive definite. OVERVIEW OF THE SYSTEM “Multiprocessor ” is a term that has been used for years. Our definition follows those of [8], [9], and [lo]. The CRAY XMP is a followup to the CRAY1S system offered by CRAY Research, Inc. The CRAY XMP family is a generalpurpose multiprocessor systeln. It inherits the basic vector functions of CRAYlS, with major architectural improvements for each individual processor. The interprocessor communication mechanism and the provision of SolidState Disk device(SSD) are new designs that create tremendous potential in the realm of highspeed computing. The CRAY XMP2 system is the first product of the CRAY XMP family.
An Extended Set of Fortran Basic Linear Algebra Subprograms
 ACM Transactions on Mathematical Software
, 1986
"... This paper describes an extension to the set of Basic Linear Algebra Subprograms. The extensions are targeted at matrixvector operations which should provide for efficient and portable implementations of algorithms for high performance computers. ..."
Abstract

Cited by 3 (0 self)
 Add to MetaCart
This paper describes an extension to the set of Basic Linear Algebra Subprograms. The extensions are targeted at matrixvector operations which should provide for efficient and portable implementations of algorithms for high performance computers.
Are There Iterative BLAS?
, 1994
"... A technique for optimizing software is proposed that involves the use of a standardized set of computational kernels that are common to many iterative methods for solving large sparse linear systems of equations. These kernels, referred to as "Iterative Basic Linear Algebra Subprograms" or "Iterativ ..."
Abstract

Cited by 3 (0 self)
 Add to MetaCart
A technique for optimizing software is proposed that involves the use of a standardized set of computational kernels that are common to many iterative methods for solving large sparse linear systems of equations. These kernels, referred to as "Iterative Basic Linear Algebra Subprograms" or "Iterative BLAS", are defined and techniques for their optimization on vector computers are presented. Several sparse matrix storage formats for different classes of matrix problems are proposed that allow the vectorization of fundamental operations in various iterative methods using these kernels. 1 Introduction Many iterative methods perform operations that can be easily optimized on most vector computers, such as the dot product of two vectors and the updating of a vector using another vector. These operations are often used in linear algebra applications, and they have been denoted as Basic Linear Algebra Subprograms or BLAS [23]. In the BLAS library, the calling sequences of these primitive vec...
Design And Implementation Of A Fortran Assistant Tool For Vector Compilers
 Intl. Journal of High Speed Computing
, 1996
"... In this paper, we present the design and implementation of sourcetosource High Performance Fortran assistant Tool (HPFT) in DEC 3000 workstations. For a given sequential program written in Fortran 77, HPFT generates a vectorized, reuse exploited, and/or parallelized version for vector computers ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
In this paper, we present the design and implementation of sourcetosource High Performance Fortran assistant Tool (HPFT) in DEC 3000 workstations. For a given sequential program written in Fortran 77, HPFT generates a vectorized, reuse exploited, and/or parallelized version for vector computers. Several new compilation schemes in vectorization, reuse exploitation, and multithreading are designed in HPFT. Performance evaluator is developed for measuring the system performance. The user interface is also designed for programmer to capture the information related to the compilation and execution of program. Experimental results based on the Convex C3840 vector computer show that the developed HPFT enhances the system performance and usually reduces the program execution time. Keywords: Data dependence, loop optimization, vector compilers, vector register reuse. Short title: Will be used by the Publisher as running head. 1. Introduction. Vector computers such as Cray famil...
Principal Component Analysis on Vector Computers
, 1995
"... Principal component analysis is a classical multivariate technique. This is a basic tool in the field of image processing. Due to the iterative performing and the high computational cost of this algorithms over conventional computers, they are good candidates for the pipeline processing. In this wor ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
Principal component analysis is a classical multivariate technique. This is a basic tool in the field of image processing. Due to the iterative performing and the high computational cost of this algorithms over conventional computers, they are good candidates for the pipeline processing. In this work we analyse this code from the vectorization approach and present an efficient implementation on the Fujitsu VP2400/10. 1 Introduction The notion of principal components of a sample was introduced by Pearson (1901) as a statistical tool for reducing multivariate data encountered in applied statistical research to a smaller dimensionality [7]. He defined a "plane of closest fit" as a subspace which minimizes the sum of squares of the distances from each point containing data. The term "principal components" was later applied for the purpose of analysing covariance and correlation structures. Since then, it has become increasingly popular in multivariate statistical theory and applications. ...
rospectus for the Development of a Linear Algebra Library
 Mathematics and Computer Science Division Report ANL/MCSTM97, Argonne National Laboratory, Argonne, IL
, 1987
"... derlying numerica lgorithms will be the same for all machines. Prospectus for the Development of a Linear Algebra Library for HighPerformance Computers + C James Demmel omputer Science Department 2 Courant Institute 51 Mercer Street 2 New York, New York 1001 + M Jack J. Dongarra athe ..."
Abstract
 Add to MetaCart
derlying numerica lgorithms will be the same for all machines. Prospectus for the Development of a Linear Algebra Library for HighPerformance Computers + C James Demmel omputer Science Department 2 Courant Institute 51 Mercer Street 2 New York, New York 1001 + M Jack J. Dongarra athematics and Computer Science Division A Argonne National Laboratory rgonne, Illinois 604394844 N Jeremy Du Croz umerical Algorithms Group Ltd. e 2 NAG Central Office, Mayfield Hous 56 Banbury Road, Oxford OX2 7DE, England C Anne Greenbaum + omputer Science Department 2 Courant Institute 51 Mercer Street 2 New York, New York 1001 Sven Hammarling . N Numerical Algorithms Group Ltd AG Central Office, Mayfield House d 256 Banbury Road, Oxford OX2 7DE, Englan # M Danny Sorensen athematics and Computer Science Di
Principal Component Analysis on Vector Computers
, 1996
"... . Principal component analysis is a classical multivariate technique used as a basic tool in the field of image processing. Due to the iterative character and the high computational cost of these algorithms over conventional computers, they are good candidates for pipelined processing. In this w ..."
Abstract
 Add to MetaCart
. Principal component analysis is a classical multivariate technique used as a basic tool in the field of image processing. Due to the iterative character and the high computational cost of these algorithms over conventional computers, they are good candidates for pipelined processing. In this work we analyse these algorithms from the viewpoint of vectorization and present an efficient implementation on the Fujitsu VP2400/10. We systematically applied different code transformations to the algorithm making use of the vectorial capabilities of the system. In particular we have tested a number of vectorization techniques that optimize the reuse of the vector registers, exploit all levels of the memory hierarchy, and utilize the pipelined units in parallel (concurrency between them). We have considered images of 32 \Theta 32 pixels and have divided the algorithm into three different stages. The speedups obtained for the native vectorizing compiler were 1.3, 1.3 and 7.9 for t...