Results 11 - 20
of
325
NetSolve: A Network-enabled Server for Solving Computational Science Problems
- The International Journal of Supercomputer Applications and High Performance Computing
, 2000
"... This paper presents a new system, called NetSolve, that allows users to access computational resources, such as hardware and software, distributed across the network. The development of NetSolve was motivated by the need for an easy-to-use, efficient mechanism for using computational resources remot ..."
Abstract
-
Cited by 64 (4 self)
- Add to MetaCart
This paper presents a new system, called NetSolve, that allows users to access computational resources, such as hardware and software, distributed across the network. The development of NetSolve was motivated by the need for an easy-to-use, efficient mechanism for using computational resources remotely. Ease of use is obtained as a result of different interfaces, some of which require no programming effort from the user. Good performance is ensured by a loadbalancing policy that enables NetSolve to use the computational resources available as efficiently as possible. NetSolve offers the ability to look for computational resources on a network, choose the best one available, solve a problem (with retry for fault-tolerance), and return the answer to the user.
SLICOT - A Subroutine Library in Systems and Control Theory
- Applied and Computational Control, Signals, and Circuits
, 1997
"... This article describes the subroutine library SLICOT that provides Fortran 77 implementations of numerical algorithms for computations in systems and control theory. Around a nucleus of basic numerical linear algebra subroutines, this library builds methods for the design and analysis of linear cont ..."
Abstract
-
Cited by 62 (48 self)
- Add to MetaCart
This article describes the subroutine library SLICOT that provides Fortran 77 implementations of numerical algorithms for computations in systems and control theory. Around a nucleus of basic numerical linear algebra subroutines, this library builds methods for the design and analysis of linear control systems. A brief history of the library is given together with a description of the current version of the library and the on-going activities to complete and improve the library in several aspects. 1 Introduction Systems and control theory are disciplines widely used to describe, control, and optimize industrial and economical processes. There is now a huge amount of theoretical results available which has lead to a variety of methods and algorithms used throughout industry and academia. Although based on theoretical results, these methods often fail when applied to real-life problems, which often tend to be ill-posed or of high dimensions. This failing is frequently due to the lack of...
Summa: Scalable universal matrix multiplication algorithm
, 1997
"... In this paper, we give a straight forward, highly e cient, scalable implementation of common matrix multiplication operations. The algorithms are much simpler than previously published methods, yield better performance, and require less work space. MPI implementations are given, as are performance r ..."
Abstract
-
Cited by 58 (3 self)
- Add to MetaCart
In this paper, we give a straight forward, highly e cient, scalable implementation of common matrix multiplication operations. The algorithms are much simpler than previously published methods, yield better performance, and require less work space. MPI implementations are given, as are performance results on the Intel Paragon system. 1
PUMMA: Parallel Universal Matrix Multiplication Algorithms on Distributed Memory Concurrent Computers
, 1993
"... 0-5, NASA Ames Research Center, Moffet Field, CA 94035 134. William C. Skamarock, 3973 Escuela Court, Boulder, CO 80301 135. Richard Smith, Los Alamos National Laboratory, Group T-3, Mail Stop B2316, Los Alamos, NM 87545 136. Peter Smolarkiewicz, National Center for Atmospheric Research, MMM Group, ..."
Abstract
-
Cited by 57 (11 self)
- Add to MetaCart
0-5, NASA Ames Research Center, Moffet Field, CA 94035 134. William C. Skamarock, 3973 Escuela Court, Boulder, CO 80301 135. Richard Smith, Los Alamos National Laboratory, Group T-3, Mail Stop B2316, Los Alamos, NM 87545 136. Peter Smolarkiewicz, National Center for Atmospheric Research, MMM Group, P. O. Box 3000, Boulder, CO 80307 137. Jurgen Steppeler, DWD, Frankfurterstr 135, 6050 Offenbach, WEST GERMANY 138. Rick Stevens, Mathematics and Computer Science Division, Argonne National Laboratory, 9700 South Cass Avenue, Argonne, IL 60439 139. Paul N. Swarztrauber, National Center for Atmospheric Research, P. O. Box 3000, Boulder, CO 80307 140. Wei Pai Tang, Department of Computer Science, University of Waterloo, Waterloo, Ontario, Canada N2L 3G1 141. Harold Trease, Los Alamos National Laboratory, Mail Stop B257, Los Alamos, NM 87545 142. Robert G. Voigt, ICASE, MS 132-C, NASA Langley Research Center, Hampton, VA 23665 143. Mary F. Wheeler, Rice University, Department of Mathematical Sc
A Numerically Stable, Structure Preserving Method for Computing the Eigenvalues of Real Hamiltonian or Symplectic Pencils
- Numer. Math
, 1996
"... A new method is presented for the numerical computation of the generalized eigenvalues of real Hamiltonian or symplectic pencils and matrices. The method is strongly backward stable, i.e., it is numerically backward stable and preserves the structure (i.e., Hamiltonian or symplectic). In the case of ..."
Abstract
-
Cited by 53 (25 self)
- Add to MetaCart
A new method is presented for the numerical computation of the generalized eigenvalues of real Hamiltonian or symplectic pencils and matrices. The method is strongly backward stable, i.e., it is numerically backward stable and preserves the structure (i.e., Hamiltonian or symplectic). In the case of a Hamiltonian matrix the method is closely related to the square reduced method of Van Loan, but in contrast to that method which may suffer from a loss of accuracy of order p ", where " is the machine precision, the new method computes the eigenvalues to full possible accuracy. Keywords. eigenvalue problem, Hamiltonian pencil (matrix), symplectic pencil (matrix), skew-Hamiltonian matrix AMS subject classification. 65F15 1 Introduction The eigenproblem for Hamiltonian and symplectic matrices has received a lot of attention in the last 25 years, since the landmark papers of Laub [13] and Paige/Van Loan [20]. The reason for this is the importance of this problem in many applications in c...
An Updated Set of Basic Linear Algebra Subprograms (BLAS)
- ACM Transactions on Mathematical Software
, 2001
"... This paper summarizes the BLAS Technical Forum Standard, a speci- #cation of a set of kernel routines for linear algebra, historically called the Basic Linear Algebra Subprograms and commonly known as the BLAS. The complete standard can be found in #1#, and on the BLAS Technical Forum webpage #http: ..."
Abstract
-
Cited by 51 (7 self)
- Add to MetaCart
This paper summarizes the BLAS Technical Forum Standard, a speci- #cation of a set of kernel routines for linear algebra, historically called the Basic Linear Algebra Subprograms and commonly known as the BLAS. The complete standard can be found in #1#, and on the BLAS Technical Forum webpage #http:##www.netlib.org#blas#blast-forum##
Parallel tiled QR factorization for multicore architectures
, 2007
"... As multicore systems continue to gain ground in the High Performance Computing world, linear algebra algorithms have to be reformulated or new algorithms have to be developed in order to take advantage of the architectural features on these new processors. Fine grain parallelism becomes a major requ ..."
Abstract
-
Cited by 49 (26 self)
- Add to MetaCart
As multicore systems continue to gain ground in the High Performance Computing world, linear algebra algorithms have to be reformulated or new algorithms have to be developed in order to take advantage of the architectural features on these new processors. Fine grain parallelism becomes a major requirement and introduces the necessity of loose synchronization in the parallel execution of an operation. This paper presents an algorithm for the QR factorization where the operations can be represented as a sequence of small tasks that operate on square blocks of data. These tasks can be dynamically scheduled for execution based on the dependencies among them and on the availability of computational resources. This may result in an out of order execution of the tasks which will completely hide the presence of intrinsically sequential tasks in the factorization. Performance comparisons are presented with the LAPACK algorithm for QR factorization where parallelism can only be exploited at the level of the BLAS operations.
Solving Algebraic Riccati Equations on Parallel Computers Using Newton's Method with Exact Line Search
, 1999
"... We investigate the numerical solution of continuous-time algebraic Riccati equations via Newton's method on serial and parallel computers with distributed memory. We apply and extend the available theory for Newton's method endowed with exact line search to accelerate convergence. We also discuss a ..."
Abstract
-
Cited by 48 (5 self)
- Add to MetaCart
We investigate the numerical solution of continuous-time algebraic Riccati equations via Newton's method on serial and parallel computers with distributed memory. We apply and extend the available theory for Newton's method endowed with exact line search to accelerate convergence. We also discuss a new stopping criterion based on recent observations regarding condition and error estimates. In each iteration step of Newton's method a stable Lyapunov equation has too be solved. We propose to solve these Lyapunov equations using iterative schemes for computing the matrix sign function. This approach can be efficiently implemented on parallel computers using ScaLAPACK. Numerical experiments on an ibm sp2 multicomputer report the accuracy, scalability, and speed-up of the implemented algorithms.
Sparse Elimination and Applications in Kinematics
, 1994
"... This thesis proposes efficient algorithmic solutions to problems in computational algebra and computational algebraic geometry. Moreover, it considers their application to different areas where algebraic systems describe kinematic and geometric constraints. Given an arbitrary system of nonlinear mul ..."
Abstract
-
Cited by 47 (10 self)
- Add to MetaCart
This thesis proposes efficient algorithmic solutions to problems in computational algebra and computational algebraic geometry. Moreover, it considers their application to different areas where algebraic systems describe kinematic and geometric constraints. Given an arbitrary system of nonlinear multivariate polynomial equations, its resultant serves in eliminating variables and reduces root finding to a linear eigenproblem. Our contribution is to describe the first efficient and general algorithms for computing the sparse resultant. The sparse resultant generalizes the classical homogeneous resultant and exploits the structure of the given polynomials. Its size depends only on the geometry of the input Newton polytopes. The first algorithm uses a subdivision of the Minkowski sum and produces matrix...
Modeling Parallel Computers as Memory Hierarchies
- In Proc. Programming Models for Massively Parallel Computers
, 1993
"... A parameterized generic model that captures the features of diverse computer architectures would facilitate the development of portable programs. Specific models appropriate to particular computers are obtained by specifying parameters of the generic model. A generic model should be simple, and for ..."
Abstract
-
Cited by 41 (6 self)
- Add to MetaCart
A parameterized generic model that captures the features of diverse computer architectures would facilitate the development of portable programs. Specific models appropriate to particular computers are obtained by specifying parameters of the generic model. A generic model should be simple, and for each machine that it is intended to represent, it should have a reasonably accurate specific model. The Parallel Memory Hierarchy (PMH) model of computation uses a single mechanism to model the costs of both interprocessor communication and memory hierarchy traffic. A computer is modeled as a tree of memory modules with processors at the leaves. All data movement takes the form of block transfers between children and their parents. This paper assesses the strengths and weaknesses of the PMH model as a generic model. 1 Introduction The raw computing power of multiprocessor computers is exploding. The challenge is to create software that can take advantage of this computing power. The diversit...

