Results 1  10
of
31
Self Adapting Software for Numerical Linear Algebra and LAPACK for Clusters
 Parallel Computing
, 2003
"... This article describes the context, design, and recent development of the LAPACK for Clusters (LFC) project. It has been developed in the framework of SelfAdapting Numerical Software (SANS) since we believe such an approach can deliver the con venience and ease of use of existing sequential enviro ..."
Abstract

Cited by 24 (12 self)
 Add to MetaCart
(Show Context)
This article describes the context, design, and recent development of the LAPACK for Clusters (LFC) project. It has been developed in the framework of SelfAdapting Numerical Software (SANS) since we believe such an approach can deliver the con venience and ease of use of existing sequential environments bundled with the power and versatility of highlytuned parallel codes that execute on clusters. Accomplishing this task is far from trivial as we argue in the paper by presenting pertinent case studies and possible usage scenarios.
Using mixed precision for sparse matrix computations to enhance the performance while achieving 64bit accuracy
 ACM Trans. Math. Softw
"... By using a combination of 32bit and 64bit floating point arithmetic the performance of many sparse linear algebra algorithms can be significantly enhanced while maintaining the 64bit accuracy of the resulting solution. These ideas can be applied to sparse multifrontal and supernodal direct techni ..."
Abstract

Cited by 20 (1 self)
 Add to MetaCart
(Show Context)
By using a combination of 32bit and 64bit floating point arithmetic the performance of many sparse linear algebra algorithms can be significantly enhanced while maintaining the 64bit accuracy of the resulting solution. These ideas can be applied to sparse multifrontal and supernodal direct techniques and sparse iterative techniques such as Krylov subspace methods. The approach presented here can apply not only to conventional processors but also to exotic technologies such as
Computational quality of service for scientific CCA applications: Composition, substitution, and reconfiguration
 Argonne National Laboratory
, 2006
"... Abstract. Componentbased design can help manage the complexity of highperformance scientific simulations, where it has become increasingly clear that no single research group can effectively develop, select, or tune all of the components in a given application and that no single tool, solver, or s ..."
Abstract

Cited by 17 (7 self)
 Add to MetaCart
Abstract. Componentbased design can help manage the complexity of highperformance scientific simulations, where it has become increasingly clear that no single research group can effectively develop, select, or tune all of the components in a given application and that no single tool, solver, or solution strategy can seamlessly span the entire spectrum efficiently. Component approaches augment the benefits of objectoriented design with programming language interoperability, common interfaces, and dynamic composability. Our work addresses the challenge of how to compose, substitute, and reconfigure components dynamically during the execution of a scientific application. The goal is to make suitable compromises among performance, accuracy, mathematical consistency, and reliability when choosing among available component implementations and parameters. As motivated by highperformance simulations in combustion, quantum chemistry, and accelerator modeling, this paper discusses ideas on computational quality of service (CQoS) — the automatic selection and configuration of components to suit a particular computational purpose. We discuss the synergy between componentbased software design and CQoS, with emphasis on features of the Common Component Architecture that provide the foundation for this work. We introduce the design of our CQoS software, which consists of tools for measurement, analysis, and control infrastructure, and
An efficient block variant of GMRES
 SIAM J. Sci. Comput
"... Abstract. We present an alternative to the standard restarted GMRES algorithm for solving a single righthand side linear system Ax = b based on solving the block linear system AX = B. Additional starting vectors and righthand sides are chosen to accelerate convergence. Algorithm performance, i.e. ..."
Abstract

Cited by 10 (2 self)
 Add to MetaCart
(Show Context)
Abstract. We present an alternative to the standard restarted GMRES algorithm for solving a single righthand side linear system Ax = b based on solving the block linear system AX = B. Additional starting vectors and righthand sides are chosen to accelerate convergence. Algorithm performance, i.e. time to solution, is improved by using the matrix A in operations on groups of vectors, or “multivectors, ” thereby reducing the movement of A through memory. The efficient implementation of our method depends on a fast matrixmultivector multiply routine. We present numerical results that show that the time to solution of the new method is up to two and half times faster than that of restarted GMRES on preconditioned problems. Furthermore, we demonstrate the impact of implementation choices on data movement and, as a result, algorithm performance. Key words. GMRES, block GMRES, iterative methods, Krylov subspace techniques, restart, nonsymmetric linear systems, memory access costs AMS subject classifications. 65F10
On improving linear solver performance: A block variant of GMRes
 SIAM Journal on Scientific Computing
"... Abstract. The increasing gap between processor performance and memory access time warrants the reexamination of data movement in iterative linear solver algorithms. For this reason, we explore and establish the feasibility of modifying a standard iterative linear solver algorithm in a manner that ..."
Abstract

Cited by 8 (1 self)
 Add to MetaCart
(Show Context)
Abstract. The increasing gap between processor performance and memory access time warrants the reexamination of data movement in iterative linear solver algorithms. For this reason, we explore and establish the feasibility of modifying a standard iterative linear solver algorithm in a manner that reduces the movement of data through memory. In particular, we present an alternative to the restarted GMRES algorithm for solving a single righthand side linear system Ax = b based on solving the block linear system AX = B. Algorithm performance, i.e. time to solution, is improved by using the matrix A in operations on groups of vectors. Experimental results demonstrate the importance of implementation choices on data movement as well as the effectiveness of the new method on a variety of problems from different application areas.
Performance optimization and modeling of blocked sparse kernels
, 2005
"... We present a method for automatically selecting optimal implementations of sparse matrixvector operations. Our software “AcCELS ” (Accelerated Compressstorage Elements for Linear Solvers) involves a setup phase that probes machine characteristics, and a runtime phase where stored characteristics ..."
Abstract

Cited by 8 (1 self)
 Add to MetaCart
(Show Context)
We present a method for automatically selecting optimal implementations of sparse matrixvector operations. Our software “AcCELS ” (Accelerated Compressstorage Elements for Linear Solvers) involves a setup phase that probes machine characteristics, and a runtime phase where stored characteristics are combined with a measure of the actual sparse matrix to find the optimal kernel implementation. We present a performance model that is shown to be accurate over a large range of matrices. Key words: optimization, sparse, matrixvector product, blocking, selfadaptivity
A proposed standard for numerical metadata
, 2003
"... We propose a standard for generating and storing metadata describing numerical problems, in particular properties of matrices and linear systems. The standard comprises a storage and a generation component. The storage consists of an XML file format and an internal data format with various access ro ..."
Abstract

Cited by 5 (3 self)
 Add to MetaCart
(Show Context)
We propose a standard for generating and storing metadata describing numerical problems, in particular properties of matrices and linear systems. The standard comprises a storage and a generation component. The storage consists of an XML file format and an internal data format with various access routines; the generation standard describes a format for software that produces metadata. We give the abstract description of the XML storage format, APIs (Application Programmer Interfaces) for generating and storing metadata,and a core set of categories of data to be stored, and software to generate them. The standard defines an openended format, allowing for other parties to define additional metadata categories to be generated and stored within this framework. 1 General discussion Matrix storage formats, both file formats and data structures, traditionally limit themselves to specifying only the minimally necessary description of the data: the matrix size, and the matrix elements themselves with a fairly explicit description of the nonzero structure for sparse matrices. However, we can associate with matrix data any number of derived properties, such as norms, spectral properties, or graph properties in the sparse case. There is no standard way of generating and storing such data, making interoperability hard between software
Visual tracking employing Maple code generation
 In Maple Summer Workshop
, 2004
"... Closedloop, modelbased visual target recognition and tracking in extreme lighting conditions is expensive to develop and computationally resourceintensive. To reduce the development cycle, improve software reliability and reduce the computational requirements for such algorithms, we have adapted ..."
Abstract

Cited by 2 (2 self)
 Add to MetaCart
(Show Context)
Closedloop, modelbased visual target recognition and tracking in extreme lighting conditions is expensive to develop and computationally resourceintensive. To reduce the development cycle, improve software reliability and reduce the computational requirements for such algorithms, we have adapted Maple code generation to the problem of automatically generating efficient implementations of families of Newton solvers, each of which estimates a set of related parameters in a target model. We describe the leading target model in detail, formulate target identification as an optimization problem, explain the challenges in solving this model and the resulting need for multiple solvers, and the main advantage provided by code generation in Maple. We also discuss the problem of partiallysaturated images in this scheme and our approach to solving it, and desirable features for a future version of Maple which would improve the applicability to similar application domains and simplify the implementation of code generators. 1.
A Proposed Standard for Matrix Metadata
, 2003
"... We propose a standard for storing metadata describing numerical matrix data. The standard consists of an XML file format and an internal data format. We give the abstract description of the XML storage format, APIs (Application Programmer Interfaces) for access to the stored data inside a program, ..."
Abstract

Cited by 2 (0 self)
 Add to MetaCart
We propose a standard for storing metadata describing numerical matrix data. The standard consists of an XML file format and an internal data format. We give the abstract description of the XML storage format, APIs (Application Programmer Interfaces) for access to the stored data inside a program, and a core set of categories of data to be stored. The standard defines an openended format, allowing for other parties to define additional metadata categories to be stored within this framework.
Case Studies in Model Manipulation for Scientific Computing
, 2008
"... The same methodology is used to develop 3 different applications. We begin by using a very expressive, appropriate Domain Specific Language, to write down precise problem definitions, using their most natural formulation. Once defined, the problems form an implicit definition of a unique solution. ..."
Abstract

Cited by 2 (1 self)
 Add to MetaCart
The same methodology is used to develop 3 different applications. We begin by using a very expressive, appropriate Domain Specific Language, to write down precise problem definitions, using their most natural formulation. Once defined, the problems form an implicit definition of a unique solution. From the problem statement, our model, we use mathematical transformations to make the problem simpler to solve computationally. We call this crucial step “model manipulation.” With the model rephrased in more computational terms, we can also derive various quantities directly from this model, which greatly simplify traditional numeric solutions, our eventual goal. From all this data, we then use standard code generation and code transformation techniques to generate lowerlevel code to perform the final numerical steps. This methodology is very flexible, generates faster code, and generates code that would have been all but impossible for a human programmer to get correct.