Results 1  10
of
13
Choosing the Forcing Terms in an Inexact Newton Method
 SIAM J. Sci. Comput
, 1994
"... An inexact Newton method is a generalization of Newton's method for solving F(x) = 0, F:/ /, in which, at the kth iteration, the step sk from the current approximate solution xk is required to satisfy a condition ]lF(x) + F'(x)s]l _< /]lF(xk)]l for a "forcing term" / [0,1). In typical applications, ..."
Abstract

Cited by 94 (2 self)
 Add to MetaCart
An inexact Newton method is a generalization of Newton's method for solving F(x) = 0, F:/ /, in which, at the kth iteration, the step sk from the current approximate solution xk is required to satisfy a condition ]lF(x) + F'(x)s]l _< /]lF(xk)]l for a "forcing term" / [0,1). In typical applications, the choice of the forcing terms is critical to the efficiency of the method and can affect robustness as well. Promising choices of the forcing terms arc given, their local convergence properties are analyzed, and their practical performance is shown on a representative set of test problems.
Recent computational developments in Krylov subspace methods for linear systems
 NUMER. LINEAR ALGEBRA APPL
, 2007
"... Many advances in the development of Krylov subspace methods for the iterative solution of linear systems during the last decade and a half are reviewed. These new developments include different versions of restarted, augmented, deflated, flexible, nested, and inexact methods. Also reviewed are metho ..."
Abstract

Cited by 48 (12 self)
 Add to MetaCart
Many advances in the development of Krylov subspace methods for the iterative solution of linear systems during the last decade and a half are reviewed. These new developments include different versions of restarted, augmented, deflated, flexible, nested, and inexact methods. Also reviewed are methods specifically tailored to systems with special properties such as special forms of symmetry and those depending on one or more parameters.
Differences in the effects of rounding errors in Krylov solvers for symmetric indefinite linear systems
, 1999
"... The 3term Lanczos process leads, for a symmetric matrix, to bases for Krylov subspaces of increasing dimension. The Lanczos basis, together with the recurrence coefficients, can be used for the solution of symmetric indefinite linear systems, by solving the reduced system in one way or another. Thi ..."
Abstract

Cited by 15 (0 self)
 Add to MetaCart
The 3term Lanczos process leads, for a symmetric matrix, to bases for Krylov subspaces of increasing dimension. The Lanczos basis, together with the recurrence coefficients, can be used for the solution of symmetric indefinite linear systems, by solving the reduced system in one way or another. This leads to wellknown methods: MINRES, GMRES, and SYMMLQ. We will discuss in what way and to what extent these approaches differ in their sensitivity to rounding errors. In our analysis we will assume that the Lanczos basis is generated in exactly the same way for the different methods, and we will not consider the errors in the Lanczos process itself. We will show that the method of solution may lead, under certain circumstances, to large additional errors, that are not corrected by continuing the iteration process. Our findings are supported and illustrated by numerical examples. 1 Introduction We will consider iterative methods for the construction of approximate solutions, starting with...
Using mixed precision for sparse matrix computations to enhance the performance while achieving 64bit accuracy
 ACM Trans. Math. Softw
"... By using a combination of 32bit and 64bit floating point arithmetic the performance of many sparse linear algebra algorithms can be significantly enhanced while maintaining the 64bit accuracy of the resulting solution. These ideas can be applied to sparse multifrontal and supernodal direct techni ..."
Abstract

Cited by 13 (1 self)
 Add to MetaCart
By using a combination of 32bit and 64bit floating point arithmetic the performance of many sparse linear algebra algorithms can be significantly enhanced while maintaining the 64bit accuracy of the resulting solution. These ideas can be applied to sparse multifrontal and supernodal direct techniques and sparse iterative techniques such as Krylov subspace methods. The approach presented here can apply not only to conventional processors but also to exotic technologies such as
BiCGstab(l) And Other Hybrid BiCG Methods
, 1994
"... . It is wellknown that BiCG can be adapted so that the operations with A T can be avoided, and hybrid methods can be constructed in which it is attempted to further improve the convergence behaviour. Examples of this are CGS, BiCGSTAB, and the more general BiCGstab(`) method. In this paper it i ..."
Abstract

Cited by 11 (1 self)
 Add to MetaCart
. It is wellknown that BiCG can be adapted so that the operations with A T can be avoided, and hybrid methods can be constructed in which it is attempted to further improve the convergence behaviour. Examples of this are CGS, BiCGSTAB, and the more general BiCGstab(`) method. In this paper it is shown that BiCGstab(`) can be implemented in different ways. Each of the suggested approaches has its own advantages and disadvantages. Our implementations allow for combinations of BiCG with arbitrary polynomial methods. The choice for a specific implementation can also be made for reasons of numerical stability. This aspect receives much attention. Various effects have been illustrated by numerical examples. Key words. BiConjugate gradients, nonsymmetric linear systems, CGS, BiCGSTAB, iterative solvers, ORTHODIR, Krylov subspace. AMS subject classification. 65F10. 1. Introduction and background. The BiCG algorithm [2, 4] is an iterative solution method for linear systems Ax = b (...
Pipelined mixed precision algorithms on FPGAs for fast and accurate PDE solvers from low precision components
 In IEEE Proceedings on Fieldâ€“Programmable Custom Computing Machines (FCCM
, 2006
"... FPGAs are becoming more and more attractive for high precision scientific computations. One of the main problems in efficient resource utilization is the quadratically growing resource usage of multipliers depending on the operand size. Many research efforts have been devoted to the optimization of ..."
Abstract

Cited by 9 (2 self)
 Add to MetaCart
FPGAs are becoming more and more attractive for high precision scientific computations. One of the main problems in efficient resource utilization is the quadratically growing resource usage of multipliers depending on the operand size. Many research efforts have been devoted to the optimization of individual arithmetic and linear algebra operations. In this paper we take a higher level approach and seek to reduce the intermediate computational precision on the algorithmic level by optimizing the accuracy towards the final result of an algorithm. In our case this is the accurate solution of partial differential equations (PDEs). Using the Poisson Problem as a typical PDE example we show that most intermediate operations can be computed with floats or even smaller formats and only very few operations (e.g. 1%) must be performed in double precision to obtain the same accuracy as a full double precision solver. Thus the FPGA can be configured with many parallel float rather than few resource hungry double operations. To achieve this, we adapt the general concept of mixed precision iterative refinement methods to FPGAs and develop a fully pipelined version of the Conjugate Gradient solver. We combine this solver with different iterative refinement schemes and precision combinations to obtain resource efficient mappings of the pipelined algorithm core onto the FPGA. 1.
Accelerating Scientific Computations with Mixed Precision Algorithms
, 2008
"... On modern architectures, the performance of 32bit operations is often at least twice as fast as the performance of 64bit operations. By using a combination of 32bit and 64bit floating point arithmetic, the performance of many dense and sparse linear algebra algorithms can be significantly enhanc ..."
Abstract

Cited by 4 (0 self)
 Add to MetaCart
On modern architectures, the performance of 32bit operations is often at least twice as fast as the performance of 64bit operations. By using a combination of 32bit and 64bit floating point arithmetic, the performance of many dense and sparse linear algebra algorithms can be significantly enhanced while maintaining the 64bit accuracy of the resulting solution. The approach presented here can apply not only to conventional processors but also to other technologies such as Field Programmable Gate Arrays (FPGA), Graphical Processing Units (GPU), and the STI Cell BE processor. Results on modern processor architectures and the STI Cell BE are presented. 1
Exploiting Mixed Precision Floating Point Hardware in Scientific Computations
, 2007
"... By using a combination of 32bit and 64bit floating point arithmetic, the performance of many dense and sparse linear algebra algorithms can be significantly enhanced while maintaining the 64bit accuracy of the resulting solution. The approach presented here can apply not only to conventional proc ..."
Abstract

Cited by 2 (0 self)
 Add to MetaCart
By using a combination of 32bit and 64bit floating point arithmetic, the performance of many dense and sparse linear algebra algorithms can be significantly enhanced while maintaining the 64bit accuracy of the resulting solution. The approach presented here can apply not only to conventional processors but also
Preconditioning Strategies for Linear Systems Arising in Tire Design
, 1999
"... In this paper, we consider linear systems arising in static tire equilibrium computation. The heterogeneous material properties, nonlinear constraints, and a 3D finite element formulation make the linear systems arising in tire design difficult to solve by iterative methods. An analysis of matrix ..."
Abstract

Cited by 1 (1 self)
 Add to MetaCart
In this paper, we consider linear systems arising in static tire equilibrium computation. The heterogeneous material properties, nonlinear constraints, and a 3D finite element formulation make the linear systems arising in tire design difficult to solve by iterative methods. An analysis of matrix characteristics attempts to explain this negative effect. This paper focuses on two preconditioning techniques  a variation of an incomplete LU factorization with threshold and a multilevel recursive solver  that are able to improve the convergence of a suitable iterative accelerator. In particular, we compare these techniques and assess their applicability when the linear system difficulty varies for the same class of problems. The effect of altering the values of parameters such as number of fillin elements, block size, and number of levels is considered. 1 Introduction Static equilibrium computation routinely takes place in the tire manufacturing process. Tire stability an...