Results 1  10
of
53
Parallel Numerical Linear Algebra
, 1993
"... We survey general techniques and open problems in numerical linear algebra on parallel architectures. We first discuss basic principles of parallel processing, describing the costs of basic operations on parallel machines, including general principles for constructing efficient algorithms. We illust ..."
Abstract

Cited by 575 (26 self)
 Add to MetaCart
We survey general techniques and open problems in numerical linear algebra on parallel architectures. We first discuss basic principles of parallel processing, describing the costs of basic operations on parallel machines, including general principles for constructing efficient algorithms. We illustrate these principles using current architectures and software systems, and by showing how one would implement matrix multiplication. Then, we present direct and iterative algorithms for solving linear systems of equations, linear least squares problems, the symmetric eigenvalue problem, the nonsymmetric eigenvalue problem, and the singular value decomposition. We consider dense, band and sparse matrices.
A proven correctly rounded logarithm in doubleprecision
 In Real Numbers and Computers, Schloss Dagstuhl
, 2004
"... Abstract. This article is a case study in the implementation of a portable, proven and efficient correctly rounded elementary function in doubleprecision. We describe the methodology used to achieve these goals in the crlibm library. There are two novel aspects to this approach. The first is the pr ..."
Abstract

Cited by 24 (10 self)
 Add to MetaCart
(Show Context)
Abstract. This article is a case study in the implementation of a portable, proven and efficient correctly rounded elementary function in doubleprecision. We describe the methodology used to achieve these goals in the crlibm library. There are two novel aspects to this approach. The first is the proof framework, and in general the techniques used to balance performance and provability. The second is the introduction of processorspecific optimization to get performance equivalent to the best current mathematical libraries, while trying to minimize the proof work. The implementation of the natural logarithm is detailed to illustrate these questions. Mathematics Subject Classification. 2604, 65D15, 65Y99. 1.
Assisted verification of elementary functions using Gappa
 In Proceedings of the 2006 ACM symposium on Applied computing
, 2006
"... The implementation of a correctly rounded or interval elementary function needs to be proven carefully in the very last details. The proof requires a tight bound on the overall error of the implementation with respect to the mathematical function. Such work is function specific, concerns tens of lin ..."
Abstract

Cited by 22 (8 self)
 Add to MetaCart
(Show Context)
The implementation of a correctly rounded or interval elementary function needs to be proven carefully in the very last details. The proof requires a tight bound on the overall error of the implementation with respect to the mathematical function. Such work is function specific, concerns tens of lines of code for each function, and will usually be broken by the smallest change to the code (e.g. for maintenance or optimization purpose). Therefore, it is very tedious and errorprone if done by hand. This article discusses the use of the Gappa proof assistant in this context. Gappa has two main advantages over previous approaches: Its input format is very close to the actual C code to validate, and it automates error evaluation and propagation using interval arithmetic. Besides, it can be used to incrementally prove complex mathematical properties pertaining to the C code. Yet it does not require any specific knowledge about automatic theorem proving, and thus is accessible to a wider community. Moreover, Gappa may generate a formal proof of the results that can be checked independently by a lowerlevel proof assistant like Coq, hence providing an even higher confidence in the certification of the numerical code. 1.
Towards the postultimate libm
, 2005
"... This article presents advances on the subject of correctly rounded elementary functions since the publication of the libultim mathematical library developed by Ziv at IBM. This library showed that the average performance and memory overhead of correct rounding could be made negligible. However, the ..."
Abstract

Cited by 17 (10 self)
 Add to MetaCart
This article presents advances on the subject of correctly rounded elementary functions since the publication of the libultim mathematical library developed by Ziv at IBM. This library showed that the average performance and memory overhead of correct rounding could be made negligible. However, the worstcase overhead was still a factor 1000 or more. It is shown here that, with current processor technology, this worstcase overhead can be kept within a factor of 2 to 10 of current best libms. This low overhead has very positive consequences on the techniques for implementing and proving correctly rounded functions, which are also studied. These results lift the last technical obstacles to a generalisation of (at least some) correctly rounded double precision elementary functions.
Some functions computable with a fusedmac
 in Proceedings of the 17th Symposium on Computer Arithmetic, P. Montuschi and E. Schwarz, Eds., Cape Cod
, 2005
"... The fused multiply accumulate instruction (fusedmac) that is available on some current processors such as the Power PC or the Itanium eases some calculations. We give examples of some floatingpoint functions (such as ulp(x) or Nextafter(x, y)), or some useful tests, that are easily computable usin ..."
Abstract

Cited by 16 (7 self)
 Add to MetaCart
(Show Context)
The fused multiply accumulate instruction (fusedmac) that is available on some current processors such as the Power PC or the Itanium eases some calculations. We give examples of some floatingpoint functions (such as ulp(x) or Nextafter(x, y)), or some useful tests, that are easily computable using a fusedmac. Then, we show that, with rounding to the nearest, the error of a fusedmac instruction is exactly representable as the sum of two floatingpoint numbers. We give an algorithm that computes that error. 1
Trading off Parallelism and Numerical Stability
, 1992
"... The fastest parallel algorithm for a problem may be significantly less stable numerically than the fastest serial algorithm. We illustrate this phenomenon by a series of examples drawn from numerical linear algebra. We also show how some of these instabilities may be mitigated by better floating poi ..."
Abstract

Cited by 13 (5 self)
 Add to MetaCart
(Show Context)
The fastest parallel algorithm for a problem may be significantly less stable numerically than the fastest serial algorithm. We illustrate this phenomenon by a series of examples drawn from numerical linear algebra. We also show how some of these instabilities may be mitigated by better floating point arithmetic.
Fast and Accurate Floating Point Summation with Application to Computational Geometry
 Numerical Algorithms
, 2002
"... We present several simple algorithms for accurately computing the sum of n oating point numbers using a wider accumulator. Let f and F be the number of signi cant bits in the summands and the accumulator, respectively. Then assuming gradual underow, no overow, and roundtonearest arithmetic, up ..."
Abstract

Cited by 11 (0 self)
 Add to MetaCart
(Show Context)
We present several simple algorithms for accurately computing the sum of n oating point numbers using a wider accumulator. Let f and F be the number of signi cant bits in the summands and the accumulator, respectively. Then assuming gradual underow, no overow, and roundtonearest arithmetic, up to b2 =(1 2 )c + 1 numbers can be accurately added by just summing the terms in decreasing order of exponents, yielding a sum correct to within about 1.5 units in the last place. In particular, if the sum is zero, it is computed exactly. We apply this result to the oating point formats in the IEEE oating point standard, and investigate its performance. Our results show that in the absence of massive cancellation (the most common case) the cost of guaranteed accuracy is about 3040% more than the straightforward summation. If massive cancellation does occur, the cost of computing the accurate sum is about a factor of ten. Finally we apply our algorithm in computing a robust geometric predicate (used in computational geometry), where our accurate summation algorithm improves the existing algorithm by a factor of two on a nearly coplanar set of points.
Validated roundings of dot products by sticky accumulation
 IEEE Trans Comput
, 1997
"... ..."
(Show Context)
Certifying the floatingpoint implementation of an elementary function using Gappa
 IEEE TRANSACTIONS ON COMPUTERS, 2010. 9 HTTP://DX.DOI.ORG/10.1145/1772954.1772987 10 HTTP://DX.DOI.ORG/10.1145/1838599.1838622 11 HTTP://SHEMESH.LARC.NASA.GOV/NFM2010/PAPERS/NFM2010_14_23.PDF 12 HTTP://DX.DOI.ORG/10.1007/9783642142031_11 13 HTTP://DX.
, 2011
"... High confidence in floatingpoint programs requires proving numerical properties of final and intermediate values. One may need to guarantee that a value stays within some range, or that the error relative to some ideal value is well bounded. This certification may require a timeconsuming proof fo ..."
Abstract

Cited by 8 (3 self)
 Add to MetaCart
(Show Context)
High confidence in floatingpoint programs requires proving numerical properties of final and intermediate values. One may need to guarantee that a value stays within some range, or that the error relative to some ideal value is well bounded. This certification may require a timeconsuming proof for each line of code, and it is usually broken by the smallest change to the code, e.g., for maintenance or optimization purpose. Certifying floatingpoint programs by hand is, therefore, very tedious and errorprone. The Gappa proof assistant is designed to make this task both easier and more secure, due to the following novel features: It automates the evaluation and propagation of rounding errors using interval arithmetic. Its input format is very close to the actual code to validate. It can be used incrementally to prove complex mathematical properties pertaining to the code. It generates a formal proof of the results, which can be checked independently by a lower level proof assistant like Coq. Yet it does not require any specific knowledge about automatic theorem proving, and thus, is accessible to a wide community. This paper demonstrates the practical use of this tool for a widely used class of floatingpoint programs: implementations of elementary functions in a mathematical library.
Floats & Ropes: a case study for formal numerical program verification
, 2009
"... We present a case study of a formal verification of a numerical program that computes the discretization of a simple partial differential equation. Bounding the rounding error was tricky as the usual idea, that is to bound the absolute value of the error at each step, fails. Our idea is to nd out a ..."
Abstract

Cited by 8 (5 self)
 Add to MetaCart
We present a case study of a formal verification of a numerical program that computes the discretization of a simple partial differential equation. Bounding the rounding error was tricky as the usual idea, that is to bound the absolute value of the error at each step, fails. Our idea is to nd out a precise analytical expression that cancels with itself at the next step, and to formally prove the correctness of this approach.