Results 1  10
of
37
Recent computational developments in Krylov subspace methods for linear systems
 NUMER. LINEAR ALGEBRA APPL
, 2007
"... Many advances in the development of Krylov subspace methods for the iterative solution of linear systems during the last decade and a half are reviewed. These new developments include different versions of restarted, augmented, deflated, flexible, nested, and inexact methods. Also reviewed are metho ..."
Abstract

Cited by 48 (12 self)
 Add to MetaCart
Many advances in the development of Krylov subspace methods for the iterative solution of linear systems during the last decade and a half are reviewed. These new developments include different versions of restarted, augmented, deflated, flexible, nested, and inexact methods. Also reviewed are methods specifically tailored to systems with special properties such as special forms of symmetry and those depending on one or more parameters.
Automatic online tuning for fast Gaussian summation
"... Many machine learning algorithms require the summation of Gaussian kernel functions, an expensive operation if implemented straightforwardly. Several methods have been proposed to reduce the computational complexity of evaluating such sums, including tree and analysis based methods. These achieve va ..."
Abstract

Cited by 21 (10 self)
 Add to MetaCart
Many machine learning algorithms require the summation of Gaussian kernel functions, an expensive operation if implemented straightforwardly. Several methods have been proposed to reduce the computational complexity of evaluating such sums, including tree and analysis based methods. These achieve varying speedups depending on the bandwidth, dimension, and prescribed error, making the choice between methods difficult for machine learning tasks. We provide an algorithm that combines tree methods with the Improved Fast Gauss Transform (IFGT). As originally proposed the IFGT suffers from two problems: (1) the Taylor series expansion does not perform well for very low bandwidths, and (2) parameter selection is not trivial and can drastically affect performance and ease of use. We address the first problem by employing a tree data structure, resulting in four evaluation methods whose performance varies based on the distribution of sources and targets and input parameters such as desired accuracy and bandwidth. To solve the second problem, we present an online tuning approach that results in a black box method that automatically chooses the evaluation method and its parameters to yield the best performance for the input data, desired accuracy, and bandwidth. In addition, the new IFGT parameter selection approach allows for tighter error bounds. Our approach chooses the fastest method at negligible additional cost, and has superior performance in comparisons with previous approaches. 1
On the occurrence of superlinear convergence of exact and inexact Krylov subspace methods
 SIAM Rev
, 2005
"... We present a general analytical model which describes the superlinear convergence of Krylov subspace methods. We take an invariant subspace approach, so that our results apply also to inexact methods, and to nondiagonalizable matrices. Thus, we provide a unified treatment of the superlinear conve ..."
Abstract

Cited by 20 (7 self)
 Add to MetaCart
We present a general analytical model which describes the superlinear convergence of Krylov subspace methods. We take an invariant subspace approach, so that our results apply also to inexact methods, and to nondiagonalizable matrices. Thus, we provide a unified treatment of the superlinear convergence of GMRES, Conjugate Gradients, block versions of these, and inexact subspace methods. Numerical experiments illustrate the bounds obtained.
Using mixed precision for sparse matrix computations to enhance the performance while achieving 64bit accuracy
 ACM Trans. Math. Softw
"... By using a combination of 32bit and 64bit floating point arithmetic the performance of many sparse linear algebra algorithms can be significantly enhanced while maintaining the 64bit accuracy of the resulting solution. These ideas can be applied to sparse multifrontal and supernodal direct techni ..."
Abstract

Cited by 13 (1 self)
 Add to MetaCart
By using a combination of 32bit and 64bit floating point arithmetic the performance of many sparse linear algebra algorithms can be significantly enhanced while maintaining the 64bit accuracy of the resulting solution. These ideas can be applied to sparse multifrontal and supernodal direct techniques and sparse iterative techniques such as Krylov subspace methods. The approach presented here can apply not only to conventional processors but also to exotic technologies such as
Optical computing for fast light transport analysis
 SIGGRAPH Asia
, 2010
"... Figure 1: Our approach enables very efficient acquisition and analysis of light transport: to create the relighting results shown above, just forty low dynamic range photos were used to acquire 700Kpixel×100Kpixel transport matrices. Note the complex shadows cast by the hat (both sharp and soft), th ..."
Abstract

Cited by 12 (1 self)
 Add to MetaCart
Figure 1: Our approach enables very efficient acquisition and analysis of light transport: to create the relighting results shown above, just forty low dynamic range photos were used to acquire 700Kpixel×100Kpixel transport matrices. Note the complex shadows cast by the hat (both sharp and soft), the complex highlights on the hair and the shadows it casts, and the many shadows, caustics and indirect lighting effects in the glass scene. We used an optical implementation of the Arnoldi algorithm to do both photo acquisition and lowrank matrix approximation; the entire process (photo capture, matrix reconstruction, relighting) took four minutes on a standard PC for each scene. We present a general framework for analyzing the transport matrix of a realworld scene at full resolution, without capturing many photos. The key idea is to use projectors and cameras to directly acquire eigenvectors and the Krylov subspace of the unknown transport matrix. To do this, we implement Krylov subspace methods partially in optics, by treating the scene as a “black box subroutine” that enables optical computation of arbitrary matrixvector products. We describe two methods—optical Arnoldi to acquire a lowrank approximation of the transport matrix for relighting; and optical GMRES to invert light transport. Our experiments suggest that good quality relighting and transport inversion are possible from a few dozen lowdynamic range photos, even for scenes with complex shadows, caustics, and other challenging lighting effects. 1
Relaxation strategies for nested Krylov methods
 Journal of Computational and Applied Mathematics
, 2003
"... There are classes of linear problems for which the matrixvector product is a time consuming operation because an expensive approximation method is required to compute it to a given accuracy. In recent years di#erent authors have investigated the use of, what is called, relaxation strategies for ..."
Abstract

Cited by 11 (0 self)
 Add to MetaCart
There are classes of linear problems for which the matrixvector product is a time consuming operation because an expensive approximation method is required to compute it to a given accuracy. In recent years di#erent authors have investigated the use of, what is called, relaxation strategies for various Krylov subspace methods. These relaxation strategies aim to minimize the amount of work that is spent in the computation of the matrixvector product without compromising the accuracy of the method or the convergence speed too much. In order to achieve this goal, the accuracy of the matrixvector product is decreased when the iterative process comes closer to the solution. In this paper we show that a further significant reduction in computing time can be obtained by combining a relaxation strategy with the nesting of inexact Krylov methods. Flexible Krylov subspace methods allow variable preconditioning and therefore can be used in the outer most loop of our overall method. We analyze for several flexible Krylov methods strategies for controlling the accuracy of both the inexact matrixvector products and of the inner iterations. The results of our analysis will be illustrated with an example that models global ocean circulation.
FAST RADIAL BASIS FUNCTION INTERPOLATION VIA PRECONDITIONED KRYLOV ITERATION ∗
"... Abstract. We consider a preconditioned Krylov subspace iterative algorithm presented by Faul, Goodsell, and Powell (IMA J. Numer. Anal. 25 (2005), pp. 1–24) for computing the coefficients of a radial basis function interpolant over N data points. This preconditioned Krylov iteration has been demonst ..."
Abstract

Cited by 9 (0 self)
 Add to MetaCart
Abstract. We consider a preconditioned Krylov subspace iterative algorithm presented by Faul, Goodsell, and Powell (IMA J. Numer. Anal. 25 (2005), pp. 1–24) for computing the coefficients of a radial basis function interpolant over N data points. This preconditioned Krylov iteration has been demonstrated to be extremely robust to the distribution of the points and the iteration rapidly convergent. However, the iterative method has several steps whose computational and memory costs scale as O(N 2), both in preliminary computations that compute the preconditioner and in the matrixvector product involved in each step of the iteration. We effectively accelerate the iterative method to achieve an overall cost of O(N log N). The matrix vector product is accelerated via the use of the fast multipole method. The preconditioner requires the computation of a set of closest points to each point. We develop an O(N log N) algorithm for this step as well. Results are presented for multiquadric interpolation in R 2 and biharmonic interpolation in R 3. A novel FMM algorithm for the evaluation of sums involving multiquadric functions in R 2 is presented as well.
Faulttolerant iterative methods via selective reliability
, 2011
"... Current iterative methods for solving linear equations assume reliability of data (no “bit flips”) and arithmetic (correct up to rounding error). If faults occur, the solver usually either aborts, or computes the wrong answer without indication. System reliability guarantees consume energy or reduce ..."
Abstract

Cited by 7 (0 self)
 Add to MetaCart
Current iterative methods for solving linear equations assume reliability of data (no “bit flips”) and arithmetic (correct up to rounding error). If faults occur, the solver usually either aborts, or computes the wrong answer without indication. System reliability guarantees consume energy or reduces performance. As processor counts continue to grow, these costs will become unbearable. Instead, we show that if the system lets applications apply reliability selectively, we can develop iterations that compute the right answer despite faults. These “faulttolerant ” methods either converge eventually, at a rate that degrades gracefully with increased fault rate, or return a clear failure indication in the rare case that they cannot converge. If faults are infrequent, these algorithms spend most of their time in unreliable mode. This can save energy, improve performance, and avoid restarting from checkpoints. We illustrate convergence for a sample algorithm, FaultTolerant GMRES, for representative test problems and fault rates.
Fast large scale Gaussian process regression using approximate matrixvector products. Presented at the Learning workshop 2007
, 2007
"... Gaussian processes (GP) allow the treatment of nonlinear nonparametric regression problems in a Bayesian framework [6]. Unfortunately its nonparametric nature causes computational problems for large data sets, due to an unfavorable O(N 3) time and O(N 2) memory scaling for training. The key comput ..."
Abstract

Cited by 6 (1 self)
 Add to MetaCart
Gaussian processes (GP) allow the treatment of nonlinear nonparametric regression problems in a Bayesian framework [6]. Unfortunately its nonparametric nature causes computational problems for large data sets, due to an unfavorable O(N 3) time and O(N 2) memory scaling for training. The key computational task involves inversion of an N × N covariance matrix K + σ 2 I, where [K]ij = K(xi, xj), K is the covariance function of the GP, and σ 2 is the noise variance. Direct computation of the inverse requires O(N 3) operations and O(N 2) storage, which is impractical even for problems of moderate size (typically a few thousands). An important subfield of work in GP has attempted to bring this scaling down to O � m 2 N � by making sparse