Results 1  10
of
121
Efficient spectralGalerkin methods III. Polar and cylindrical geometries
 SIAM J. Sci. Comput
, 1995
"... Abstract. Efficient direct solvers based on the ChebyshevGalerkin methods for second and fourth order equations are presented. They are based on appropriate base functions for the Galerkin formulation which lead to discrete systems with special structured matrices which can be efficiently inverted. ..."
Abstract

Cited by 98 (36 self)
 Add to MetaCart
(Show Context)
Abstract. Efficient direct solvers based on the ChebyshevGalerkin methods for second and fourth order equations are presented. They are based on appropriate base functions for the Galerkin formulation which lead to discrete systems with special structured matrices which can be efficiently inverted. Numerical results indicate that the direct solvers presented in this paper are significantly more accurate and efficient than that based on the Chebyshevtau method.
An inverse free parallel spectral divide and conquer algorithm for nonsymmetric eigenproblems
, 1997
"... We discuss an inversefree, highly parallel, spectral divide and conquer algorithm. It can compute either an invariant subspace of a nonsymmetric matrix A, or a pair of left and right deflating subspaces of a regular matrix pencil A − λB. This algorithm is based on earlier ones of Bulgakov, Godunov ..."
Abstract

Cited by 69 (10 self)
 Add to MetaCart
(Show Context)
We discuss an inversefree, highly parallel, spectral divide and conquer algorithm. It can compute either an invariant subspace of a nonsymmetric matrix A, or a pair of left and right deflating subspaces of a regular matrix pencil A − λB. This algorithm is based on earlier ones of Bulgakov, Godunov and Malyshev, but improves on them in several ways. This algorithm only uses easily parallelizable linear algebra building blocks: matrix multiplication and QR decomposition, but not matrix inversion. Similar parallel algorithms for the nonsymmetric eigenproblem use the matrix sign function, which requires matrix inversion and is faster but can be less stable than the new algorithm.
Learning a spatially smooth subspace for face recognition
 Computer Vision and Pattern Recognition, 2007. CVPR ’07. IEEE Conference on
, 2007
"... Subspace learning based face recognition methods have attracted considerable interests in recently years, including ..."
Abstract

Cited by 49 (3 self)
 Add to MetaCart
(Show Context)
Subspace learning based face recognition methods have attracted considerable interests in recently years, including
Stable and efficient spectral methods in unbounded domains using Laguerre functions
 SIAM JOURNAL ON NUMERICAL ANALYSIS
, 2000
"... Stable and efficient spectral methods using Laguerre functions are proposed and analyzed for model elliptic equations on regular unbounded domains. It is shown that spectralGalerkin approximations based on Laguerre functions are stable and convergent with spectral accuracy in the usual (not weighte ..."
Abstract

Cited by 34 (9 self)
 Add to MetaCart
(Show Context)
Stable and efficient spectral methods using Laguerre functions are proposed and analyzed for model elliptic equations on regular unbounded domains. It is shown that spectralGalerkin approximations based on Laguerre functions are stable and convergent with spectral accuracy in the usual (not weighted) Sobolev spaces. Efficient, accurate, and wellconditioned algorithms using Laguerre functions are developed and implemented. Numerical results indicating the spectral convergence rate and effectiveness of these algorithms are presented.
A new fastmultipole accelerated poisson solver in two dimensions
 SIAM J. Sci. Comput
, 2001
"... Abstract. We present an adaptive fast multipole method for solving the Poisson equation in two dimensions. The algorithm is direct, assumes that the source distribution is discretized using an adaptive quadtree, and allows for Dirichlet, Neumann, periodic, and freespace conditions to be imposed on ..."
Abstract

Cited by 31 (1 self)
 Add to MetaCart
(Show Context)
Abstract. We present an adaptive fast multipole method for solving the Poisson equation in two dimensions. The algorithm is direct, assumes that the source distribution is discretized using an adaptive quadtree, and allows for Dirichlet, Neumann, periodic, and freespace conditions to be imposed on the boundary of a square. The amount of work per grid point is comparable to that of classical fast solvers, even for highly nonuniform grids.
A Parallel Fast Direct Solver For Block Tridiagonal Systems With Separable Matrices Of Arbitrary Dimension
 SIAM J. Sci. Comput
, 1996
"... A parallel fast direct solver based on the Divide & Conquer method for linear systems with separable block tridiagonal matrices is considered. Such systems appear, for example, when discretizing the Poisson equation in a rectangular domain using the fivepoint finite difference scheme or the pi ..."
Abstract

Cited by 30 (16 self)
 Add to MetaCart
(Show Context)
A parallel fast direct solver based on the Divide & Conquer method for linear systems with separable block tridiagonal matrices is considered. Such systems appear, for example, when discretizing the Poisson equation in a rectangular domain using the fivepoint finite difference scheme or the piecewise linear finite elements on a triangulated rectangular mesh. The Divide & Conquer method has the arithmetical complexity O(N log N ), and it is closely related to the cyclic reduction, but instead of using the matrix polynomial factorization the socalled partial solution technique is employed. The method is presented and analyzed in a general base q framework and based on this analysis, the base four variant is chosen for parallel implementation using the MPI standard. The generalization of the method to the case of arbitrary block dimension is described. The numerical experiments show the sequential efficiency and numerical stability of the considered method compared to the wellknown...
Fast tridiagonal solvers on the GPU
 In Proceedings of the 15th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP 2010
, 2010
"... We study the performance of three parallel algorithms and their hybrid variants for solving tridiagonal linear systems on a GPU: cyclic reduction (CR), parallel cyclic reduction (PCR) and recursive doubling (RD). We develop an approach to measure, analyze, and optimize the performance of GPU program ..."
Abstract

Cited by 25 (2 self)
 Add to MetaCart
(Show Context)
We study the performance of three parallel algorithms and their hybrid variants for solving tridiagonal linear systems on a GPU: cyclic reduction (CR), parallel cyclic reduction (PCR) and recursive doubling (RD). We develop an approach to measure, analyze, and optimize the performance of GPU programs in terms of memory access, computation, and control overhead. We find that CR enjoys linear algorithm complexity but suffers from more algorithmic steps and bank conflicts, while PCR and RD have fewer algorithmic steps but do more work each step. To combine the benefits of the basic algorithms, we propose hybrid CR+PCR and CR+RD algorithms, which improve the performance of PCR, RD and CR by 21%, 31 % and 61 % respectively. Our GPU solvers achieve up to a 28x speedup over a sequential LAPACK solver, and a 12x speedup over a multithreaded CPU solver.
A direct adaptive Poisson solver of arbitrary order accuracy
 J. Comput. Phys
, 1996
"... We present a direct, adaptive solver for the Poisson equation which can achieve any prescribed order of accuracy. It is based on a domain decomposition approach using local spectral approximation, as well as potential theory and the fast multipole method. In two space dimensions, the algorithm requ ..."
Abstract

Cited by 23 (5 self)
 Add to MetaCart
(Show Context)
We present a direct, adaptive solver for the Poisson equation which can achieve any prescribed order of accuracy. It is based on a domain decomposition approach using local spectral approximation, as well as potential theory and the fast multipole method. In two space dimensions, the algorithm requires O(NK) work where N is the number of discretization points and K is the desired order of accuracy. 1
An Algorithm With Polylog Parallel Complexity for. . .
"... This paper describes an algorithm for the timeaccurate solution of certain classes of parabolic partial di erential e uations that can be parallelized in both time and space. It has a serial comple ity that is proportional to the serial comple ities of the best known algorithms. The algorithm is a v ..."
Abstract

Cited by 20 (7 self)
 Add to MetaCart
This paper describes an algorithm for the timeaccurate solution of certain classes of parabolic partial di erential e uations that can be parallelized in both time and space. It has a serial comple ity that is proportional to the serial comple ities of the best known algorithms. The algorithm is a variant of the multigrid waveform rela ation method where the scalar ordinary di erential e uations that make up the kernel of computation are solved using a cyclic reduction type algorithm. E perimental results obtained on a massively parallel multiprocessor are presented. . parabolic partial di erential e uations, massively parallel computation, waveform rela ation, multigrid, cyclic reduction . primary 65M, 65W secondary 65L05 1. ntroduction. For many numerical problems in scientific computation, the execution time grows without bound as a function of the problem size, independent of the number of processors and of the algorithm used [36], [38], [40]. In particular, for most linear partial differential equations (PDEs) arising in mathematical physics, the parallel complexity grows as (log N ), where N is a particular measure of the problem size. The proof is based on deriving upper and lower bounds on the execution time of optimal parallel algorithms for multiprocessors with an unlimited number of processors and no interprocessor communication costs, where both upper and lower bounds are proportional to log N . These optimal parallel algorithms can have very large serial complexities, and the tightness of the bounds on the parallel execution time for practical algorithms is not established by this analysis. In the analysis of standard numerical algorithms for linear PDEs, there is a strong dichotomy in the nature of the growth in the parallel execution time between algorith...
Schwarz methods over the course of time
 Electronic Transactions on Numerical Analysis
, 2008
"... To the memory of Gene Golub, our leader and friend. Abstract. Schwarz domain decomposition methods are the oldest domain decomposition methods. They were invented by Hermann Amandus Schwarz in 1869 as an analytical tool to rigorously prove results obtained by Riemann through a minimization principle ..."
Abstract

Cited by 15 (2 self)
 Add to MetaCart
(Show Context)
To the memory of Gene Golub, our leader and friend. Abstract. Schwarz domain decomposition methods are the oldest domain decomposition methods. They were invented by Hermann Amandus Schwarz in 1869 as an analytical tool to rigorously prove results obtained by Riemann through a minimization principle. Renewed interest in these methods was sparked by the arrival of parallel computers, and variants of the method have been introduced and analyzed, both at the continuous and discrete level. It can be daunting to understand the similarities and subtle differences between all the variants, even for the specialist. This paper presents Schwarz methods as they were developed historically. From quotes by major contributors over time, we learn about the reasons for similarities and subtle differences between continuous and discrete variants. We also formally prove at the algebraic level equivalence and/or nonequivalence among the major variants for very general decompositions and many subdomains. We finally trace the motivations that led to the newest class called optimized Schwarz methods, illustrate how they can greatly enhance the performance of the solver, and show why one has to be cautious when testing them numerically. Key words. Alternating and parallel Schwarz methods, additive, multiplicative and restricted additive Schwarz methods, optimized Schwarz methods. AMS subject classifications. 65F10, 65N22.