Results 1  10
of
14
Efficient Management of Parallelism in ObjectOriented Numerical Software Libraries
 Modern Software Tools in Scientific Computing
, 1997
"... Parallel numerical software based on the messagepassing model is enormously complicated. This paper introduces a set of techniques to manage the complexity, while maintaining high efficiency and ease of use. The PETSc 2.0 package uses objectoriented programming to conceal the details of the messag ..."
Abstract

Cited by 49 (0 self)
 Add to MetaCart
Parallel numerical software based on the messagepassing model is enormously complicated. This paper introduces a set of techniques to manage the complexity, while maintaining high efficiency and ease of use. The PETSc 2.0 package uses objectoriented programming to conceal the details of the message passing, without concealing the parallelism, in a highquality set of numerical software libraries. In fact, the programming model used by PETSc is also the most appropriate for NUMA sharedmemory machines, since they require the same careful attention to memory hierarchies as do distributedmemory machines. Thus, the concepts discussed are appropriate for all scalable computing systems. The PETSc libraries provide many of the data structures and numerical kernels required for the scalable solution of PDEs, offering performance portability. 1 Introduction Currently the only generalpurpose, efficient, scalable approach to programming distributedmemory parallel systems is the messagepass...
Globalized Newton–Krylov–Schwarz algorithms and software for parallel implicit CFD
 Int. J. High Perform. Comput. Appl
"... Implicit solution methods are important in applications modeled by PDEs with disparate temporal and spatial scales. Because such applications require high resolution with reasonable turnaround, parallelization is essential. The pseudotransient matrixfree NewtonKrylovSchwarz ( Y NKS) algorithmic ..."
Abstract

Cited by 44 (17 self)
 Add to MetaCart
(Show Context)
Implicit solution methods are important in applications modeled by PDEs with disparate temporal and spatial scales. Because such applications require high resolution with reasonable turnaround, parallelization is essential. The pseudotransient matrixfree NewtonKrylovSchwarz ( Y NKS) algorithmic framework is presented as a widely applicable answer. This article shows that for the classical problem of threedimensional transonic Euler flow about an M6 wing, Y NKS can simultaneously deliver globalized, asymptotically rapid convergence through adaptive pseudotransient continuation and Newton’s method; reasonable parallelizability for an implicit method through deferred synchronization and favorable communicationtocomputation scaling in the Krylov linear solver; and high per processor performance through attention to distributed memory and cache locality, especially through the Schwarz preconditioner. Two discouraging features of Y NKS methods are their sensitivity to the coding of the underlying PDE discretization and the large number of parameters that must be selected to govern convergence. The authors therefore distill several recommendations from their experience and reading of the literature on various algorithmic components of Y NKS, and they describe a freely available MPIbased portable parallel software implementation of the solver employed here. 1
Parallel Implicit PDE Computations: Algorithms and Software
, 1997
"... this paper have been obtained. 2. Algorithmic Framework ..."
Abstract

Cited by 8 (3 self)
 Add to MetaCart
this paper have been obtained. 2. Algorithmic Framework
Parallel Simulation of Compressible Flow Using Automatic Differentiation and PETSc
"... Many aerospace applications require parallel implicit solution strategies and software. We consider the use of two computational tools, PETSc and ADIFOR, to implement a NewtonKrylovSchwarz method with pseudotransient continuation for a particular application, namely, a steadystate, fully implici ..."
Abstract

Cited by 6 (1 self)
 Add to MetaCart
Many aerospace applications require parallel implicit solution strategies and software. We consider the use of two computational tools, PETSc and ADIFOR, to implement a NewtonKrylovSchwarz method with pseudotransient continuation for a particular application, namely, a steadystate, fully implicit, threedimensional compressible Euler model of flow over an M6 wing. We describe how automatic differentiation (AD) can be used within the PETSc framework to compute the required derivatives. We present performance data demonstrating the suitability of AD and PETSc for this problem. We conclude with a synopsis of our results and a description of opportunities for future work. Key words: Compressible Euler, PETSc, Nonlinear PDEs, Automatic Differentiation 1 Introduction Parallel implicit solution strategies are important in aerodynamic applications modeled by PDEs with disparate temporal and spatial scales. Within this family of techniques, NewtonKrylov methods have been shown to be wi...
Early Applications in the MessagePassing Interface (MPI)
 The International Journal of Supercomputer Applications
, 1994
"... We describe a number of early efforts to make use of the Message Passing Interface (MPI) standard in applications, based on an informal survey conducted in MayJune, 1994. Rather than a definitive statement of all MPI development work, this paper addresses initial successes, progress, and impression ..."
Abstract

Cited by 2 (0 self)
 Add to MetaCart
(Show Context)
We describe a number of early efforts to make use of the Message Passing Interface (MPI) standard in applications, based on an informal survey conducted in MayJune, 1994. Rather than a definitive statement of all MPI development work, this paper addresses initial successes, progress, and impressions that application developers have with MPI, according to the responses received. We summarize the important aspects of each survey response, and draw conclusions about the spread of MPI into applications. An understanding of messagepassing, and access to the MPI standard are prerequisites for appreciating this paper. Some background material is provided to ease this requirement. Skjellum, et al. Early MPI: : : 3 1 Introduction In this paper, we describe a number of early efforts to make use of the Message Passing Interface (MPI) standard in real applications (Forum 1994a; Forum 1994b). An informal survey of efforts is reported here, together with our commentary. We summarize the respon...
Accelerating CFD Applications by Improving Cached Data Reuse
, 1995
"... As processors continue to experience relatively rapid clock speed increases, the gap widens between cpu and memory performance. Unlike other studies that collect memory traces and analyze them for compile time optimization or propose cache organization best suited for an application group, this pape ..."
Abstract

Cited by 2 (0 self)
 Add to MetaCart
(Show Context)
As processors continue to experience relatively rapid clock speed increases, the gap widens between cpu and memory performance. Unlike other studies that collect memory traces and analyze them for compile time optimization or propose cache organization best suited for an application group, this paper tackles the problem at its roots, namely analyzing data access patterns and optimizing them before implementation. Optimization done by today's compilers is mostly loop level. Function level optimization is limited to inlining code that often leads to poor instruction cache utilization, affecting code performance adversely. In this study, an algorithm to solve compressible Euler equations is studied with regard to temporal and spatial access of data. Data and instruction blocks, which are used most often, are isolated. The algorithm is then coded to utilize the characteristics of hierarchial memories with as much as 45% improvement over conventional optimization techniques. 1 Introductio...
Ecient Management of Parallelism in ObjectOriented Numerical Software Libraries
"... ABSTRACT Parallel numerical software based on the messagepassing model is enormously complicated. This paper introduces a set of techniques to manage the complexity, while maintaining high eciency and ease of use. The PETSc 2.0 package uses objectoriented programming to conceal the details of the ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
(Show Context)
ABSTRACT Parallel numerical software based on the messagepassing model is enormously complicated. This paper introduces a set of techniques to manage the complexity, while maintaining high eciency and ease of use. The PETSc 2.0 package uses objectoriented programming to conceal the details of the message passing, without concealing the parallelism, in a highquality set of numerical software libraries. In fact, the programming model used by PETSc is also the most appropriate for NUMA sharedmemory machines, since they require the same careful attention to memory hierarchies as do distributedmemory machines. Thus, the concepts discussed are appropriate for all scalable computing systems. The PETSc libraries provide many of the data structures and numerical kernels required for the scalable solution of PDEs, oering performance portability. 1
Use of RANS Calculations in the Design of a Submarine Sail
"... The application of a Reynolds Averaged NavierStokes (RANS) code in the design of an “Advanced Sail ” for a submarine is discussed. To validate the code on similar sail shapes calculations are compared with experimentally obtained data at 1/35 scale from a wind tunnel and 1/17 scale from a water cha ..."
Abstract

Cited by 1 (1 self)
 Add to MetaCart
(Show Context)
The application of a Reynolds Averaged NavierStokes (RANS) code in the design of an “Advanced Sail ” for a submarine is discussed. To validate the code on similar sail shapes calculations are compared with experimentally obtained data at 1/35 scale from a wind tunnel and 1/17 scale from a water channel. This data comparison includes flow visualization, axial velocity and surface pressures. The agreement demonstrates that RANS codes can be used to provide the significant hydrodynamics associated with these sail shapes. To improve the design several modifications to a sail are evaluated using the RANS code. Based on the predicted secondary flow downstream of the sail as well as the drag a new design is chosen, without having to build and test the inferior shapes, reducing time and cost for the program. This improved sail was then built at 1/4 scale and demonstrated on the U.S. Navy’s Large Scale Vehicle.
and
"... A very large asymmetric composite payload fairing (PLF) was developed for launching payloads of unconventional geometry and size on the Atlas V Heavy Lift vehicle. Currently, no launch system exists that can accommodate these payloads without requiring a redesign of the launch vehicle and/or integra ..."
Abstract
 Add to MetaCart
(Show Context)
A very large asymmetric composite payload fairing (PLF) was developed for launching payloads of unconventional geometry and size on the Atlas V Heavy Lift vehicle. Currently, no launch system exists that can accommodate these payloads without requiring a redesign of the launch vehicle and/or integration facility. The asymmetric design was tailored to accommodate very large payloads while maintaining structural requirements and control authority limits of the launch vehicle as it currently stands. An optimal design was achieved through the use of an innovative computational fluid dynamics (CFD)based geometric optimization, composite structural tailoring, and novel manufacturing methods. The design was validated through correlation with subscale wind tunnel testing, and extremely close agreement between the analysis and test was achieved. The final design resulted in a composite sandwich structure that meets or exceeds strength, buckling, flutter, thermal, and acoustic requirements and does not require significant modifications to existing launch pad integration facilities. The geometry, methods, and processes demonstrated here have wider applicability to the whole range of launch vehicle sizes and can increase the payload capabilities of each by offering fairings which are tailored specifically to existing control