Results 1  10
of
59
Numerical solution of saddle point problems
 ACTA NUMERICA
, 2005
"... Large linear systems of saddle point type arise in a wide variety of applications throughout computational science and engineering. Due to their indefiniteness and often poor spectral properties, such linear systems represent a significant challenge for solver developers. In recent years there has b ..."
Abstract

Cited by 180 (30 self)
 Add to MetaCart
Large linear systems of saddle point type arise in a wide variety of applications throughout computational science and engineering. Due to their indefiniteness and often poor spectral properties, such linear systems represent a significant challenge for solver developers. In recent years there has been a surge of interest in saddle point problems, and numerous solution techniques have been proposed for solving this type of systems. The aim of this paper is to present and discuss a large selection of solution methods for linear systems in saddle point form, with an emphasis on iterative methods for large and sparse problems.
A comparison of eigensolvers for largescale 3D modal analysis using AMGpreconditioned iterative methods
 Int. J. Numer. Meth. Engng
, 2005
"... The goal of our paper is to compare a number of algorithms for computing a large number of eigenvectors of the generalized symmetric eigenvalue problem arising from a modal analysis of elastic structures. The shiftinvert Lanczos algorithm has emerged as the workhorse for the solution of this genera ..."
Abstract

Cited by 25 (1 self)
 Add to MetaCart
The goal of our paper is to compare a number of algorithms for computing a large number of eigenvectors of the generalized symmetric eigenvalue problem arising from a modal analysis of elastic structures. The shiftinvert Lanczos algorithm has emerged as the workhorse for the solution of this generalized eigenvalue problem; however a sparse direct factorization is required for the resulting set of linear equations. Instead, our paper considers the use of preconditioned iterative methods. We present a brief review of available preconditioned eigensolvers followed by a numerical comparison on three problems using a scalable algebraic multigrid (AMG) preconditioner. Copyright c ○ 2003 John Wiley & Sons, Ltd. key words: Eigenvalues, large sparse symmetric eigenvalue problems, modal analysis, algebraic
The Combinatorial BLAS: Design, Implementation, and Applications
, 2010
"... This paper presents a scalable highperformance software library to be used for graph analysis and data mining. Large combinatorial graphs appear in many applications of highperformance computing, including computational biology, informatics, analytics, web search, dynamical systems, and sparse mat ..."
Abstract

Cited by 21 (9 self)
 Add to MetaCart
This paper presents a scalable highperformance software library to be used for graph analysis and data mining. Large combinatorial graphs appear in many applications of highperformance computing, including computational biology, informatics, analytics, web search, dynamical systems, and sparse matrix methods. Graph computations are difficult to parallelize using traditional approaches due to their irregular nature and low operational intensity. Many graph computations, however, contain sufficient coarse grained parallelism for thousands of processors, which can be uncovered by using the right primitives. We describe the Parallel Combinatorial BLAS, which consists of a small but powerful set of linear algebra primitives specifically targeting graph and data mining applications. We provide an extendible library interface and some guiding principles for future development. The library is evaluated using two important graph algorithms, in terms of both performance and easeofuse. The scalability and raw performance of the example applications, using the combinatorial BLAS, are unprecedented on distributed memory clusters.
Language Virtualization for Heterogeneous Parallel Computing
"... As heterogeneous parallel systems become dominant, application developers are being forced to turn to an incompatible mix of low level programming models (e.g. OpenMP, MPI, CUDA, OpenCL). However, these models do little to shield developers from the difficult problems of parallelization, data decomp ..."
Abstract

Cited by 18 (6 self)
 Add to MetaCart
As heterogeneous parallel systems become dominant, application developers are being forced to turn to an incompatible mix of low level programming models (e.g. OpenMP, MPI, CUDA, OpenCL). However, these models do little to shield developers from the difficult problems of parallelization, data decomposition and machinespecific details. Most programmers are having a difficult time using these programming models effectively. To provide a programming model that addresses the productivity and performance requirements for the average programmer, we explore a domainspecific approach to heterogeneous parallel programming. We propose language virtualization as a new principle that enables the construction of highly efficient parallel domain specific languages that are embedded in a common host language. We define criteria for language virtualization and present techniques to achieve them. We present two concrete case studies of domainspecific languages that are implemented using our virtualization approach.
Liszt: A Domain Specific Language for Building Portable Meshbased PDE Solvers
"... Heterogeneous computers with processors and accelerators are becoming widespread in scientific computing. However, it is difficult to program hybrid architectures and there is no commonly accepted programming model. Ideally, applications should be written in a way that is portable to many platforms, ..."
Abstract

Cited by 14 (2 self)
 Add to MetaCart
Heterogeneous computers with processors and accelerators are becoming widespread in scientific computing. However, it is difficult to program hybrid architectures and there is no commonly accepted programming model. Ideally, applications should be written in a way that is portable to many platforms, but providing this portability for general programs is a hard problem. By restricting the class of programs considered, we can make this portability feasible. We present Liszt, a domainspecific language for constructing meshbased PDE solvers. We introduce language statements for interacting with an unstructured mesh, and storing data at its elements. Program analysis of these statements enables our compiler to expose the parallelism, locality, and synchronization of Liszt programs. Using this analysis, we generate applications for multiple platforms: a cluster, an SMP, and a GPU. This approach allows Liszt applications to perform within 12 % of handwritten C++, scale to large clusters, and experience orderofmagnitude speedups on GPUs.
DOLFIN: Automated finite element computing
, 2009
"... We describe here a library aimed at automating the solution of partial differential equations using the finite element method. By employing novel techniques for automated code generation, the library combines a high level of expressiveness with efficient computation. Finite element variational forms ..."
Abstract

Cited by 13 (0 self)
 Add to MetaCart
We describe here a library aimed at automating the solution of partial differential equations using the finite element method. By employing novel techniques for automated code generation, the library combines a high level of expressiveness with efficient computation. Finite element variational forms may be expressed in near mathematical notation, from which lowlevel code is automatically generated, compiled and seamlessly integrated with efficient implementations of computational meshes and highperformance linear algebra. Easytouse objectoriented interfaces to the library are provided in the form of a C++ library and a Python module. This paper discusses the mathematical abstractions and methods used in the design of the library and its implementation. A number of examples are presented to demonstrate the use of the library in application code.
A Flexible OpenSource Toolbox for Scalable Complex Graph Analysis
, 2011
"... The Knowledge Discovery Toolbox (KDT) enables domain experts to perform complex analyses of huge datasets on supercomputers using a highlevel language without grappling with the difficulties of writing parallel code, calling parallel libraries, or becoming a graph expert. KDT provides a flexible Py ..."
Abstract

Cited by 7 (2 self)
 Add to MetaCart
The Knowledge Discovery Toolbox (KDT) enables domain experts to perform complex analyses of huge datasets on supercomputers using a highlevel language without grappling with the difficulties of writing parallel code, calling parallel libraries, or becoming a graph expert. KDT provides a flexible Python interface to a small set of highlevel graph operations; composing a few of these operations is often sufficient for a specific analysis. Scalability and performance are delivered by linking to a stateoftheart backend compute engine that scales from laptops to large HPC clusters. KDT delivers very competitive performance from a generalpurpose, reusable library for graphs on the order of 10 billion edges and greater. We demonstrate speedup of 1 and 2 orders of magnitude over PBGL and Pegasus, respectively, on some tasks. Examples from simple use cases and key graphanalytic benchmarks illustrate the productivity and performance realized by KDT users. Semantic graph abstractions provide both flexibility and high performance for realworld use cases. Graphalgorithm researchers benefit from the ability to develop algorithms quickly using KDT’s graph and underlying matrix abstractions for distributed memory. KDT is available as opensource code to foster experimentation.
Highly Parallel Sparse MatrixMatrix Multiplication
, 2010
"... Generalized sparse matrixmatrix multiplication is a key primitive for many high performance graph algorithms as well as some linear solvers such as multigrid. We present the first parallel algorithms that achieve increasing speedups for an unbounded number of processors. Our algorithms are based on ..."
Abstract

Cited by 6 (3 self)
 Add to MetaCart
Generalized sparse matrixmatrix multiplication is a key primitive for many high performance graph algorithms as well as some linear solvers such as multigrid. We present the first parallel algorithms that achieve increasing speedups for an unbounded number of processors. Our algorithms are based on twodimensional block distribution of sparse matrices where serial sections use a novel hypersparse kernel for scalability. We give a stateoftheart MPI implementation of one of our algorithms. Our experiments show scaling up to thousands of processors on a variety of test scenarios.