Results 1  10
of
19
Dynamic Partitioning of NonUniform Structured Workloads with Spacefilling Curves
 IEEE Transactions on Parallel and Distributed Systems
, 1995
"... We discuss Inverse Spacefilling Partitioning (ISP), a partitioning strategy for nonuniform scientific computations running on distributed memory MIMD parallel computers. We consider the case of a dynamic workload distributed on a uniform mesh, and compare ISP against Orthogonal Recursive Bisectio ..."
Abstract

Cited by 56 (2 self)
 Add to MetaCart
We discuss Inverse Spacefilling Partitioning (ISP), a partitioning strategy for nonuniform scientific computations running on distributed memory MIMD parallel computers. We consider the case of a dynamic workload distributed on a uniform mesh, and compare ISP against Orthogonal Recursive Bisection (ORB) and a Median of Medians variant of ORB, ORBMM. We present two results. First, ISP and ORBMM are superior to ORB in rendering balanced workloadsbecause they are more finegrained and incur communication overheads that are comparable to ORB. Second, ISP is more attractive than ORBMM from a software engineering standpoint because it avoids elaborate bookkeeping. Whereas ISP partitionings can be described succinctly as logically contiguous segments of the line, ORBMM's partitionings are inherently unstructured. We describe the general ddimensional ISP algorithm and report empirical results with two and threedimensional, nonhierarchical particle methods. Scott B. Bad...
Parallel Multigrid in an Adaptive PDE Solver Based on Hashing and SpaceFilling Curves
, 1997
"... this paper is organized as follows: In section 2 we discuss data structures for adaptive PDE solvers. Here, we suggest to use hash tables instead of the usually employed tree type data structures. Then, in section 3 we discuss the main features of the sequential adaptive multilevel solver. Section 4 ..."
Abstract

Cited by 39 (3 self)
 Add to MetaCart
this paper is organized as follows: In section 2 we discuss data structures for adaptive PDE solvers. Here, we suggest to use hash tables instead of the usually employed tree type data structures. Then, in section 3 we discuss the main features of the sequential adaptive multilevel solver. Section 4 deals with the partitioning and distribution of adaptive grids with spacefilling curves and section 5 gives the main features of our new parallelized adaptive multilevel solver. In section 6 we present the results of numerical experiments on a parallel cluster computer with up to 64 processors. It is shown that our approach works nicely also for problems with severe singularities which result in locally refined meshes. Here, the work overhead for load balancing and data distribution remains only a small fraction of the overall work load. 2. DATA STRUCTURES FOR ADAPTIVE PDE SOLVERS 2.1. Adaptive Cycle
ViennaFortran/HPF Extensions for Sparse and Irregular Problems and Their Compilation
 IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS
, 1997
"... Vienna Fortran, High Performance Fortran (HPF) and other data parallel languages have been introduced to allow the programming of massively parallel distributedmemory machines (DMMP) at a relatively high level of abstraction based on the SPMD paradigm. Their main features include directives to expr ..."
Abstract

Cited by 30 (10 self)
 Add to MetaCart
Vienna Fortran, High Performance Fortran (HPF) and other data parallel languages have been introduced to allow the programming of massively parallel distributedmemory machines (DMMP) at a relatively high level of abstraction based on the SPMD paradigm. Their main features include directives to express the distribution of data and computations across the processors of a machine. In this paper, we use ViennaFortran as a general framework for dealing with sparse data structures. We describe new methods for the representation and distribution of such data on DMMPs, and propose simple language features that permit the user to characterize a matrix as "sparse" and specify the associated representation. Together with the data distribution for the matrix, this enables the compiler and runtime system to translate sequential sparse code into explicitly parallel messagepassing code. We develop new compilation and runtime techniques, which focus on achieving storage economy and reducing communi...
Partitioning with Spacefilling Curves
, 1994
"... Balanced partitioning of nonuniform data in a high dimensional space is much more difficult than partitioning the nonuniform data projected onto a line. For dynamic problems which require many such partitions, this partitioning time difference may be critical. For this reason, Recursive Coordinate ..."
Abstract

Cited by 20 (2 self)
 Add to MetaCart
Balanced partitioning of nonuniform data in a high dimensional space is much more difficult than partitioning the nonuniform data projected onto a line. For dynamic problems which require many such partitions, this partitioning time difference may be critical. For this reason, Recursive Coordinate Bisection (RCB) has grown in popularity since it repeatedly collapses d \Gamma 1 dimensions onto 1 dimension. As an alternative method, we introduce the Inverse Spacefilling Partition (ISP) which maps the higher dimensional space to a line in more finegrained units. ISP is faster than RCB, and yields a more even load balance at the cost of slightly higher communication and irregularly partitioned regions. The general ddimensional ISP algorithm is described, then analytical and empirical comparisons are made between ISP and RCB. 1 Introduction Dynamic, nonuniform computations arise in diverse scientific applications including particle methods [1, 2], and the Finite Element Method [3]. As...
A partitioning advisory system for networked dataparallel processing
 Concurrency: Practice and Experience
, 1995
"... With the increased performance capabilities of desktop computers, networked computing has become a popular vehicle for using parallelism to solve a variety of computationally intense problems. However, node heterogeneity and high communication costs may limit performance unless the problem space is ..."
Abstract

Cited by 20 (1 self)
 Add to MetaCart
With the increased performance capabilities of desktop computers, networked computing has become a popular vehicle for using parallelism to solve a variety of computationally intense problems. However, node heterogeneity and high communication costs may limit performance unless the problem space is carefully partitioned across the network in a way that considers both the capabilities of the machines and the high network communication costs. We describe an advisory system that is designed to help the programmer, compiler, or runtime environment choose the best decomposition strategy for partitioning specific dataparallel applications across a given collection of machines. The system includes provisions for assessing the capabilities of the participating machines and the network in light of the current workload. Given information about the problem space, the machine speeds, and the network, the system provides a ranking of three standard partitioning methods. We test the validity of our system by comparing the observed relative performance with predicted relative performance of different data decompositions on a program with a variable number of floating point operations and a 5point stencil communication pattern.
DataParallel Language Features for Sparse Codes: A Survey and Contributions
, 1995
"... This paper proposes a new approach to improve dataparallel languages in the context of sparse and irregular computation. We analyze the capabilities of High Performance Fortran (HPF) and Vienna Fortran, and identify a set of problems leading to suboptimal parallel code generation for such computati ..."
Abstract

Cited by 19 (5 self)
 Add to MetaCart
This paper proposes a new approach to improve dataparallel languages in the context of sparse and irregular computation. We analyze the capabilities of High Performance Fortran (HPF) and Vienna Fortran, and identify a set of problems leading to suboptimal parallel code generation for such computations on distributedmemory machines. Finally, we propose extensions to the data distribution facilities in Vienna Fortran which address these issues and provide a powerful mechanism for efficiently expressing sparse algorithms.
HashStorage Techniques for Adaptive Multilevel Solvers and Their Domain Decomposition Parallelization
 Domain decomposition methods 10. The 10th int. conf., Boulder, volume 218 of Contemp. Math
, 1998
"... this article remain attractive even for such a code. ..."
Abstract

Cited by 18 (6 self)
 Add to MetaCart
this article remain attractive even for such a code.
Partitioning an Array onto a Mesh of Processors
 In Proc. of the Workshop on Applied Parallel Computing in Industrial Problems
, 1996
"... . Achieving an even load balance with a low communication overhead is a fundamental task in parallel computing. In this paper we consider the problem of partitioning an array into a number of blocks such that the maximum amount of work in any block is as low as possible. We review different proposed ..."
Abstract

Cited by 16 (1 self)
 Add to MetaCart
. Achieving an even load balance with a low communication overhead is a fundamental task in parallel computing. In this paper we consider the problem of partitioning an array into a number of blocks such that the maximum amount of work in any block is as low as possible. We review different proposed schemes for this problem and the complexity of their communication pattern. We present new approximation algorithms for computing a well balanced generalized block distribution as well as an algorithm for computing an optimal semigeneralized block distribution. The various algorithms are tested and compared on a number of different matrices. 1 Introduction A basic task in parallel computing is the partitioning and subsequent distribution of data to processors. The problem one faces in this operation is how to balance two often contradictory aims; finding an equal distribution of the computational work and at the same time minimizing the imposed communication. In the data parallel model th...
Parallel Adaptive Mesh Refinement and Redistribution on Distributed Memory Computers
 Comput. Methods Appl. Mech. Engrg
, 1993
"... A procedure to support parallel refinement and redistribution of two dimensional unstructured finite element meshes on distributed memory computers is presented. The procedure uses the mesh topological entity hierarchy as the underlying data structures to easily support the required adjacency inform ..."
Abstract

Cited by 15 (1 self)
 Add to MetaCart
A procedure to support parallel refinement and redistribution of two dimensional unstructured finite element meshes on distributed memory computers is presented. The procedure uses the mesh topological entity hierarchy as the underlying data structures to easily support the required adjacency information. Mesh refinement is done by employing links back to the geometric representation to place new nodes on the boundary of the domain directly on the curved geometry. The refined mesh is then redistributed by an iterative heuristic based on the Leiss/Reddy [9] load balancing criteria. A fast parallel tree edgecoloring algorithm is used to pair processors having adjacent partitions and forming a tree structure as a result of Leiss/Reddy load request criteria. Excess elements are iteratively migrated from heavily loaded to less loaded processors until load balancing is achieved. The system is implemented on a massively parallel MasPar MP2 system with a SIMD style of computation and uses me...
Parallel Adaptive Subspace Correction Schemes with Applications to Elasticity
 Comput. Methods Appl. Mech. Engrg
, 1999
"... : In this paper, we give a survey on the three main aspects of the efficient treatment of PDEs, i.e. adaptive discretization, multilevel solution and parallelization. We emphasize the abstract approach of subspace correction schemes and summarize its convergence theory. Then, we give the main featur ..."
Abstract

Cited by 8 (4 self)
 Add to MetaCart
: In this paper, we give a survey on the three main aspects of the efficient treatment of PDEs, i.e. adaptive discretization, multilevel solution and parallelization. We emphasize the abstract approach of subspace correction schemes and summarize its convergence theory. Then, we give the main features of each of the three distinct topics and treat the historical background and modern developments. Furthermore, we demonstrate how all three ingredients can be put together to give an adaptive and parallel multilevel approach for the solution of elliptic PDEs and especially of linear elasticity problems. We report on numerical experiments for the adaptive parallel multilevel solution of some test problems, namely the Poisson equation and Lam'e's equation. Here, we emphasize the parallel efficiency of the adaptive code even for simple test problems with little work to distribute, which is achieved through hash storage techniques and spacefilling curves. Keywords: subspace correction, iter...