Results 1  10
of
29
Graph Partitioning Algorithms With Applications To Scientific Computing
 Parallel Numerical Algorithms
, 1997
"... Identifying the parallelism in a problem by partitioning its data and tasks among the processors of a parallel computer is a fundamental issue in parallel computing. This problem can be modeled as a graph partitioning problem in which the vertices of a graph are divided into a specified number of su ..."
Abstract

Cited by 41 (0 self)
 Add to MetaCart
Identifying the parallelism in a problem by partitioning its data and tasks among the processors of a parallel computer is a fundamental issue in parallel computing. This problem can be modeled as a graph partitioning problem in which the vertices of a graph are divided into a specified number of subsets such that few edges join two vertices in different subsets. Several new graph partitioning algorithms have been developed in the past few years, and we survey some of this activity. We describe the terminology associated with graph partitioning, the complexity of computing good separators, and graphs that have good separators. We then discuss early algorithms for graph partitioning, followed by three new algorithms based on geometric, algebraic, and multilevel ideas. The algebraic algorithm relies on an eigenvector of a Laplacian matrix associated with the graph to compute the partition. The algebraic algorithm is justified by formulating graph partitioning as a quadratic assignment p...
Parallel Constrained Delaunay Meshing
 IN PROCEEDINGS OF THE JOINT ASME/ASCE/SES SUMMER MEETING SPECIAL SYMPOSIUM ON TRENDS IN UNSTRUCTURED MESH GENERATION
, 1997
"... We present a parallel unstructured grid generation method based on the Constrained Delaunay Triangulation (CDT). The Parallel Constrained Meshing Algorithm uses certain edges of the initial mesh as constraints that, without compromising the grid quality, help in the minimization of communication ove ..."
Abstract

Cited by 30 (10 self)
 Add to MetaCart
We present a parallel unstructured grid generation method based on the Constrained Delaunay Triangulation (CDT). The Parallel Constrained Meshing Algorithm uses certain edges of the initial mesh as constraints that, without compromising the grid quality, help in the minimization of communication overhead and in the elimination of the synchronization overhead. By combining the CDT and its datacentric taskparallel implementation we produce a meshing algorithm that requires almost no synchronization. Moreover, experiments show that the use of the CDT for meshing cuts communication time by a factor of about seven when compared to a similar meshing algorithm that does not use the CDT.
A Comparison of Parallel Graph Coloring Algorithms
"... Dynamic irregular triangulated meshes are used in adaptive grid partial differential equation (PDE) solvers, and in simulations of random surface models of quantum gravity inphysics and cell membranes in biology. Parallel algorithms for random surface simulations and adaptive grid PDE solvers requir ..."
Abstract

Cited by 25 (0 self)
 Add to MetaCart
Dynamic irregular triangulated meshes are used in adaptive grid partial differential equation (PDE) solvers, and in simulations of random surface models of quantum gravity inphysics and cell membranes in biology. Parallel algorithms for random surface simulations and adaptive grid PDE solvers require coloring of the triangulated mesh, so that neighboring vertices are not updated simultaneously. Graph coloring is also used in iterative parallel algorithms for solving large irregular sparse matrix equations. Here we introduce some parallel graph coloring algorithms based on wellknown sequential heuristic algorithms, and compare them with some existing parallel algorithms. These algorithms are implemented on both SIMD and MIMD parallel architectures and tested for speed, e ciency, and quality (the average number of colors required) for coloring random triangulated meshes and graphs from sparse matrix problems.
PELLPACK: a problemsolving environment for PDEbased applications on multicomputer platforms
 ACM Transactions on Mathematical Software
, 1998
"... This paper presents the software architecture and implementation of the problem solving ..."
Abstract

Cited by 22 (4 self)
 Add to MetaCart
This paper presents the software architecture and implementation of the problem solving
Heuristic Algorithms for Automatic Graph Partitioning
, 1995
"... Practical implementations of the Finite Element method on distributed memory multicomputer systems necessitate the use of partitioning tools to subdivide the mesh into submeshes of roughly equal size. Graph partitioning algorithms are mandatory when implementing distributed sparse matrix methods o ..."
Abstract

Cited by 18 (6 self)
 Add to MetaCart
Practical implementations of the Finite Element method on distributed memory multicomputer systems necessitate the use of partitioning tools to subdivide the mesh into submeshes of roughly equal size. Graph partitioning algorithms are mandatory when implementing distributed sparse matrix methods or domain decomposition techniques for irregularly structured problems, on parallel computers. We propose a class of algorithms which are based on level set expansions from a number of center nodes. A critical component of these methods is the location of these centers. We present a number of different strategies for finding centers which lead to goodquality partitionings. Work supported in part by ARPA under grant NIST 60NANB2D1272, in part by NSF grant CCR9214116, and by the Minnesota Supercomputer Institute. y Department of Computer Science, University of Minnesota, Minneapolis 55455 z Department of Computer Science, and Minnesota Supercomputer Institute, University of Minnesota, Mi...
Mobile Object Layer: A Runtime Substrate for Parallel Adaptive and Irregular Computations
, 1999
"... In this paper we present a parallel runtime substrate, the Mobile Object Layer (MOL), that supports data or object mobility and automatic message forwarding in order to ease the implementation of adaptive and irregular applications on distributed memory machines. The MOL uses global logical name s ..."
Abstract

Cited by 11 (6 self)
 Add to MetaCart
In this paper we present a parallel runtime substrate, the Mobile Object Layer (MOL), that supports data or object mobility and automatic message forwarding in order to ease the implementation of adaptive and irregular applications on distributed memory machines. The MOL uses global logical name space for message passing and distributed directories to assist in the translation of logical to physical addresses. The latency of the MOL primitives is within 10% to 14% of the the latency of the underlying communication substrate. The MOL is a lightweight, portable library designed to minimize maintenance costs for very largescale parallel adaptive applications. Keywords: Parallel, message passing, load balancing, runtime software, adaptive mesh generation. 1
Simultaneous Mesh Generation and Partitioning for Delaunay Meshes
 IN PROCEEDINGS OF THE EIGHTH INTERNATIONAL MESHING ROUNDTABLE
, 1999
"... In this paper, we present a new approach for the parallel generation and partitioning of unstructured 3D Delaunay meshes. The new approach couples the mesh generation and partitioning problems into a single optimization problem. Traditionally, these two problems are solved separately, first ge ..."
Abstract

Cited by 9 (1 self)
 Add to MetaCart
In this paper, we present a new approach for the parallel generation and partitioning of unstructured 3D Delaunay meshes. The new approach couples the mesh generation and partitioning problems into a single optimization problem. Traditionally, these two problems are solved separately, first generating the mesh (usually sequentially) and then partitioning the mesh, either sequentially or in parallel. In the traditional approach, the overheads due to I/O and data movement exceed 50% of the total execution time. Even if parallel partitioning schemes are employed, data movement, synchronization, and data structure translation overheads are high; for applications which require frequent remeshing (e.g. crack growth simulations), these overheads are prohibitive. We present a method for solving the mesh partitioning and placement problem simultaneously with the mesh generation problem. By eliminating unnecessary and redundant cache, local, and remote memory accesses, we can...
Visualization of Distributed Data Structures for HPFlike Languages
"... This paper motivates the usage of graphics and visualization for efficient utilization of HPF's data distribution facilities. It proposes a graphical tooltkit consisting of exploratory tools and estimation tools which allow the programmer to navigate through complex distributions and to obtain graph ..."
Abstract

Cited by 8 (4 self)
 Add to MetaCart
This paper motivates the usage of graphics and visualization for efficient utilization of HPF's data distribution facilities. It proposes a graphical tooltkit consisting of exploratory tools and estimation tools which allow the programmer to navigate through complex distributions and to obtain graphical ratings with respect to load distribution and communication. The toolkit has been implemented in a mapping design and visualization tool which is coupled with a compilation system for the HPF predecessor Vienna Fortran. Since this language covers a superset of HPF's facilities, the tool may also be used for visualization of HPF data structures.
Static Load Balancing of Parallel PDE Solver for Distributed Computing Environment
 In PDCS'2000, 13th Int'l Conf. Parallel and Distributed Computing Systems
, 2000
"... This paper describes a static load balancing scheme for partial differential equation solvers in a distributed computing environment. Though there has been much research on static load balancing for uniform processors, a distributed computing environment is a computationally more difficult target be ..."
Abstract

Cited by 6 (1 self)
 Add to MetaCart
This paper describes a static load balancing scheme for partial differential equation solvers in a distributed computing environment. Though there has been much research on static load balancing for uniform processors, a distributed computing environment is a computationally more difficult target because it usually consists of a variety of processors. Our method considers both computing and communication time to minimize the total execution time with automatic data partitioning and processor allocation. This problem is formulated as a combinatorial optimization and solved by the branchandbound method for up to 2024 processors. This paper also presents approximation algorithms that give good allocation and partitioning in practical time. The quality of the approximation is quantitively evaluated in comparison to the optimal solution or theoretical lower bounds. Our method is general and applicable to a wide variety of parallel processing applications.