Results 1 -
4 of
4
Multilevel Mesh Partitioning for Heterogeneous Communication Networks
- Future Generation Comput. Syst
, 2001
"... Multilevel algorithms are a successful class of optimisation techniques which address the mesh partitioning problem for distributing unstructured meshes onto parallel computers. They usually combine a graph contraction algorithm together with a local optimisation method which refines the partition a ..."
Abstract
-
Cited by 20 (9 self)
- Add to MetaCart
Multilevel algorithms are a successful class of optimisation techniques which address the mesh partitioning problem for distributing unstructured meshes onto parallel computers. They usually combine a graph contraction algorithm together with a local optimisation method which refines the partition at each graph level. To date these algorithms have been used almost exclusively to minimise the cut edge weight in the graph with the aim of minimising the parallel communication overhead, but recently there has been a perceived need to take into account the communications network of the parallel machine. For example the increasing use of SMP clusters (systems of multiprocessor compute nodes with very fast intra-node communications but relatively slow inter-node networks) suggest the use of hierarchical network models. Indeed this requirement is exacerbated in the early experiments with meta-computers (multiple supercomputers combined together, in extreme cases over inter-continental networks). In this paper therefore, we modify a multilevel algorithm in order to minimise a cost function based on a model of the communications network. Several network models and variants of the algorithm are tested and we establish that it is possible to successfully guide the optimisation to reflect the chosen architecture. 2001 Elsevier Science B.V. All rights reserved.
ParaPART: Parallel Mesh Partitioning Tool for Distributed Systems
- In Proc. IRREGULAR'99
, 1999
"... In this paper, we present ParaPART, a parallel version of a mesh partitioning tool, called PART, for distributed systems. PART takes into consideration the heterogeneities in processor performance, network performance and application computational complexities to achieve a balanced estimate of execu ..."
Abstract
-
Cited by 3 (2 self)
- Add to MetaCart
In this paper, we present ParaPART, a parallel version of a mesh partitioning tool, called PART, for distributed systems. PART takes into consideration the heterogeneities in processor performance, network performance and application computational complexities to achieve a balanced estimate of execution time across the processors in the distributed system. Simulated annealing is used in PART to perform the backtracking search for desired partitions. ParaPART significantly improves performance of PART by using the asynchronous multiple Markov chain approach of parallel simulated annealing. ParaPART is used to partition six irregular meshes into 8, 16, and 100 subdomains using up to 64 client processors on an IBM SP2 machine. The results show superlinear speedup in most cases and nearly perfect speedup for the rest. Using the partitions from ParaPART, we ran an explicit, 2-D finite element code on two geographically distributed IBM SP machines. Results indicate that ParaPART produces r...
Coordinating Computation and I/O in Massively Parallel Sequence Search
"... With the explosive growth of genomic information, the searching of sequence databases has emerged as one of the most computation- and data-intensive scientific applications. Our previous studies suggested that parallel genomic sequence-search possesses highly irregular computation and I/O patterns. ..."
Abstract
-
Cited by 2 (1 self)
- Add to MetaCart
With the explosive growth of genomic information, the searching of sequence databases has emerged as one of the most computation- and data-intensive scientific applications. Our previous studies suggested that parallel genomic sequence-search possesses highly irregular computation and I/O patterns. Effectively addressing these run-time irregularities is thus the key to designing scalable sequence-search tools on massively parallel computers. While the computation scheduling for irregular scientific applications and the optimization of noncontiguous file accesses have been wellstudied independently, littleattentionhasbeenpaidtotheinterplay between the two. In this paper, we systematically investigate the computation and I/O scheduling for data-intensive, irregular scientific applications within the context of genomic sequence search. Our study revealsthatthelackofcoordinationbetweencomputation scheduling and I/O optimization could result in severe performance issues. We then propose an integrated scheduling approach that effectively improves sequence-search throughput by gracefully coordinating the dynamic load-balancing of computation and highperformance noncontiguous I/O.
Approaches to architecture-aware parallel scientific computation
- Williams College Department of Computer Science
, 2005
"... Abstract. Modern large-scale scientific computation problems must execute in a parallel computational environment to achieve acceptable performance. Target parallel environments range from the largest tightly-coupled supercomputers to heterogeneous clusters of workstations. Grid technologies make In ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
Abstract. Modern large-scale scientific computation problems must execute in a parallel computational environment to achieve acceptable performance. Target parallel environments range from the largest tightly-coupled supercomputers to heterogeneous clusters of workstations. Grid technologies make Internet execution more likely. Hierarchical and heterogeneous systems are increasingly common. Processing and communication capabilities can be nonuniform, non-dedicated, transient or unreliable. Even when targeting homogeneous computing environments, each environment may differ in the number of processors per node, the relative costs of computation, communication, and memory access, and the availability of programming paradigms and software tools. Architecture-aware computation requires knowledge of the computing environment and software performance characteristics, and tools to make use of this knowledge. These challenges may be addressed by compilers, low-level tools, dynamic load balancing or solution procedures, middleware layers, high-level software development techniques, and choice of programming languages and paradigms. Computation and communication may be reordered. Data or computation may be replicated or a load imbalance may be tolerated to avoid costly communication. This paper samples a variety of approaches to architecture-aware parallel computation.

