Results 1 - 10
of
11
MPI: A Message-Passing Interface Standard
, 1994
"... process naming to allow libraries to describe their communication in terms suitable to their own data structures and algorithms, ffl The ability to "adorn" a set of communicating processes with additional user-defined attributes, such as extra collective operations. This mechanism should provide a ..."
Abstract
-
Cited by 250 (0 self)
- Add to MetaCart
process naming to allow libraries to describe their communication in terms suitable to their own data structures and algorithms, ffl The ability to "adorn" a set of communicating processes with additional user-defined attributes, such as extra collective operations. This mechanism should provide a means for the user or library writer effectively to extend a message-passing notation. In addition, a unified mechanism or object is needed for conveniently denoting communication context, the group of communicating processes, to house abstract process naming, and to store adornments. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 5.1. INTRODUCTION 131 5.1.2 MPI's Support for Libraries The corresponding concepts that MPI provides, specifically to support robust libraries, are as follows: ffl Contexts of communication, ffl Groups of processes, ffl Virtual topologies, ffl Attribute caching, ffl Commun...
Models of Machines and Computation for Mapping in Multicomputers
, 1993
"... It is now more than a quarter of a century since researchers started publishing papers on mapping strategies for distributing computation across the computation resource of multiprocessor systems. There exists a large body of literature on the subject, but there is no commonly-accepted framework ..."
Abstract
-
Cited by 76 (1 self)
- Add to MetaCart
It is now more than a quarter of a century since researchers started publishing papers on mapping strategies for distributing computation across the computation resource of multiprocessor systems. There exists a large body of literature on the subject, but there is no commonly-accepted framework whereby results in the field can be compared. Nor is it always easy to assess the relevance of a new result to a particular problem. Furthermore, changes in parallel computing technology have made some of the earlier work of less relevance to current multiprocessor systems. Versions of the mapping problem are classified, and research in the field is considered in terms of its relevance to the problem of programming currently available hardware in the form of a distributed memory multiple instruction stream multiple data stream computer: a multicomputer.
Task Allocation onto a Hypercube by Recursive Mincut Bipartitioning
, 1989
"... An efficient recursive task allocation scheme, based on the Kernighan-Lin mincut bisection heuristic, is proposed for the effective mapping of tasks of a parallel program onto a hypercube parallel computer. It is evaluated by comparison with an adaptive, scaled simulated annealing method. The rec ..."
Abstract
-
Cited by 31 (0 self)
- Add to MetaCart
An efficient recursive task allocation scheme, based on the Kernighan-Lin mincut bisection heuristic, is proposed for the effective mapping of tasks of a parallel program onto a hypercube parallel computer. It is evaluated by comparison with an adaptive, scaled simulated annealing method. The recursive allocation scheme is shown to be effective on a number of large test task graphs -- its solution quality is nearly as good as that produced by simulated annealing, and its computation time is several orders of magnitude less.
Document for a Standard Message-Passing Interface
, 1997
"... Introduction Current Status: No votes Collective communication capabilities are here for MPI-2, covering these areas: ffl Extension of MPI collective operations to intercommunicators ffl Extension of MPI collective operations to in-place buffers. ffl Two-phase collective communication of a limi ..."
Abstract
-
Cited by 13 (0 self)
- Add to MetaCart
Introduction Current Status: No votes Collective communication capabilities are here for MPI-2, covering these areas: ffl Extension of MPI collective operations to intercommunicators ffl Extension of MPI collective operations to in-place buffers. ffl Two-phase collective communication of a limited form and a limited set of operations. ffl A generalized all-to-all collective operation 6.2 Two-phase Collective Communication Current Status: no votes In some applications, better performance can be achieved by separating the initiation and completion of a collective operation. For example, in some numerical applications, better performance can be achieved by overlapping other work (both computation and communication) with an MPI Allreduce. At the same time, the full generality of non-blocking collectiv
Clustering and Intra-Processor Scheduling for Explicitly-Parallel Programs on Distributed-Memory Systems
- In International Parallel Processing Symposium
, 1994
"... Programs for distributed-memory systems are explicitly-parallel and comprise of a set of sequential tasks or processes that communicate via message-passing. The sequence of computation in each task together with the intermediate send and receive communication steps exhibit temporal behavior of the p ..."
Abstract
-
Cited by 5 (2 self)
- Add to MetaCart
Programs for distributed-memory systems are explicitly-parallel and comprise of a set of sequential tasks or processes that communicate via message-passing. The sequence of computation in each task together with the intermediate send and receive communication steps exhibit temporal behavior of the program. In this paper, we show that the two common models of program representation, the precedence graph and the interaction graph models, are insufficient to capture this temporal behavior and hence are not ideal for solving the clustering and the scheduling problems. We use a new Temporal Communication Graph (TCG) model to represent such explicitly-parallel programs. This model captures communication dependency and overlap of communication with computation. This provides flexibility to get a better estimate of the program completion time. New measures are developed for quantifying critical communication and inter-task parallelism on this model. We analyze the importance of intra-processor...
Mapping tasks to processors at run-time
- Proc. ISCIS VII (Int. Symp. on Comp. & Inf. Sciences) Antalya, Turkey (Nov.1992
, 1992
"... We consider the dynamic task allocation problem in multicomputer system with multiprogramming. Programs are given as task interaction graphs that have to be mapped onto the processors at run-time. We propose a fast two-phase heuristic algorithm where phase 1 performs a hierarchic clustering of the t ..."
Abstract
-
Cited by 4 (2 self)
- Add to MetaCart
We consider the dynamic task allocation problem in multicomputer system with multiprogramming. Programs are given as task interaction graphs that have to be mapped onto the processors at run-time. We propose a fast two-phase heuristic algorithm where phase 1 performs a hierarchic clustering of the tasks which is used by the second phase to map clusters of suitable size onto free partitions of the processor graph. 1
MPI: A Message-Passing Interface Standard
, 1994
"... process naming to allow libraries to describe their communication in terms suitable to their own data structures and algorithms, ffl The ability to "adorn" a set of communicating processes with additional user-defined attributes, such as extra collective operations. This mechanism should provide a ..."
Abstract
- Add to MetaCart
process naming to allow libraries to describe their communication in terms suitable to their own data structures and algorithms, ffl The ability to "adorn" a set of communicating processes with additional user-defined attributes, such as extra collective operations. This mechanism should provide a means for the user or library writer effectively to extend a message-passing notation. In addition, a unified mechanism or object is needed for conveniently denoting communication context, the group of communicating processes, to house abstract process naming, and to store adornments. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 134 CHAPTER 5. GROUPS, CONTEXTS, AND COMMUNICATORS 5.1.2 MPI's Support for Libraries The corresponding concepts that MPI provides, specifically to support robust libraries, are as follows: ffl Contexts of communication, ffl Groups of processes, ffl Virtual topolo...
Process Topologies
"... Introduction This chapter discusses the MPI topology mechanism. A topology is an extra, optional attribute that one can give to an intra-communicator; topologies cannot be added to intercommunicators. A topology can provide a convenient naming mechanism for the processes of a group (within a commun ..."
Abstract
- Add to MetaCart
Introduction This chapter discusses the MPI topology mechanism. A topology is an extra, optional attribute that one can give to an intra-communicator; topologies cannot be added to intercommunicators. A topology can provide a convenient naming mechanism for the processes of a group (within a communicator), and additionally, may assist the runtime system in mapping the processes onto hardware. As stated in chapter ??, a process group in MPI is a collection of n processes. Each process in the group is assigned a rank between 0 and n-1. In many parallel applications a linear ranking of processes does not adequately reflect the logical communication pattern of the processes (which is usually determined by the underlying problem geometry and the numerical algorithm used). Often the processes are arranged in topological patterns such as two- or three-dimensional grids. More generally, th
Mapping Strategies for Switch-Based Cluster Systems of Irregular Topology
, 2001
"... Mapping virtual process topology to physical processor topology is one of the most important issues in parallel computing. The mapping problem for switch-based cluster systems of irregular topology is very complicated due to the connection irregularity and routing complexity. This paper proposes two ..."
Abstract
- Add to MetaCart
Mapping virtual process topology to physical processor topology is one of the most important issues in parallel computing. The mapping problem for switch-based cluster systems of irregular topology is very complicated due to the connection irregularity and routing complexity. This paper proposes two mapping schemes for irregular cluster systems, which try to map the nearest neighbors in the process topology to physically adjacent processors. In addition, an application-oriented performance metric, weightedcardinality, is introduced to represent the quality of mapping. Simulation study shows that, for a virtual topology of a 16 \Theta 16 mesh, the proposed mapping schemes result in better mapping quality and about 15 20% shorter communication latency compared to random mapping. The proposed algorithms should also be beneficial when they are appliedtometacomputing and cluster of cluster systems, wherethecommunication costs areanorder of magnitude different depending on the relative position of the processor nodes.
MPIXternal: A Library for a Portable Adjustment of Parallel MPI Applications to Heterogeneous Environments
"... Nowadays, common systems in the area of high performance computing exhibit highly hierarchical architectures. As a result, achieving satisfactory application performance demands an adaptation of the respective parallel algorithm to such systems. This, in turn, requires knowledge about the actual har ..."
Abstract
- Add to MetaCart
Nowadays, common systems in the area of high performance computing exhibit highly hierarchical architectures. As a result, achieving satisfactory application performance demands an adaptation of the respective parallel algorithm to such systems. This, in turn, requires knowledge about the actual hardware structure even at the application level. However, the prevalent Message Passing Interface (MPI) standard (at least in its current version 2.1) intentionally hides heterogeneity from the application programmer in order to assure portability. In this paper, we introduce the MPIXternal library which tries to circumvent this obvious semantic gap within the current MPI standard. For this purpose, the library offers the programmer additional features that should help to adapt applications to today’s hierarchical systems in a convenient and portable way. 1.

