Results 11 - 20
of
104
Send-receive considered harmful: Myths and realities of message passing
- ACM Transactions on Programming Languages and Systems
"... During the software crisis of the 1960s, Dijkstra’s famous thesis “goto considered harmful ” paved the way for structured programming. This short communication suggests that many current difficulties of parallel programming based on message passing are caused by poorly structured communication, whic ..."
Abstract
-
Cited by 28 (1 self)
- Add to MetaCart
During the software crisis of the 1960s, Dijkstra’s famous thesis “goto considered harmful ” paved the way for structured programming. This short communication suggests that many current difficulties of parallel programming based on message passing are caused by poorly structured communication, which is a consequence of using low-level send-receive primitives. We argue that, like goto in sequential programs, send-receive should be avoided as far as possible and replaced by collective operations in the setting of message passing. We dispute some widely held opinions about the apparent superiority of pairwise communication over collective communication and present substantial theoretical and empirical evidence to the contrary in the context of MPI (Message Passing Interface).
The Performance of Processor Co-Allocation in Multicluster Systems
, 2003
"... In systems consisting of multiple clusters of processors which are interconnected by relatively slow communication links and which employ space sharing for scheduling jobs, such as our Distributed ASCI Supercomputer (DAS), coallocation, i.e., the simultaneous allocation of processors to single jo ..."
Abstract
-
Cited by 28 (7 self)
- Add to MetaCart
In systems consisting of multiple clusters of processors which are interconnected by relatively slow communication links and which employ space sharing for scheduling jobs, such as our Distributed ASCI Supercomputer (DAS), coallocation, i.e., the simultaneous allocation of processors to single jobs in different clusters, may be required. We study the performance of co-allocation by means of simulations for the mean response time of jobs depending on the structure and sizes of jobs, the scheduling policy, and the communication speed ratio. Our main conclusion is that for current communication speed ratios in multiclusters, coallocation is a viable option.
Wide-Area Parallel Computing in Java
- In ACM SIGPLAN Java Grande Conference
, 1999
"... Java's support for parallel and distributed processing makes the language attractive for metacomputing applications, such as parallel applications that run on geographically distributed (wide-area) systems. To obtain actual experience with a Java-centric approach to metacomputing, we have built and ..."
Abstract
-
Cited by 26 (8 self)
- Add to MetaCart
Java's support for parallel and distributed processing makes the language attractive for metacomputing applications, such as parallel applications that run on geographically distributed (wide-area) systems. To obtain actual experience with a Java-centric approach to metacomputing, we have built and used a high-performance widearea Java system, called Manta. Manta implements the Java RMI model using different communication protocols (active messages and TCP/IP) for different networks. The paper shows how widearea parallel applications can be expressed and optimized using Java RMI. Also, it presents performance results of several applications on a wide-area system consisting of four Myrinet-based clusters connected by ATM WANs. 1 Introduction Metacomputing is an interesting research area that tries to integrate geographically distributed computing resources into a single powerful system. Many applications can benefit from such an integration [11, 22]. Metacomputing systems support such...
Ibis: an efficient Java-based Grid programming environment
- in Joint ACM Java Grande - ISCOPE 2002 Conference
, 2002
"... rob,jason,rutger,kielmann,bal ¡ ..."
Bandwidth-efficient Collective Communication for Clustered Wide Area Systems
- In Proc. International Parallel and Distributed Processing Symposium (IPDPS 2000), Cancun
, 1999
"... Metacomputing infrastructures couple multiple clusters (or MPPs) via wide-area networks and thus allow parallel programs to run on geographically distributed resources. A major problem in programming such wide-area parallel applications is the difference in communication costs inside and between clu ..."
Abstract
-
Cited by 24 (3 self)
- Add to MetaCart
Metacomputing infrastructures couple multiple clusters (or MPPs) via wide-area networks and thus allow parallel programs to run on geographically distributed resources. A major problem in programming such wide-area parallel applications is the difference in communication costs inside and between clusters. Latency and bandwidth of WANs often are orders of magnitude worse than those of local networks. Our MagPIe library eases wide-area parallel programming by providing an efficient implementation of MPI's collective communication operations. MagPIe exploits the hierarchical structure of clustered wide-area systems and minimizes the communication overhead over the WAN links. In this paper, we present improved algorithms for collective communication that achieve shorter completion times by simultaneously using the aggregate bandwidth of the available wide-area links. Our new algorithms split messages into multiple segments that are sent in parallel over different WAN links, thus resulting ...
Optimizing Threaded MPI Execution on SMP Clusters
- IN PROC. OF 15TH ACM INTERNATIONAL CONFERENCE ON SUPERCOMPUTING
, 2001
"... Our previous work has shown that using threads to execute MPI programs can yield great performance gain on multiprogrammed shared-memory machines. This paper investigates the design and implementation of a thread-based MPI system on SMP clusters. Our study indicates that with a proper design for thr ..."
Abstract
-
Cited by 23 (1 self)
- Add to MetaCart
Our previous work has shown that using threads to execute MPI programs can yield great performance gain on multiprogrammed shared-memory machines. This paper investigates the design and implementation of a thread-based MPI system on SMP clusters. Our study indicates that with a proper design for threaded MPI execution, both point-to-point and collective communication performance can be improved substantially, compared to a processbased MPI implementation in a cluster environment. Our contribution includes a hierarchy-aware and adaptive communication scheme for threaded MPI execution and a thread-safe network device abstraction that uses event-driven synchronization and provides separated collective and point-to-point communication channels. This paper describes the implementation of our design and illustrates its performance advantage on a Linux SMP cluster.
The Component Architecture of Open MPI: Enabling Third-Party Collective Algorithms
- In Proceedings, 18th ACM International Conference on Supercomputing, Workshop on Component Models and Systems for Grid Applications
, 2004
"... Abstract As large-scale clusters become more distributed and heterogeneous, significant research interest has emerged in optimizing MPI collective operations because of the performance gains that can be realized. However, researchers wishing to develop new algorithms for MPI collective operations ar ..."
Abstract
-
Cited by 22 (9 self)
- Add to MetaCart
Abstract As large-scale clusters become more distributed and heterogeneous, significant research interest has emerged in optimizing MPI collective operations because of the performance gains that can be realized. However, researchers wishing to develop new algorithms for MPI collective operations are typically faced with significant design, implementation, and logistical challenges. To address a number of needs in the MPI research community, Open MPI has been developed, a new MPI-2 implementation centered around a lightweight component architecture that provides a set of component frameworks for realizing collective algorithms, point-to-point communication, and other aspects of MPI implementations. In this paper, we focus on the collective algorithm component framework. The “coll” framework provides tools for researchers to easily design, implement, and experiment with new collective algorithms in the context of a production-quality MPI. Performance results with basic collective operations demonstrate that the component architecture of Open MPI does not introduce any performance penalty.
TOPOMON: A monitoring tool for grid network topology
- In International Conference on Computational Science (2
, 2002
"... Abstract. In Grid environments, high-performance applications have to take into account the available network performance between the individual sites. Existing monitoring tools like the Network Weather Service (NWS) measure bandwidth and latency of end-to-end network paths. This information is nece ..."
Abstract
-
Cited by 21 (4 self)
- Add to MetaCart
Abstract. In Grid environments, high-performance applications have to take into account the available network performance between the individual sites. Existing monitoring tools like the Network Weather Service (NWS) measure bandwidth and latency of end-to-end network paths. This information is necessary but not sufficient. With more than two participating sites, simultaneous transmissions may collide with each other on shared links of the wide-area network. If this occurs, applications may obtain lower network performance than predicted by NWS. In this paper, we describe TopoMon, a monitoring tool for Grid networks that augments NWS with additional sensors for the routes between the sites of a Grid environment. Our tool conforms to the Grid Monitoring Architecture (GMA) defined by the Global Grid Forum. It unites NWS performance and topology discovery in a single monitoring architecture. Our topology consumer process collects route information between the sites of a Grid environment and derives the overall topology for utilization by application programs and communication libraries. The topology can also be visualized for Grid application developers. 1
Improving the performance of collective operations in MPICH
- Recent Advances in Parallel Virtual Machine and Message Passing Interface. Number 2840 in LNCS, Springer Verlag (2003) 257–267 10th European PVM/MPI User’s Group Meeting
, 2003
"... ..."
Grid programming models: Current tools, issues and directions
- In Grid Computing: Making The Global Infrastructure a Reality
, 2003
"... Grid programming must manage computing environments that are inherently parallel, distributed, heterogeneous and dynamic, both in terms of the resources involved and their performance. Furthermore, grid applications will want to dynamically and flexibly compose resources and services across that dyn ..."
Abstract
-
Cited by 20 (2 self)
- Add to MetaCart
Grid programming must manage computing environments that are inherently parallel, distributed, heterogeneous and dynamic, both in terms of the resources involved and their performance. Furthermore, grid applications will want to dynamically and flexibly compose resources and services across that dynamic environments. While it may be possible to build grid applications using established programming tools, they are not particularly well-suited to effectively manage flexible composition or deal with heterogeneous hierarchies of machines, data and networks with heterogeneous performance. This chapter discusses issues, properties and capabilities of grid programming models and tools to support efficient grid programs and their effective development. The main issues are outlined and then current programming paradigms and tools are surveyed, examining their suitability for grid programming. Clearly no one tool will address all requirements in all situations. However, paradigms and tools that can incorporate and provide the widest possible support for grid programming will come to dominant. Advanced programming support techniques are analyzed discussing possibilities for their effective implementation on grid environments. 1

