Results 1 - 10
of
14
A high-performance, portable implementation of the MPI message passing interface standard
- Parallel Computing
, 1996
"... MPI (Message Passing Interface) is a specification for a standard library for message passing that was defined by the MPI Forum, a broadly based group of parallel computer vendors, library writers, and applications specialists. Multiple implementations of MPI have been developed. In this paper, we d ..."
Abstract
-
Cited by 651 (37 self)
- Add to MetaCart
MPI (Message Passing Interface) is a specification for a standard library for message passing that was defined by the MPI Forum, a broadly based group of parallel computer vendors, library writers, and applications specialists. Multiple implementations of MPI have been developed. In this paper, we describe MPICH, unique among existing implementations in its design goal of combining portability with high performance. We document its portability and performance and describe the architecture by which these features are simultaneously achieved. We also discuss the set of tools that accompany the free distribution of MPICH, which constitute the beginnings of a portable parallel programming environment. A project of this scope inevitably imparts lessons about parallel computing, the specification being followed, the current hardware and software environment for parallel computing, and project management; we describe those we have learned. Finally, we discuss future developments for MPICH, including those necessary to accommodate extensions to the MPI Standard now being contemplated by the MPI Forum. 1
The Globus Project: A Status Report
, 1998
"... The Globus project is a multi-institutional research e#ort that seeks to enable the construction of computational grids providing pervasive, dependable, and consistent access to high-performance computational resources, despite geographical distribution of both resources and users. Computational gri ..."
Abstract
-
Cited by 267 (18 self)
- Add to MetaCart
The Globus project is a multi-institutional research e#ort that seeks to enable the construction of computational grids providing pervasive, dependable, and consistent access to high-performance computational resources, despite geographical distribution of both resources and users. Computational grid technology is being viewed as a critical element of future highperformance computing environments that will enable entirely new classes of computation-oriented applications, much as the World Wide Web fostered the development of new classes of information-oriented applications. In this paper, we report on the status of the Globus project as of early 1998. We describe the progress that has been achieved to date in the development of the Globus toolkit, a set of core services for constructing grid tools and applications. We also discuss on the Globus Ubiquitous Supercomputing Testbed (GUSTO) that we have constructed to enable largescale evaluation of Globus technologies, and review early exp...
Remote I/O: Fast Access to Distant Storage
- In Proceedings of the Fifth Workshop on Input/Output in Parallel and Distributed Systems
, 1997
"... As high-speed networks make it easier to use distributed resources, it becomes increasingly common that applications and their data are not colocated. Users have traditionally addressed this problem by manually staging data to and from remote computers. We argue instead for a new remote I/O paradigm ..."
Abstract
-
Cited by 53 (7 self)
- Add to MetaCart
As high-speed networks make it easier to use distributed resources, it becomes increasingly common that applications and their data are not colocated. Users have traditionally addressed this problem by manually staging data to and from remote computers. We argue instead for a new remote I/O paradigm in which programs use familiar parallel I/O interfaces to access remote filesystems. In addition to simplifying remote execution, remote I/O can improve performance relative to staging by overlapping computation and data transfer or by reducing communication requirements. However, remote I/O also introduces new technical challenges in the areas of portability, performance, and integration with distributed computing systems. We propose techniques designed to address these challenges and describe a remote I/O library called RIO that we have developed to evaluate the effectiveness of these techniques. RIO addresses issues of portability by adopting the quasi-standard MPI-IO interface and by de...
Server-Directed Collective I/O in Panda
- In Proceedings of Supercomputing '95
, 1995
"... We present the architecture and implementation results for Panda 2.0, a library for input and output of multidimensional arrays on parallel and sequential platforms. Panda achieves remarkable performance levels on the IBM SP2, showing excellent scalability as data size increases and as the number of ..."
Abstract
-
Cited by 36 (2 self)
- Add to MetaCart
We present the architecture and implementation results for Panda 2.0, a library for input and output of multidimensional arrays on parallel and sequential platforms. Panda achieves remarkable performance levels on the IBM SP2, showing excellent scalability as data size increases and as the number of nodes increases, and provides throughputs close to the full capacity of the AIX file system on the SP2 we used. We argue that this good performance can be traced to Panda's use of server-directed i/o (a logical-level version of diskdirected i/o [Kotz94b]) to perform array i/o using sequential disk reads and writes, a very high level interface for collective i/o requests, and built-in facilities for arbitrary rearrangements of arrays during i/o. Other advantages of Panda's approach are ease of use, easy application portability, and a reliance on commodity system software. 1 Introduction In the past few years, researchers in the HPCC community have suggested many approaches to improve i/o p...
MPI-2: Extending the Message-Passing Interface
, 1996
"... This paper describes current activities of the MPI-2 Forum. The MPI-2 Forum is a group of parallel computer vendors, library writers, and application specialists working together to define a set of extensions to MPI (Message Passing Interface). MPI was defined by the same process and now has many im ..."
Abstract
-
Cited by 28 (15 self)
- Add to MetaCart
This paper describes current activities of the MPI-2 Forum. The MPI-2 Forum is a group of parallel computer vendors, library writers, and application specialists working together to define a set of extensions to MPI (Message Passing Interface). MPI was defined by the same process and now has many implementations, both vendor-proprietary and publicly available, for a wide variety of parallel computing environments. In this paper we present the salient aspects of the evolving MPI-2 document as it now stands. We discuss proposed extensions and enhancements to MPI in the areas of dynamic process management, one-sided operations, collective operations, new language binding, real-time computing, external interfaces, and miscellaneous topics. 1 Introduction During 1993 and 1994, a group of parallel computer vendors, library writers, and application scientists met regularly to define a standard interface for message-passing libraries. The result of this effort was MPI (Message-Passing Interfa...
Optimizing Collective I/O Performance on Parallel Computers: A Multisystem Study
- In Proceedings of the 11th ACM International Conference on Supercomputing
, 1997
"... While individual parallel I/O systems can incorporate sophisticated techniques and achieve impressive performance in particular situations, researchers as yet have only limited understanding of the impact of various design decisions or of the techniques required for performance robustness. One remed ..."
Abstract
-
Cited by 12 (8 self)
- Add to MetaCart
While individual parallel I/O systems can incorporate sophisticated techniques and achieve impressive performance in particular situations, researchers as yet have only limited understanding of the impact of various design decisions or of the techniques required for performance robustness. One remedy is to perform detailed comparative studies of different I/O libraries. In this paper, we describe such a study for the Disk Resident Array and Panda libraries, both designed to support high-performance I/O for arrays. While the two systems have many similarities, their designs and implementations are based on different assumptions and target different applications. We base our study on two I/O structures commonly encountered in scientific applications: the collective read/write of an entire array and the collective read/write of an arbitrary array section. Experiments are performed on two parallel file systems (IBM PIOFS and Intel PFS) and one commodity Unix file system (AIX JFS). Our resu...
ChemIO: High-Performance Parallel I/O for Computational Chemistry Applications
- for Computational Chemistry Applications, Intl. J. Supercomp. Apps. High Perf. Comp.12
, 1998
"... Recent developments in I/O systems on scalable parallel computers have sparked renewed interest in out-of-core methods for computational chemistry. These methods can improve execution time significantly relative to "direct" methods, which perform many redundant computations. However, the widespread ..."
Abstract
-
Cited by 12 (3 self)
- Add to MetaCart
Recent developments in I/O systems on scalable parallel computers have sparked renewed interest in out-of-core methods for computational chemistry. These methods can improve execution time significantly relative to "direct" methods, which perform many redundant computations. However, the widespread use of such out-of-core methods requires efficient and portable implementations of often complex I/O patterns. The ChemIO project has addressed this problem by defining an I/O interface that captures the I/O patterns found in important computational chemistry applications and by providing high-performance implementations of this interface on multiple platforms. This development not only broadens the user community for parallel I/O techniques but also provides new insights into the functionality required in general-purpose scalable I/O libraries and the techniques required to achieve high- performance I/O on scalable parallel computers. 1 Introduction Computational chemistry refers t...
Flexibility and performance of parallel file systems
- ACM Operating Systems Review
, 1996
"... Many scienti c applications for high-performance multiprocessors have tremendous I/O requirements. As a result, the I/O system is often the limiting factor of application performance. Several new parallel le systems have been developed in recent years, each promising better performance for some clas ..."
Abstract
-
Cited by 7 (2 self)
- Add to MetaCart
Many scienti c applications for high-performance multiprocessors have tremendous I/O requirements. As a result, the I/O system is often the limiting factor of application performance. Several new parallel le systems have been developed in recent years, each promising better performance for some class of parallel applications. As we gain experience with parallel computing, and parallel le systems in particular, it becomes increasingly clear that a single solution does not suit all applications. For example, it appears to be impossible to nd a single appropriate interface, caching policy, le structure, or disk management strategy. Furthermore, the proliferation of le-system interfaces and abstractions make application portability a signi cant problem. We propose that the traditional functionality of parallel le systems be separated into two components: a xed core that is standard on all platforms, encapsulating only primitive abstractions and interfaces, and a set of high-level libraries to provide a variety of abstractions and application-programmer interfaces (APIs). We think of this approach as the \RISC " of parallel le-system design. We present our current and next-generation le systems as examples of this structure. Their features, such as a three-dimensional le structure, strided read and write interfaces, and I/Onode programs, are speci cally designed with the exibility and performance necessary to support a wide range of applications. 1
Extensible Message Passing Application Development and Debugging with Python
- Proceedings 11th International Parallel Processing Symposium
, 1997
"... We describe how we have parallelized Python, an interpreted object oriented scripting language, and used it to build an extensible message-passing molecular dynamics application for the CM-5, Cray T3D, and Sun multiprocessor servers running MPI. This allows us to interact with large-scale message-pa ..."
Abstract
-
Cited by 5 (2 self)
- Add to MetaCart
We describe how we have parallelized Python, an interpreted object oriented scripting language, and used it to build an extensible message-passing molecular dynamics application for the CM-5, Cray T3D, and Sun multiprocessor servers running MPI. This allows us to interact with large-scale message-passing applications, rapidly prototype new features, and perform application specific debugging. It is even possible to write message passing programs in Python itself. We describe some of the tools we have developed to extend Python and results of this approach. 1
POM: a Virtual Parallel Machine Featuring Observation Mechanisms
- PI 902, IRISA
, 1995
"... : We describe in this paper a Parallel Observable virtual Machine (POM), which provides a homogeneous interface upon the communication kernels of parallel architectures. POM was designed so as to be ported easily and efficiently on numerous parallel platforms. It provides sophisticated features for ..."
Abstract
-
Cited by 4 (2 self)
- Add to MetaCart
: We describe in this paper a Parallel Observable virtual Machine (POM), which provides a homogeneous interface upon the communication kernels of parallel architectures. POM was designed so as to be ported easily and efficiently on numerous parallel platforms. It provides sophisticated features for observing distributed executions. Key-words: Distributed memory parallel computers, virtual machine, communication library, observation, traces (R'esum'e : tsvp) guidec@irisa.fr maheo@irisa.fr Unite de recherche INRIA Rennes IRISA, Campus universitaire de Beaulieu, 35042 RENNES Cedex (France) Telephone : (33) 99 84 71 00 -- Telecopie : (33) 99 84 71 POM : une machine parall`ele virtuelle incorporant des m'ecanismes d'observation R'esum'e : Nous d'ecrivons dans cet article une machine parall`ele virtuelle observable, la POM. Celle-ci offre une interface homog`ene au dessus des syst`emes de communication des architectures parall`eles. Elle a 'et'e con¸cue en vue d'un portage ais'e et ...

