Results 1 -
6 of
6
On Implementing MPI-IO Portably and with High Performance
- In Proceedings of the 6th Workshop on I/O in Parallel and Distributed Systems
, 1999
"... We discuss the issues involved in implementing MPI-IO portably on multiple machines and file systems and also achieving high performance. One way to implement MPI-IO portably is to implement it on top of the basic Unix I/O functions (open, lseek, read, write, and close), which are themselves portabl ..."
Abstract
-
Cited by 137 (21 self)
- Add to MetaCart
We discuss the issues involved in implementing MPI-IO portably on multiple machines and file systems and also achieving high performance. One way to implement MPI-IO portably is to implement it on top of the basic Unix I/O functions (open, lseek, read, write, and close), which are themselves portable. We argue that this approach has limitations in both functionality and performance. We instead advocatean implementation approach that combines a large portion of portable code and a small portion of code that is optimized separately for different machines and file systems. We have used such an approach to develop a high-performance, portable MPI-IO implementation, called ROMIO. In addition to basic I/O functionality, we consider the issues of supporting other MPI-IO features, such as 64-bit file sizes, noncontiguous accesses, collective I/O, asynchronous I/O, consistency and atomicity semantics, user-supplied hints, shared file pointers, portable data representation, and file preallocati...
Applications of parallel I/O
, 1996
"... Scientific applications are increasingly being implemented on massively parallel supercomputers. Many of these applications have intense I/O demands, as well as massive computational requirements. This paper is essentially an annotated bibliography of papers and other sources of information about sc ..."
Abstract
-
Cited by 11 (2 self)
- Add to MetaCart
Scientific applications are increasingly being implemented on massively parallel supercomputers. Many of these applications have intense I/O demands, as well as massive computational requirements. This paper is essentially an annotated bibliography of papers and other sources of information about scientific applications using parallel I/O. It will be updated periodically.
Effective Communication and File-I/O Bandwidth Benchmarks
, 2001
"... We describe the design and MPI implementation of two benchmarks created to characterize the balanced system performance of high-performance clusters and supercomputers: b_eff, the communication-specific benchmark examines the parallel message passing performance of a system, and b_eff_io, which char ..."
Abstract
-
Cited by 9 (6 self)
- Add to MetaCart
We describe the design and MPI implementation of two benchmarks created to characterize the balanced system performance of high-performance clusters and supercomputers: b_eff, the communication-specific benchmark examines the parallel message passing performance of a system, and b_eff_io, which characterizes the effective I/O bandwidth. Both benchmarks have two goals: a) to get a detailed insight into the performance strengths and weaknesses of different parallel communication and I/O patterns, and based on this, b) to obtain a single bandwidth number that characterizes the average performance of the system namely communication and I/O bandwidth. Both benchmarks use a time-driven approach and loop over a variety of communication and access patterns to characterize a system in an automated fashion. Results of the two benchmarks are given for several systems including IBM SPs, Cray T3E, NEC SX-5, and Hitachi SR 8000. After a redesign of b_eff_io, I/O bandwidth results for several compute partition sizes are achieved in an appropriate time for rapid benchmarking.
An Evaluation of Java's I/O Capabilities for High-Performance Computing
- In Proceedings of the ACM 2000 Java Grande Conference
, 2000
"... Java is quickly becoming the preferred language for writing distributed applications because of its inherent support for programming on distributed platforms. In particular, Java provides compile-time and run-time security, automatic garbage collection, inherent support for multithreading, support f ..."
Abstract
-
Cited by 6 (1 self)
- Add to MetaCart
Java is quickly becoming the preferred language for writing distributed applications because of its inherent support for programming on distributed platforms. In particular, Java provides compile-time and run-time security, automatic garbage collection, inherent support for multithreading, support for persistent objects and object migration, and portability. Given these significant advantages of Java, there is a growing interest in using Java for high-performance computing applications. To be successful in the high-performance computing domain, however, Java must have the capability to efficiently handle the significant I/O requirements commonly found in high-performance computing applications. While there has been significant research in high-performance I/O using languages such as C, C++, and Fortran, there has been relatively little research into the I/O capabilities of Java. In this paper, we evaluate the I/O capabilities of Java for high-performance computing. We examine several...
Research Trends in High Performance Parallel Input/Output for Cluster Environments
- Proceedings of the 4th International Scientific and Practical Conference on Programming UkrPROG2004, Pages 274-281, National Academy of Sciences of
, 2004
"... Parallel input/output in high performance computing is a field of increasing importance. In particular with compute clusters we see the concept of replicated resources being transferred to I/O issues. Consequently, we find research questions like e.g. how to map data structures to files, which resou ..."
Abstract
-
Cited by 2 (1 self)
- Add to MetaCart
Parallel input/output in high performance computing is a field of increasing importance. In particular with compute clusters we see the concept of replicated resources being transferred to I/O issues. Consequently, we find research questions like e.g. how to map data structures to files, which resources to actually use, and how to deal with failures in the environment. The paper will introduce the problem of massive I/O from the user“s point of view and illustrate available programming interfaces. After a short description of some available parallel file systems we will concentrate on the research directions in that field. Besides other questions, efficiency is the main issue. It depends on an appropriate mapping of data structures onto file segments which in turn are spread over physical disks. Our own work concentrates on measuring the performance of individual mappings and to change them dynamically to increase performance and control the sharing of resources. Keywords: Parallel programming, parallel I/O, cluster computing. 1.
CSAR: Cluster Storage with Adaptive Redundancy
"... Striped file systems such as the Parallel Virtual File System (PVFS) deliver high-bandwidth I/O to applications running on clusters. An open problem in the design of striped file systems is how to reduce their vulnerability to disk failures with the minimum performance penalty. In this paper we desc ..."
Abstract
- Add to MetaCart
Striped file systems such as the Parallel Virtual File System (PVFS) deliver high-bandwidth I/O to applications running on clusters. An open problem in the design of striped file systems is how to reduce their vulnerability to disk failures with the minimum performance penalty. In this paper we describe a novel data redundancy scheme designed specifically to address the performance issue. We demonstrate the new scheme within CSAR, a proof-of-concept implementation based on PVFS. By dynamically switching between RAID1 and RAID5 redundancy based on write size, CSAR consistently achieves the best of two worlds- RAID1 performance on small writes, and RAID5 efficiency on large writes. Using the popular parallel I/O benchmark BTIO, our scheme achieves 82 % of the write bandwidth of the unmodified PVFS. We describe the issues in implementing our new scheme in a popular striped file system such as PVFS on a Linux cluster. 1.

