Results 1 - 10
of
12
Disk-directed I/O for MIMD Multiprocessors
, 1994
"... Many scientific applications that run on today’s multiprocessors, such as weather forecasting and seismic analysis, are bottlenecked by their file-I/O needs. Even if the multiprocessor is configured with sufficient I/O hardware, the file-system software often fails to provide the available bandwidth ..."
Abstract
-
Cited by 217 (18 self)
- Add to MetaCart
Many scientific applications that run on today’s multiprocessors, such as weather forecasting and seismic analysis, are bottlenecked by their file-I/O needs. Even if the multiprocessor is configured with sufficient I/O hardware, the file-system software often fails to provide the available bandwidth to the application. Although libraries and enhanced file-system interfaces can make a significant improvement, we believe that fundamental changes are needed in the file-server software. We propose a new technique, disk-directed I/O, to allow the disk servers to determine the flow of data for maximum performance. Our simulations show that tremendous performance gains are possible. Indeed, disk-directed I/O provided consistent high performance that was largely independent of data distribution, obtained up to 93 % of peak disk bandwidth, and was as much as 16 times faster than traditional parallel file systems.
Server-Directed Collective I/O in Panda
- In Proceedings of Supercomputing '95
, 1995
"... We present the architecture and implementation results for Panda 2.0, a library for input and output of multidimensional arrays on parallel and sequential platforms. Panda achieves remarkable performance levels on the IBM SP2, showing excellent scalability as data size increases and as the number of ..."
Abstract
-
Cited by 36 (2 self)
- Add to MetaCart
We present the architecture and implementation results for Panda 2.0, a library for input and output of multidimensional arrays on parallel and sequential platforms. Panda achieves remarkable performance levels on the IBM SP2, showing excellent scalability as data size increases and as the number of nodes increases, and provides throughputs close to the full capacity of the AIX file system on the SP2 we used. We argue that this good performance can be traced to Panda's use of server-directed i/o (a logical-level version of diskdirected i/o [Kotz94b]) to perform array i/o using sequential disk reads and writes, a very high level interface for collective i/o requests, and built-in facilities for arbitrary rearrangements of arrays during i/o. Other advantages of Panda's approach are ease of use, easy application portability, and a reliance on commodity system software. 1 Introduction In the past few years, researchers in the HPCC community have suggested many approaches to improve i/o p...
ViPIOS: The Vienna Parallel Input/Output System
- In Euro-Par'98
, 1998
"... . In this paper we present the Vienna Parallel Input Output System (ViPIOS), a novel approach to enhance the I/O performance of high performance applications. It is a client-server based tool combining capabilities found in parallel I/O runtime libraries and parallel file systems. 1 Introduction In ..."
Abstract
-
Cited by 9 (4 self)
- Add to MetaCart
. In this paper we present the Vienna Parallel Input Output System (ViPIOS), a novel approach to enhance the I/O performance of high performance applications. It is a client-server based tool combining capabilities found in parallel I/O runtime libraries and parallel file systems. 1 Introduction In the last few years the applications in high performance computing (Grand Challenges [1]) shifted from being CPU-bound to be I/O-bound. Performance can not be scaled up by increasing the number of CPUs any more, but by increasing the bandwidth of the I/O subsystem. This situation is commonly known as the I/O bottleneck in high performance computing ([5]) In reaction all leading hardware vendors of multiprocessor systems provided powerful concurrent I/O subsystems. In accordance researchers focused on the design of appropriate programming tools to take advantage of the available hardware resources. 1.1 The ViPIOS approach Conventionally two different directions in developing programming supp...
Discretionary Caching for I/O on Clusters
- In Proceedings of the Third IEEE/ACM International Symposium on Cluster Computing and the Grid
, 2003
"... kandemir¤ rross¤ ..."
A Software Architecture for Massively Parallel Input-Output
, 1996
"... For an increasing number of data intensive scientific applications, parallel I/O concepts are a major performance issue. Tackling this issue, we provide an outline of an input/output system designed for highly efficient, scalable and conveniently usable parallel I/O on distributed memory systems. Th ..."
Abstract
-
Cited by 8 (2 self)
- Add to MetaCart
For an increasing number of data intensive scientific applications, parallel I/O concepts are a major performance issue. Tackling this issue, we provide an outline of an input/output system designed for highly efficient, scalable and conveniently usable parallel I/O on distributed memory systems. The main focus of this paper is the parallel I/O runtime system support provided for software-generated programs produced by High Performance FORTRAN compilers. Specifically, our design is presented in the context of the Vienna Fortran Compilation System. Contents I Introduction 1 II Data Mapping Model 1 III Design of the Parallel I/O Runtime Support 3 IV Related Work 8 V Conclusions 8 I. Introduction The main issue for I/O subsystems in supercomputing environments is to feed arrays of computing processors with huge amounts of raw data (nowadays typically in the Tbyte range) in such a way that neither the The work described in this paper is being carried out as part of the research project...
Interfaces for Disk-Directed I/O
, 1995
"... In other papers I propose the idea of disk-directed I/O for multiprocessor file systems. Those papers focus on the performance advantages and capabilities of disk-directed I/O, but say little about the application-programmer's interface or about the interface between the compute processors and I/O p ..."
Abstract
-
Cited by 8 (3 self)
- Add to MetaCart
In other papers I propose the idea of disk-directed I/O for multiprocessor file systems. Those papers focus on the performance advantages and capabilities of disk-directed I/O, but say little about the application-programmer's interface or about the interface between the compute processors and I/O processors. In this short note I discuss the requirements for these interfaces, and look at many existing interfaces for parallel file systems. I conclude that many of the existing interfaces could be adapted for use in a disk-directed I/O system. 1 Introduction In other papers I propose the idea of disk-directed I/O for multiprocessor file systems [Kot94, Kot95a, Kot95b]. Those papers show that disk-directed I/O can be used to substantially improve performance (higher throughput, lower execution time, or less network traffic) when reading input data, writing results, or executing an out-of-core computation. They show that the concept of disk-directed I/O can be extended to include data-depe...
Runtime Support for In-Core and Out-of-Core Data-Parallel Programs
, 1995
"... Distributed memory parallel computers or distributed computer systems are widely recognized as the only cost-effective means of achieving teraflops performance in the near future. However, the fact remains that they are difficult to program and advances in software for these machines have not kept p ..."
Abstract
-
Cited by 6 (2 self)
- Add to MetaCart
Distributed memory parallel computers or distributed computer systems are widely recognized as the only cost-effective means of achieving teraflops performance in the near future. However, the fact remains that they are difficult to program and advances in software for these machines have not kept pace with advances in hardware. This thesis addresses several issues in providing runtime support for in-core as well as out-of-core programs on distributed memory parallel computers. This runtime support can be directly used in application programs for greater efficiency, portability and ease of programming. It can also be used together with a compiler to translate programs written in a high-level data-parallel language like High Performance Fortran (HPF) to node programs for distributed memory machines. In distributed memory programs, it is often necessary to change the distribution of arrays during program execution. This thesis presents efficient and portable algorithms for runtime array ...
Compiler Optimizations for I/O-Intensive Computations
"... This paper describes transformation techniques for out-of-core programs (i.e., those that deal with very large quantities of data) based on exploiting locality using a combination of loop and data transformations. Writing efficient out-of-core program is an arduous task. As a result, compiler optimi ..."
Abstract
-
Cited by 4 (0 self)
- Add to MetaCart
This paper describes transformation techniques for out-of-core programs (i.e., those that deal with very large quantities of data) based on exploiting locality using a combination of loop and data transformations. Writing efficient out-of-core program is an arduous task. As a result, compiler optimizations directed at improving I/O performance are becoming increasingly important. We describe how a compiler can improve the performance of the code by determining appropriate file layouts for out-of-core arrays and finding suitable loop transformations. In addition to optimizing a single loop nest, our solution can handle a sequence of loop nests. We also show how to generate code when the file layouts are optimized. Experimental results obtained on an Intel Paragon distributed-memory message-passing multiprocessor demonstrate marked improvements in performance due to the optimizations described in this paper.
Compiler-Directed I/O Optimization
- In Proceedings of the 16th International Symposium on Parallel and Distributed Processing (2002
"... Despite continued innovations in design of I/O systems, I/O performance has not kept pace with the progress in processor and communication technology. This paper addresses this I/O problem from a compiler’s perspective, and presents an I/O optimization strategy based on access pattern and storage fo ..."
Abstract
-
Cited by 2 (1 self)
- Add to MetaCart
Despite continued innovations in design of I/O systems, I/O performance has not kept pace with the progress in processor and communication technology. This paper addresses this I/O problem from a compiler’s perspective, and presents an I/O optimization strategy based on access pattern and storage form (file layout) detection. The objective of our optimization strategy is to determine storage forms for array-based data sets taking into account future use of data (future access patterns). To tackle this problem, we present a three-step strategy: (i) determining all I/O access patterns to the array, and among them, selecting the most dominant (i.e., the most beneficial) access pattern; (ii) determining the most suitable storage form for the array taking into account the most dominant access pattern detected in the previous step; and (iii) optimizing the non-dominant access patterns using collective I/O, an optimization that allows each processor to do I/O on behalf of others if doing so improves overall performance.
Mass Storage Support for a Parallelizing Compilation System
"... This paper focuses on mass storage support for the Vienna Fortran Compilation System (VFCS) to enable efficient execution of parallel I/O operations and operations on out-of-core (OOC) structures. The use of OOC structures implies I/O operations: due to main memory constraints some parts of these da ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
This paper focuses on mass storage support for the Vienna Fortran Compilation System (VFCS) to enable efficient execution of parallel I/O operations and operations on out-of-core (OOC) structures. The use of OOC structures implies I/O operations: due to main memory constraints some parts of these data structures (e.g., large arrays) must be swapped to disk. The approach outlined in this paper is based on two main concepts: (i) Vienna Fortran language extensions and compilation techniques.

