Results 1 - 10
of
25
Disk-directed I/O for MIMD Multiprocessors
, 1994
"... Many scientific applications that run on today’s multiprocessors, such as weather forecasting and seismic analysis, are bottlenecked by their file-I/O needs. Even if the multiprocessor is configured with sufficient I/O hardware, the file-system software often fails to provide the available bandwidth ..."
Abstract
-
Cited by 217 (18 self)
- Add to MetaCart
Many scientific applications that run on today’s multiprocessors, such as weather forecasting and seismic analysis, are bottlenecked by their file-I/O needs. Even if the multiprocessor is configured with sufficient I/O hardware, the file-system software often fails to provide the available bandwidth to the application. Although libraries and enhanced file-system interfaces can make a significant improvement, we believe that fundamental changes are needed in the file-server software. We propose a new technique, disk-directed I/O, to allow the disk servers to determine the flow of data for maximum performance. Our simulations show that tremendous performance gains are possible. Indeed, disk-directed I/O provided consistent high performance that was largely independent of data distribution, obtained up to 93 % of peak disk bandwidth, and was as much as 16 times faster than traditional parallel file systems.
Input/Output Characteristics of Scalable Parallel Applications
- In Proceedings of the Supercomputing ’95
, 1995
"... Rapid increases in computing and comm unication performance are exacerbating the long-standing problem of performance-limited input/output. Indeed, for many otherwise scalable parallel applications, input/output is emerging as a major performance bottleneck. The design of scalable input/output syste ..."
Abstract
-
Cited by 100 (2 self)
- Add to MetaCart
Rapid increases in computing and comm unication performance are exacerbating the long-standing problem of performance-limited input/output. Indeed, for many otherwise scalable parallel applications, input/output is emerging as a major performance bottleneck. The design of scalable input/output systems depends critically on the input/output requirements and access patterns for this emerging class of large-scale parallel applications. Ho wever, hard data on the behavior of such applications is only now becoming available. In this paper, we describe the input/output requirements of three scalable parallel applications (electron scattering, terrain rendering, and quantum chemistry) on the Intel Paragon XP/S. As part of an ongoing parallel input/output characterization e ort, we used instrumented versions of the application codes to capture
I/O Requirements of Scientific Applications: An Evolutionary View
- In Proceedings of the Fifth IEEE International Symposium on High Performance Distributed Computing
, 1996
"... The modest I/O configurations and file system limitations of many current high-performance systems preclude solution of problems with large I/O needs. I/O hardware and file system parallelism is the key to achieve high performance. We analyze the I/O behavior of several versions of two scientific ap ..."
Abstract
-
Cited by 46 (10 self)
- Add to MetaCart
The modest I/O configurations and file system limitations of many current high-performance systems preclude solution of problems with large I/O needs. I/O hardware and file system parallelism is the key to achieve high performance. We analyze the I/O behavior of several versions of two scientific applicationson the Intel Paragon XP/S. The versions involve incremental application code enhancements across multiple releases of the operating system. Studying the evolution of I/O access patterns underscores the interplay between application access patterns and file system features. Our results show that both small and large request sizes are common, that at present application developers must manually aggregate small requests to obtain high disk transfer rates, that concurrent file accesses are frequent, and that appropriate matching of the application access pattern and the file system access mode can significantly increase application I/O performance. Based on these results, we describe a set of file system design principles. 1.
Workload Characterization of Input/Output Intensive Parallel Applications
- in Proceedings of the 9th International Conference on Modelling Techniques and Tools for Computer Performance Evaluation
, 1997
"... . The broadening disparity in the performance of input/output (I/O) devices and the performance of processors and communication links on parallel systems is a major obstacle to achieving high performance for a wide range of parallel applications. I/O hardware and file system parallelism are the keys ..."
Abstract
-
Cited by 40 (9 self)
- Add to MetaCart
. The broadening disparity in the performance of input/output (I/O) devices and the performance of processors and communication links on parallel systems is a major obstacle to achieving high performance for a wide range of parallel applications. I/O hardware and file system parallelism are the keys to bridging this performance gap. A prerequisite to the development of efficient parallel file systems is detailed characterization of the I/O demands of parallel applications. In this paper, we present a comparative study of the I/O access patterns commonly found in I/O intensive parallel applications. Using the Pablo performance analysis environment and its I/O extensions we captured application I/O access patterns and analyzed their interactions with current parallel I/O systems. This analysis has proven instrumental in guiding the development of new application programming interfaces (APIs) for parallel file systems and in developing effective file system policies that can adaptively re...
File System Workload Analysis for Large Scale Scientific Computing Applications
- In Proceedings of the 21st IEEE / 12th NASA Goddard Conference on Mass Storage Systems and Technologies
, 2004
"... Parallel scientific applications require high-performance I/O support from underlying file systems. A comprehensive understanding of the expected workload is therefore essential for the design of high-performance parallel file systems. We re-examine the workload characteristics in parallel computing ..."
Abstract
-
Cited by 38 (12 self)
- Add to MetaCart
Parallel scientific applications require high-performance I/O support from underlying file systems. A comprehensive understanding of the expected workload is therefore essential for the design of high-performance parallel file systems. We re-examine the workload characteristics in parallel computing environments in the light of recent technology advances and new applications.
Server-Directed Collective I/O in Panda
- In Proceedings of Supercomputing '95
, 1995
"... We present the architecture and implementation results for Panda 2.0, a library for input and output of multidimensional arrays on parallel and sequential platforms. Panda achieves remarkable performance levels on the IBM SP2, showing excellent scalability as data size increases and as the number of ..."
Abstract
-
Cited by 36 (2 self)
- Add to MetaCart
We present the architecture and implementation results for Panda 2.0, a library for input and output of multidimensional arrays on parallel and sequential platforms. Panda achieves remarkable performance levels on the IBM SP2, showing excellent scalability as data size increases and as the number of nodes increases, and provides throughputs close to the full capacity of the AIX file system on the SP2 we used. We argue that this good performance can be traced to Panda's use of server-directed i/o (a logical-level version of diskdirected i/o [Kotz94b]) to perform array i/o using sequential disk reads and writes, a very high level interface for collective i/o requests, and built-in facilities for arbitrary rearrangements of arrays during i/o. Other advantages of Panda's approach are ease of use, easy application portability, and a reliance on commodity system software. 1 Introduction In the past few years, researchers in the HPCC community have suggested many approaches to improve i/o p...
Lessons from Characterizing Input/Output Bahavior of Parallel Scientific Applications
- INTERNATIONAL JOURNAL
, 1998
"... Because both processor and interprocessor communication hardware is evolving rapidly with only moderate improvements to file system performance in parallel systems, it is becoming increasingly difficult to provide sufficient input/output (I/O) performance to parallel applications. I/O hardware and f ..."
Abstract
-
Cited by 26 (4 self)
- Add to MetaCart
Because both processor and interprocessor communication hardware is evolving rapidly with only moderate improvements to file system performance in parallel systems, it is becoming increasingly difficult to provide sufficient input/output (I/O) performance to parallel applications. I/O hardware and file system parallelism are the key to bridging this performance gap. Prerequisite to the development of efficient parallel file system is detailed characterization of the I/O demands of parallel applications. In the paper, we present a comparative study of parallel I/O access patterns, commonly found in I/O intensive scientific applications. The Pablo performance analysis tool and its I/O extensions is a valuable resource in capturing and analyzing the I/O access attributes and their interactions with extant parallel I/O systems. This analysis is instrumental in guiding the development of new application programming interfaces (APIs) for parallel file systems and effective file system polici...
I/O Characterization of a Portable Astrophysics Application on the IBM SP and Intel Paragon
, 1995
"... Many large-scale applications on parallel machines are bottlenecked by the I/O performance rather than the CPU or communication performance of the system. To improve the I/O performance, it is first necessary for system designers to understand the I/O requirements of various applications. This pa ..."
Abstract
-
Cited by 14 (1 self)
- Add to MetaCart
Many large-scale applications on parallel machines are bottlenecked by the I/O performance rather than the CPU or communication performance of the system. To improve the I/O performance, it is first necessary for system designers to understand the I/O requirements of various applications. This paper presents the results of a study of the I/O characteristics and performance of a real, I/O-intensive, portable, parallel application in astrophysics, on two different parallel machines---the IBM SP and the Intel Paragon. We instrumented the source code to record all I/O activity, and analyzed the resulting trace files. Our results show that, for this application, the I/O consists of fairly large writes, and writing data to files is faster on the Paragon, whereas opening and closing files are faster on the SP. We also discuss how the I/O performance of this application could be improved; particularly, we believe that this application would benefit from using collective I/O.
A Comparison of Logical and Physical Parallel I/O Patterns
- International Journal of High Performance Computing Applications
, 1998
"... Although there are several extant studies of parallel scientific application request patterns, there is little experimental data on the correlation of physical input/output patterns with application input/output stimuli. To understand these correlations, we have instrumented the SCSI device drivers ..."
Abstract
-
Cited by 13 (2 self)
- Add to MetaCart
Although there are several extant studies of parallel scientific application request patterns, there is little experimental data on the correlation of physical input/output patterns with application input/output stimuli. To understand these correlations, we have instrumented the SCSI device drivers of the Intel Paragon OSF/1 operating system to record key physical input/output activities and have correlated this data with the input/output patterns of scientific applications captured via the Pablo analysis toolkit. Our analysis shows that disk hardware features profoundly affect the distribution of request delays and that current parallel file systems respond to parallel application input/output patterns in non-scalable ways. 1 Introduction Input/output for scalable parallel systems continues to be the major performance bottleneck for many large-scale scientific applications [2, 14]. Market forces are increasing the disparity between processor and disk system performance, exacerbating ...
Automatic Classification Of Input/Output Access Patterns
, 1997
"... Despite continued innovations in disk design, input/output performance has not kept pace with concurrent increases in processor speeds. Much research has focused on developing algorithms to avoid input/output or hide input/output latency in an attempt to redress this widening gap. Many studies have ..."
Abstract
-
Cited by 9 (2 self)
- Add to MetaCart
Despite continued innovations in disk design, input/output performance has not kept pace with concurrent increases in processor speeds. Much research has focused on developing algorithms to avoid input/output or hide input/output latency in an attempt to redress this widening gap. Many studies have shown that with advance knowledge of access patterns, file systems can improve input/output performance by selecting policies appropriate for the resource demands. Unfortunately, access patterns may be complex or data dependent, and therefore unknown a priori. Our thesis is that the file system can automatically detect qualitative file access patterns both locally (per parallel program thread) and globally (per parallel program) and use this information to dynamically choose appropriate file system policies. We propose two complementary methods for automatic classification, based on neural networks and hidden Markov models, respectively. Global classifications are created from a combination...

