Results 1 - 10
of
14
Dynamic File-Access Characteristics of a Production Parallel Scientific Workload
, 1994
"... Multiprocessors have permitted astounding increases in computational performance, but many cannot meet the intense I/O requirements of some scientific applications. An important component of any solution to this I/O bottleneck is a parallel file system that can provide high-bandwidth access to treme ..."
Abstract
-
Cited by 76 (12 self)
- Add to MetaCart
Multiprocessors have permitted astounding increases in computational performance, but many cannot meet the intense I/O requirements of some scientific applications. An important component of any solution to this I/O bottleneck is a parallel file system that can provide high-bandwidth access to tremendous amounts of data in parallel to hundreds or thousands of processors. Most successful systems are based on a solid understanding of the expected workload, but thus far there have been no comprehensive workload characterizations of multiprocessor le systems. This paper presents the results of a three week tracing study in which all file-related activity on a massively parallel computer was recorded. Our instrumentation di ers from previous efforts in that it collects information about every I/O request and about the mix of jobs running in a production environment. We also present the results of a trace-driven caching simulation and recommendations for designers of multiprocessor file systems.
Characterizing parallel file-access patterns on a large-scale multiprocessor
- IN PROCEEDINGS OF THE NINTH INTERNATIONAL PARALLEL PROCESSING SYMPOSIUM
, 1995
"... Rapid increases in the computational speeds of multiprocessors have not been matched by correspond-ing performance enhancements in the I/O subsystem. To satisfy the large and growing I/O requirements of some parallel scientific applications, we need parallel file systems that can provide high-bandwi ..."
Abstract
-
Cited by 41 (4 self)
- Add to MetaCart
Rapid increases in the computational speeds of multiprocessors have not been matched by correspond-ing performance enhancements in the I/O subsystem. To satisfy the large and growing I/O requirements of some parallel scientific applications, we need parallel file systems that can provide high-bandwidth and high-v01ume data transfer between tth I/O subsystem and thousands of processors. Design of such high-performance parallel file systems depends on a thorough grasp of the expected " workload. So far there have been no-comprehensive usage studies of multiprocessor file systems. Our _. CHARISMA project intends to fill this void. The first results from our study involve an iPSC/860 at.. _i ",-' NASA Ames. This paper presents results from a different platform, the CM-5 at the National Center for Supercomputing Applications. The CHARISMA studies are unique because we collect information about every individual read and write request and about the entire mix of applications running on the machines. The results of our trace analysis lead to recommendations for parallel file system design. First, the file system should support efficient concurrent access to many files, and UO requests from many jobs
File System Workload Analysis for Large Scale Scientific Computing Applications
- In Proceedings of the 21st IEEE / 12th NASA Goddard Conference on Mass Storage Systems and Technologies
, 2004
"... Parallel scientific applications require high-performance I/O support from underlying file systems. A comprehensive understanding of the expected workload is therefore essential for the design of high-performance parallel file systems. We re-examine the workload characteristics in parallel computing ..."
Abstract
-
Cited by 38 (12 self)
- Add to MetaCart
Parallel scientific applications require high-performance I/O support from underlying file systems. A comprehensive understanding of the expected workload is therefore essential for the design of high-performance parallel file systems. We re-examine the workload characteristics in parallel computing environments in the light of recent technology advances and new applications.
Requirements of I/O Systems for Parallel Machines: An Application-driven Study
, 1997
"... I/O-intensive parallel programs have emerged as one of the leading consumers of cycles on parallel machines. This change has been driven by two trends. First, parallel scientific applications are being used to process larger datasets that do not fit in memory. Second, a large number of parallel ma ..."
Abstract
-
Cited by 31 (6 self)
- Add to MetaCart
I/O-intensive parallel programs have emerged as one of the leading consumers of cycles on parallel machines. This change has been driven by two trends. First, parallel scientific applications are being used to process larger datasets that do not fit in memory. Second, a large number of parallel machines are being used for non-scientific applications. Efficient execution of these applications requires high-performance I/O systems which have been designed to meet their I/O requirements. In this paper, we examine the I/O requirements for data-intensive parallel applications and the implications of these requirements for the design of I/O systems for parallel machines. We attempt to answer the following questions. First, what is the steady-state as well peak I/O rate required? Second, what spatial patterns, if any, occur in the sequence of I/O requests for individual applications? Third, what is the degree of intra-processor and inter-processor locality in I/O accesses? Fourth, does the application structure allow programmers to disclose future I/O requests to the I/O system? Fifth, what patterns, if any, exist in the sequence of inter-arrival times of I/O requests? To address these questions, we have analyzed I/O request traces for a diverse set of I/O-intensive parallel applications. This set includes seven scientific applications and four non-scientific applications.
Parallel I/O workload characteristics using Vesta
- in Proceedings of the IPPS ’95 Workshop on Input/Output in Parallel and Distributed Systems, IEEE Computer Society
, 1995
"... In recent years, the design and performance evaluation of parallel processors has focused on the processor, memory and communication subsystems. As a result, these subsystems have better performance potential than the I/O subsystem. In fact, the I/O subsystem is the bottleneck in many machines. Howe ..."
Abstract
-
Cited by 30 (0 self)
- Add to MetaCart
In recent years, the design and performance evaluation of parallel processors has focused on the processor, memory and communication subsystems. As a result, these subsystems have better performance potential than the I/O subsystem. In fact, the I/O subsystem is the bottleneck in many machines. However, there are a number of studies currently underway to improve the design of parallel I/O subsystems. To develop optimal parallel I/O subsystem designs, one must have a thorough understanding of the workload characteristics of parallel I/O and its exploitation of the associated parallel file system. Presented are the results of a study conducted to analyze the parallel I/O workloads of several applications on a parallel processor using the Vesta parallel file system. Traces of the applications are obtained to collect system events, communication events, and parallel I/O events. The traces are then analyzed to determine workload characteristics. The results show I/O request rates on the order of hundreds of requests per second, a large majority of requests are for small amounts of data (less than 1500 bytes), a few requests are for large amounts of data (on the order of megabytes), significant file sharing among processes within a job, and strong temporal, traditional spatial, and interprocess spatial locality. I.
Caching and writeback policies in parallel file systems
- Journal of Parallel and Distributed Computing
, 1993
"... Improvements in the processing speed of multiprocessors are outpacing improvements in the speed of disk hardware. Parallel disk I/O subsystems have been proposed as one way to close the gap between processor and disk speeds. Such parallel disk systems require parallel le system software to avoid per ..."
Abstract
-
Cited by 30 (7 self)
- Add to MetaCart
Improvements in the processing speed of multiprocessors are outpacing improvements in the speed of disk hardware. Parallel disk I/O subsystems have been proposed as one way to close the gap between processor and disk speeds. Such parallel disk systems require parallel le system software to avoid performance-limiting bottlenecks. We discuss cache management techniques that can be used inaparallel le system implementation. We examine several writeback policies, and give results of experiments that test their performance. 1
I/O Characterization of a Portable Astrophysics Application on the IBM SP and Intel Paragon
, 1995
"... Many large-scale applications on parallel machines are bottlenecked by the I/O performance rather than the CPU or communication performance of the system. To improve the I/O performance, it is first necessary for system designers to understand the I/O requirements of various applications. This pa ..."
Abstract
-
Cited by 14 (1 self)
- Add to MetaCart
Many large-scale applications on parallel machines are bottlenecked by the I/O performance rather than the CPU or communication performance of the system. To improve the I/O performance, it is first necessary for system designers to understand the I/O requirements of various applications. This paper presents the results of a study of the I/O characteristics and performance of a real, I/O-intensive, portable, parallel application in astrophysics, on two different parallel machines---the IBM SP and the Intel Paragon. We instrumented the source code to record all I/O activity, and analyzed the resulting trace files. Our results show that, for this application, the I/O consists of fairly large writes, and writing data to files is faster on the Paragon, whereas opening and closing files are faster on the SP. We also discuss how the I/O performance of this application could be improved; particularly, we believe that this application would benefit from using collective I/O.
Automatic Classification Of Input/Output Access Patterns
, 1997
"... Despite continued innovations in disk design, input/output performance has not kept pace with concurrent increases in processor speeds. Much research has focused on developing algorithms to avoid input/output or hide input/output latency in an attempt to redress this widening gap. Many studies have ..."
Abstract
-
Cited by 9 (2 self)
- Add to MetaCart
Despite continued innovations in disk design, input/output performance has not kept pace with concurrent increases in processor speeds. Much research has focused on developing algorithms to avoid input/output or hide input/output latency in an attempt to redress this widening gap. Many studies have shown that with advance knowledge of access patterns, file systems can improve input/output performance by selecting policies appropriate for the resource demands. Unfortunately, access patterns may be complex or data dependent, and therefore unknown a priori. Our thesis is that the file system can automatically detect qualitative file access patterns both locally (per parallel program thread) and globally (per parallel program) and use this information to dynamically choose appropriate file system policies. We propose two complementary methods for automatic classification, based on neural networks and hidden Markov models, respectively. Global classifications are created from a combination...
Profile-Guided I/O Partitioning
- In Proceedings of the International Conference on Supercompuing
, 2003
"... In the field of high performance computing there is a growing need to process large, complex datasets. Many of these applications are file-intensive workloads, performing a large number of reads from and writes to a small number of files. When executing these workloads on cluster-based systems, perf ..."
Abstract
-
Cited by 6 (3 self)
- Add to MetaCart
In the field of high performance computing there is a growing need to process large, complex datasets. Many of these applications are file-intensive workloads, performing a large number of reads from and writes to a small number of files. When executing these workloads on cluster-based systems, performance cannot scale by simply increasing the number of compute nodes. To e#ectively exploit parallel resources we need to parallelize file I/O. The potential impact of exploiting parallel I/O grows as the gap between CPU and disk speeds continues to increase.
The Architectural Implications of Pipeline and Batch Sharing in Scientific Workloads
, 2002
"... We present a study of six batch-pipelined scientific workloads. Whereas other studies focus on the behavior of a single application, we characterize an emerging type of workload which consists of pipelines of sequential processes that use file storage for communication and also share significant dat ..."
Abstract
-
Cited by 5 (2 self)
- Add to MetaCart
We present a study of six batch-pipelined scientific workloads. Whereas other studies focus on the behavior of a single application, we characterize an emerging type of workload which consists of pipelines of sequential processes that use file storage for communication and also share significant data across a batch. This study includes measurements of the memory, CPU, and I/O requirements of individual components as well as analyses of I/O sharing within complete batches, as well as a discussion of the architectural ramifications of these new types of workloads.

