Results 1 - 10
of
28
Plfs: A checkpoint filesystem for parallel applications
, 2009
"... Parallel applications running across thousands of processors must protect themselves from inevitable system failures. Many applications insulate themselves from failures by checkpointing. For many applications, checkpointing into a shared single file is most convenient. With such an approach, the si ..."
Abstract
-
Cited by 23 (6 self)
- Add to MetaCart
Parallel applications running across thousands of processors must protect themselves from inevitable system failures. Many applications insulate themselves from failures by checkpointing. For many applications, checkpointing into a shared single file is most convenient. With such an approach, the size of writes are often small and not aligned with file system boundaries. Unfortunately for these applications, this preferred data layout results in pathologically poor performance from the underlying file system which is optimized for large, aligned writes to non-shared files. To address this fundamental mismatch, we have developed a virtual parallel log structured file system, PLFS. PLFS remaps an application’s preferred data layout into one which is optimized for the underlying file system. Through testing on PanFS, Lustre, and GPFS, we have seen that this layer of indirection and reorganization can reduce checkpoint time by an order of magnitude for several important benchmarks and real applications without any application modification.
RFS: Efficient and Flexible Remote File Access for MPI-IO
- In Proceedings of the IEEE International Conference on Cluster Computing
, 2004
"... Scientific applications often need to access remote file systems. Because of slow networks and large data size, however, remote I/O can become an even more serious performance bottleneck than local I/O performance. In this work, we present RFS, a high-performance remote I/O facility for ROMIO, which ..."
Abstract
-
Cited by 15 (9 self)
- Add to MetaCart
Scientific applications often need to access remote file systems. Because of slow networks and large data size, however, remote I/O can become an even more serious performance bottleneck than local I/O performance. In this work, we present RFS, a high-performance remote I/O facility for ROMIO, which is a well-known MPI-IO implementation. Our simple, portable, and flexible design eliminates the shortcomings of previous remote I/O efforts. In particular, RFS improves the remote I/O performance by adopting active buffering with threads (ABT), which hides I/O cost by aggressively buffering the output data using available memory and performing background I/O using threads while computation is taking place. Our experimental results show that RFS with ABT can significantly reduce the remote I/O visible cost, achieving up to 92 % of the theoretical peak throughput. The computation slowdown caused by concurrent I/O activities was 0.2–6.2%, which is dwarfed by the overall performance improvement in application turnaround time. 1
Exploiting Lustre File Joining for Effective Collective IO
"... Lustre is a parallel file system that presents high aggregated IO bandwidth by striping file extents across many storage devices. However, our experiments indicate excessively wide striping can cause performance degradation. Lustre supports an innovative file joining feature that joins files in plac ..."
Abstract
-
Cited by 7 (0 self)
- Add to MetaCart
Lustre is a parallel file system that presents high aggregated IO bandwidth by striping file extents across many storage devices. However, our experiments indicate excessively wide striping can cause performance degradation. Lustre supports an innovative file joining feature that joins files in place. To mitigate striping overhead and benefit collective IO, we propose two techniques: split writing and hierarchical striping. In split writing, a file is created as separate subfiles, each of which is striped to only a few storage devices. They are joined as a single file at the file close time. Hierarchical striping builds on top of split writing and orchestrates the span of subfiles in a hierarchical manner to avoid overlapping and achieve the appropriate coverage of storage devices. Together, these techniques can avoid the overhead associated with large stripe width, while still being able to combine bandwidth available from many storage devices. We have prototyped these techniques in the ROMIO implementation of MPI-IO. Experimental results indicate that split writing and hierarchical striping can significantly improve the performance of Lustre collective IO in terms of both data transfer and management operations. On a Lustre file system configured with 46 object storage targets, our implementation improves collective write performance of a 16-process job by as much as 220%. 1
High-level buffering for hiding periodic output cost in scientific simulations
- IEEE Transactions Parallel Distributed Systems
, 2006
"... Abstract—Scientific applications often need to write out large arrays and associated metadata periodically for visualization or restart purposes. In this paper, we present active buffering, a high-level transparent buffering scheme for collective I/O, in which processors actively organize their idle ..."
Abstract
-
Cited by 7 (1 self)
- Add to MetaCart
Abstract—Scientific applications often need to write out large arrays and associated metadata periodically for visualization or restart purposes. In this paper, we present active buffering, a high-level transparent buffering scheme for collective I/O, in which processors actively organize their idle memory into a hierarchy of buffers for periodic output data. It utilizes idle memory on the processors, yet makes no assumption regarding runtime memory availability. Active buffering can perform background I/O while the computation is going on, is extensible to remote I/O for more efficient data migration, and can be implemented in a portable style in today’s parallel I/O libraries. It can also mask performance problems of scientific data formats used by many scientists. Performance experiments with both synthetic benchmarks and real simulation codes on multiple platforms show that active buffering can greatly reduce the visible I/O cost from the application’s point of view. Index Terms—Parallel I/O library design, performance optimization, experimentation. 1
MPIIO/L: efficient remote i/o for mpi-io via logistical networking. InIPDPS
, 2006
"... Scientific applications often need to access remotely located files, but many remote I/O systems lack standard APIs that allow efficient and direct access from application codes. This work presents MPI-IO/L, a remote I/O facility for MPI-IO using Logistical Networking. This combination not only prov ..."
Abstract
-
Cited by 5 (3 self)
- Add to MetaCart
Scientific applications often need to access remotely located files, but many remote I/O systems lack standard APIs that allow efficient and direct access from application codes. This work presents MPI-IO/L, a remote I/O facility for MPI-IO using Logistical Networking. This combination not only provides high-performance and direct remote I/O using the standard parallel I/O interface but also offers convenient management and sharing of remote files. We show the performance trade-offs with various remote I/O approaches implemented in the system, which can help scientists identify preferable I/O options for their own applications. We also discuss how Logistical Networking could be improved to work better with parallel I/O systems such as ROMIO. 1
Scalable I/O Forwarding Framework for High-Performance Computing Systems
"... Abstract—Current leadership-class machines suffer from a significant imbalance between their computational power and their I/O bandwidth. While Moore’s law ensures that the computational power of high-performance computing systems increases with every generation, the same is not true for their I/O s ..."
Abstract
-
Cited by 5 (0 self)
- Add to MetaCart
Abstract—Current leadership-class machines suffer from a significant imbalance between their computational power and their I/O bandwidth. While Moore’s law ensures that the computational power of high-performance computing systems increases with every generation, the same is not true for their I/O subsystems. The scalability challenges faced by existing parallel file systems with respect to the increasing number of clients, coupled with the minimalistic compute node kernels running on these machines, call for a new I/O paradigm to meet the requirements of data-intensive scientific applications. I/O forwarding is a technique that attempts to bridge the increasing performance and scalability gap between the compute and I/O components of leadership-class machines by shipping I/O calls from compute nodes to dedicated I/O nodes. The I/O nodes perform operations on behalf of the compute nodes and can reduce file system traffic by aggregating, rescheduling, and caching I/O requests. This paper presents an open, scalable I/O forwarding framework for high-performance computing systems. We describe an I/O protocol and API for shipping function calls from compute nodes to I/O nodes, and we present a quantitative analysis of the overhead associated with I/O forwarding. Keywords-I/O forwarding; Parallel file systems; Leadershipclass machines
Fusing data management services with file systems
- In Proceedings of the 4th ACM/IEEE Petascale Data Storage Workshop (PDSW ’09
, 2009
"... File systems are the backbone of large-scale data processing for scientific applications. Motivated by the need to provide an extensible and flexible framework beyond the abstractions provided by API libraries for files to manage and analyze large-scale data, we are developing Damasc, an enhanced fi ..."
Abstract
-
Cited by 5 (0 self)
- Add to MetaCart
File systems are the backbone of large-scale data processing for scientific applications. Motivated by the need to provide an extensible and flexible framework beyond the abstractions provided by API libraries for files to manage and analyze large-scale data, we are developing Damasc, an enhanced file system where rich data management services for scientific computing are provided as a native part of the file system. This paper presents our vision for Damasc, a performant file system that would allow scientists or even casual users to pose declarative queries and updates over views of underlying files that are stored in their native bytestream format. In Damasc, a configurable layer is added on top of the file system to expose the contents of files in a logical data model through which views can be defined and used for queries and updates. The logical data model and views are leveraged to optimize access to files through caching and selforganizing indexing. In addition, provenance capture and analysis to file access is also built into Damasc. We describe the salient features of our proposal and discuss how it can benefit the development of scientific code. 1.
Evaluating I/O characteristics and methods for storing structured scientific data
- In Proceedings of the International Parallel & Distributed Processing Symposium
, 2006
"... Many large-scale scientific simulations generate large, structured multi-dimensional datasets. Data is stored at various intervals on high performance I/O storage systems for checkpointing, post-processing, and visualization. Data storage is very I/O intensive and can dominate the overall running ti ..."
Abstract
-
Cited by 3 (2 self)
- Add to MetaCart
Many large-scale scientific simulations generate large, structured multi-dimensional datasets. Data is stored at various intervals on high performance I/O storage systems for checkpointing, post-processing, and visualization. Data storage is very I/O intensive and can dominate the overall running time of an application, depending on the characteristics of the I/O access pattern. Our NCIO benchmark determines how I/O characteristics greatly affect performance (up to 2 orders of magnitude) and provides scientific application developers with guidelines for improvement. In this paper, we examine the impact of various I/O parameters and methods when using the MPI-IO interface to store structured scientific data in an optimized parallel file system. 1.
Noncontiguous Locking Techniques for Parallel File Systems ABSTRACT
"... Many parallel scientific applications use high-level I/O APIs that offer atomic I/O capabilities. Atomic I/O in current parallel file systems is often slow when multiple processes simultaneously access interleaved, shared files. Current atomic I/O solutions are not optimized for handling noncontiguo ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
Many parallel scientific applications use high-level I/O APIs that offer atomic I/O capabilities. Atomic I/O in current parallel file systems is often slow when multiple processes simultaneously access interleaved, shared files. Current atomic I/O solutions are not optimized for handling noncontiguous access patterns because current locking systems have a fixed file system block-based granularity and do not leverage highlevel access pattern information. In this paper we present a hybrid lock protocol that takes advantage of new list and datatype byte-range lock description techniques to enable high performance atomic I/O operations for these challenging access patterns. We implement our scalable distributed lock manager (DLM) in the PVFS parallel file system and show that these techniques improve locking throughput over a naive noncontiguous locking approach by several orders of magnitude in an array of lockonly tests. Additionally, in two scientific I/O benchmarks, we show the benefits of avoiding false sharing with our byterange granular DLM when compared against a block-based lock system implementation. 1.
PARALLEL ACCESS TO NETCDF FILES IN HIGH PERFORMANCE APLICATIONS FROM HIGH-LEVEL FRAMEWORKS
"... applications The analysis of climate variability requires to perform operations with netCDF files [1], Empirical Orthogonal Functions analysis (EOF), and Singular Value Decompositions (SVD) of coupled data sets. As example, PyClimate [2] is a Python package designed to accomplish these tasks sequent ..."
Abstract
- Add to MetaCart
applications The analysis of climate variability requires to perform operations with netCDF files [1], Empirical Orthogonal Functions analysis (EOF), and Singular Value Decompositions (SVD) of coupled data sets. As example, PyClimate [2] is a Python package designed to accomplish these tasks sequentially. However, the huge data volume in this kind of applications requires high performance routines that can be executed in distributed memory architecture platforms. High performance linear algebra operations can be performed with the highlevel framework PyACTS [3]. Also, huge volume of the netcdf files requires a parallel tool for Python. In this way, we present PyPnetCDF like a Python package that implements parallel access to netCDF files using PnetCDF library [4]. The results show a scalable tool with very lower execution times than the

