Results 1 - 10
of
12
Coupling Prefix Caching and Collective Downloads for Remote Dataset Access
- In Proceedings of the 16th ACM International Conference on Supercomputing
, 2006
"... Scientific datasets are typically archived at mass storage systems or data centers close to supercomputers/instruments. Endusers of these datasets, however, usually perform parts of their workflows at their local computers. In such cases, client-side caching can offer significant gains by reducing t ..."
Abstract
-
Cited by 8 (7 self)
- Add to MetaCart
Scientific datasets are typically archived at mass storage systems or data centers close to supercomputers/instruments. Endusers of these datasets, however, usually perform parts of their workflows at their local computers. In such cases, client-side caching can offer significant gains by reducing the cost of widearea data movement. Scientific data caches, however, traditionally cache entire datasets, which may not be necessary. In this paper, we propose a novel combination of prefix caching and collective download. Prefix caching allows the bootstrapping of dataset downloads by caching only a prefix of the dataset, while collective download facilitates efficient parallel patching of the missing suffix from an external data source. To estimate the optimal prefix size, we further present an analytical model that considers both the initial download overhead and the downloading speed. We implemented our proposed approach in the FreeLoader distributed cache prototype. Experimental results (using multiple scientific data repositories and data transfer tools, as well as a real-world scientific dataset access trace) demonstrate that prefix caching and collective download can be implemented efficiently, our model can select an appropriate prefix size, and the cache hit rate can be improved significantly without hurting the local access rate of cached datasets. 1.
Constructing collaborative desktop storage caches for large scientific datasets
- ACM Transaction on Storage (TOS
, 2006
"... or classroom use provided that the copies are not made or distributed for profit or commercial advantage, the ACM copyright/server notice, the title of the publication, and its date appear, and ..."
Abstract
-
Cited by 8 (5 self)
- Add to MetaCart
or classroom use provided that the copies are not made or distributed for profit or commercial advantage, the ACM copyright/server notice, the title of the publication, and its date appear, and
High-level buffering for hiding periodic output cost in scientific simulations
- IEEE Transactions Parallel Distributed Systems
, 2006
"... Abstract—Scientific applications often need to write out large arrays and associated metadata periodically for visualization or restart purposes. In this paper, we present active buffering, a high-level transparent buffering scheme for collective I/O, in which processors actively organize their idle ..."
Abstract
-
Cited by 7 (1 self)
- Add to MetaCart
Abstract—Scientific applications often need to write out large arrays and associated metadata periodically for visualization or restart purposes. In this paper, we present active buffering, a high-level transparent buffering scheme for collective I/O, in which processors actively organize their idle memory into a hierarchy of buffers for periodic output data. It utilizes idle memory on the processors, yet makes no assumption regarding runtime memory availability. Active buffering can perform background I/O while the computation is going on, is extensible to remote I/O for more efficient data migration, and can be implemented in a portable style in today’s parallel I/O libraries. It can also mask performance problems of scientific data formats used by many scientists. Performance experiments with both synthetic benchmarks and real simulation codes on multiple platforms show that active buffering can greatly reduce the visible I/O cost from the application’s point of view. Index Terms—Parallel I/O library design, performance optimization, experimentation. 1
MPIIO/L: efficient remote i/o for mpi-io via logistical networking. InIPDPS
, 2006
"... Scientific applications often need to access remotely located files, but many remote I/O systems lack standard APIs that allow efficient and direct access from application codes. This work presents MPI-IO/L, a remote I/O facility for MPI-IO using Logistical Networking. This combination not only prov ..."
Abstract
-
Cited by 5 (3 self)
- Add to MetaCart
Scientific applications often need to access remotely located files, but many remote I/O systems lack standard APIs that allow efficient and direct access from application codes. This work presents MPI-IO/L, a remote I/O facility for MPI-IO using Logistical Networking. This combination not only provides high-performance and direct remote I/O using the standard parallel I/O interface but also offers convenient management and sharing of remote files. We show the performance trade-offs with various remote I/O approaches implemented in the system, which can help scientists identify preferable I/O options for their own applications. We also discuss how Logistical Networking could be improved to work better with parallel I/O systems such as ROMIO. 1
PreDatA- Preparatory Data Analytics on Peta-Scale Machines
"... Abstract—Peta-scale scientific applications running on High End Computing (HEC) platforms can generate large volumes of data. For high performance storage and in order to be useful to science end users, such data must be organized in its layout, indexed, sorted, and otherwise manipulated for subsequ ..."
Abstract
-
Cited by 3 (1 self)
- Add to MetaCart
Abstract—Peta-scale scientific applications running on High End Computing (HEC) platforms can generate large volumes of data. For high performance storage and in order to be useful to science end users, such data must be organized in its layout, indexed, sorted, and otherwise manipulated for subsequent data presentation, visualization, and detailed analysis. In addition, scientists desire to gain insights into selected data characteristics ‘hidden ’ or ‘latent ’ in the massive datasets while data is being produced by simulations. PreDatA, short for Preparatory Data Analytics, is an approach for preparing and characterizing data while it is being produced by the large scale simulations running on peta-scale machines. By dedicating additional compute nodes on the peta-scale machine as staging nodes and staging simulation’s output data through these nodes, PreDatA can exploit their computational power to perform selected data manipulations with lower latency than attainable by first moving data into file systems and storage. Such in-transit manipulations are supported by the PreDatA middleware through RDMAbased data movement to reduce write latency, application-specific operations on streaming data that are able to discover latent data characteristics, and appropriate data reorganization and metadata annotation to speed up subsequent data access. As a result, PreDatA enhances the scalability and flexibility of current I/O stack on HEC platforms and is useful for data pre-processing, runtime data analysis and inspection, as well as for data exchange between concurrently running simulation models. Performance evaluations with several production peta-scale applications on Oak Ridge National Laboratory’s Leadership Computing Facility demonstrate the feasibility and advantages of the PreDatA approach. I.
Towards a High Performance Implementation of MPI-IO on the Lustre File System
"... Abstract—Lustre is becoming an increasingly important file system for large-scale computing clusters. The problem is that many dataintensive applications use MPI-IO for their I/O requirements, and it has been well documented that MPI-IO performs poorly in a Lustre file system environment. However, t ..."
Abstract
-
Cited by 3 (0 self)
- Add to MetaCart
Abstract—Lustre is becoming an increasingly important file system for large-scale computing clusters. The problem is that many dataintensive applications use MPI-IO for their I/O requirements, and it has been well documented that MPI-IO performs poorly in a Lustre file system environment. However, the reasons for such poor performance are not currently well understood. We believe that the primary reason for poor performance is that the assumptions underpinning most of the parallel I/O optimizations implemented in MPI-IO do not hold in a Lustre environment. Perhaps the most important assumption that appears to be incorrect is that optimal performance is obtained by performing large, contiguous I/O operations. Our research suggests that this is often the worst approach to take in a Lustre file system. In fact, we found that the best performance is sometimes achieved when each process performs a series of smaller, non-contiguous I/O requests. In this paper, we provide experimental results showing that such assumptions do not apply in Lustre, and explore new approaches that appear to provide significantly better performance. I.
Semantics-based Distributed I/O with the ParaMEDIC Framework ∗
"... Many large-scale applications simultaneously rely on multiple resources for efficient execution. For example, such applications may require both large compute and storage resources; however, very few supercomputing centers can provide large quantities of both. Thus, data generated at the compute sit ..."
Abstract
-
Cited by 2 (2 self)
- Add to MetaCart
Many large-scale applications simultaneously rely on multiple resources for efficient execution. For example, such applications may require both large compute and storage resources; however, very few supercomputing centers can provide large quantities of both. Thus, data generated at the compute site oftentimes has to be moved to a remote storage site for either storage or visualization and analysis. Clearly, this is not an efficient model, especially when the two sites are distributed over a wide-area network. Thus, we present a framework called “ParaMEDIC: Parallel Metadata Environment for Distributed I/O and Computing ” which uses application-specific semantic information to convert the generated data to orders-of-magnitude smaller metadata at the compute site, transfer the metadata to the storage site, and re-process the metadata at the storage site to regenerate the output. Specifically, ParaMEDIC trades a small amount of additional computation (in the form of data post-processing) for a potentially significant reduction in data that needs to be transferred in distributed environments. 1
Distributed I/O with ParaMEDIC: Experiences with a Worldwide Supercomputer ∗
"... Achieving high performance for distributed I/O on a wide-area network continues to be an elusive holy grail. Despite enhancements in network hardware as well as software stacks, achieving high-performance remains a challenge. In this paper, our worldwide team took a completely new and non-traditiona ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
Achieving high performance for distributed I/O on a wide-area network continues to be an elusive holy grail. Despite enhancements in network hardware as well as software stacks, achieving high-performance remains a challenge. In this paper, our worldwide team took a completely new and non-traditional approach to distributed I/O, called ParaMEDIC: Parallel Metadata Environment for Distributed I/O and Computing, by utilizing application-specific transformation of data to orders-of-magnitude smaller meta-data before performing the actual I/O. Specifically, this paper details our experiences in deploying a large-scale system to facilitate the discovery of missing genes and constructing a genome similarity tree by encapsulating the mpiBLAST sequence-search algorithm into ParaMEDIC. The overall project involved nine different computational sites spread across the U.S. generating more than a petabyte of data, that was “teleported ” to a large-scale facility in Tokyo for storage. Keywords: distributed I/O, bioinformatics, BLAST, grid computing, cluster computing. 1
Feasibility Study of Effective Remote I/O Using a Parallel NetCDF Interface in a Long-Latency Network
"... Abstract—NetCDF provides portable and selfdescribing I/O data format for array-oriented data in scientific computation domains. Its parallel I/O interface named parallel netCDF (hereafter PnetCDF) provides parallel I/O operations with the help of an MPI interface. To realize such operations among co ..."
Abstract
- Add to MetaCart
Abstract—NetCDF provides portable and selfdescribing I/O data format for array-oriented data in scientific computation domains. Its parallel I/O interface named parallel netCDF (hereafter PnetCDF) provides parallel I/O operations with the help of an MPI interface. To realize such operations among computers which have different MPI libraries through a PnetCDF interface, a Stampi library was introduced as an underlying MPI library. Decomposition in multi-dimensional data leads to complex I/O operations in non-contiguous parallel I/O pattern with the help of a derived data type. However, times for data communications between computers are independent of the complexity because the transfered data are contiguous in memory buffer. Although it succeeded in effective remote I/O operations in a LAN environment, throughput is degraded due to unoptimized configuration for TCP sockets if interconnection among computers has long latency like WAN. We have addressed the degradation and we observed that applying an appropriate size in a socket buffer minimized I/O times effectively.
Improving Data Availability for Better Access Performance: A Study on Caching Scientific Data on Distributed Desktop Workstations
"... Abstract Client-side data caching serves as an excellent mechanism to store and analyze the rapidly growing scientific data, motivating distributed, client-side caches built from unreliable desktop storage contributions to store and access large scientific data. They offer several desirable properti ..."
Abstract
- Add to MetaCart
Abstract Client-side data caching serves as an excellent mechanism to store and analyze the rapidly growing scientific data, motivating distributed, client-side caches built from unreliable desktop storage contributions to store and access large scientific data. They offer several desirable properties, such as performance impedance matching, improved space utilization, and high parallel I/O bandwidth. In this context, we are faced with two key challenges: (1) the finite amount of contributed cache space is stretched This work builds on several of our previously published studies. While we included extended introduction of the FreeLoader framework [57, 58] as well as our combination of prefix caching and collective download techniques [33], this paper proposes the novel RDPR technique as well as a new striping-enabled cache management algorithm. We also discussed methods to combine RDPR with prefix caching and collective download techniques. Overall, more than 60 % of the manuscript’s content has not been previously published.

