Results 1 - 10
of
11
An Experimental Evaluation of the Parallel I/O Systems of the IBM SP and Intel Paragon Using a Production Application
, 1996
"... We present the results of an experimental evaluation of the parallel I/O systems of the IBM SP and Intel Paragon using a real three-dimensional parallel application code. This application, developed by scientists at the University of Chicago, simulates the gravitational collapse of self-gravita ..."
Abstract
-
Cited by 24 (12 self)
- Add to MetaCart
We present the results of an experimental evaluation of the parallel I/O systems of the IBM SP and Intel Paragon using a real three-dimensional parallel application code. This application, developed by scientists at the University of Chicago, simulates the gravitational collapse of self-gravitating gaseous clouds. It performs parallel I/O by using library routines that we developed and optimized separately for the SP and Paragon. The I/O routines perform two-phase I/O and use the parallel file systems PIOFS on the SP and PFS on the Paragon. We studied the I/O performance for two different sizes of the application. In the small case, we found that I/O was much faster on the SP. In the large case, open, close, and read operations were only slightly faster, and seeks were significantly faster, on the SP; whereas, writes were slightly faster on the Paragon. The communication required within our I/O routines was faster on the Paragon in both cases. The highest read bandwidth ...
Applications of parallel I/O
, 1996
"... Scientific applications are increasingly being implemented on massively parallel supercomputers. Many of these applications have intense I/O demands, as well as massive computational requirements. This paper is essentially an annotated bibliography of papers and other sources of information about sc ..."
Abstract
-
Cited by 11 (2 self)
- Add to MetaCart
Scientific applications are increasingly being implemented on massively parallel supercomputers. Many of these applications have intense I/O demands, as well as massive computational requirements. This paper is essentially an annotated bibliography of papers and other sources of information about scientific applications using parallel I/O. It will be updated periodically.
Automatic Classification Of Input/Output Access Patterns
, 1997
"... Despite continued innovations in disk design, input/output performance has not kept pace with concurrent increases in processor speeds. Much research has focused on developing algorithms to avoid input/output or hide input/output latency in an attempt to redress this widening gap. Many studies have ..."
Abstract
-
Cited by 9 (2 self)
- Add to MetaCart
Despite continued innovations in disk design, input/output performance has not kept pace with concurrent increases in processor speeds. Much research has focused on developing algorithms to avoid input/output or hide input/output latency in an attempt to redress this widening gap. Many studies have shown that with advance knowledge of access patterns, file systems can improve input/output performance by selecting policies appropriate for the resource demands. Unfortunately, access patterns may be complex or data dependent, and therefore unknown a priori. Our thesis is that the file system can automatically detect qualitative file access patterns both locally (per parallel program thread) and globally (per parallel program) and use this information to dynamically choose appropriate file system policies. We propose two complementary methods for automatic classification, based on neural networks and hidden Markov models, respectively. Global classifications are created from a combination...
Performance Modeling for the Panda Array I/O Library
- In Proceedings of Supercomputing '96
, 1996
"... We present an analytical performance model for Panda, a library for synchronized i/o of large multidimensional arrays on parallel and sequential platforms, and show how the Panda developers use this model to evaluate Panda's parallel i/o performance and guide future Panda development. The model vali ..."
Abstract
-
Cited by 8 (3 self)
- Add to MetaCart
We present an analytical performance model for Panda, a library for synchronized i/o of large multidimensional arrays on parallel and sequential platforms, and show how the Panda developers use this model to evaluate Panda's parallel i/o performance and guide future Panda development. The model validation shows that system developers can simplify performance analysis, identify potential performance bottlenecks, and study the design trade-offs for Panda on massively parallel platforms more easily than by conducting empirical experiments. More importantly, we show that the outputs of the performance model can be used to help make optimal plans for handling application i/o requests, the first step toward our long-term goal of automatically optimizing i/o request handling in Panda. This research was supported by an ARPA Fellowship in High Performance Computing administered by the Institute for Advanced Computer Studies, University of Maryland, by NSF under PYI grant IRI 89 58582, and by N...
PI/OT, Parallel I/O Templates
, 1997
"... This paper presents a novel, top-down, high-level approach to parallelizing file I/O. Each parallel file descriptor is annotated with a high-level specification, or template, of the expected parallel behaviour. The annotations are external to and independent of the source code. At run-time, all I/O ..."
Abstract
-
Cited by 4 (2 self)
- Add to MetaCart
This paper presents a novel, top-down, high-level approach to parallelizing file I/O. Each parallel file descriptor is annotated with a high-level specification, or template, of the expected parallel behaviour. The annotations are external to and independent of the source code. At run-time, all I/O using a parallel file descriptor adheres to the semantics of the selected template. By separating the parallel I/O specifications from the code, a user can quickly change the I/O behaviour without rewriting code. Templates can be composed hierarchically to construct more complex access patterns. Two sample parallel programs using these templates are compared against versions implemented in an existing parallel I/O system (PIOUS). The sample programs show that the use of parallel I/O templates are beneficial from both the performance and software engineering points of view. 1. Introduction The development of parallel applications has focused on computational parallelism. However, the corres...
Performance Prediction and Analysis of Parallel Out-of-Core Matrix Factorization
- In Proceedings of the 7th International Conference on High Performance Computing (HiPC'00
, 2000
"... In this paper, we present a analytical performance model of the parallel left-right looking out-of-core LU factorization algorithm. We show the accuracy of the performance prediction for a prototype implementation in the ScaLAPACK library. We will show that with a correct distribution of the mat ..."
Abstract
-
Cited by 3 (2 self)
- Add to MetaCart
In this paper, we present a analytical performance model of the parallel left-right looking out-of-core LU factorization algorithm. We show the accuracy of the performance prediction for a prototype implementation in the ScaLAPACK library. We will show that with a correct distribution of the matrix and with an overlap of IO by computation, we obtain performances similar to those of the in-core algorithm. To get such performances, the size of the physical main memory only need to be proportional to the product of the matrix order (not the matrix size) by the ratio of the IO bandwidth and the computation rate: There is no need of large main memory for the factorization of huge matrix! 1
A Distributed Multi-Storage Resource Architecture and I/O Performance Prediction for Scientific Computing
- In Proceedings of the Ninth IEEE International Symposium on High Performance Distributed Computing (HPDC’00
, 2000
"... Abstract. I/O intensive applications have posed great challenges to computational scientists. A major problem of these applications is that users have to sacrifice performance requirements in order to satisfy storage capacity requirements in a conventional computing environment. Further performance ..."
Abstract
-
Cited by 3 (0 self)
- Add to MetaCart
Abstract. I/O intensive applications have posed great challenges to computational scientists. A major problem of these applications is that users have to sacrifice performance requirements in order to satisfy storage capacity requirements in a conventional computing environment. Further performance improvement is impeded by the physical nature of these storage media even when state-of-the-art I/O optimizations are employed. In this paper, we present a distributed multi-storage resource architecture, which can satisfy both performance and capacity requirements by employing multiple storage resources. Compared to a traditional single storage resource architecture, our architecture provides a more flexible and reliable computing environment. This architecture can bring new opportunities for high performance computing as well as inherit state-of-the-art I/O optimization approaches that have already been developed. It provides application users with high-performance storage access even when they do not have the availability of a single large local storage archive at their disposal. We also develop an Application Programming Interface (API) that provides transparent management and access to various storage resources in our computing environment. Since I/O usually dominates the performance in I/O intensive applications, we establish an I/O performance prediction mechanism which consists of a performance database and a prediction algorithm to help users better evaluate and schedule their applications. A tool is also developed to help users automatically generate performance data stored in databases. The experiments show that our multi-storage resource architecture is a promising platform for high performance distributed computing. Keywords: multi-storage resource architecture, I/O performance prediction, data intensive computing 1.
Virtual Memory Management in Data Parallel Applications
- In HPCN Europe
, 1999
"... The PaLaDiN (PArallel LArge Data set In Network of workstations) project is concerned with parallel out-of-core application running on cluster of workstations or PCs. In such architectures, each node has a virtual memory manager and a rst idea is to use this feature to run \parallel out-of-core" ..."
Abstract
-
Cited by 3 (1 self)
- Add to MetaCart
The PaLaDiN (PArallel LArge Data set In Network of workstations) project is concerned with parallel out-of-core application running on cluster of workstations or PCs. In such architectures, each node has a virtual memory manager and a rst idea is to use this feature to run \parallel out-of-core" application as a parallel in-core one. The out-of-core part of the problem, i.e. the schedule of data fetch and data write-back, is relegated to the operating system. In this paper we show that usual virtual memory manager is not well suited for parallel out-of-core application. Then, we propose an extension to modern operating system which allow to dene application specic virtual memory manager. This extension is made up of one kernel module (MMUSSEL) and one library (MMUM) and run on Linux. We present a new pagination strategy for the LU decomposition program. 1
Dealing with Massive Data: From Parallel I/O to Grid I/O
, 2003
"... Acknowledgements Many people have helped us find our way during the development of this thesis. Erich Schikuta, our supervisor, provided a motivating, enthusiastic, and critical atmosphere dur-ing our discussions. It was a great pleasure for us to conduct this thesis under his su-pervision. We also ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
Acknowledgements Many people have helped us find our way during the development of this thesis. Erich Schikuta, our supervisor, provided a motivating, enthusiastic, and critical atmosphere dur-ing our discussions. It was a great pleasure for us to conduct this thesis under his su-pervision. We also acknowledge Heinz and Kurt Stockinger who provided constructive comments. We would also like to thank everybody for providing us with feedback.
MS-I/O: A Distributed Multi-Storage I/O System
"... More and more parallel applications are running in a distributed environment to take advantage of easily available and inexpensive commodity resources. For data intensive applications, employing multiple distributed storage resources has many advantages. In this paper, we present a Multi-Storage I/O ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
More and more parallel applications are running in a distributed environment to take advantage of easily available and inexpensive commodity resources. For data intensive applications, employing multiple distributed storage resources has many advantages. In this paper, we present a Multi-Storage I/O System (MS-I/O) that can not only effectively manage various distributed storage resources in the system, but also provide novel high performance storage access schemes. MS-I/O employs many state-of-the-art I/O optimizations such as collective I/O, asynchronous I/O etc. and a number of new techniques such as data location, data replication, subfile, superfile and data access history. In addition, many MS-I/O optimization schemes can work simultaneously within a single data access session, greatly improving the performance.

