Results 1 -
8 of
8
The Galley parallel file system
- Parallel Computing
, 1996
"... Most current multiprocessor le systems are designed to use multiple disks in parallel, using the high aggregate bandwidth to meet the growing I/O requirements of parallel scienti c applications. Many multiprocessor le systems provide applications with a conventional Unix-like interface, allowing the ..."
Abstract
-
Cited by 127 (8 self)
- Add to MetaCart
Most current multiprocessor le systems are designed to use multiple disks in parallel, using the high aggregate bandwidth to meet the growing I/O requirements of parallel scienti c applications. Many multiprocessor le systems provide applications with a conventional Unix-like interface, allowing the application to access multiple disks transparently. Thisinterface conceals the parallelism within the le system, increasing the ease of programmability, but making it di cult or impossible for sophisticated programmers and libraries to use knowledge about their I/O needs to exploit that parallelism. In addition to providing an insu cient interface, most current multiprocessor le systems are optimized for a di erent workload than they are being asked to support. We introduce Galley, a new parallel le system that is intended to e ciently support realistic scienti c multiprocessor workloads. We discuss Galley's le structure and application interface, as well as the performance advantages o ered by that interface. 1
Object-relational Queries into Multidimensional Databases with the Active Data Repository
, 1999
"... As computational power and storage capacity increase, processing and analyzing large volumes of multi-dimensional datasets play an increasingly important role in many domains of scientific research. Scientific applications that make use of very large scientific datasets have several important charac ..."
Abstract
-
Cited by 22 (7 self)
- Add to MetaCart
As computational power and storage capacity increase, processing and analyzing large volumes of multi-dimensional datasets play an increasingly important role in many domains of scientific research. Scientific applications that make use of very large scientific datasets have several important characteristics: datasets consist of complex data and are usually multi-dimensional; applications usually retrieve a subset of all the data available in the dataset; various applicationspecific operations are performed on the data items retrieved. Such applications can be supported by object-relational database management systems (OR-DBMSs). In addition to providing functionality to define new complex datatypes and user-defined functions, an OR-DBMS for scientific datasets should contain runtime support that will provide optimized storage for very large datasets and an execution environment for user-defined functions involving expensive operations. In this paper we describe an infrastructure, the ...
Early experiences in evaluating the Parallel Disk Model with the ViC* implementation
, 1996
"... Although several algorithms have been developed for the Parallel Disk Model (PDM), few have beenimplemented. Consequently, little has been known about the accuracy of thePDMin measuring I/O time and total running time toperform an out-of-core computation. This paper analyzes timing results on multip ..."
Abstract
-
Cited by 19 (6 self)
- Add to MetaCart
Although several algorithms have been developed for the Parallel Disk Model (PDM), few have beenimplemented. Consequently, little has been known about the accuracy of thePDMin measuring I/O time and total running time toperform an out-of-core computation. This paper analyzes timing results on multiple-disk platforms fortwo PDM algorithms, out-of-core radix sort and BMMC permutations, to determine the strengths and weaknesses of thePDM. The results indicate the following. First, good PDM algorithms are usually not I/O bound. Second, of the four PDM parameters, one (problem size) is a good indicator of I/O time and running time, one (memory size) is a good indicator of I/O time but not necessarily running time, and the other two (block size and number of disks) do not necessarily indicate either I/O or running time. Third, because PDM algorithms tendnottobeI/Obound, using asynchronous I/O can reduce I/O wait times signi cantly. The software interface to the PDM is part of the ViC * run-time library. The interface is a set of wrappers that are designed to be both e cient and portable across several underlying le systems and target machines. 1
Performance of the Galley Parallel File System
, 1996
"... As the I/O needs of parallel scientific applications increase, file systems for multiprocessors are being designed to provide applications with parallel access to multiple disks. Many parallel file systems present applications with a conventional Unix-like interface that allows the application to ac ..."
Abstract
-
Cited by 10 (4 self)
- Add to MetaCart
As the I/O needs of parallel scientific applications increase, file systems for multiprocessors are being designed to provide applications with parallel access to multiple disks. Many parallel file systems present applications with a conventional Unix-like interface that allows the application to access multiple disks transparently. This interface conceals the parallelism within the file system, which increases the ease of programmability, but makes it difficult or impossible for sophisticated programmers and libraries to use knowledge about their I/O needs to exploit that parallelism. Furthermore, most current parallel file systems are optimized for a different workload than they are being asked to support. We introduce Galley, a new parallel file system that is intended to efficiently support realistic parallel workloads. Initial experiments, reported in this paper, indicate that Galley is capable of providing high-performance I/O to applications that access data in patterns that have been observed to be common.
Galley: A New Parallel File System For Scientific Workloads
, 1996
"... Most current multiprocessor file systems are designed to use multiple disks in parallel, using the high aggregate bandwidth to meet the growing I/O requirements of parallel scientific applications. Most multiprocessor file systems provide applications with a conventional Unix-like interface, allowin ..."
Abstract
-
Cited by 9 (2 self)
- Add to MetaCart
Most current multiprocessor file systems are designed to use multiple disks in parallel, using the high aggregate bandwidth to meet the growing I/O requirements of parallel scientific applications. Most multiprocessor file systems provide applications with a conventional Unix-like interface, allowing the application to access those multiple disks transparently. This interface conceals the parallelism within the file system, increasing the ease of programmability, but making it difficult or impossible for sophisticated application and library programmers to use knowledge about their I/O to exploit that parallelism. In addition to providing an insufficient interface, most current multiprocessor le systems are optimized for a different workload than they are being asked to support. In this work we examine current multiprocessor file systems, as well as how those file systems are used by scientific applications. Contrary to the expectations of the designers of current parallel file systems, the workloads on those systems are dominated by requests to read and write small pieces of data. Furthermore, rather than being accessed sequentially and contiguously, as in uniprocessor and supercomputer workloads, files in multiprocessor file systems are accessed in regular, structured, but non-contiguous patterns. Based on our observations of multiprocessor workloads, we have designed Galley, a new parallel
Flexibility and performance of parallel file systems
- ACM Operating Systems Review
, 1996
"... Many scienti c applications for high-performance multiprocessors have tremendous I/O requirements. As a result, the I/O system is often the limiting factor of application performance. Several new parallel le systems have been developed in recent years, each promising better performance for some clas ..."
Abstract
-
Cited by 7 (2 self)
- Add to MetaCart
Many scienti c applications for high-performance multiprocessors have tremendous I/O requirements. As a result, the I/O system is often the limiting factor of application performance. Several new parallel le systems have been developed in recent years, each promising better performance for some class of parallel applications. As we gain experience with parallel computing, and parallel le systems in particular, it becomes increasingly clear that a single solution does not suit all applications. For example, it appears to be impossible to nd a single appropriate interface, caching policy, le structure, or disk management strategy. Furthermore, the proliferation of le-system interfaces and abstractions make application portability a signi cant problem. We propose that the traditional functionality of parallel le systems be separated into two components: a xed core that is standard on all platforms, encapsulating only primitive abstractions and interfaces, and a set of high-level libraries to provide a variety of abstractions and application-programmer interfaces (APIs). We think of this approach as the \RISC " of parallel le-system design. We present our current and next-generation le systems as examples of this structure. Their features, such as a three-dimensional le structure, strided read and write interfaces, and I/Onode programs, are speci cally designed with the exibility and performance necessary to support a wide range of applications. 1
Introduction to multiprocessor I/O architecture
- Input/Output in Parallel and Distributed Computer Systems, chapter 4
, 1996
"... ..."
Adaptive Prefetching for Device-Independent File I/O
- O,” Proceedings of the SPIE
, 1998
"... Device independent I/O has been a holy grail to operating system designers since the early days of UNIX. Unfortunately, existing operating systems fall short of this goal for multimedia applications. Techniques such as caching and sequential read-ahead can help mask I/O latency in some cases, but in ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
Device independent I/O has been a holy grail to operating system designers since the early days of UNIX. Unfortunately, existing operating systems fall short of this goal for multimedia applications. Techniques such as caching and sequential read-ahead can help mask I/O latency in some cases, but in others they increase latency and add substantial jitter. Multimedia applications, such as video players, are sensitive to vagaries in performance since I/O latency and jitter affect the quality of presentation. Our solution uses adaptive prefetching to reduce both latency and jitter. Applications submit file access plans to the prefetcher, which then generates I/O requests to the operating system and manages the buffer cache to isolate the application from variations in device performance. Our experiments show device independence can be achieved: an MPEG video player sees the same latency when reading from a local disk or an NFS server. Moreover, our approach reduces jitter substantially. ...

