Results 1 - 10
of
13
Integrated Parallel Prefetching and Caching
, 1995
"... Recently there has been a great deal of interest in prefetching from parallel disks, as a technique for enabling serial applications to improve I/O performance. Studies have also shown that for optimal performance, it is important to properly integrate prefetching and caching. In this paper, we stud ..."
Abstract
-
Cited by 63 (5 self)
- Add to MetaCart
Recently there has been a great deal of interest in prefetching from parallel disks, as a technique for enabling serial applications to improve I/O performance. Studies have also shown that for optimal performance, it is important to properly integrate prefetching and caching. In this paper, we study integrated prefetching and caching strategies for multiple disks. We present two algorithms, regular aggressive and reverse aggressive, and show that reverse aggressive is close to optimal. Using trace-driven simulation on a collection of file access traces, we evaluated these algorithms under a variety of data placement alternatives. Our results show that both algorithms can achieve near linear speedup when the load is distributed evenly on the disks, and reverse aggressive performs well even when the placement of blocks on disks distributes the load unevenly. Our simulations also show that, for file system traces, replicating data, even across all of the disks, offers little performance ...
Simple Randomized Mergesort on Parallel Disks
- PARALLEL COMPUTING
, 1996
"... We consider the problem of sorting a file of N records on the D-disk model of parallel I/O [VS94] in which there are two sources of parallelism. Records are transferred to and from disk concurrently in blocks of B contiguous records. In each I/O operation, up to one block can be transferred to or fr ..."
Abstract
-
Cited by 59 (11 self)
- Add to MetaCart
We consider the problem of sorting a file of N records on the D-disk model of parallel I/O [VS94] in which there are two sources of parallelism. Records are transferred to and from disk concurrently in blocks of B contiguous records. In each I/O operation, up to one block can be transferred to or from each of the D disks in parallel. We propose a simple, efficient, randomized mergesort algorithm called SRM that uses a forecast-and-flush approach to overcome the inherent difficulties of simple merging on parallel disks. SRM exhibits a limited use of randomization and also has a useful deterministic version. Generalizing the technique of forecasting [Knu73], our algorithm is able to read in, at any time, the "right" block from any disk, and using the technique of flushing, our algorithm evicts, without any I/O overhead, just the "right" blocks from memory to make space for new ones to be read in. The disk layout of SRM is such that it enjoys perfect write parallelism, avoiding fundamenta...
Tight Bounds for Prefetching and Buffer Management Algorithms for Parallel I/O Systems
- In Foundations of Software Technology and Theoretical Computer Science
, 1996
"... . The growing importance of multiple-disk parallel I/O systems requires the development of appropriate prefetching and buffer management algorithms. We answer several fundamental questions on prefetching and buffer management for such parallel I/O systems. Specifically, we find and prove the opt ..."
Abstract
-
Cited by 22 (11 self)
- Add to MetaCart
. The growing importance of multiple-disk parallel I/O systems requires the development of appropriate prefetching and buffer management algorithms. We answer several fundamental questions on prefetching and buffer management for such parallel I/O systems. Specifically, we find and prove the optimality of an algorithm, P-MIN, that minimizes the number of parallel I/Os. Secondly, we analyze P-CON, an algorithm which always matches its replacement decisions with those of the well-known demand-paged MIN algorithm. We show that P-CON can become fully sequential in the worst case. Finally, we define and analyze P-LRU, a semi-on-line version of the traditional LRU buffermanagement algorithm. Unexpectedly, we find that the performance of P-LRU is independent of the number of disks. 1 Introduction The increasing imbalance between the speeds of processors and I/O devices has resulted in the I/O subsystem becoming a bottleneck in many applications. The use of multiple disks to build...
Competitive Parallel Disk Prefetching and Buffer Management
, 1997
"... We provide a competitive analysis framework for online prefetching and buffer management algorithms in parallel I/O systems, using a read-once model of block references. This has widespread applicability to key I/O-bound applications such as external merging and concurrent playback of multiple vi ..."
Abstract
-
Cited by 21 (11 self)
- Add to MetaCart
We provide a competitive analysis framework for online prefetching and buffer management algorithms in parallel I/O systems, using a read-once model of block references. This has widespread applicability to key I/O-bound applications such as external merging and concurrent playback of multiple video streams. Two realistic lookahead models, global lookahead and local lookahead, are defined. Algorithms NOM and GREED based on these two forms of lookahead are analyzed for shared buffer and distributed buffer configurations, both of which occur frequently in existing systems. An important aspect of our work is that we show how to implement both the models of lookahead in practice using the simple techniques of forecasting and flushing.
Optimal Read-Once Parallel Disk Scheduling
- in IOPADS
, 1999
"... An optimal prefetching and I/O scheduling algorithm L-OPT, for parallel I/O systems, using a read-once model of block references is presented. The algorithm uses knowledge of the next L references, L-block lookahead, to create a minimal-length I/O schedule. We show that the competitive ratio of L ..."
Abstract
-
Cited by 18 (8 self)
- Add to MetaCart
An optimal prefetching and I/O scheduling algorithm L-OPT, for parallel I/O systems, using a read-once model of block references is presented. The algorithm uses knowledge of the next L references, L-block lookahead, to create a minimal-length I/O schedule. We show that the competitive ratio of L-OPT is ( p MD=L), L M , which matches the lower bound of any prefetching algorithm with L-block lookahead. Tight bounds for the remaining ranges of lookahead are also presented. In addition we show that L-OPT is the optimal offline algorithm: when the lookahead consists of the entire reference string, it performs the absolute minimum possible number of I/Os. Finally, we show that L-OPT is comparable to the best on-line algorithm with the same amount of lookahead; the ratio of the length of its schedule to the length of the optimal schedule is always within a constant factor of the best possible. Supported in part by the National Science Foundation under grant CCR-9704562 an...
Improving Parallel-Disk Buffer Management using Randomized Writeback
- Proc. Int’l Conf. Parallel Processing
, 1998
"... We address the problems of I/O scheduling and buffer management for general reference strings in a parallel I/O system. Using the standard parallel disk model with D disks and a shared I/O buffer of size M, we study the performance of on-line algorithms that use bounded global M-block lookahead. We ..."
Abstract
-
Cited by 8 (3 self)
- Add to MetaCart
We address the problems of I/O scheduling and buffer management for general reference strings in a parallel I/O system. Using the standard parallel disk model with D disks and a shared I/O buffer of size M, we study the performance of on-line algorithms that use bounded global M-block lookahead. We introduce the concept of write-back whereby blocks are dynamically relocated between disks during the course of the computation. Write-back allows the layout to be altered to suit different access patterns in different parts of the reference string. We show that any bounded-lookahead on-line algorithm that uses purely deterministic policies must have a competitive ratio of (D). We show how to improve the performance by using randomization, and present a novel algorithm, RAND-WB, using a randomized write-back scheme. RAND-WB has a competitive ratio of ( p D), which is the best achievable by any on-line algorithm with only global M-block lookahead. If the initial layout of data on the disks is uniformly random, RAND-WB has a competitive ratio of (log D). 1.
Pc-opt: Optimal offline prefetching and caching for parallel i/o systems
- IEEE TRANSACTIONS ON COMPUTERS
, 2002
"... We address the problem of prefetching and caching in a parallel I/O system and present a new algorithm for parallel disk scheduling. Traditional buffer management algorithms that minimize the number of block misses are substantially suboptimal in a parallel I/O system where multiple I/Os can proceed ..."
Abstract
-
Cited by 8 (0 self)
- Add to MetaCart
We address the problem of prefetching and caching in a parallel I/O system and present a new algorithm for parallel disk scheduling. Traditional buffer management algorithms that minimize the number of block misses are substantially suboptimal in a parallel I/O system where multiple I/Os can proceed simultaneously. We show that in the offline case, where a priori knowledge of all the requests is available, PC-OPT performs the minimum number of I/Os to service the given I/O requests. This is the first parallel I/O scheduling algorithm that is provably offline optimal in the parallel disk model. In the online case, we study the context of global L-block lookahead, which gives the buffer management algorithm a lookahead consisting of L distinct requests. We show that the competitive ratio of PC-OPT, with global L-block lookahead, is ðM L þ DÞ, when L M, and ðMD=LÞ, when L>M, where the number of disks is D and buffer size is M.
A Simple and Efficient Parallel Disk Mergesort
, 2002
"... External sorting—the process of sorting a file that is too large to fit into the computer’s internal memory and must be stored externally on disks—is a fundamental subroutine in database systems [G], [IBM]. Of prime importance are techniques that use multiple disks in parallel in order to speed up t ..."
Abstract
-
Cited by 7 (0 self)
- Add to MetaCart
External sorting—the process of sorting a file that is too large to fit into the computer’s internal memory and must be stored externally on disks—is a fundamental subroutine in database systems [G], [IBM]. Of prime importance are techniques that use multiple disks in parallel in order to speed up the performance of external sorting. The simple randomized merging (SRM) mergesort algorithm proposed by Barve et al. [BGV] is the first parallel disk sorting algorithm that requires a provably optimal number of passes and that is fast in practice. Knuth [K, Section 5.4.9] recently identified SRM (which he calls “randomized striping”) as the method of choice for sorting with parallel disks. In this paper we present an efficient implementation of SRM, based upon novel and elegant data structures. We give a new implementation for SRM’s lookahead forecasting technique for parallel prefetching and its forecast and flush technique for buffer management. Our techniques amount to a significant improvement in the way SRM carries out the parallel, independent disk accesses necessary to read blocks of input runs efficiently during external merging. Our implementation is
Red-Black Prefetching: An Approximation Algorithm for Parallel Disk Scheduling
- In Foundations of Software Technology and Theoretical Computer Science, number 1530 in LNCS
, 1998
"... . We address the problem of I/O scheduling of read-once reference strings in a multiple-disk parallel I/O system. We present a novel online algorithm, Red-Black Prefetching (RBP), for parallel I/O scheduling. In order to perform accurate prefetching RBP uses L-block lookahead. The performance of RBP ..."
Abstract
-
Cited by 4 (3 self)
- Add to MetaCart
. We address the problem of I/O scheduling of read-once reference strings in a multiple-disk parallel I/O system. We present a novel online algorithm, Red-Black Prefetching (RBP), for parallel I/O scheduling. In order to perform accurate prefetching RBP uses L-block lookahead. The performance of RBP is analyzed in the standard parallel disk model with D independent disks and a shared I/O buffer of size M . We show that the number of parallel I/Os performed by RBP is within a factot \Theta(maxf p MD=L;D 1=3 g) of the number of I/Os done by the optimal offline algorithm. This ratio is within a canstant factor of the best possible when L is L = O(MD 1=3 ). 1 Introduction Continuing advances in processor architecture and technology have resulted in the I/O subsystem becoming the bottleneck in many applications. The problem is exacerbated by the advent of multiprocessing systems that can harness the power of hundreds of processors in speeding up computation. Improvements in I/O tech...
ASP: Adaptive Online Parallel Disk Scheduling
- PROC. OF DIMACS WKSHP. ON EXT. MEMORY ALGORITHMS AND VISUALIZATION, DIMACS
, 1998
"... In this work we address the problems of prefetching and I/O scheduling for read-once reference strings in a parallel I/O system. We use the standard parallel disk model with D disks a shared I/O buffer of size M. We design an on-line algorithm ASP (Adaptive Segmented Prefetching) with ML-block loo ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
In this work we address the problems of prefetching and I/O scheduling for read-once reference strings in a parallel I/O system. We use the standard parallel disk model with D disks a shared I/O buffer of size M. We design an on-line algorithm ASP (Adaptive Segmented Prefetching) with ML-block lookahead, L 1, and compare its performance to the best on-line algorithm with the same lookahead. We show that for any reference string the number of I/Os done by ASP is with a factor \Theta(C), C = minf p

