Results 1 -
8 of
8
Fast Concurrent Access to Parallel Disks
- In 11th ACM-SIAM Symposium on Discrete Algorithms
, 1999
"... High performance applications involving large data sets require the efficient and flexible use of multiple disks. In an external memory machine with D parallel, independent disks, only one block can be accessed on each disk in one I/O step. This restriction leads to a load balancing problem that is ..."
Abstract
-
Cited by 44 (11 self)
- Add to MetaCart
High performance applications involving large data sets require the efficient and flexible use of multiple disks. In an external memory machine with D parallel, independent disks, only one block can be accessed on each disk in one I/O step. This restriction leads to a load balancing problem that is perhaps the main inhibitor for adapting single-disk external memory algorithms to multiple disks. This paper shows that this problem can be solved efficiently using a combination of randomized placement, redundancy and an optimal scheduling algorithm. A buffer of O(D) blocks suffices to support efficient writing of arbitrary blocks if blocks are distributed uniformly at random to the disks (e.g., by hashing). If two randomly allocated copies of each block exist, N arbitrary blocks can be read within dN=De + 1 I/O steps with high probability. In addition, the redundancy can be reduced from 2 to 1 + 1=r for any integer r. These results can be used to emulate the simple and powerful "single-disk multi-head" model of external computing [1] on the physically more realistic independent disk model [33] with small constant overhead. This is faster than a lower bound for deterministic emulation [3].
Asynchronous scheduling of redundant disk arrays
- In 12th ACM Symposium on Parallel Algorithms and Architectures
, 2000
"... Abstract—Allocation of data to parallel disk using redundant storage and random placement of blocks can be exploited to achieve low access delays. New algorithms are proposed which improve the previously known shortest queue algorithm by systematically exploiting that scheduling decisions can be def ..."
Abstract
-
Cited by 12 (4 self)
- Add to MetaCart
Abstract—Allocation of data to parallel disk using redundant storage and random placement of blocks can be exploited to achieve low access delays. New algorithms are proposed which improve the previously known shortest queue algorithm by systematically exploiting that scheduling decisions can be deferred until a block access is actually started on a disk. These algorithms are also generalized for coding schemes with low redundancy. Using extensive simulations, practically important quantities are measured which have so far eluded an analytical treatment: The delay distribution when a stream of requests approaches the limit of the sytem capacity, the system efficiency for parallel disk applications with bounded prefetching buffers, and the combination of both for mixed traffic. A further step toward practice is taken by outlining the system design for: automatically load-balanced parallel hard-disk array. Additional algorithmic measures are proposed for that allow variable sized blocks, seek time reduction, fault tolerance, inhomogeneous systems, and flexible priorization schemes. Index Terms—Parallel disks, lazy scheduling, asynchronous, random redundant storage, duplicate allocation, soft real time, bipartite matching, queuing theory. 1
Striping for Interactive Video: Is it Worth it?
, 1998
"... We study the design of interactive video servers consisting of disk arrays. In order to avoid the hot spot problem in video servers it is conventional wisdom to stripe the videos over the disk array using Fine Grained Striping or Coarse Grained Striping techniques. Striping, however, increases th ..."
Abstract
-
Cited by 5 (0 self)
- Add to MetaCart
We study the design of interactive video servers consisting of disk arrays. In order to avoid the hot spot problem in video servers it is conventional wisdom to stripe the videos over the disk array using Fine Grained Striping or Coarse Grained Striping techniques. Striping, however, increases the seek and rotational overhead, thereby reducing the throughput of the disk array. Our results indicate that the decrease in throughput is substantial when interactive delays are constrained to be less than i second. We show that both a high degree of interactivity and high throughput are achieved by using a narrow striping width and replicating the videos according to the user's request pattern. Specifically, we find that striping over two disks gives the highest throughput when a tight i second constraint on interactive delays is imposed. We also demonstrate that localized placement (i.e., no striping at all) performs nearly as well when a good estimate of the user request pattern is available.
Algorithms for Scalable Storage Servers
- In SOFSEM 2004: Theory and Practice of Computer Science
, 2004
"... We survey a set of algorithmic techniques that make it possible to build a high performance storage server from a network of cheap components. Such a storage server oers a very simple programming model. To the clients it looks like a single very large disk that can handle many requests in parall ..."
Abstract
-
Cited by 5 (1 self)
- Add to MetaCart
We survey a set of algorithmic techniques that make it possible to build a high performance storage server from a network of cheap components. Such a storage server oers a very simple programming model. To the clients it looks like a single very large disk that can handle many requests in parallel with minimal interference between the requests.
Random Duplicate Storage Strategies for Load Balancing in Multimedia Servers
, 2000
"... this paper we use randomization and data redundancy to enable good load balancing. We focus on duplicate storage strategies, i.e., each data block is stored twice. This means that a request for a block can be serviced by two disks. A consequence of such a storage strategy is that we have to decide f ..."
Abstract
-
Cited by 4 (2 self)
- Add to MetaCart
this paper we use randomization and data redundancy to enable good load balancing. We focus on duplicate storage strategies, i.e., each data block is stored twice. This means that a request for a block can be serviced by two disks. A consequence of such a storage strategy is that we have to decide for each block which disk to use for its retrieval. This results in a so-called retrieval selection problem. We describe a graph model for duplicate storage strategies and derive polynomial time optimization algorithms for the retrieval selection problems of several storage strategies. Our model unifies and generalizes chained declustering and random duplicate assignment strategies. Simulation results and a probabilistic analysis complete this paper
A Fault Tolerant Video Server Using Combined Raid 5 and Mirroring
, 1997
"... Video servers must use large disk arrays to provide the huge amount of storage capacity and bandwidth needed. As the number of disk drives increases, the probability of a video server failure increases too. We propose a redundancy scheme that uses both RAID 5 techniques and mirroring to make a vid ..."
Abstract
-
Cited by 4 (0 self)
- Add to MetaCart
Video servers must use large disk arrays to provide the huge amount of storage capacity and bandwidth needed. As the number of disk drives increases, the probability of a video server failure increases too. We propose a redundancy scheme that uses both RAID 5 techniques and mirroring to make a video server tolerant against all single disk failures.
Analyzing cache performance for video servers
- In Proceedings of the 1998 ICPP Workshop on Architectural and OS Support for Multimedia Applications Flexible Communication Systems
, 1998
"... Video-on-demand (VOD) is expected to become one of the most important and successful services to be offered on emerging technologies. There is not just an interest in the delivery of digital video for home entertainment purposes, but there are also several educational and commercial benefits from th ..."
Abstract
-
Cited by 4 (0 self)
- Add to MetaCart
Video-on-demand (VOD) is expected to become one of the most important and successful services to be offered on emerging technologies. There is not just an interest in the delivery of digital video for home entertainment purposes, but there are also several educational and commercial benefits from this service. However, for this service to become viable, it is important that the video servers meet the demanding data rates and real-time performance requirements imposed by video. Interval caching has been proposed as a cost-effective way of improving the throughput of a server, but all previous studies on interval caching have used simulation. With numerous parameters and their complex interplay affecting the performance of interval caching, it is infeasible to consider a full-factorial experiment with simulation. This paper presents an analytical model for interval caching on video servers. The model has been extensively validated over a range of client requests, video data and server parameters. In addition, the model has been generalized to accommodate variable lengths of video data stored at the server. Using this model, the impact of different parameters has been studied on the performance of interval caching scheme.
Load Balancing for Redundant Storage Strategies - Multiprocessor scheduling with machine eligibility
"... An important cost issue in multimedia servers is disk load balancing, such that the available hard disks are used as efficiently as possible. Disk load balancing is often done on a block basis, but can also be done on a time basis, by taking into account the actual transfer times of the blocks. I ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
An important cost issue in multimedia servers is disk load balancing, such that the available hard disks are used as efficiently as possible. Disk load balancing is often done on a block basis, but can also be done on a time basis, by taking into account the actual transfer times of the blocks. In the latter approach we can also embed the disk switch times. In this paper we revisit blockbased load balancing and introduce time-based load balancing. For each approach we present a mathematical model and analyze the complexity of the corresponding retrieval problem. We give algorithms with a performance bound for the NP-hard time-based retrieval problem and use simulation to compare the results of these algorithms with a maximum flow algorithm for the blockbased retrieval problem. 1

