Results 1  10
of
95
Algorithms for Parallel Memory I: TwoLevel Memories
, 1992
"... We provide the first optimal algorithms in terms of the number of input/outputs (I/Os) required between internal memory and multiple secondary storage devices for the problems of sorting, FFT, matrix transposition, standard matrix multiplication, and related problems. Our twolevel memory model is n ..."
Abstract

Cited by 240 (27 self)
 Add to MetaCart
We provide the first optimal algorithms in terms of the number of input/outputs (I/Os) required between internal memory and multiple secondary storage devices for the problems of sorting, FFT, matrix transposition, standard matrix multiplication, and related problems. Our twolevel memory model is new and gives a realistic treatment of parallel block transfer, in which during a single I/O each of the P secondary storage devices can simultaneously transfer a contiguous block of B records. The model pertains to a largescale uniprocessor system or parallel multiprocessor system with P disks. In addition, the sorting, FFT, permutation network, and standard matrix multiplication algorithms are typically optimal in terms of the amount of internal processing time. The difficulty in developing optimal algorithms is to cope with the partitioning of memory into P separate physical devices. Our algorithms' performance can be significantly better than those obtained by the wellknown but nonopti...
EVENODD: An efficient scheme for tolerating double disk failures in RAID architectures
 IEEE Transactions on Computers
, 1995
"... Abstruct We present a novel method, that we call EVENODD, for tolerating up to two disk failures in RAID architectures. EVENODD employs the addition of only two redundant disks and consists of simple exclusiveOR computations. This redundant storage is optimal, in the sense that two failed disks ca ..."
Abstract

Cited by 182 (26 self)
 Add to MetaCart
Abstruct We present a novel method, that we call EVENODD, for tolerating up to two disk failures in RAID architectures. EVENODD employs the addition of only two redundant disks and consists of simple exclusiveOR computations. This redundant storage is optimal, in the sense that two failed disks cannot be retrieved with less than two redundant disks. A major advantage of EVENODD is that it only requires parity hardware, which is typically present in standard RAID5 controllers. Hence, EVENODD can be implemented on standard RAID5 controllers without any hardware changes. The most commonly used scheme that employes optimal redundant storage (Le., two extra disks) is based on ReedSolomon (RS) errorcorrecting codes. This scheme requires computation over finite fields and results in a more complex implementation. For example, we show that the complexity of implementing EVENODD in a disk array with 15 disks is about 50 % of the one required when using the RS scheme. The new scheme is not limited to RAID architectures: it can be used in any system requiring large symbols and relatively short codes, for instance, in multitrack magnetic recording. To this end, we also present a decoding algorithm for one column (track) in error. Index Terms RAID architectures, erasurecorrecting codes, ReedSolomon codes, disk arrays.
ExternalMemory Computational Geometry
, 1993
"... In this paper, we give new techniques for designing efficient algorithms for computational geometry problems that are too large to be solved in internal memory, and we use these techniques to develop optimal and practical algorithms for a number of important largescale problems. We discuss our algor ..."
Abstract

Cited by 121 (14 self)
 Add to MetaCart
In this paper, we give new techniques for designing efficient algorithms for computational geometry problems that are too large to be solved in internal memory, and we use these techniques to develop optimal and practical algorithms for a number of important largescale problems. We discuss our algorithms primarily in the contex't of single processor/single disk machines, a domain in which they are not only the first known optimal results but also of tremendous practical value. Our methods also produce the first known optimal algorithms for a wide range of twolevel and hierarchical muir{level memory models, including parallel models. The algorithms are optimal both in terms of I/0 cost and internal computation.
EVENODD: An Optimal Scheme for Tolerating Double Disk Failures
 in RAID Architectures: IBM Research Report, RJ 9506
, 1993
"... EVENODD: An optimal scheme for tolerating double disk failures in RAID architectures ..."
Abstract

Cited by 82 (5 self)
 Add to MetaCart
EVENODD: An optimal scheme for tolerating double disk failures in RAID architectures
MDS Array Codes with Independent Parity Symbols
 IEEE TRANS. ON INFORMATION THEORY
, 1996
"... A new family of MDS array codes is presented. The code arrays contain p information columns and T independent parity columns, each column consisting of p 1 bits, where p is a prime. We extend a previously known construction for 1 he case T = 2 to three and more parity columns. It is shown that whe ..."
Abstract

Cited by 76 (19 self)
 Add to MetaCart
(Show Context)
A new family of MDS array codes is presented. The code arrays contain p information columns and T independent parity columns, each column consisting of p 1 bits, where p is a prime. We extend a previously known construction for 1 he case T = 2 to three and more parity columns. It is shown that when r = 3 such extension is possible for any prime p. For larger values of T, we give necessary and sufficient conditions for our codes to be MDS, and then prove that if p belongs to a certain class of primes these conditions are satisfied up to T 5 8. One of the advantages of the new codes is that encoding and decoding may be accomplished using simple cyclic shifts and XOR operations on the columns of the code array. We develop efficient decoding procedures for the case of two and threecolumn errors. This again extends the previously known results for the case of ii singlecolumn error. Another primary advantage of our codes is related to the problem of efficient information updates. We present upper and lower bounds on the average number of parity bits which have to be updated in an MDS code over GF (2^m), following an update in a single information bit. This average number is of importance in many storage applications which require frequent updates of information. We show that the upper bound obtained from our codes is close to the lower bound and, most importantly, does not depend on the size of the code symbols.
Disk scrubbing in large archival storage systems
 In Proceedings of the 12th International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunication Systems (MASCOTS ’04
, 2004
"... Large archival storage systems experience long periods of idleness broken up by rare data accesses. In such systems, disks may remain powered off for long periods of time. These systems can lose data for a variety of reasons, including failures at both the device level and the block level. To deal w ..."
Abstract

Cited by 67 (16 self)
 Add to MetaCart
(Show Context)
Large archival storage systems experience long periods of idleness broken up by rare data accesses. In such systems, disks may remain powered off for long periods of time. These systems can lose data for a variety of reasons, including failures at both the device level and the block level. To deal with these failures, we must detect them early enough to be able to use the redundancy built into the storage system. We propose a process called “disk scrubbing” in a system in which drives are periodically accessed to detect drive failure. By scrubbing all of the data stored on all of the disks, we can detect block failures and compensate for them by rebuilding the affected blocks. Our research shows how the scheduling of disk scrubbing affects overall system reliability, and that “opportunistic ” scrubbing, in which the system scrubs disks only when they are powered on for other reasons, performs very well without the need to power on disks solely to check them. 1.
Fast Concurrent Access to Parallel Disks
"... High performance applications involving large data sets require the efficient and flexible use of multiple disks. In an external memory machine with D parallel, independent disks, only one block can be accessed on each disk in one I/O step. This restriction leads to a load balancing problem that is ..."
Abstract

Cited by 60 (12 self)
 Add to MetaCart
High performance applications involving large data sets require the efficient and flexible use of multiple disks. In an external memory machine with D parallel, independent disks, only one block can be accessed on each disk in one I/O step. This restriction leads to a load balancing problem that is perhaps the main inhibitor for the efficient adaptation of singledisk external memory algorithms to multiple disks. We solve this problem for arbitrary access patterns by randomly mapping blocks of a logical address space to the disks. We show that a shared buffer of O(D) blocks suffices to support efficient writing. The analysis uses the properties of negative association to handle dependencies between the random variables involved. This approach might be of independent interest for probabilistic analysis in general. If two randomly allocated copies of each block exist, N arbitrary blocks can be read within dN=De + 1 I/O steps with high probability. The redundancy can be further reduced from 2 to 1 + 1=r for any integer r without a big impact on reading efficiency. From the point of view of external memory models, these results rehabilitate Aggarwal and Vitter's "singledisk multihead" model [1] that allows access to D arbitrary blocks in each I/O step. This powerful model can be emulated on the physically more realistic independent disk model [2] with small constant overhead factors. Parallel disk external memory algorithms can therefore be developed in the multihead model first. The emulation result can then be applied directly or further refinements can be added.
LH*RS  a highavailability scalable distributed data structure
"... (SDDS). An LH*RS file is hash partitioned over the distributed RAM of a multicomputer, e.g., a network of PCs, and supports the unavailability of any of its k ≥ 1 server nodes. The value of k transparently grows with the file to offset the reliability decline. Only the number of the storage nodes p ..."
Abstract

Cited by 59 (11 self)
 Add to MetaCart
(SDDS). An LH*RS file is hash partitioned over the distributed RAM of a multicomputer, e.g., a network of PCs, and supports the unavailability of any of its k ≥ 1 server nodes. The value of k transparently grows with the file to offset the reliability decline. Only the number of the storage nodes potentially limits the file growth. The highavailability management uses a novel parity calculus that we have developed, based on the ReedSalomon erasure correcting coding. The resulting parity storage overhead is about the minimal ever possible. The parity encoding and decoding are faster than for any other candidate coding we are aware of. We present our scheme and its performance analysis, including experiments with a prototype implementation on Wintel PCs. The capabilities of LH*RS offer new perspectives to data intensive applications, including the emerging ones of grids and of P2P computing.
Tolerating Multiple Failures in RAID Architectures with Optimal Storage and Uniform Declustering
 In Proceedings of the 24th International Symposium on Computer Architecture
, 1996
"... We present Datum, a novel method for tolerating multiple disk failures in disk arrays. Datum is the first known method that can mask any given number of failures, requires an optimal amount of redundant storage space, and spreads reconstruction accesses uniformly over disks in the presence of failur ..."
Abstract

Cited by 51 (6 self)
 Add to MetaCart
(Show Context)
We present Datum, a novel method for tolerating multiple disk failures in disk arrays. Datum is the first known method that can mask any given number of failures, requires an optimal amount of redundant storage space, and spreads reconstruction accesses uniformly over disks in the presence of failures without needing large layout tables in controller memory. Our approach is based on information dispersal, a coding technique that admits an efficient hardware implementation. As the method does not restrict the configuration parameters of the disk array, many existing RAID organizations are particular cases of Datum. A detailed performance comparison with two other approaches shows that Datum's response times are similar to those of the best competitor when two or less disks fail, and that the performance degrades gracefully when more than two disks fail. 1 Introduction Disk arrays [15] offer significant advantages over conventional disks. Fragmentation of the total storage space into ...
AIDAbased RealTime FaultTolerant Broadcast Disks
 In Proceedings of RTAS'96: The 1996 IEEE RealTime Technology and Applications Symposium
, 1996
"... The proliferation of mobile computers and wireless networks requires the design of future distributed realtime applications to recognize and deal with the significant asymmetry between downstream and upstream communication capacities, and the significant disparitybetween server and client storag ..."
Abstract

Cited by 41 (15 self)
 Add to MetaCart
(Show Context)
The proliferation of mobile computers and wireless networks requires the design of future distributed realtime applications to recognize and deal with the significant asymmetry between downstream and upstream communication capacities, and the significant disparitybetween server and client storage capacities.