Results 1 - 10
of
29
FAWN: A Fast Array of Wimpy Nodes
, 2008
"... This paper introduces the FAWN—Fast Array of Wimpy Nodes—cluster architecture for providing fast, scalable, and power-efficient key-value storage. A FAWN links together a large number of tiny nodes built using embedded processors and small amounts (2–16GB) of flash memory into an ensemble capable of ..."
Abstract
-
Cited by 68 (19 self)
- Add to MetaCart
This paper introduces the FAWN—Fast Array of Wimpy Nodes—cluster architecture for providing fast, scalable, and power-efficient key-value storage. A FAWN links together a large number of tiny nodes built using embedded processors and small amounts (2–16GB) of flash memory into an ensemble capable of handling 700 queries per second per node, while consuming fewer than 6 watts of power per node. We have designed and implemented a clustered key-value storage system, FAWN-DHT, that runs atop these node. Nodes in FAWN-DHT use a specialized log-like back-end hash-based database to ensure that the system can absorb the large write workload imposed by frequent node arrivals and departures. FAWN uses a two-level cache hierarchy to ensure that imbalanced workloads cannot create hot-spots on one or a few wimpy nodes that impair the system’s ability to service queries at its guaranteed rate. Our evaluation of a small-scale FAWN cluster and several candidate FAWN node systems suggest that FAWN can be a practical approach to building large-scale storage for seek-intensive workloads. Our further analysis indicates that a FAWN cluster is cost-competitive with other approaches (e.g., DRAM, multitudes of magnetic disks, solid-state disk) to providing high query rates, while consuming 3-10x less power. Acknowledgements: We thank the members and companies of the CyLab Corporate Partners and the PDL
Generating Realistic Impressions for File-System Benchmarking
- In Proceedings of the 7th Conference on File and Storage Technologies (FAST ’09
, 2009
"... ..."
SRCMap: Energy Proportional Storage using Dynamic Consolidation
"... We investigate the problem of creating an energy proportional storage system through power-aware dynamic storage consolidation. Our proposal, Sample-Replicate-Consolidate Mapping (SRCMap), is a storage virtualization layer optimization that enables energy proportionality for dynamic I/O workloads by ..."
Abstract
-
Cited by 9 (2 self)
- Add to MetaCart
We investigate the problem of creating an energy proportional storage system through power-aware dynamic storage consolidation. Our proposal, Sample-Replicate-Consolidate Mapping (SRCMap), is a storage virtualization layer optimization that enables energy proportionality for dynamic I/O workloads by consolidating the cumulative workload on a subset of physical volumes proportional to the I/O workload intensity. Instead of migrating data across physical volumes dynamically or replicating entire volumes, both of which are prohibitively expensive, SRCMap samples a subset of blocks from each data volume that constitutes its working set and replicates these on other physical volumes. During a given consolidation interval, SRCMap activates a minimal set of physical volumes to serve the workload and spins down the remaining volumes, redirecting their workload to replicas on active volumes. We present both theoretical and experimental evidence to establish the effectiveness of SRCMap in minimizing the power consumption of enterprise storage systems. 1
A Performance Evaluation and Examination of Open-Source Erasure Coding Libraries For Storage
"... Over the past five years, large-scale storage installations have required fault-protection beyond RAID-5, leading to a flurry of research on and development of erasure codes for multiple disk failures. Numerous open-source implementations of various coding techniques are available to the general pub ..."
Abstract
-
Cited by 8 (1 self)
- Add to MetaCart
Over the past five years, large-scale storage installations have required fault-protection beyond RAID-5, leading to a flurry of research on and development of erasure codes for multiple disk failures. Numerous open-source implementations of various coding techniques are available to the general public. In this paper, we perform a head-to-head comparison of these implementations in encoding and decoding scenarios. Our goals are to compare codes and implementations, to discern whether theory matches practice, and to demonstrate how parameter selection, especially as it concerns memory, has a significant impact on a code’s performance. Additional benefits are to give storage system designers an idea of what to expect in terms of coding performance when designing their storage systems, and to identify the places where further erasure coding research can have the most impact.
Energy-efficient cluster computing with FAWN: Workloads and implications
- In Proc. e-Energy 2010
, 2010
"... This paper presents the architecture and motivation for a clusterbased, many-core computing architecture for energy-efficient, dataintensive computing. FAWN, a Fast Array of Wimpy Nodes, consists of a large number of slower but efficient nodes coupled with low-power storage. We present the computing ..."
Abstract
-
Cited by 7 (2 self)
- Add to MetaCart
This paper presents the architecture and motivation for a clusterbased, many-core computing architecture for energy-efficient, dataintensive computing. FAWN, a Fast Array of Wimpy Nodes, consists of a large number of slower but efficient nodes coupled with low-power storage. We present the computing trends that motivate a FAWN-like approach, for CPU, memory, and storage. We follow with a set of microbenchmarks to explore under what workloads these “wimpy nodes ” perform well (or perform poorly). We conclude with an outline of the longer-term implications of FAWN that lead us to select a tightly integrated stacked chip-and-memory architecture for future FAWN development.
A new minimum density RAID-6 code with a word size of eight
- In NCA-08: 7th IEEE International Symposium on Network Computing Applications
, 2008
"... RAID-6 storage systems protect k disks of data with two parity disks so that the system of k + 2 disks may tolerate the failure of any two disks. Coding techniques for RAID-6 systems are varied, but an important class of techniques are those with minimum density, featuring an optimal combination of ..."
Abstract
-
Cited by 6 (3 self)
- Add to MetaCart
RAID-6 storage systems protect k disks of data with two parity disks so that the system of k + 2 disks may tolerate the failure of any two disks. Coding techniques for RAID-6 systems are varied, but an important class of techniques are those with minimum density, featuring an optimal combination of encoding, decoding and modification complexity. The word size of a code impacts both how the code is laid out on each disk’s sectors and how large k can be. Word sizes which are powers of two are especially important, since they fit precisely into file system blocks. Minimum density codes exist for many word sizes with the notable exception of eight. This paper fills that gap by describing new codes for this important word size. The description includes performance properties as well as details of the discovery process. 1.
Protecting against rare event failures in archival systems
- Proc. 17 th IEEE International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunication Systems (MASCOTS '09
, 2009
"... Digital archives are growing rapidly, necessitating stronger reliability measures than RAID to avoid data loss from device failure. Mirroring, a popular solution, is too expensive over time. We present a compromise solution that uses multi-level redundancy coding to reduce the probability of data lo ..."
Abstract
-
Cited by 5 (3 self)
- Add to MetaCart
Digital archives are growing rapidly, necessitating stronger reliability measures than RAID to avoid data loss from device failure. Mirroring, a popular solution, is too expensive over time. We present a compromise solution that uses multi-level redundancy coding to reduce the probability of data loss from multiple simultaneous device failures. This approach handles small-scale failures of one or two devices efficiently while still allowing the system to survive rare-event, larger-scale failures of four or more devices. In our approach, each disk is split into a set of fixed size disklets which are used to construct reliability stripes. To protect against rare event failures, reliability stripes are grouped into larger “über-groups, ” each of which has a corresponding “über-parity; ” über-parity is only used to recover data when disk failures overwhelm the redundancy in a single reliability stripe. Über-parity can be stored on a variety of devices such as NV-RAM and always-on disks to offset write bottlenecks while still keeping the number of active devices low. Our calculations of failure probabilities found that the addition of über-groups allowed the system to absorb many more disk failures without data loss. Through discrete event simulation, we found that adding über-groups only negatively impacts performance when these groups need to be used for a rebuild. Since rebuilds using über-parity occur very rarely, they minimally impact system performance over time. Finally, we showed that robustness against rare events can be achieved for under 5 % of total system cost. 1.
A Spin-Up Saved is Energy Earned: Achieving Power-Efficient, Erasure-Coded Storage
"... Storage accounts for a significant amount of a data center’s ever increasing power budget. As a consequence, energy consumption has joined performance and reliability as a dominant metric in storage system design. In this paper, we show that the structure of an erasure code— which is generally used ..."
Abstract
-
Cited by 5 (0 self)
- Add to MetaCart
Storage accounts for a significant amount of a data center’s ever increasing power budget. As a consequence, energy consumption has joined performance and reliability as a dominant metric in storage system design. In this paper, we show that the structure of an erasure code— which is generally used to provide data reliability—can be exploited to save power in a storage system. We define a novel technique in power-aware systems called poweraware coding and present generic techniques for reading, writing and activating devices in a power-aware, erasurecoded storage system. While our techniques have an effect on energy consumption, fault tolerance and performance, we focus on a few examples that illustrate the tradeoff between power efficiency and fault tolerance. Finally, we discuss open problems in the space of poweraware coding. 1
Tiered Fault Tolerance for Long-Term Integrity
"... Fault-tolerant services typically make assumptions about the type and maximum number of faults that they can tolerate while providing their correctness guarantees; when such a fault threshold is violated, correctness is lost. We revisit the notion of fault thresholds in the context of long-term arch ..."
Abstract
-
Cited by 4 (2 self)
- Add to MetaCart
Fault-tolerant services typically make assumptions about the type and maximum number of faults that they can tolerate while providing their correctness guarantees; when such a fault threshold is violated, correctness is lost. We revisit the notion of fault thresholds in the context of long-term archival storage. We observe that fault thresholds are inevitably violated in longterm services, making traditional fault tolerance inapplicable to the long-term. In this work, we undertake a “reallocation of the fault-tolerance budget ” of a long-term service. We split the service into service pieces, each of which can tolerate a different number of faults without failing (and without causing the whole service to fail): each piece can be either in a critical trusted fault tier, which must never fail, or an untrusted fault tier, which can fail massively and often, or other fault tiers in between. By carefully engineering the split of a long-term service into pieces that must obey distinct fault thresholds, we can prolong its inevitable demise. We demonstrate this approach with Bonafide, a long-term key-value store that, unlike all similar systems proposed in the literature, maintains integrity in the face of Byzantine faults without requiring self-certified data. We describe the notion of tiered fault tolerance, the design, implementation, and experimental evaluation of Bonafide, and argue that our approach is a practical yet significant improvement over the state of the art for long-term services. 1
Semantic data placement for power management in archival storage
- In PDSW 2010
, 2010
"... Abstract—Power is the greatest lifetime cost in an archival system, and, as decreasing costs make disks more attractive than tapes, spinning disks account for the majority of power drawn. To reduce this cost, we propose reducing the number of times disks have to spin up by grouping together files su ..."
Abstract
-
Cited by 4 (3 self)
- Add to MetaCart
Abstract—Power is the greatest lifetime cost in an archival system, and, as decreasing costs make disks more attractive than tapes, spinning disks account for the majority of power drawn. To reduce this cost, we propose reducing the number of times disks have to spin up by grouping together files such that a typical spin-up handles several file accesses. For a typical system, we show that if only 30 % of total accesses occur while disks are still spinning, we can conserve 12 % of the power cost. We classify files according to directory structure and see access hit rates of up to 66 % for a power savings of up to 52 % of the power cost of spinning up for every read in easily-separable workloads. I.

