Results 1 - 10
of
36
Versatility and unix semantics in namespace unification
- ACM Transactions on Storage (TOS
, 2006
"... Administrators often prefer to keep related sets of files in different locations or media, as it is easier to maintain them separately. Users, however, prefer to see all files in one location for convenience. One solution that accommodates both needs is virtual namespace unification— providing a mer ..."
Abstract
-
Cited by 22 (4 self)
- Add to MetaCart
Administrators often prefer to keep related sets of files in different locations or media, as it is easier to maintain them separately. Users, however, prefer to see all files in one location for convenience. One solution that accommodates both needs is virtual namespace unification— providing a merged view of several directories without physically merging them. For example, namespace unification can merge the contents of several CD-ROM images without unpacking them, merge binary directories from different packages, merge views from several file servers, and more. Namespace unification can also enable snapshotting, by marking some data sources readonly and then utilizing copy-on-write for the read-only sources. For example, an OS image may be contained on a read-only CD-ROM image—and user’s configuration, data, and programs could be stored in a separate read-write directory. With copy-on-write unification, the user need not be concerned about the two disparate file systems. It is difficult to maintain Unix semantics while offering a versatile namespace unification system. Past efforts to provide such unification often compromised on the set of features provided or Unix compatibility—resulting in an incomplete solution that users could not use. We designed and implemented a versatile namespace-unification system called Unionfs. Unionfs
Safestore: A durable and practical storage system
- In USENIX Annual Technical Conference
, 2007
"... This paper presents SafeStore, a distributed storage system designed to maintain long-term data durability despite conventional hardware and software faults, environmental disruptions, and administrative failures caused by human error or malice. The architecture of SafeStore is based on fault isolat ..."
Abstract
-
Cited by 21 (4 self)
- Add to MetaCart
This paper presents SafeStore, a distributed storage system designed to maintain long-term data durability despite conventional hardware and software faults, environmental disruptions, and administrative failures caused by human error or malice. The architecture of SafeStore is based on fault isolation, which Safe-Store applies aggressively along administrative, physical, and temporal dimensions by spreading data across autonomous storage service providers (SSPs). However, current storage interfaces provided by SSPs are not designed for high end-to-end durability. In this paper, we propose a new storage system architecture that (1) spreads data efficiently across autonomous SSPs using informed hierarchical erasure coding that, for a given replication cost, provides several additional 9’s of durability over what can be achieved with existing black-box SSP interfaces, (2) performs an efficient end-to-end audit of SSPs to detect data loss that, for a 20 % cost increase, improves data durability by two 9’s by reducing MTTR, and (3) offers durable storage with cost, performance, and availability competitive with traditional storage systems. We instantiate and evaluate these ideas by building a SafeStore-based file system with an NFSlike interface. 1
TRAP-Array: A Disk Array Architecture Providing Timely Recovery to Any Point-in-time
- In Proc. Of the 33 rd Int’l Symposium on Computer Architecture (ISCA06
, 2006
"... RAID architectures have been used for more than two decades to recover data upon disk failures. Disk failure is just one of the many causes of damaged data. Data can be damaged by virus attacks, user errors, defective software/firmware, hardware faults, and site failures. The risk of these types of ..."
Abstract
-
Cited by 12 (3 self)
- Add to MetaCart
RAID architectures have been used for more than two decades to recover data upon disk failures. Disk failure is just one of the many causes of damaged data. Data can be damaged by virus attacks, user errors, defective software/firmware, hardware faults, and site failures. The risk of these types of data damage is far greater than disk failure with today’s mature disk technology and networked information services. It has therefore become increasingly important for today’s disk array to be able to recover data to any point in time when such a failure occurs. This paper presents a new disk array architecture that provides Timely Recovery to Any Point-in-time, referred to as TRAP-Array. TRAP-Array stores not only the data stripe upon a write to the array, but also the time-stamped Exclusive-ORs of successive
Accurate and efficient replaying of file system traces
- In Proc. USENIX Conference on File and Storage Technologies (FAST’05
, 2005
"... Replaying traces is a time-honored method for benchmarking, stress-testing, and debugging systems—and more recently—forensic analysis. One benefit to replaying traces is the reproducibility of the exact set of operations that were captured during a specific workload. Existing trace capture and repla ..."
Abstract
-
Cited by 12 (5 self)
- Add to MetaCart
Replaying traces is a time-honored method for benchmarking, stress-testing, and debugging systems—and more recently—forensic analysis. One benefit to replaying traces is the reproducibility of the exact set of operations that were captured during a specific workload. Existing trace capture and replay systems operate at different levels: network packets, disk device drivers, network file systems, or system calls. System call replayers miss memory-mapped operations and cannot replay I/Ointensive workloads at original speeds. Traces captured at other levels miss vital information that is available only at the file system level. We designed and implemented Replayfs, the first system for replaying file system traces at the VFS level. The VFS is the most appropriate level for replaying file system traces because all operations are reproduced in a manner that is most relevant to file-system developers. Thanks to the uniform VFS API, traces can be replayed transparently onto any existing file system, even a different one than the one originally traced, without modifying existing file systems. Replayfs’s user-level compiler prepares a trace to be replayed efficiently in the kernel where multiple kernel threads prefetch and schedule the replay of file system operations precisely and efficiently. These techniques allow us to replay I/O-intensive traces at different speeds, and even accelerate them on the same hardware that the trace was captured on originally. 1
Design and Implementation of Verifiable Audit Trails for a Versioning File System
, 2007
"... We present constructs that create, manage, and verify digital audit trails for versioning file systems. Based upon a small amount of data published to a third party, a file system commits to a version history. At a later date, an auditor uses the published data to verify the contents of the file sys ..."
Abstract
-
Cited by 11 (0 self)
- Add to MetaCart
We present constructs that create, manage, and verify digital audit trails for versioning file systems. Based upon a small amount of data published to a third party, a file system commits to a version history. At a later date, an auditor uses the published data to verify the contents of the file system at any point in time. Digital audit trails create an analog of the paper audit process for file data, helping to meet the requirements of electronic records legislation. Our techniques address the I/O and computational efficiency of generating and verifying audit trails, the aggregation of audit information in directory hierarchies, and independence to file system architectures.
DoublePlay: Parallelizing Sequential Logging and Replay
"... Deterministic replay systems record and reproduce the execution of a hardware or software system. In contrast to replaying execution on uniprocessors, deterministic replay on multiprocessors is very challenging to implement efficiently because of the need to reproduce the order or values read by sha ..."
Abstract
-
Cited by 11 (2 self)
- Add to MetaCart
Deterministic replay systems record and reproduce the execution of a hardware or software system. In contrast to replaying execution on uniprocessors, deterministic replay on multiprocessors is very challenging to implement efficiently because of the need to reproduce the order or values read by shared memory operations performed by multiple threads. In this paper, we present DoublePlay, a new way to efficiently guarantee replay on commodity multiprocessors. Our key insight is that one can use the simpler and faster mechanisms of single-processor record and replay, yet still achieve the scalability offered by multiple cores, by using an additional execution to parallelize the record and replay of an application. DoublePlay timeslices multiple threads on a single processor, then runs multiple time intervals (epochs) of the program concurrently on separate processors. This strategy, which we call uniparallelism, makes logging much easier because each epoch runs on a single processor (so threads in an epoch never simultaneously access the same memory) and different epochs operate on different copies of the memory. Thus, rather than logging the order of shared-memory accesses, we need only log the order in which threads in an epoch are timesliced on the processor. DoublePlay runs an additional execution of the program on multiple processors to generate checkpoints so that epochs run in parallel. We evaluate DoublePlay on a variety of client, server, and scientific parallel benchmarks; with spare cores, DoublePlay reduces logging overhead to an average of 15 % with two worker threads and 28 % with four threads.
Causality-Based Versioning
- IN PROCEEDINGS OF THE 7TH USENIX CONFERENCE ON FILE AND STORAGE TECHNOLOGIES
, 2009
"... Versioning file systems provide the ability to recover from a variety of failures, including file corruption, virus and worm infestations, and user mistakes. However, using versions to recover from data-corrupting events requires a human to determine precisely which files and versions to restore. We ..."
Abstract
-
Cited by 10 (2 self)
- Add to MetaCart
Versioning file systems provide the ability to recover from a variety of failures, including file corruption, virus and worm infestations, and user mistakes. However, using versions to recover from data-corrupting events requires a human to determine precisely which files and versions to restore. We can create more meaningful versions and enhance the value of those versions by capturing the causal connections among files, facilitating selection and recovery of precisely the right versions after data corrupting events. We determine when to create new versions of files automatically using the causal relationships among files. The literature on versioning file systems usually examines two extremes of possible version-creation algorithms: open-to-close versioning and versioning on every write. We evaluate causal versions of these two algorithms and introduce two additional causality-based algorithms: Cycle-Avoidance and Graph-Finesse. We show that capturing and maintaining causal relationships imposes less than 7 % overhead on a versioning system, providing benefit at low cost. We then show that Cycle-Avoidance provides more meaningful versions of files created during concurrent program execution, with overhead comparable to open/close versioning. Graph-Finesse provides even greater control, frequently at comparable overhead, but sometimes at unacceptable overhead. Versioning on every write is an interesting extreme case, but is far too costly to be useful in practice.
TFS: A Transparent File System for Contributory Storage
- FAST '07
, 2007
"... Contributory applications allow users to donate unused resources on their personal computers to a shared pool. Applications such as ..."
Abstract
-
Cited by 9 (1 self)
- Add to MetaCart
Contributory applications allow users to donate unused resources on their personal computers to a shared pool. Applications such as
Application-Level Isolation and Recovery with Solitude
"... When computer systems are compromised by an attack, it is difficult to determine the precise extent of the damage caused by the attack because the state changes made by an attacker and those made by regular users can be closely intertwined. This problem occurs due to implicit sharing in operating sy ..."
Abstract
-
Cited by 5 (0 self)
- Add to MetaCart
When computer systems are compromised by an attack, it is difficult to determine the precise extent of the damage caused by the attack because the state changes made by an attacker and those made by regular users can be closely intertwined. This problem occurs due to implicit sharing in operating systems, and it can be especially severe for persistent state. In particular, the file system provides a single namespace that when compromised can have cascading effects on the entire system, making intrusion analysis and recovery a time-consuming and error-prone process. In this paper, we present Solitude, an application-level isolation and recovery system that is designed to both limit the effects of attacks and simplify the post-intrusion recovery process. Solitude uses a copy-on-write filesystem to provide a transparent, restricted privilege isolation environment for running untrusted applications, and it uses an explicit file sharing mechanism across the isolation environments that limits attack propagation without compromising functionality. Solitude provides two modes of recovery. If a sandboxed application proves to be untrustworthy, a course-grained recovery method allows easily removing the footprint of the software. However, if a user mistakenly moves malicious files to the trusted environment via explicit file sharing, then Solitude uses data dependency tracking to allow fine-grained recovery.
Implementation and Performance Evaluation of Two Snapshot Methods on iSCSI
- Target Storages,” Proc. 14th NASA Goddard/23rd IEEE Conf. Mass Storage Systems and Technologies (MSST ’06
, 2006
"... While snapshots have been commonly used in data storages for backup and data protections, little is known in the open literature how such snapshots impact application performance. This paper presents an implementation and performance evaluation of two snapshot techniques: copy-on-write snapshot and ..."
Abstract
-
Cited by 5 (4 self)
- Add to MetaCart
While snapshots have been commonly used in data storages for backup and data protections, little is known in the open literature how such snapshots impact application performance. This paper presents an implementation and performance evaluation of two snapshot techniques: copy-on-write snapshot and redirect-on-write snapshot. Our implementation is carried out at block level on a standard iSCSI target. We carry out quantitative performance evaluations and comparisons of the two snapshot implementations using TPC-C, TPC-W, IoMeter, and PostMark benchmarks. Our measurements reveal many interesting observations regarding the performance characteristics of the two snapshot techniques. Depending on the applications and different I/O workloads, the two snapshot techniques perform quite differently. In general, copy-on-write performs well on read-intensive applications while redirect-on-write performs well on write-intensive applications. 1.

