Results 1 - 10
of
28
Ext3cow: A time-shifting file system for regulatory compliance
- ACM Transactions on Storage
, 2005
"... The ext3cow file system, built on the popular ext3 file system, provides an open-source file versioning and snapshot platform for compliance with the versioning and audtitability requirements of recent electronic record retention legislation. Ext3cow provides a time-shifting interface that permits a ..."
Abstract
-
Cited by 43 (2 self)
- Add to MetaCart
The ext3cow file system, built on the popular ext3 file system, provides an open-source file versioning and snapshot platform for compliance with the versioning and audtitability requirements of recent electronic record retention legislation. Ext3cow provides a time-shifting interface that permits a real-time and continuous view of data in the past. Time-shifting does not pollute the file system namespace nor require snapshots to be mounted as a separate file system. Further, ext3cow is implemented entirely in the file system space and, therefore, does not modify kernel interfaces or change the operation of other file systems. Ext3cow takes advantage of the fine-grained control of on-disk and in-memory data available only to a file system, resulting in minimal degradation of performance and functionality. Experimental results confirm this hypothesis; ext3cow performs comparably to ext3 on many benchmarks and on trace-driven experiments.
Increasing Application Performance in Virtual Environments through Run-time Inference and Adaptation
- In Proceedings of the 14th IEEE International Symposium on High Performance Distributed Computing (HPDC
, 2005
"... Virtual machine distributed computing greatly simplifies the use of widespread computing resources by lowering the level of abstraction, benefiting both resource providers and users. Towards that end our Virtuoso middleware closely emulates the existing process of buying, configuring and using physi ..."
Abstract
-
Cited by 33 (13 self)
- Add to MetaCart
Virtual machine distributed computing greatly simplifies the use of widespread computing resources by lowering the level of abstraction, benefiting both resource providers and users. Towards that end our Virtuoso middleware closely emulates the existing process of buying, configuring and using physical machines. Virtuoso's VNET component is a simple and efficient layer two virtual network tool that makes these virtual machines (VMs) appear to be physically connected to the home network of the user while simultaneously supporting arbitrary topologies and routing among them. Virtuoso's VTTIF component continually infers the communication behavior of the application running in a collection of VMs. The combination of overlays like VNET and inference frameworks like VTTIF has great potential to increase the performance, with no user or developer involvement, of existing, unmodified applications by adapting their virtual environments to the underlying computing infrastructure to best suit the applications. We show here how to use the continually inferred application topology and traffic to dynamically control three mechanisms of adaptation, VM migration, overlay topology, and forwarding to significantly increase the performance of two classes of applications, bulk synchronous parallel applications and transactional web ecommerce applications.
Virtuoso: A System for Virtual Machine Marketplaces
, 2004
"... This report describes the interface and implementation of the Virtuoso system. It is also a user manual for those who wish to try Virtuoso ..."
Abstract
-
Cited by 23 (12 self)
- Add to MetaCart
This report describes the interface and implementation of the Virtuoso system. It is also a user manual for those who wish to try Virtuoso
Flight Data Recorder: Monitoring persistent-state interactions to improve systems management
- In 7th USENIX OSDI
, 2006
"... Mismanagement of the persistent state of a system—all the executable files, configuration settings and other data that govern how a system functions—causes reliability problems, security vulnerabilities, and drives up operation costs. Recent research traces persistent state interactions—how state is ..."
Abstract
-
Cited by 21 (2 self)
- Add to MetaCart
Mismanagement of the persistent state of a system—all the executable files, configuration settings and other data that govern how a system functions—causes reliability problems, security vulnerabilities, and drives up operation costs. Recent research traces persistent state interactions—how state is read, modified, etc.—to help troubleshooting, change management and malware mitigation, but has been limited by the difficulty of collecting, storing, and analyzing the 10s to 100s of millions of daily events that occur on a single machine, much less the 1000s or more machines in many computing environments. We present the Flight Data Recorder (FDR) that enables always-on tracing, storage and analysis of persistent state interactions. FDR uses a domain-specific log format, tailored to observed file system workloads and common systems management queries. Our lossless log format compresses logs to only 0.5-0.9 bytes per interaction. In this log format, 1000 machine-days of logs—over 25 billion events—can be analyzed in less than 30 minutes. We report on our deployment of FDR to 207 production machines at MSN, and show that a single centralized collection machine can potentially scale to collecting and analyzing the complete records of persistent state interactions from 4000+ machines. Furthermore, our tracing technology is shipping as part of the Windows Vista OS. 1.
Issues in automatic provenance collection
- IN PROC. IPAW’06, VOLUME 4145 OF LNCS
, 2006
"... Automatic provenance collection describes systems that observe processes and data transformations inferring, collecting, and maintaining provenance about them. Automatic collection is a powerful tool for analysis of objects and processes, providing a level of transparency and pervasiveness not found ..."
Abstract
-
Cited by 14 (2 self)
- Add to MetaCart
Automatic provenance collection describes systems that observe processes and data transformations inferring, collecting, and maintaining provenance about them. Automatic collection is a powerful tool for analysis of objects and processes, providing a level of transparency and pervasiveness not found in more conventional provenance systems. Unfortunately, automatic collection is also difficult. We discuss the challenges we encountered and the issues we exposed as we developed an automatic provenance collector that runs at the operating system level.
Causality-Based Versioning
- IN PROCEEDINGS OF THE 7TH USENIX CONFERENCE ON FILE AND STORAGE TECHNOLOGIES
, 2009
"... Versioning file systems provide the ability to recover from a variety of failures, including file corruption, virus and worm infestations, and user mistakes. However, using versions to recover from data-corrupting events requires a human to determine precisely which files and versions to restore. We ..."
Abstract
-
Cited by 10 (2 self)
- Add to MetaCart
Versioning file systems provide the ability to recover from a variety of failures, including file corruption, virus and worm infestations, and user mistakes. However, using versions to recover from data-corrupting events requires a human to determine precisely which files and versions to restore. We can create more meaningful versions and enhance the value of those versions by capturing the causal connections among files, facilitating selection and recovery of precisely the right versions after data corrupting events. We determine when to create new versions of files automatically using the causal relationships among files. The literature on versioning file systems usually examines two extremes of possible version-creation algorithms: open-to-close versioning and versioning on every write. We evaluate causal versions of these two algorithms and introduce two additional causality-based algorithms: Cycle-Avoidance and Graph-Finesse. We show that capturing and maintaining causal relationships imposes less than 7 % overhead on a versioning system, providing benefit at low cost. We then show that Cycle-Avoidance provides more meaningful versions of files created during concurrent program execution, with overhead comparable to open/close versioning. Graph-Finesse provides even greater control, frequently at comparable overhead, but sometimes at unacceptable overhead. Versioning on every write is an interesting extreme case, but is far too costly to be useful in practice.
Generalized File System Dependencies
- Proc. of ACM Symposium on Operating System Principles
, 2007
"... Reliable storage systems depend in part on “write-before ” relationships where some changes to stable storage are delayed until other changes commit. A journaled file system, for example, must commit a journal transaction before applying that transaction’s changes, and soft updates [9] and other con ..."
Abstract
-
Cited by 9 (3 self)
- Add to MetaCart
Reliable storage systems depend in part on “write-before ” relationships where some changes to stable storage are delayed until other changes commit. A journaled file system, for example, must commit a journal transaction before applying that transaction’s changes, and soft updates [9] and other consistency enforcement mechanisms have similar constraints, implemented in each case in systemdependent ways. We present a general abstraction, the patch, that makes write-before relationships explicit and file system agnostic. A patch-based file system implementation expresses dependencies among writes, leaving lower system layers to determine write orders that satisfy those dependencies. Storage system modules can examine and modify the dependency structure, and generalized file system dependencies are naturally exportable to user level. Our patch-based storage system, Featherstitch, includes several important optimizations that reduce patch overheads by orders of magnitude. Our ext2 prototype runs in the Linux kernel and supports asynchronous writes, soft updates-like dependencies, and journaling. It outperforms similarly reliable ext2 and ext3 configurations on some, but not all, benchmarks. It also supports unusual configurations, such as correct dependency enforcement within a loopback file system, and lets applications define consistency requirements without micromanaging how those requirements are satisfied.
Enabling transactional file access via lightweight kernel extensions
- In Proc. 7th USENIX Conference on File and Storage Technologies (FAST ’05
, 2009
"... Transactions offer a powerful data-access method used in many databases today trough a specialized query API. User applications, however, use a different fileaccess API (POSIX) which does not offer transactional guarantees. Applications using transactions can become simpler, smaller, easier to devel ..."
Abstract
-
Cited by 9 (1 self)
- Add to MetaCart
Transactions offer a powerful data-access method used in many databases today trough a specialized query API. User applications, however, use a different fileaccess API (POSIX) which does not offer transactional guarantees. Applications using transactions can become simpler, smaller, easier to develop and maintain, more reliable, and more secure. We explored several techniques how to provide transactional file access with minimal impact on existing programs. Our first prototype was a standalone kernel component within the Linux kernel, but it complicated the kernel considerably and duplicated some of Linux’s existing facilities. Our second prototype was all in user level, and while it was easier to develop, it suffered from high overheads. In this paper we describe our latest prototype and the evolution that led to it. We implemented a transactional file API inside the Linux kernel which integrates easily and seamlessly with existing kernel facilities. This design is easier to maintain, simpler to integrate into existing OSs, and efficient. We evaluated our prototype and other systems under a variety of workloads. We demonstrate that our prototype’s performance is better than comparable systems and comes close to the theoretical lower bound for a log-based transaction manager. 1
Application-Level Isolation and Recovery with Solitude
"... When computer systems are compromised by an attack, it is difficult to determine the precise extent of the damage caused by the attack because the state changes made by an attacker and those made by regular users can be closely intertwined. This problem occurs due to implicit sharing in operating sy ..."
Abstract
-
Cited by 5 (0 self)
- Add to MetaCart
When computer systems are compromised by an attack, it is difficult to determine the precise extent of the damage caused by the attack because the state changes made by an attacker and those made by regular users can be closely intertwined. This problem occurs due to implicit sharing in operating systems, and it can be especially severe for persistent state. In particular, the file system provides a single namespace that when compromised can have cascading effects on the entire system, making intrusion analysis and recovery a time-consuming and error-prone process. In this paper, we present Solitude, an application-level isolation and recovery system that is designed to both limit the effects of attacks and simplify the post-intrusion recovery process. Solitude uses a copy-on-write filesystem to provide a transparent, restricted privilege isolation environment for running untrusted applications, and it uses an explicit file sharing mechanism across the isolation environments that limits attack propagation without compromising functionality. Solitude provides two modes of recovery. If a sandboxed application proves to be untrustworthy, a course-grained recovery method allows easily removing the footprint of the software. However, if a user mistakenly moves malicious files to the trusted environment via explicit file sharing, then Solitude uses data dependency tracking to allow fine-grained recovery.
Improving recoverability in multi-tier storage systems
"... Enterprise storage systems typically contain multiple storage tiers, each having its own performance, reliability, and recoverability. The primary motivation for this multi-tier organization is cost, as storage tier costs vary considerably. In this paper, we describe a file system called TierFS that ..."
Abstract
-
Cited by 4 (0 self)
- Add to MetaCart
Enterprise storage systems typically contain multiple storage tiers, each having its own performance, reliability, and recoverability. The primary motivation for this multi-tier organization is cost, as storage tier costs vary considerably. In this paper, we describe a file system called TierFS that stores files at multiple storage tiers while providing high recoverability at all tiers. To achieve this goal, TierFS uses several novel techniques that leverage coupling between multiple tiers to reduce data loss, take consistent snapshots across tiers, provide continuous data protection, and improve recovery time. We evaluate TierFS with analytical models, showing that TierFS can provide better recoverability than a conventional design of similar cost. 1

