Results 1 - 10
of
46
Semantically-Smart Disk Systems
, 2003
"... We propose and evaluate the concept of a semantically-smart disk system (SDS). As opposed to a traditional "smart" disk, an SDS has detailed knowledge of how the file system above is using the disk system, including information about the on-disk data structures of the file system. An SDS exploits th ..."
Abstract
-
Cited by 64 (14 self)
- Add to MetaCart
We propose and evaluate the concept of a semantically-smart disk system (SDS). As opposed to a traditional "smart" disk, an SDS has detailed knowledge of how the file system above is using the disk system, including information about the on-disk data structures of the file system. An SDS exploits this knowledge to transparently improve performance or enhance functionality beneath a standard block read/write interface. To automatically acquire this knowledge, we introduce a tool (EOF) that can discover file-system structure for certain types of file systems, and then show how an SDS can exploit this knowledge on-line to understand file-system behavior. We quantify the space and time overheads that are common in an SDS, showing that they are not excessive. We then study the issues surrounding SDS construction by designing and implementing a number of prototypes as case studies; each case study exploits knowledge of some aspect of the file system to implement powerful functionality beneath the standard SCSI interface. Overall, we find that a surprising amount of functionality can be embedded within an SDS, hinting at a future where disk manufacturers can compete on enhanced functionality and not simply cost-per-byte and performance.
Bridging the Information Gap in Storage Protocol Stacks
- In Proceedings of the USENIX Annual Technical Conference (USENIX ’02
, 2002
"... The functionality and performance innovations in file systems and storage systems have proceeded largely independently from each other over the past years. The result is an information gap: neither has information about how the other is designed or implemented, which can result in a high cost of mai ..."
Abstract
-
Cited by 34 (6 self)
- Add to MetaCart
The functionality and performance innovations in file systems and storage systems have proceeded largely independently from each other over the past years. The result is an information gap: neither has information about how the other is designed or implemented, which can result in a high cost of maintenance, poor performance, duplication of features, and limitations on functionality. To bridge this gap, we introduce and evaluate a new division of labor between the storage system and the file system. We develop an enhanced storage layer known as Exposed RAID (ERAID), which reveals information to file systems built above; specifically, ERAID exports the parallelism and failure-isolation boundaries of the storage layer, and tracks performance and failure characteristics on a fine-grained basis. To take advantage of the information made available by ERAID, we develop an Informed Log-Structured File System (ILFS). ILFS is an extension of the standard logstructured file system (LFS) that has been altered to take advantage of the performance and failure information exposed by ERAID. Experiments reveal that our prototype implementation yields benefits in the management, flexibility, reliability, and performance of the storage system, with only a small increase in file system complexity. For example, ILFS/ERAID can incorporate new disks into the system on-the-fly, dynamically balance workloads across the disks of the system, allow for user control of file replication, and delay replication of files for increased performance. Much of this functionality would be difficult or impossible to implement with the traditional division of labor between file systems and storage.
Rethink the sync
- In Proc. OSDI
, 2006
"... We introduce external synchrony, a new model for local file I/O that provides the reliability and simplicity of synchronous I/O, yet also closely approximates the performance of asynchronous I/O. An external observer cannot distinguish the output of a computer with an externally synchronous file sys ..."
Abstract
-
Cited by 32 (6 self)
- Add to MetaCart
We introduce external synchrony, a new model for local file I/O that provides the reliability and simplicity of synchronous I/O, yet also closely approximates the performance of asynchronous I/O. An external observer cannot distinguish the output of a computer with an externally synchronous file system from the output of a computer with a synchronous file system. No application modification is required to use an externally synchronous file system: in fact, application developers can program to the simpler synchronous I/O abstraction and still receive excellent performance. We have implemented an externally synchronous file system for Linux, called xsyncfs. Xsyncfs provides the same durability and ordering guarantees as those provided by a synchronously mounted ext3 file system. Yet, even for I/O-intensive benchmarks, xsyncfs performance is within 7 % of ext3 mounted asynchronously. Compared to ext3 mounted synchronously, xsyncfs is up to two orders of magnitude faster. 1
NFS Tricks and Benchmarking Traps
, 2003
"... We describe two modifications to the FreeBSD 4.6 NFS server to increase read throughput by improving the read-ahead heuristic to deal with reordered requests and stride access patterns. We show that for some stride access patterns, our new heuristics improve end-to-end NFS throughput by nearly a fac ..."
Abstract
-
Cited by 23 (2 self)
- Add to MetaCart
We describe two modifications to the FreeBSD 4.6 NFS server to increase read throughput by improving the read-ahead heuristic to deal with reordered requests and stride access patterns. We show that for some stride access patterns, our new heuristics improve end-to-end NFS throughput by nearly a factor of two. We also show that benchmarking and experimenting with changes to an NFS server can be a subtle and challenging task, and that it is often difficult to distinguish the impact of a new algorithm or heuristic from the quirks of the underlying software and hardware with which they interact. We discuss these quirks and their potential effects.
TBBT: Scalable and Accurate Trace Replay for File Server Evaluation
- FAST '05
, 2005
"... This paper describes the design, implementation, and evaluation of TBBT, the first comprehensive NFS trace replay tool. Given an NFS trace, TBBT automatically detects and repairs missing operations in the trace, derives a file system image required to successfully replay the trace, ages the file sys ..."
Abstract
-
Cited by 23 (5 self)
- Add to MetaCart
This paper describes the design, implementation, and evaluation of TBBT, the first comprehensive NFS trace replay tool. Given an NFS trace, TBBT automatically detects and repairs missing operations in the trace, derives a file system image required to successfully replay the trace, ages the file system image appropriately, initializes the file server under test with that image, and finally drives the file server with a workload that is derived from replaying the trace according to user-specified parameters. TBBT can scale a trace temporally or spatially to meet the need of a simulation run without violating dependencies among file system operations in the trace.
A nine year study of file system and storage benchmarking
- ACM Transactions on Storage
, 2008
"... Benchmarking is critical when evaluating performance, but is especially difficult for file and storage systems. Complex interactions between I/O devices, caches, kernel daemons, and other OS components result in behavior that is rather difficult to analyze. Moreover, systems have different features ..."
Abstract
-
Cited by 20 (4 self)
- Add to MetaCart
Benchmarking is critical when evaluating performance, but is especially difficult for file and storage systems. Complex interactions between I/O devices, caches, kernel daemons, and other OS components result in behavior that is rather difficult to analyze. Moreover, systems have different features and optimizations, so no single benchmark is always suitable. The large variety of workloads that these systems experience in the real world also adds to this difficulty. In this article we survey 415 file system and storage benchmarks from 106 recent papers. We found that most popular benchmarks are flawed and many research papers do not provide a clear indication of true performance. We provide guidelines that we hope will improve future performance evaluations. To show how some widely used benchmarks can conceal or overemphasize overheads, we conducted a set of experiments. As a specific example, slowing down read operations on ext2 by a factor of 32 resulted in only a 2–5 % wall-clock slowdown in a popular compile benchmark. Finally, we discuss future work to improve file system and storage benchmarking.
Recent Filesystem Optimisations in FreeBSD
- In Proceedings of the USENIX Annual Technical Conference (FREENIX Track
, 2002
"... In this paper we summarise four recent optimisations to the FFS implementation in FreeBSD: soft updates, dirpref, vmiodir and dirhash. We then give a detailed exposition of dirhash's implementation. Finally we study these optimisations under a variety of benchmarks and look at their interactions. Un ..."
Abstract
-
Cited by 14 (0 self)
- Add to MetaCart
In this paper we summarise four recent optimisations to the FFS implementation in FreeBSD: soft updates, dirpref, vmiodir and dirhash. We then give a detailed exposition of dirhash's implementation. Finally we study these optimisations under a variety of benchmarks and look at their interactions. Under micro-benchmarks, combinations of these optimisations can offer improvements of over two orders of magnitude. Even real-world workloads see improvements by a factor of 2--10.
Extending ACID Semantics to the File System
- Trans. Storage
, 2006
"... An organization’s data is often its most valuable asset, but today’s file systems provide few facilities to ensure its safety. Databases, on the other hand, have long provided transactions. Transactions are useful because they provide atomicity, consistency, isolation, and durability (ACID). Many ap ..."
Abstract
-
Cited by 12 (3 self)
- Add to MetaCart
An organization’s data is often its most valuable asset, but today’s file systems provide few facilities to ensure its safety. Databases, on the other hand, have long provided transactions. Transactions are useful because they provide atomicity, consistency, isolation, and durability (ACID). Many applications could make use of these semantics, but databases have a wide variety of non-standard interfaces. For example, applications like mail servers currently perform elaborate error handling to ensure atomicity and consistency, because it is easier than using a DBMS. A transaction-oriented programming model eliminates complex error-handling code, because failed operations can simply be aborted without side effects. We have designed a file system that exports ACID transactions to user-level applications, while preserving the ubiquitous and convenient POSIX interface. In our prototype ACID file system, called Amino, updated applications can protect arbitrary sequences of system calls within a transaction. Unmodified applications operate without any changes, but each system call is transaction protected. We also built a recoverable memory library with support for nested transactions to allow applications to keep their in-memory data structures consistent with the file system. Our performance evaluation shows that ACID semantics can be added to applications with acceptable overheads. When Amino adds atomicity, consistency, and isolation functionality to an application, it performs close to Ext3. Amino achieves durability up to 46% faster than Ext3, thanks to improved locality.
The Utility of File Names
, 2003
"... For typical workloads and file naming conventions, the size, lifespan, read/write ratio, and access pattern of nearly all files in a file system are accurately predicted by the name given to the file when it is created. We discuss some name-related properties observed in three contemporary NFS workl ..."
Abstract
-
Cited by 11 (4 self)
- Add to MetaCart
For typical workloads and file naming conventions, the size, lifespan, read/write ratio, and access pattern of nearly all files in a file system are accurately predicted by the name given to the file when it is created. We discuss some name-related properties observed in three contemporary NFS workloads, and present a method for automatically creating name-based models to predict interesting file properties of new files, and analyze the accuracy of these models for our workloads. Finally, we show how these predictions can be used as hints to optimize the strategies used by the file system to manage new files when they are created.

