Results 1 - 10
of
95
The Design and Implementation of a Log-Structured File System
- ACM Transactions on Computer Systems
, 1992
"... This paper presents a new technique for disk storage management called a log-structured file system. A logstructured file system writes all modifications to disk sequentially in a log-like structure, thereby speeding up both file writing and crash recovery. The log is the only structure on disk; it ..."
Abstract
-
Cited by 808 (6 self)
- Add to MetaCart
This paper presents a new technique for disk storage management called a log-structured file system. A logstructured file system writes all modifications to disk sequentially in a log-like structure, thereby speeding up both file writing and crash recovery. The log is the only structure on disk; it contains indexing information so that files can be read back from the log efficiently. In order to maintain large free areas on disk for fast writing, we divide the log into segments and use a segment cleaner to compress the live information from heavily fragmented segments. We present a series of simulations that demonstrate the efficiency of a simple cleaning policy based on cost and benefit. We have implemented a prototype logstructured file system called Sprite LFS; it outperforms current Unix file systems by an order of magnitude for small-file writes while matching or exceeding Unix performance for reads and large writes. Even when the overhead for cleaning is included, Sprite LFS can use 70 % of the disk bandwidth for writing, whereas Unix file systems typically can use only 5-10%. 1.
A Study of Integrated Prefetching and Caching Strategies
- In Proceedings of the ACM SIGMETRICS
, 1995
"... Prefetching and caching are effective techniques for improving the performance of file systems, but they have not been studied in an integrated fashion. This paper proposes four properties that optimal integrated strategies for prefetching and caching must satisfy, and then presents and studies two ..."
Abstract
-
Cited by 168 (9 self)
- Add to MetaCart
Prefetching and caching are effective techniques for improving the performance of file systems, but they have not been studied in an integrated fashion. This paper proposes four properties that optimal integrated strategies for prefetching and caching must satisfy, and then presents and studies two such integrated strategies, called aggressive and conservative. We prove that the performance of the conservative approach is within a factor of two of optimal and that the performance of the aggressive strategy is a factor significantly less than twice that of the optimal case. We have evaluated these two approaches by trace-driven simulation with a collection of file access traces. Our results show that the two integrated prefetching and caching strategies are indeed close to optimal and that these strategies can reduce the running time of applications by up to 50%.
An Implementation of a Log-Structured File System for UNIX
, 1993
"... Research results [ROSE91] demonstrate that a log-structured file system (LFS) offers the potential for dramatically improved write performance, faster recovery time, and faster file creation and deletion than traditional UNIX file systems. This paper presents a redesign and implementation of the Spr ..."
Abstract
-
Cited by 163 (13 self)
- Add to MetaCart
Research results [ROSE91] demonstrate that a log-structured file system (LFS) offers the potential for dramatically improved write performance, faster recovery time, and faster file creation and deletion than traditional UNIX file systems. This paper presents a redesign and implementation of the Sprite [ROSE91] log-structured file system that is more robust and integrated into the vnode interface [KLEI86]. Measurements show its performance to be superior to the 4BSD Fast File System (FFS) in a variety of benchmarks and not significantly less than FFS in any test. Unfortunately, an enhanced version of FFS (with read and write clustering) [MCVO91] provides comparable and sometimes superior performance to our LFS. However, LFS can be extended to provide additional functionality such as embedded transactions and versioning, not easily implemented in traditional file systems. 1. Introduction Early UNIX file systems used a small, fixed block size and made no attempt to optimize block place...
Deciding when to forget in the Elephant file system
- 17TH ACM SYMPOSIUM ON OPERATING SYSTEMS PRINCIPLES (SOSP ’99), PUBLISHED AS OPERATING SYSTEMS REVIEW, 34(5):110–123, DEC. 1999
, 1999
"... Modern file systems associate the deletion of a file with the immediate release of storage, and file writes with the irrevocable change of file contents. We argue that this behavior is a relic of the past, when disk storage was a scarce resource. Today, large cheap disks make it possible for the fil ..."
Abstract
-
Cited by 160 (5 self)
- Add to MetaCart
Modern file systems associate the deletion of a file with the immediate release of storage, and file writes with the irrevocable change of file contents. We argue that this behavior is a relic of the past, when disk storage was a scarce resource. Today, large cheap disks make it possible for the file system to protect valuable data from accidental delete or overwrite. This paper describes the design, implementation, and performance of the Elephant file system, which automatically retains all important versions of user files. Users name previous file versions by combining a traditional pathname with a time when the desired version of a file or directory existed. Storage in Elephant is managed by the system using filegrain user-specified retention policies. This approach contrasts with checkpointing file systems such as Plan-9, AFS, and WAFL that periodically generate efficient checkpoints of entire file systems and thus restrict retention to be guided by a single policy for all files within that file system. Elephant is implemented as a new Virtual File System in the FreeBSD kernel.
Idleness is Not Sloth
, 1995
"... Many people have observed that computer systems spend much of their time idle, and various schemes have been proposed to use this idle time productively. The commonest approach is to off-load activity from busy periods to less-busy ones in order to improve system responsiveness. In addition, specula ..."
Abstract
-
Cited by 141 (8 self)
- Add to MetaCart
Many people have observed that computer systems spend much of their time idle, and various schemes have been proposed to use this idle time productively. The commonest approach is to off-load activity from busy periods to less-busy ones in order to improve system responsiveness. In addition, speculative work can be performed in idle periods in the hopes that it will be needed later at times of higher utilization, or non-renewable resource like battery power can be conserved by disabling unused resources. We found opportunities to exploit idle time in our work on storage systems, and after a few attempts to tackle specific instances of it in ad hoc ways, began to investigate general mechanisms that could be applied to this problem. Our results include a taxonomy of idle-time detection algorithms, metrics for evaluating them, and an evaluation of a number of idleness predictors that we generated from our taxonomy. 1. Introduction Resource usage is often bursty: periods of high utilizat...
Implementation and Performance of Application-Controlled File Caching
- IN PROCEEDINGS OF THE FIRST SYMPOSIUM ON OPERATING SYSTEMS DESIGN AND IMPLEMENTATION
, 1994
"... Traditional file system implementations do not allow applications to control file caching replacement decisions. We have implemented two-level replacement, a scheme that allows applications to control their own cache replacement, while letting the kernel control the allocation of cache space among ..."
Abstract
-
Cited by 114 (4 self)
- Add to MetaCart
Traditional file system implementations do not allow applications to control file caching replacement decisions. We have implemented two-level replacement, a scheme that allows applications to control their own cache replacement, while letting the kernel control the allocation of cache space among processes. We designed an interface to let applications exert control on replacement via a set of directives to the kernel. This is effective and requires low overhead. We demonstrate that for applications that do not perform well under traditional caching policies, the combination of good application-chosen replacement strategies, and our kernel allocation policy LRU-SP, can reduce the number of block I/Os by up to 80%, and can reduce the elapsed time by up to 45%. We also show that LRU-SP is crucial to the performance improvement for multiple concurrent applications: LRUSP fairly distributes cache blocks and offers protection against foolish applications.
File System Logging Versus Clustering: A Performance Comparison
, 1995
"... The Log-structured File System (LFS), introduced in 1991 [8], has received much attention for its potential order-of-magnitude improvement in file system performance. Early research results [9] showed that small file performance could scale with processor speed and that cleaning costs could be kept ..."
Abstract
-
Cited by 106 (11 self)
- Add to MetaCart
The Log-structured File System (LFS), introduced in 1991 [8], has received much attention for its potential order-of-magnitude improvement in file system performance. Early research results [9] showed that small file performance could scale with processor speed and that cleaning costs could be kept low, allowing LFS to write at an effective bandwidth of 62 to 83% of the maximum. Later work showed that the presence of synchronous disk operations could degrade performance by as much as 62% and that cleaning overhead could become prohibitive in transaction processing workloads, reducing performance by as much as 40% [10]. The same work showed that the addition of clustered reads and writes in the Berkeley Fast File System [6] (FFS) made it competitive with LFS in large-file handling and software development environments as approximated by the Andrew benchmark [4]. These seemingly inconsistent results have caused confusion in the file system research community. This paper presents a detail...
The Logical Disk: A New Approach to Improving File Systems
"... The Logical Disk (LD) defines a new interface to disk storage that separates file management and disk management by using logical block numbers and block lists. The LD interface is designed to support multiple file systems and to allow multiple implementations, both of which are important given the ..."
Abstract
-
Cited by 106 (1 self)
- Add to MetaCart
The Logical Disk (LD) defines a new interface to disk storage that separates file management and disk management by using logical block numbers and block lists. The LD interface is designed to support multiple file systems and to allow multiple implementations, both of which are important given the increasing use of kernels that support multiple operating system personalities. A log-structured implementation of LD (LLD) demonstrates that LD can be implemented efficiently. LLD adds about 5% to 10% to the purchase cost of a disk for the main memory it requires. Combining LLD with an existing file system results in a log-structured file system that exhibits the same performance characteristics as the Sprite log-structured file system.
Implementation and Performance of Integrated Application-Controlled Caching, Prefetching and Disk Scheduling
, 1996
"... Although file caching and prefetching are known techniques to improve the performance of file systems, little work has been done on intergrating caching and prefetching. Optimal prefetching is nontrivial because prefetching may require early cache block replacements. Moreover, the tradeoff between t ..."
Abstract
-
Cited by 100 (8 self)
- Add to MetaCart
Although file caching and prefetching are known techniques to improve the performance of file systems, little work has been done on intergrating caching and prefetching. Optimal prefetching is nontrivial because prefetching may require early cache block replacements. Moreover, the tradeoff between the latency-hiding benefits of prefetching and the increase in the number of fetches required must be considered. This paper presents the design and implementation of a file system that integrates application-controlled caching, prefetching and disk scheduling. We use a two-level cache management strategy. The kernel uses the LRU-SP policy [CFL94a] to allocate blocks to processes, and each process uses the controlledaggressive policy, an algorithm previously shown in a theoretical sense to be near-optimal, for managing its cache. Each process then improves its disk access latency by submitting its prefetches in batches and schedules the requests in each batch to optimize disk access performa...
Embedded Inodes and Explicit Grouping: Exploiting Disk Bandwidth for Small Files
- In Proceedings of the 1997 USENIX Technical Conference
, 1997
"... Small file performance in most file systems is limited by slowly improving disk access times, even though current file systems improve on-disk locality by allocating related data objects in the same general region. The key insight for why current file systems perform poorly is that locality is insuf ..."
Abstract
-
Cited by 92 (14 self)
- Add to MetaCart
Small file performance in most file systems is limited by slowly improving disk access times, even though current file systems improve on-disk locality by allocating related data objects in the same general region. The key insight for why current file systems perform poorly is that locality is insufficient --- exploiting disk bandwidth for small data objects requires that they be placed adjacently. We describe C-FFS (Co-locating Fast File System), which introduces two techniques, embedded inodes and explicit grouping, for exploiting what disks do well (bulk data movement) to avoid what they do poorly (reposition to new locations). With embedded inodes, the inodes for most files are stored in the directory with the corresponding name, removing a physical level of indirection without sacrificing the logical level of indirection. With explicit grouping, the data blocks of multiple small files named by a given directory are allocated adjacently and moved to and from the disk as a unit in ...

