Results 1 - 10
of
26
Passive NFS Tracing of Email and Research Workloads
, 2003
"... We present an analysis of a pair of NFS traces of contemporary email and research workloads. We show that although the research workload resembles previouslystudied workloads, the email workload is quite different. We also perform several new analyses that demonstrate the periodic nature of file sys ..."
Abstract
-
Cited by 72 (8 self)
- Add to MetaCart
We present an analysis of a pair of NFS traces of contemporary email and research workloads. We show that although the research workload resembles previouslystudied workloads, the email workload is quite different. We also perform several new analyses that demonstrate the periodic nature of file system activity, the effect of out-of-order NFS calls, and the strong relationship between the name of a file and its size, lifetime, and access pattern.
Progress-Based Regulation of Low-Importance Processes
- In Proceedings of the Seventeenth ACM Symposium on Operating Systems Principles
, 1999
"... MS Manners is a mechanism that employs progress-based regulation to prevent resource contention with lowimportance processes from degrading the performance of high-importance processes. The mechanism assumes that resource contention that degrades the performance of a high-importance process will als ..."
Abstract
-
Cited by 46 (1 self)
- Add to MetaCart
MS Manners is a mechanism that employs progress-based regulation to prevent resource contention with lowimportance processes from degrading the performance of high-importance processes. The mechanism assumes that resource contention that degrades the performance of a high-importance process will also retard the progress of the low-importance process. MS Manners detects this contention by monitoring the progress of the lowimportance process and inferring resource contention from a drop in the progress rate. This technique recognizes contention over any system resource, as long as the performance impact on contending processes is roughly symmetric. MS Manners employs statistical mechanisms to deal with stochastic progress measurements; it automatically calibrates a target progress rate, so no manual tuning is required; it supports multiple progress metrics from applications that perform several distinct tasks; and it orchestrates multiple low-importance processes to prevent measurement i...
Ext3cow: A time-shifting file system for regulatory compliance
- ACM Transactions on Storage
, 2005
"... The ext3cow file system, built on the popular ext3 file system, provides an open-source file versioning and snapshot platform for compliance with the versioning and audtitability requirements of recent electronic record retention legislation. Ext3cow provides a time-shifting interface that permits a ..."
Abstract
-
Cited by 43 (2 self)
- Add to MetaCart
The ext3cow file system, built on the popular ext3 file system, provides an open-source file versioning and snapshot platform for compliance with the versioning and audtitability requirements of recent electronic record retention legislation. Ext3cow provides a time-shifting interface that permits a real-time and continuous view of data in the past. Time-shifting does not pollute the file system namespace nor require snapshots to be mounted as a separate file system. Further, ext3cow is implemented entirely in the file system space and, therefore, does not modify kernel interfaces or change the operation of other file systems. Ext3cow takes advantage of the fine-grained control of on-disk and in-memory data available only to a file system, resulting in minimal degradation of performance and functionality. Experimental results confirm this hypothesis; ext3cow performs comparably to ext3 on many benchmarks and on trace-driven experiments.
NFS Tricks and Benchmarking Traps
, 2003
"... We describe two modifications to the FreeBSD 4.6 NFS server to increase read throughput by improving the read-ahead heuristic to deal with reordered requests and stride access patterns. We show that for some stride access patterns, our new heuristics improve end-to-end NFS throughput by nearly a fac ..."
Abstract
-
Cited by 23 (2 self)
- Add to MetaCart
We describe two modifications to the FreeBSD 4.6 NFS server to increase read throughput by improving the read-ahead heuristic to deal with reordered requests and stride access patterns. We show that for some stride access patterns, our new heuristics improve end-to-end NFS throughput by nearly a factor of two. We also show that benchmarking and experimenting with changes to an NFS server can be a subtle and challenging task, and that it is often difficult to distinguish the impact of a new algorithm or heuristic from the quirks of the underlying software and hardware with which they interact. We discuss these quirks and their potential effects.
TBBT: Scalable and Accurate Trace Replay for File Server Evaluation
- FAST '05
, 2005
"... This paper describes the design, implementation, and evaluation of TBBT, the first comprehensive NFS trace replay tool. Given an NFS trace, TBBT automatically detects and repairs missing operations in the trace, derives a file system image required to successfully replay the trace, ages the file sys ..."
Abstract
-
Cited by 23 (5 self)
- Add to MetaCart
This paper describes the design, implementation, and evaluation of TBBT, the first comprehensive NFS trace replay tool. Given an NFS trace, TBBT automatically detects and repairs missing operations in the trace, derives a file system image required to successfully replay the trace, ages the file system image appropriately, initializes the file server under test with that image, and finally drives the file server with a workload that is derived from replaying the trace according to user-specified parameters. TBBT can scale a trace temporally or spatially to meet the need of a simulation run without violating dependencies among file system operations in the trace.
Cache-Oblivious String B-trees
- IN: PROC. OF PRINCIPLES OF DATABASE SYSTEMS
, 2006
"... B-trees are the data structure of choice for maintaining searchable data on disk. However, B-trees perform suboptimally • when keys are long or of variable length, • when keys are compressed, even when using front compression, the standard B-tree compression scheme, • for range queries, and • with r ..."
Abstract
-
Cited by 23 (5 self)
- Add to MetaCart
B-trees are the data structure of choice for maintaining searchable data on disk. However, B-trees perform suboptimally • when keys are long or of variable length, • when keys are compressed, even when using front compression, the standard B-tree compression scheme, • for range queries, and • with respect to memory effects such as disk prefetching. This paper presents a cache-oblivious string B-tree (COSB-tree) data structure that is efficient in all these ways: • The COSB-tree searches asymptotically optimally and inserts and deletes nearly optimally. • It maintains an index whose size is proportional to the frontcompressed size of the dictionary. Furthermore, unlike standard front-compressed strings, keys can be decompressed in a memory-efficient manner. • It performs range queries with no extra disk seeks; in contrast, B-trees incur disk seeks when skipping from leaf block to leaf block. • It utilizes all levels of a memory hierarchy efficiently and makes good use of disk locality by using cache-oblivious layout strategies.
A nine year study of file system and storage benchmarking
- ACM Transactions on Storage
, 2008
"... Benchmarking is critical when evaluating performance, but is especially difficult for file and storage systems. Complex interactions between I/O devices, caches, kernel daemons, and other OS components result in behavior that is rather difficult to analyze. Moreover, systems have different features ..."
Abstract
-
Cited by 20 (4 self)
- Add to MetaCart
Benchmarking is critical when evaluating performance, but is especially difficult for file and storage systems. Complex interactions between I/O devices, caches, kernel daemons, and other OS components result in behavior that is rather difficult to analyze. Moreover, systems have different features and optimizations, so no single benchmark is always suitable. The large variety of workloads that these systems experience in the real world also adds to this difficulty. In this article we survey 415 file system and storage benchmarks from 106 recent papers. We found that most popular benchmarks are flawed and many research papers do not provide a clear indication of true performance. We provide guidelines that we hope will improve future performance evaluations. To show how some widely used benchmarks can conceal or overemphasize overheads, we conducted a set of experiments. As a specific example, slowing down read operations on ext2 by a factor of 32 resulted in only a 2–5 % wall-clock slowdown in a popular compile benchmark. Finally, we discuss future work to improve file system and storage benchmarking.
Ext3cow: The Design, Implementation, and Analysis of Metadata for a Time-Shifting File System
, 2003
"... The ext3cow file system, built on Linux's popular ext3 file system, brings snapshot functionality and file versioning to the open-source community. Our implementation of ext3cow has several desirable properties: ext3cow is implemented entirely in the file system and, therefore, does not modify ker ..."
Abstract
-
Cited by 15 (0 self)
- Add to MetaCart
The ext3cow file system, built on Linux's popular ext3 file system, brings snapshot functionality and file versioning to the open-source community. Our implementation of ext3cow has several desirable properties: ext3cow is implemented entirely in the file system and, therefore, does not modify kernel interfaces or change the operation of other file systems; ext3cow provides a time-shifting interface that permits access to data in the past without polluting the file system namespace; and, ext3cow creates versions of files on disk without copying data in memory. Experimental results show that the time-shifting functions of ext3cow do not degrade file system performance. Ext3cow performs comparably to ext3 on many file system benchmarks and trace driven experiments.
Accurate and efficient replaying of file system traces
- In Proc. USENIX Conference on File and Storage Technologies (FAST’05
, 2005
"... Replaying traces is a time-honored method for benchmarking, stress-testing, and debugging systems—and more recently—forensic analysis. One benefit to replaying traces is the reproducibility of the exact set of operations that were captured during a specific workload. Existing trace capture and repla ..."
Abstract
-
Cited by 12 (5 self)
- Add to MetaCart
Replaying traces is a time-honored method for benchmarking, stress-testing, and debugging systems—and more recently—forensic analysis. One benefit to replaying traces is the reproducibility of the exact set of operations that were captured during a specific workload. Existing trace capture and replay systems operate at different levels: network packets, disk device drivers, network file systems, or system calls. System call replayers miss memory-mapped operations and cannot replay I/Ointensive workloads at original speeds. Traces captured at other levels miss vital information that is available only at the file system level. We designed and implemented Replayfs, the first system for replaying file system traces at the VFS level. The VFS is the most appropriate level for replaying file system traces because all operations are reproduced in a manner that is most relevant to file-system developers. Thanks to the uniform VFS API, traces can be replayed transparently onto any existing file system, even a different one than the one originally traced, without modifying existing file systems. Replayfs’s user-level compiler prepares a trace to be replayed efficiently in the kernel where multiple kernel threads prefetch and schedule the replay of file system operations precisely and efficiently. These techniques allow us to replay I/O-intensive traces at different speeds, and even accelerate them on the same hardware that the trace was captured on originally. 1
Long-term File Activity and Inter-Reference Patterns
, 1998
"... This paper is organized into nine sections. We begin by reviewing previous disk activity studies in Section 2. In Section 3, we briefly discuss our data collection and analysis tools, which differ significantly from those used in earlier studies. We describe the different types of computing environm ..."
Abstract
-
Cited by 7 (1 self)
- Add to MetaCart
This paper is organized into nine sections. We begin by reviewing previous disk activity studies in Section 2. In Section 3, we briefly discuss our data collection and analysis tools, which differ significantly from those used in earlier studies. We describe the different types of computing environments from which we collected data in Section 4. The software written for this paper analyzes the collected data and generates statistics. The simplest analysis mode provides information about daily activity. This is shown in Section 5. Analysis of long-term trends is shown in Section 6. An interesting product from this research is a comparison of the same file system's activity from either the file name view, or from the operating system's underlying numeric index. This comparison is done in Section 7. We summarize our findings in Section 8 and briefly discuss our future research in Section 9.

