Results 11 - 20
of
58
A Survey of Distributed File Systems
- Annual Review of Computer Science
, 1989
"... Abstract This paper is a survey of the current state of the art in the design and implementation of distributed file systems. It consists of four major parts: an overview of background material, case studies of a number of contemporary file systems, identification of key design techniques, and an ex ..."
Abstract
-
Cited by 45 (2 self)
- Add to MetaCart
Abstract This paper is a survey of the current state of the art in the design and implementation of distributed file systems. It consists of four major parts: an overview of background material, case studies of a number of contemporary file systems, identification of key design techniques, and an examination of current research issues. The systems surveyed are Sun NFS, Apollo Domain, Andrew, IBM AIX DS, AT&T RFS, and Sprite. The coverage of background material includes a taxonomy of file system issues, a brief history of distributed file systems, and a summary of empirical research on file properties. A comprehensive bibliography forms an important of the paper. Copyright (C) 1988,1989 M. Satyanarayanan The author was supported in the writing of this paper by the National Science Foundation (Contract No. CCR-8657907), Defense Advanced Research Projects Agency (Order No. 4976, Contract F33615-84-K-1520) and the IBM Corporation (Faculty Development Award). The views and conclusions in t...
An Empirical Study of a Wide-Area Distributed File System
, 1994
"... The evolution of the Andrew File System (AFS) into a wide-area distributed file system has encouraged collaboration and information dissemination on a much broader scale than ever before. In this paper, we examine AFS as a provider of wide-area file services to over a hundred organizations around th ..."
Abstract
-
Cited by 45 (0 self)
- Add to MetaCart
The evolution of the Andrew File System (AFS) into a wide-area distributed file system has encouraged collaboration and information dissemination on a much broader scale than ever before. In this paper, we examine AFS as a provider of wide-area file services to over a hundred organizations around the world. We discuss usage characteristics of AFS derived from empirical measurements of the system. Our observations indicate that AFS provides robust and efficient data access in its current configuration, thus confirming its viability as a design point for widearea distributed file systems.
Explicit Control in a Batch-Aware Distributed File System
"... We present the design, implementation, and evaluation of the Batch-Aware Distributed File System (BAD-FS), a system designed to orchestrate large, I/O-intensive batch workloads on remote computing clusters distributed across the wide area. BAD-FS consists of two novel components: a storage layer whi ..."
Abstract
-
Cited by 44 (3 self)
- Add to MetaCart
We present the design, implementation, and evaluation of the Batch-Aware Distributed File System (BAD-FS), a system designed to orchestrate large, I/O-intensive batch workloads on remote computing clusters distributed across the wide area. BAD-FS consists of two novel components: a storage layer which exposes control of traditionally fixed policies such as caching, consistency, and replication; and a scheduler that exploits this control as needed for different users and workloads. By extracting these controls from the storage layer and placing them in an external scheduler, BAD-FS manages both storage and computation in a coordinated way while gracefully dealing with cache consistency, fault-tolerance, and space management issues in an application-specific manner. Using both microbenchmarks and real applications, we demonstrate the performance benefits of explicit control, delivering excellent end-to-end performance across the wide-area.
Service Interface and Replica Management Algorithm for Mobile File System Clients
- In Proceedings of the First International Conference on Parallel and Distributed Information Systems
, 1991
"... Portable computers are now common, a fact that raises the possibility that file service clients might move on a regular basis. This new development requires rethinking some features of distributed file system design. We argue that existing approaches to file replica management would not cope well wi ..."
Abstract
-
Cited by 39 (3 self)
- Add to MetaCart
Portable computers are now common, a fact that raises the possibility that file service clients might move on a regular basis. This new development requires rethinking some features of distributed file system design. We argue that existing approaches to file replica management would not cope well with the likely behavior of mobile clients, and we present our solution: a lazy "server-based" update operation. This operation facilitates fast, scalable, and highly fault-tolerant implementations of both read and write operations in the usual case. To cope with the weak semantics of the update operation, we propose a new file system service interface that allows applications to opt for "UNIX semantics" by use of a slower, less fault-tolerant read operation. 1 Introduction This work investigates how to maintain replicas in a distributed file system, especially one supporting mobile clients. While the topic of replica management within file systems has received so much attention that one mig...
File System Aging -- Increasing the Relevance of File System Benchmarks
- PROCEEDINGS OF THE ACM SIGMETRICS
, 1997
"... Benchmarks are important because they provide a means for users and researchers to characterize how their workloads will perform on different systems and different system architectures. The field of file system design is no different from other areas of research in this regard, and a variety of file ..."
Abstract
-
Cited by 38 (4 self)
- Add to MetaCart
Benchmarks are important because they provide a means for users and researchers to characterize how their workloads will perform on different systems and different system architectures. The field of file system design is no different from other areas of research in this regard, and a variety of file system benchmarks are in use, representing a wide range of the different user workloads that may be run on a file system. A realistic benchmark, however, is only one of the tools that is required in order to understand how a file system design will perform in the real world. The benchmark must also be executed on a realistic file system. While the simplest approach may be to measure the performance of an empty file system, this represents a state that is seldom encountered by real users. In order to study file systems in more representative conditions, we present a methodology for aging a test file system by replaying a workload similar to that experienced by a real file system over a period of many months, or even years. Our aging tools allow the same aging workload to be applied to multiple versions of the same file system, allowing scientific evaluation of the relative merits of competing file system designs. In addition to describing our aging tools, we demonstrate their use by applying them to evaluate two enhancements to the file layout policies of the UNIX fast file system.
A five-year study of file-system metadata
- In Proceedings of the 5th USENIX Conference on File and Storage Technologies. USENIX Association
, 2007
"... For five years, we collected annual snapshots of file-system metadata from over 60,000 Windows PC file systems in a large corporation. In this article, we use these snapshots to study temporal changes in file size, file age, file-type frequency, directory size, namespace structure, file-system popul ..."
Abstract
-
Cited by 37 (4 self)
- Add to MetaCart
For five years, we collected annual snapshots of file-system metadata from over 60,000 Windows PC file systems in a large corporation. In this article, we use these snapshots to study temporal changes in file size, file age, file-type frequency, directory size, namespace structure, file-system population, storage capacity and consumption, and degree of file modification. We present a generative model that explains the namespace structure and the distribution of directory sizes. We find significant temporal trends relating to the popularity of certain file types, the origin of file content, the way the namespace is used, and the degree of variation among file systems, as well as more pedestrian changes in size and capacities. We give examples of consequent lessons for designers of file systems and related software.
Pipeline and Batch Sharing in Grid Workloads
- In Proceedings of High-Performance Distributed Computing (HPDC-12
, 2003
"... We present a study of six batch-pipelined scientific workloads that are candidates for execution on computational grids. Whereas other studies focus on the behavior of single applications, this study characterizes workloads composed of pipelines of sequential processes that use file storage for comm ..."
Abstract
-
Cited by 33 (11 self)
- Add to MetaCart
We present a study of six batch-pipelined scientific workloads that are candidates for execution on computational grids. Whereas other studies focus on the behavior of single applications, this study characterizes workloads composed of pipelines of sequential processes that use file storage for communication and also share significant data across a batch. This study includes measurements of the memory, CPU, and I/O requirements of individual components as well as analyses of I/O sharing within complete batches. We conclude with a discussion of the ramifications of these workloads for end-to-end scalability and overall system design.
An Empirical Study of a Highly Available File System
, 1994
"... In this paper we present results from a six-month empirical study of the high availability aspectsof the Coda File System. We report on the service failures experienced by Coda clients, and show that such failures are masked successfully. We also explorethe effectiveness and resource costs of key as ..."
Abstract
-
Cited by 30 (7 self)
- Add to MetaCart
In this paper we present results from a six-month empirical study of the high availability aspectsof the Coda File System. We report on the service failures experienced by Coda clients, and show that such failures are masked successfully. We also explorethe effectiveness and resource costs of key aspects of server replication and disconnected operation, the two high availability mechanisms of Coda. Wherever possible, we compare our measurements to simulation-based predictions from earlier papers and to anecdotal evidence from users. Finally, we explore how users take advantage of the support provided by Coda for mobile computing.
Operation-based Update Propagation in a Mobile File System
- IN PROCEEDINGS OF THE USENIX ANNUAL TECHNICAL CONFERENCE
, 1999
"... In this paper we describe a technique called operation-based update propagation for efficiently transmitting updates to large files that have been modified on a weakly connected client of a distributed file system. In this technique, modifications are captured above the file-system layer at the clie ..."
Abstract
-
Cited by 28 (6 self)
- Add to MetaCart
In this paper we describe a technique called operation-based update propagation for efficiently transmitting updates to large files that have been modified on a weakly connected client of a distributed file system. In this technique, modifications are captured above the file-system layer at the client, shipped to a surrogate client that is strongly connected to a server, re-executed at the surrogate, and the resulting files transmitted from the surrogate to the server. If re-execution fails to produce a file identical to the original, the system falls back to shipping the file from the client over the slow network. We have implemented a prototype of this mechanism in the Coda File System on Linux, and demonstrated performance improvements ranging from 40 percents to nearly three orders of magnitude in reduced network traffic and elapsed time. We also found a novel use of forward error correction in this context.
Measurement and analysis of large-scale network file system workloads
- In Proceedings of the 2008 USENIX Annual Technical Conference
, 2008
"... In this paper we present the analysis of two large-scale network file system workloads. We measured CIFS traffic for two enterprise-class file servers deployed in the NetApp data center for a three month period. One file server was used by marketing, sales, and finance departments and the other by t ..."
Abstract
-
Cited by 26 (8 self)
- Add to MetaCart
In this paper we present the analysis of two large-scale network file system workloads. We measured CIFS traffic for two enterprise-class file servers deployed in the NetApp data center for a three month period. One file server was used by marketing, sales, and finance departments and the other by the engineering department. Together these systems represent over 22 TB of storage used by over 1500 employees, making this the first ever large-scale study of the CIFS protocol. We analyzed how our network file system workloads compared to those of previous file system trace studies and took an in-depth look at access, usage, and sharing patterns. We found that our workloads were quite different from those previously studied; for example, our analysis found increased read-write file access patterns, decreased read-write ratios, more random file access, and longer file lifetimes. In addition, we found a number of interesting properties regarding file sharing, file re-use, and the access patterns of file types and users, showing that modern file system workload has changed in the past 5–10 years. This change in workload characteristics has implications on the future design of network file systems, which we describe in the paper. 1

