Results 1 - 10
of
155
Serverless Network File Systems
- ACM TRANSACTIONS ON COMPUTER SYSTEMS
, 1995
"... In this paper, we propose a new paradigm for network file system design, serverless network file systems. While traditional network file systems rely on a central server machine, a serverless system utilizes workstations cooperating as peers to provide all file system services. Any machine in the sy ..."
Abstract
-
Cited by 403 (26 self)
- Add to MetaCart
In this paper, we propose a new paradigm for network file system design, serverless network file systems. While traditional network file systems rely on a central server machine, a serverless system utilizes workstations cooperating as peers to provide all file system services. Any machine in the system can store, cache, or control any block of data. Our approach uses this location independence, in combination with fast local area networks, to provide better performance and scalability than traditional file systems. Further, because any machine in the system can assume the responsibilities of a failed component, our serverless design also provides high availability via redundant data storage. To demonstrate our approach, we have implemented a prototype serverless network file system called xFS. Preliminary performance measurements suggest that our architecture achieves its goal of scalability. For instance, in a 32-node xFS system with 32 active clients, each client receives nearly as much read or write throughput as it would see if it were the only active client.
Hippodrome: Running Circles around Storage Administration
- In Proceedings of the Conference on File and Storage Technologies
, 2002
"... Enterprise-scale computer storage systems are extremely difficult to manage due to their size and complexity. It is difficult to generate a good storage system design for a given workload and to correctly implement the selected design. Traditionally, initial system configuration is performed by admi ..."
Abstract
-
Cited by 118 (9 self)
- Add to MetaCart
Enterprise-scale computer storage systems are extremely difficult to manage due to their size and complexity. It is difficult to generate a good storage system design for a given workload and to correctly implement the selected design. Traditionally, initial system configuration is performed by administrators who are guided by rules of thumb. Unfortunately, this process involves trial and error, and as a result is tedious and error-prone. In this paper, we introduce Hippodrome, an approach to automating initial system configuration. Hippodrome is an iterative loop that analyzes an existing system to determine its requirements, creates a new storage system design to better meet these requirements, and migrates the existing system to the new design. In this paper, we show how Hippodrome automates initial system configuration. 1
Ceph: A scalable, high-performance distributed file system
- In Proceedings of the 7th Symposium on Operating Systems Design and Implementation (OSDI
, 2006
"... We have developed Ceph, a distributed file system that provides excellent performance, reliability, and scalability. Ceph maximizes the separation between data and metadata management by replacing allocation tables with a pseudo-random data distribution function (CRUSH) designed for heterogeneous an ..."
Abstract
-
Cited by 112 (21 self)
- Add to MetaCart
We have developed Ceph, a distributed file system that provides excellent performance, reliability, and scalability. Ceph maximizes the separation between data and metadata management by replacing allocation tables with a pseudo-random data distribution function (CRUSH) designed for heterogeneous and dynamic clusters of unreliable object storage devices (OSDs). We leverage device intelligence by distributing data replication, failure detection and recovery to semi-autonomous OSDs running a specialized local object file system. A dynamic distributed metadata cluster provides extremely efficient metadata management and seamlessly adapts to a wide range of general purpose and scientific computing file system workloads. Performance measurements under a variety of workloads show that Ceph has excellent I/O performance and scalable metadata management, supporting more than 250,000 metadata operations per second. 1
Manageability, availability and performance in Porcupine: a highly scalable, cluster-based mail service
- In Proceedings of the 17th ACM Symposium on Operating Systems Principles
, 1999
"... This paper describes the motivation, design, and performance of Porcupine, a scalable mail server. The goal of Porcupine is to provide a highly available and scalable electronic mail service using a large cluster of commodity PCs. We designed Porcupine to be easy to manage by emphasizing dynamic loa ..."
Abstract
-
Cited by 108 (6 self)
- Add to MetaCart
This paper describes the motivation, design, and performance of Porcupine, a scalable mail server. The goal of Porcupine is to provide a highly available and scalable electronic mail service using a large cluster of commodity PCs. We designed Porcupine to be easy to manage by emphasizing dynamic load balancing, automatic configuration, and graceful degradation in the presence of failures. Key to the system’s manageability, availability, and performance is that sessions, data, and underlying services are distributed homogeneously and dynamically across nodes in a cluster.
My cache or yours? Making storage more exclusive
- In Proceedings of the 2002 USENIX Annual Technical Conference
, 2002
"... Modern high-end disk arrays often have several gigabytes of cache RAM. Unfortunately, most array caches use management policies which duplicate the same data blocks at both the client and array levels of the cache hierarchy: they are inclusive. Thus, the aggregate cache behaves as if it was only as ..."
Abstract
-
Cited by 88 (0 self)
- Add to MetaCart
Modern high-end disk arrays often have several gigabytes of cache RAM. Unfortunately, most array caches use management policies which duplicate the same data blocks at both the client and array levels of the cache hierarchy: they are inclusive. Thus, the aggregate cache behaves as if it was only as big as the larger of the client and array caches, instead of as large as the sum of the two. Inclusiveness is wasteful: cache RAM is expensive. We explore the benefits of a simple scheme to achieve exclusive caching, in which a data block is cached at either a client or the disk array, but not both. Exclusiveness helps to create the effect of a single, large unified cache. We introduce a DEMOTE operation to transfer data ejected from the client to the array, and explore its effectiveness with simulation studies. We quantify the benefits and overheads of demotions across both synthetic and real-life workloads. The results show that we can obtain useful -- sometimes substantial -- speedups. During our investigations, we also developed some new cache-insertion algorithms that show promise for multi-client systems, and report on some of their properties.
Towards higher disk head utilization: extracting free bandwidth from busy disk drives
- Symposium on Operating Systems Design and Implementation
, 2000
"... Abstract Freeblock scheduling is a new approach to utilizing more of a disk's potential media bandwidth. By filling rotational latency periods with useful media transfers, 20-50 % of a never-idle disk's bandwidth can often be provided to background applications with no effect on foreground response ..."
Abstract
-
Cited by 79 (18 self)
- Add to MetaCart
Abstract Freeblock scheduling is a new approach to utilizing more of a disk's potential media bandwidth. By filling rotational latency periods with useful media transfers, 20-50 % of a never-idle disk's bandwidth can often be provided to background applications with no effect on foreground response times. This paper describes freeblock scheduling and demonstrates its value with simulation studies of two concrete applications: segment cleaning and data mining. Free segment cleaning often allows an LFS file system to maintain its ideal write performance when cleaning overheads would otherwise reduce performance by up to a factor of three. Free data mining can achieve over 47 full disk scans per day on an active transaction processing system, with no effect on its disk performance.
Facade: Virtual Storage Devices With Performance Guarantees
, 2003
"... High-end storage systems, such as those in large data centers, must service multiple independent workloa&. Workloa& often require predictable quality of service, deslite the fact that they have to compete with other rapidly-changing workloads for access to common storage resources. We present a nove ..."
Abstract
-
Cited by 76 (9 self)
- Add to MetaCart
High-end storage systems, such as those in large data centers, must service multiple independent workloa&. Workloa& often require predictable quality of service, deslite the fact that they have to compete with other rapidly-changing workloads for access to common storage resources. We present a novel approach to providing performance guarantees in this highly-volatile scenario, in an efficient and cost-effective way. Facade, a virtual store controller, sits between hosts and storage devices in the network, and throttles individual I/0 requests from multiple clients so that devices do not saturate. We implemented a prototype, and evaluated it using real workloads on an enterprise storage system. We also instantiated it to the particular case of emulating commercial disk arrays. Our results show that Facade satisfies performance objectives while making efficient use of the storage resources---even in the presence of of failures and bursty workloads with stringent performance requirements.
Track-aligned Extents: Matching Access Patterns to Disk Drive Characteristics
- IN PROCEEDINGS OF THE 1ST USENIX SYMPOSIUM ON FILE AND STORAGE TECHNOLOGIES(FAST '02
, 2002
"... Track-aligned extents (traxtents) utilize disk-specific knowledge to match access patterns to the strengths of modern disks. By allocating and accessing related data on disk track boundaries, a system can avoid most rotational latency and track crossing overheads. Avoiding these overheads can incre ..."
Abstract
-
Cited by 72 (19 self)
- Add to MetaCart
Track-aligned extents (traxtents) utilize disk-specific knowledge to match access patterns to the strengths of modern disks. By allocating and accessing related data on disk track boundaries, a system can avoid most rotational latency and track crossing overheads. Avoiding these overheads can increase disk access efficiency by up to 50 % for mid-sized requests (100-500 KB). This paper describes traxtents, algorithms for detecting track boundaries, and some uses of traxtents in file systems and video servers. For large-file workloads, a version of FreeBSD's FFS implementation that exploits traxtents reduces application run times by up to 20 % compared to the original version. A video server using traxtent-based requests can support 56 % more concurrent streams at the same startup latency and buffer space. For LFS, 44 % lower overall write cost for track-sized segments can be achieved.
Journaling versus Soft Updates: Asynchronous Meta-data Protection in File Systems
- In USENIX Annual Technical Conference
, 2000
"... The UNIX Fast File System (FFS) is probably the most widely-used file system for performance comparisons. However, such comparisons frequently overlook many of the performance enhancements that have been added over the past decade. In this paper, we explore the two most commonly used approaches for ..."
Abstract
-
Cited by 59 (4 self)
- Add to MetaCart
The UNIX Fast File System (FFS) is probably the most widely-used file system for performance comparisons. However, such comparisons frequently overlook many of the performance enhancements that have been added over the past decade. In this paper, we explore the two most commonly used approaches for improving the performance of meta-data operations and recovery: journaling and Soft Updates. Journaling systems use an auxiliary log to record meta-data operations and Soft Updates uses ordered writes to ensure meta-data consistency. The commercial sector has moved en masse to journaling file systems, as evidenced by their presence on nearly every server platform available today: Solaris, AIX, Digital UNIX, HP-UX, Irix, and Windows NT. On all but Solaris, the default file system uses journaling. In the meantime, Soft Updates holds the promise of providing stronger reliability guarantees than journaling, with faster recovery and superior performance in certain boundary cases. In this paper, we explore the benefits of Soft Updates and journaling, comparing their behavior on both microbenchmarks and workload-based macrobenchmarks. We find that journaling alone is not sufficient to “solve ” the meta-data update problem. If synchronous semantics are required (i.e., meta-data operations are durable once the system call returns), then the journaling systems cannot realize their full potential. Only when this synchronicity requirement is relaxed can journaling systems approach the performance of systems like Soft Updates (which also relaxes this requirement). Our asynchronous journaling and Soft Updates systems perform comparably in most cases. While Soft Updates excels in some meta-data intensive microbenchmarks, the macrobenchmark results are more ambiguous. In three cases Soft Updates and journaling are comparable. In a file intensive news workload, journaling prevails, and in a small ISP workload, Soft Updates prevails. 2
Designing Computer Systems with MEMS-based Storage
- In Proceedings of the 9th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS
, 2000
"... For decades the RAM-to-disk memory hierarchy gap has plagued computer architects. An exciting new storage technology based on microelectromechanical systems (MEMS) is poised to fill a large portion of this performance gap, significantly reduce system power consumption, and enable many new applicatio ..."
Abstract
-
Cited by 58 (11 self)
- Add to MetaCart
For decades the RAM-to-disk memory hierarchy gap has plagued computer architects. An exciting new storage technology based on microelectromechanical systems (MEMS) is poised to fill a large portion of this performance gap, significantly reduce system power consumption, and enable many new applications. This paper explores the system-level implications of integrating MEMS-based storage into the memory hierarchy. Results show that standalone MEMS-based storage reduces I/O stall times by 4-74X over disks and improves overall application runtimes by 1.9-4.4X. When used as on-board caches for disks, MEMS-based storage improves I/O response time by up to 3.5X. Further, the energy consumption of MEMS-based storage is 10-54X less than that of state-of-the-art low-power disk drives. The combination of the high-level physical characteristics of MEMS-based storage (small footprints, high shock tolerance) and the ability to directly integrate MEMS-based storage with processing leads to such new ap...

