Results 1 - 10
of
20
Transforming Policies into Mechanisms with Infokernel
- In Proceedings of the nineteenth ACM symposium on Operating systems principles
, 2003
"... We describe an evolutionary path that allows operating systems to be used in a more flexible and appropriate manner by higher-level services An inf okernel exposes key pieces of inf rmation about its algorithms and internal state; thus, its def ault policies become mechanisms, which can be controlle ..."
Abstract
-
Cited by 39 (9 self)
- Add to MetaCart
We describe an evolutionary path that allows operating systems to be used in a more flexible and appropriate manner by higher-level services An inf okernel exposes key pieces of inf rmation about its algorithms and internal state; thus, its def ault policies become mechanisms, which can be controlledf rom user-level We have implemented two prototype inf okernels based on the Linux 2 4 and NetBSD 1 5 kernels, called inf Linux and inf BSD, respectively The inf okernels export key abstractions as well as basic inf ormation primitives Using inf oLinux, we have implemented f ur case studies showing that policies within Linux can be manipulated outsideof the kernel Specifically, we show that the def ault file cache replacement algorithm, file layout policy, disk scheduling algorithm, and TCP congestion control algorithm can each be turned into base mechanisms For each case study, we havef ound that inf okernel abstractions can be implemented with little code and that the overhead and accuracyof synthesizing policies at user-level is acceptable Categories a n Subject Descriptors: D.4.7 [Operatin g Systems]: Organ inE in and Desi2 Ge n ral Terms: Desi9 , Experi51 tati1 , Performance Keywords: Poli) , MechaniE) Informatir 1.
File Access Prediction with Adjustable Accuracy
, 2002
"... We describe a novel on-line file access predictor, Recent Popularity, capable of rapid adaptation to workload changes while simultaneously predicting more events with greater accuracy than prior efforts. We distinguish the goal of predicting the most events accurately from the goal of offering the ..."
Abstract
-
Cited by 35 (8 self)
- Add to MetaCart
We describe a novel on-line file access predictor, Recent Popularity, capable of rapid adaptation to workload changes while simultaneously predicting more events with greater accuracy than prior efforts. We distinguish the goal of predicting the most events accurately from the goal of offering the most accurate predictions (when declining to offer a prediction is acceptable). For this purpose we present two distinct measures of accuracy, general and specific accuracy, corresponding to these goals. We describe how our new predictor and an earlier effort, Noah, can trade the number of events predicted for prediction accuracy by modifying simple parameters. When prediction accuracy is strictly more important than the number of predictions offered, trace-based evaluation demonstrates error rates as low as 2%, while offering predictions for more than 60% of all file access events.
File Classification in Self-* Storage Systems
- In Proceedings of the First International Conference on Autonomic Computing (ICAC-04
, 2004
"... To tune and manage themselves, file and storage systems must understand key properties (e.g., access pattern, lifetime, size) of their various files. This paper describes how systems can automatically learn to classify the properties of files (e.g., read-only access pattern, short-lived, small in si ..."
Abstract
-
Cited by 29 (3 self)
- Add to MetaCart
To tune and manage themselves, file and storage systems must understand key properties (e.g., access pattern, lifetime, size) of their various files. This paper describes how systems can automatically learn to classify the properties of files (e.g., read-only access pattern, short-lived, small in size) and predict the properties of new files, as they are created, by exploiting the strong associations between a file's properties and the names and attributes assigned to it. These associations exist, strongly but differently, in each of four real NFS environments studied. Decision tree classifiers can automatically identify and model such associations, providing prediction accuracies that often exceed 90%. Such predictions can be used to select storage policies (e.g., disk allocation schemes and replication factors) for individual files. Further, changes in associations can expose information about applications, helping autonomic system components distinguish growth from fundamental change.
HFS: A flexible file system for shared-memory multiprocessors
, 1994
"... The HURRICANE File System (HFS) is designed for large-scale, shared-memory multiprocessors. Its architecture is based on the principle that a file system must support a wide variety of file structures, file system policies and I/O interfaces to maximize performance for a wide variety of applications ..."
Abstract
-
Cited by 21 (3 self)
- Add to MetaCart
The HURRICANE File System (HFS) is designed for large-scale, shared-memory multiprocessors. Its architecture is based on the principle that a file system must support a wide variety of file structures, file system policies and I/O interfaces to maximize performance for a wide variety of applications. HFS uses a novel, object-oriented building-block approach to provide the flexibility needed to support this variety of file structures, policies, and I/O interfaces. File structures can be defined in HFS that optimize for sequential or random access, read-only, write-only or read/write access, sparse or dense data, large or small file sizes, and different degrees of application concurrency. Policies that can be defined on a per-file or per-open instance basis include locking policies, prefetching policies, compression/decompression policies and file cache management policies. In contrast, most existing file systems have been designed to support a single file structure and a small set of po...
Using Data Clustering to Improve Cleaning Performance for Flash Memory
- Software—Practice and Experience
, 1999
"... this paper, we tried to find an effective data redistribution method to reduce the number of erase operations induced by flash memory cleaner. The simplest way is to copy valid data to another free segment in the same order as they appear in the original segment. But this does nothing contributed t ..."
Abstract
-
Cited by 17 (0 self)
- Add to MetaCart
this paper, we tried to find an effective data redistribution method to reduce the number of erase operations induced by flash memory cleaner. The simplest way is to copy valid data to another free segment in the same order as they appear in the original segment. But this does nothing contributed to reduce the number of erase operations. If data are migrated in the way that hot data (most frequently updated data) are clustered in the same segments, then flash segments will be either full of all hot data or all non-hot data. Because hot data have high possibility to be updated soon to cause the original copy to become garbage, segments containing most of the hot data would soon contain the most amount of garbage.
HFS: A Flexible File System for large-scale Multiprocessors
, 1993
"... The Hurricane File System (HFS) is a new file system being developed for large-scale shared memory multiprocessors with distributed disks. The main goal of this file system is scalability; that is, the file system is designed to handle demands that are expected to grow linearly with the number of pr ..."
Abstract
-
Cited by 13 (2 self)
- Add to MetaCart
The Hurricane File System (HFS) is a new file system being developed for large-scale shared memory multiprocessors with distributed disks. The main goal of this file system is scalability; that is, the file system is designed to handle demands that are expected to grow linearly with the number of processors in the system. To achieve this goal, HFS is designed using a new structuring technique called Hierarchical Clustering. HFS is also designed to be flexible in supporting a variety of policies for managing file data and for managing file system state. This flexibility is necessary to support in a scalable fashion the diverse workloads we expect for a multiprocessor file system. 1 Introduction The Hurricane File System (HFS) is a new file system being developed for large-scale shared memory multiprocessors. In this paper the goals and basic architecture of this file system are introduced. The main goal of this file system is scalability; we expect the file system load to grow linearly...
BORG: Block-reORGanization and Self-optimization in Storage Systems
, 2007
"... This paper presents the design, implementation, and evaluation of BORG, a self-optimizing storage system that performs automatic block reorganization based on the observed I/O workload. BORG is motivated by three characteristics of I/O workloads: non-uniform access frequency distribution, temporal l ..."
Abstract
-
Cited by 13 (4 self)
- Add to MetaCart
This paper presents the design, implementation, and evaluation of BORG, a self-optimizing storage system that performs automatic block reorganization based on the observed I/O workload. BORG is motivated by three characteristics of I/O workloads: non-uniform access frequency distribution, temporal locality, and partial determinism in non-sequential accesses. To achieve its objective, BORG manages a small, dedicated partition on the disk drive, with the goal of servicing a majority of the I/O requests from within this partition with significantly reduced seek and rotational delays. BORG is transparent to the rest of the storage stack, including applications, file system(s), and I/O schedulers, thereby requiring no or minimal modification to storage stack implementations. We evaluated a Linux implementation of BORG using several real-world workloads, including individual user desktop environments, a web-server, a virtual machine monitor, and an SVN server. These experiments comprehensively demonstrate BORG’s effectiveness in improving I/O performance and its incurred resource overhead. 1
Workload-Specific File System Benchmarks
, 2001
"... To Maddie, who didn’t understand why Daddy had to work late And to Jackie, who did A fundamental problem with the current generation of file system benchmarks is that they fail to take into account the fact that a file system’s performance can vary depending on the workload running on it. Many bench ..."
Abstract
-
Cited by 9 (0 self)
- Add to MetaCart
To Maddie, who didn’t understand why Daddy had to work late And to Jackie, who did A fundamental problem with the current generation of file system benchmarks is that they fail to take into account the fact that a file system’s performance can vary depending on the workload running on it. Many benchmarks attempt to reduce file system perfor-mance to a single number, producing a simplistic one-dimensional ordering of the sys-tems being tested. Although this may be useful for marketing literature, the performance of file systems in the real world is more complicated. Different workloads place different demands on the file system, and can result in different behavior from the underlying sys-tem. A file system that provides superior performance for a web server may have inferior performance when running a software development workload. In this dissertation I demonstrate that the “one size fits all ” approach of current file system benchmarks does not accurately predict the performance of different workloads on different file systems. I then present a new benchmarking methodology
High Performance File System Design
, 1991
"... File systems and I/O subsystems should be smart ; they can analyze how they are being used and tune themselves dynamically to improve their performance. File systems should select caching and disk placement strategies on a per-file basis, and they should use system-wide disk reorganization strategie ..."
Abstract
-
Cited by 7 (0 self)
- Add to MetaCart
File systems and I/O subsystems should be smart ; they can analyze how they are being used and tune themselves dynamically to improve their performance. File systems should select caching and disk placement strategies on a per-file basis, and they should use system-wide disk reorganization strategies. For example, systems should be able to reorganize the data on disk automatically during idle periods so that system performance is improved during future periods of peak load. This dissertation presents the design and analysis of iPcress, a prototype of a nextgeneration file system. iPcress is a smart, high-performance, reliable file system. It uses statistical information collected on a per-file basis to tune itself. iPcress has a framework in which various optimizations can be performed by the file system automatically. It is extensible; other optimization techniques can be incorporated easily, so that the system may evolve. In addition, iPcress can incorporate a variety of file access...
ABSTRACT File Grouping for Scientific Data Management: Lessons from Experimenting with Real Traces
"... The analysis of data usage in a large set of real traces from a highenergy physics collaboration revealed the existence of an emergent grouping of files that we coined “filecules”. This paper presents the benefits of using this file grouping for prestaging data and compares it with previously propos ..."
Abstract
-
Cited by 3 (0 self)
- Add to MetaCart
The analysis of data usage in a large set of real traces from a highenergy physics collaboration revealed the existence of an emergent grouping of files that we coined “filecules”. This paper presents the benefits of using this file grouping for prestaging data and compares it with previously proposed file grouping techniques along a range of performance metrics. Our experiments with real workloads demonstrate that filecule grouping is a reliable and useful abstraction for data management in science Grids; that preserving time locality for data prestaging is highly recommended; that job reordering with respect to data availability has significant impact on throughput; and finally, that a relatively short history of traces is a good predictor for filecule grouping. Our experimental results provide lessons for workload modeling and suggest design guidelines for data management in dataintensive resource-sharing environments.

