Results 1 - 10
of
40
Improving the reliability of commodity operating systems
, 2003
"... drivers remain a significant cause of system failures. In Windows XP, for example, drivers account for 85 % of recently reported failures. This article describes Nooks, a reliability subsystem that seeks to greatly enhance operating system (OS) reliability by isolating the OS from driver failures. T ..."
Abstract
-
Cited by 192 (14 self)
- Add to MetaCart
drivers remain a significant cause of system failures. In Windows XP, for example, drivers account for 85 % of recently reported failures. This article describes Nooks, a reliability subsystem that seeks to greatly enhance operating system (OS) reliability by isolating the OS from driver failures. The Nooks approach is practical: rather than guaranteeing complete fault tolerance through a new (and incompatible) OS or driver architecture, our goal is to prevent the vast majority of driver-caused crashes with little or no change to the existing driver and system code. Nooks isolates drivers within lightweight protection domains inside the kernel address space, where hardware and software prevent them from corrupting the kernel. Nooks also tracks a driver’s use of kernel resources to facilitate automatic cleanup during recovery. To prove the viability of our approach, we implemented Nooks in the Linux operating system and used it to fault-isolate several device drivers. Our results show that Nooks offers a substantial increase in the reliability of operating systems, catching and quickly recovering from many faults that would otherwise crash the system. Under a wide range and number of fault conditions, we show that Nooks recovers automatically from 99 % of the faults that otherwise cause Linux to crash.
Sinfonia: a new paradigm for building scalable distributed systems
- In SOSP
, 2007
"... We propose a new paradigm for building scalable distributed systems. Our approach does not require dealing with message-passing protocols—a major complication in existing distributed systems. Instead, developers just design and manipulate data structures within our service called Sinfonia. Sinfonia ..."
Abstract
-
Cited by 56 (6 self)
- Add to MetaCart
We propose a new paradigm for building scalable distributed systems. Our approach does not require dealing with message-passing protocols—a major complication in existing distributed systems. Instead, developers just design and manipulate data structures within our service called Sinfonia. Sinfonia keeps data for applications on a set of memory nodes, each exporting a linear address space. At the core of Sinfonia is a novel minitransaction primitive that enables efficient and consistent access to data, while hiding the complexities that arise from concurrency and failures. Using Sinfonia, we implemented two very different and complex applications in a few months: a cluster file system and a group communication service. Our implementations perform well and scale to hundreds of machines.
A Coherent Distributed File Cache With Directory Write-behind
, 1993
"... Extensive caching is a key feature of the Echo distributed file system. Echo client machines maintain coherent caches of file and directory data and properties, with write-behind (delayed write-back) of all cached information. Echo specifies ordering constraints on this write-behind, enabling applic ..."
Abstract
-
Cited by 52 (6 self)
- Add to MetaCart
Extensive caching is a key feature of the Echo distributed file system. Echo client machines maintain coherent caches of file and directory data and properties, with write-behind (delayed write-back) of all cached information. Echo specifies ordering constraints on this write-behind, enabling applications to store and maintain consistent data structures in the file system even when crashes or network faults prevent some writes from being completed. In this paper we describe the Echo cache's coherence and ordering semantics, show how they can improve the performance and consistency of applications, and explain how they are implemented. We also discuss the general problem of reliably notifying applications and users when write-behind is lost; we addressed this problem as part of the Echo design but did not find a fully satisfactory solution.
Speculative execution in a distributed file system
- ACM Trans. Comput. Syst
, 2006
"... Speculator provides Linux kernel support for speculative execution. It allows multiple processes to share speculative state by tracking causal dependencies propagated through interprocess communication. It guarantees correct execution by preventing speculative processes from externalizing output, e. ..."
Abstract
-
Cited by 49 (13 self)
- Add to MetaCart
Speculator provides Linux kernel support for speculative execution. It allows multiple processes to share speculative state by tracking causal dependencies propagated through interprocess communication. It guarantees correct execution by preventing speculative processes from externalizing output, e.g., sending a network message or writing to the screen, until the speculations on which that output depends have proven to be correct. Speculator improves the performance of distributed file systems by masking I/O latency and increasing I/O throughput. Rather than block during a remote operation, a file system predicts the operation’s result, then uses Speculator to checkpoint the state of the calling process and speculatively continue its execution based on the predicted result. If the prediction is correct, the checkpoint is discarded; if it is incorrect, the calling process is restored to the checkpoint, and the operation is retried. We have modified the client, server, and network protocol of two distributed file systems to use Speculator. For PostMark and Andrew-style benchmarks, speculative execution results in a factor of 2 performance improvement for NFS over local-area networks and an order of magnitude improvement over wide-area networks. For the same benchmarks, Speculator enables the Blue File System to provide the consistency of single-copy file semantics and the safety of synchronous I/O, yet still outperform current distributed file systems with weaker consistency and safety.
Rethink the sync
- In Proc. OSDI
, 2006
"... We introduce external synchrony, a new model for local file I/O that provides the reliability and simplicity of synchronous I/O, yet also closely approximates the performance of asynchronous I/O. An external observer cannot distinguish the output of a computer with an externally synchronous file sys ..."
Abstract
-
Cited by 32 (6 self)
- Add to MetaCart
We introduce external synchrony, a new model for local file I/O that provides the reliability and simplicity of synchronous I/O, yet also closely approximates the performance of asynchronous I/O. An external observer cannot distinguish the output of a computer with an externally synchronous file system from the output of a computer with a synchronous file system. No application modification is required to use an externally synchronous file system: in fact, application developers can program to the simpler synchronous I/O abstraction and still receive excellent performance. We have implemented an externally synchronous file system for Linux, called xsyncfs. Xsyncfs provides the same durability and ordering guarantees as those provided by a synchronously mounted ext3 file system. Yet, even for I/O-intensive benchmarks, xsyncfs performance is within 7 % of ext3 mounted asynchronously. Compared to ext3 mounted synchronously, xsyncfs is up to two orders of magnitude faster. 1
File System Performance and Transaction Support
, 1992
"... This thesis considers two related issues: the impact of disk layout on file system throughput and the integration of transaction support in file systems. Historic file system designs have optimized for reading, as read throughput was the I/O performance bottleneck. Since increasing main-memory cach ..."
Abstract
-
Cited by 28 (3 self)
- Add to MetaCart
This thesis considers two related issues: the impact of disk layout on file system throughput and the integration of transaction support in file systems. Historic file system designs have optimized for reading, as read throughput was the I/O performance bottleneck. Since increasing main-memory cache sizes effectively reduce disk read traffic [BAKER91], disk write performance has become the I/O performance bottleneck [OUST89]. This thesis presents both simulation and implementation analysis of the performance of read-optimized and write-optimized file systems. An example of a file system with a disk layout optimized for writing is a log-structured file system, where writes are bundled and written sequentially. Empirical evidence in [ROSE90], [ROSE91], and [ROSE92] indicates that a log-structured file sys...
Helios: Heterogeneous multiprocessing with satellite kernels
- In Proceedings of the 22nd ACM Symposium on Operating Systems Principles
, 2009
"... Helios is an operating system designed to simplify the task of writing, deploying, and tuning applications for heterogeneous platforms. Helios introduces satellite kernels, which export a single, uniform set of OS abstractions across CPUs of disparate architectures and performance characteristics. A ..."
Abstract
-
Cited by 18 (0 self)
- Add to MetaCart
Helios is an operating system designed to simplify the task of writing, deploying, and tuning applications for heterogeneous platforms. Helios introduces satellite kernels, which export a single, uniform set of OS abstractions across CPUs of disparate architectures and performance characteristics. Access to I/O services such as file systems are made transparent via remote message passing, which extends a standard microkernel message-passing abstraction to a satellite kernel infrastructure. Helios retargets applications to available ISAs by compiling from an intermediate language. To simplify deploying and tuning application performance, Helios exposes an affinity metric to developers. Affinity provides a hint to the operating system about whether a process would benefit from executing on the same platform as a service it depends upon. We developed satellite kernels for an XScale programmable I/O card and for cache-coherent NUMA architectures. We offloaded several applications and operating system components, often by changing only a single line of metadata. We show up to a 28% performance improvement by offloading tasks to the XScale I/O card. On a mail-server benchmark, we show a 39 % improvement in performance by automatically splitting the application among multiple NUMA domains.
VINO: The 1994 Fall Harvest
, 1994
"... Current operating systems are designed to provide leastcommon -denominator service to a variety of applications. They export few internal kernel facilities, and those which are exported have irregular interfaces. As a result, resource intensive applications such as database management systems and mu ..."
Abstract
-
Cited by 17 (0 self)
- Add to MetaCart
Current operating systems are designed to provide leastcommon -denominator service to a variety of applications. They export few internal kernel facilities, and those which are exported have irregular interfaces. As a result, resource intensive applications such as database management systems and multimedia applications, are often poorly served by the operating system. These applications often go to great lengths to bypass normal kernel mechanisms to achieve acceptable performance. We describe a new kernel architecture, the VINO kernel, which addresses the limitations of conventional operating systems. The VINO design is driven by three principles: ffl Application Directed Policy: the operating system provides a collection of mechanisms, but applications dictate the policies applied to those mechanisms. ffl Kernel as Toolbox: applications can reuse the kernel's primitives. ffl Universal Resource Access: all resources are accessed through a single, common interface. VINO's power and...
Dynamic detection and prevention of race conditions in file accesses
- In Proceedings of the 12th USENIX Security Symposium
, 2003
"... Permission is granted for noncommercial reproduction of the work for educational or research purposes. ..."
Abstract
-
Cited by 16 (0 self)
- Add to MetaCart
Permission is granted for noncommercial reproduction of the work for educational or research purposes.
Operating System Transactions
, 2008
"... Operating systems should provide system transactions to user applications, in which user-level processes execute a series of system calls atomically and in isolation from other processes on the system. System transactions provide a simple tool for programmers to express safety conditions during conc ..."
Abstract
-
Cited by 15 (4 self)
- Add to MetaCart
Operating systems should provide system transactions to user applications, in which user-level processes execute a series of system calls atomically and in isolation from other processes on the system. System transactions provide a simple tool for programmers to express safety conditions during concurrent execution. This paper describes TxOS, a variant of Linux 2.6.22, which is the first operating system to implement system transactions on commodity hardware with strong isolation and fairness between transactional and non-transactional system calls. System transactions provide a simple and expressive interface for user programs to avoid race conditions on system resources. For instance, system transactions eliminate time-of-check-to-time-of-use (TOCTTOU) race conditions in the file system which are a class of security vulnerability that are difficult to eliminate with other techniques. System transactions also provide transactional semantics for user-level transactions that require system resources, allowing applications using hardware or software transactional memory system to safely make system calls. While system transactions may reduce single-thread performance, they can yield more scalable performance. For example, enclosing link and unlink within a system transaction outperforms rename on Linux by 14 % at 8 CPUs.

