Results 1 - 10
of
66
Improving the reliability of commodity operating systems
, 2003
"... drivers remain a significant cause of system failures. In Windows XP, for example, drivers account for 85 % of recently reported failures. This article describes Nooks, a reliability subsystem that seeks to greatly enhance operating system (OS) reliability by isolating the OS from driver failures. T ..."
Abstract
-
Cited by 192 (14 self)
- Add to MetaCart
drivers remain a significant cause of system failures. In Windows XP, for example, drivers account for 85 % of recently reported failures. This article describes Nooks, a reliability subsystem that seeks to greatly enhance operating system (OS) reliability by isolating the OS from driver failures. The Nooks approach is practical: rather than guaranteeing complete fault tolerance through a new (and incompatible) OS or driver architecture, our goal is to prevent the vast majority of driver-caused crashes with little or no change to the existing driver and system code. Nooks isolates drivers within lightweight protection domains inside the kernel address space, where hardware and software prevent them from corrupting the kernel. Nooks also tracks a driver’s use of kernel resources to facilitate automatic cleanup during recovery. To prove the viability of our approach, we implemented Nooks in the Linux operating system and used it to fault-isolate several device drivers. Our results show that Nooks offers a substantial increase in the reliability of operating systems, catching and quickly recovering from many faults that would otherwise crash the system. Under a wide range and number of fault conditions, we show that Nooks recovers automatically from 99 % of the faults that otherwise cause Linux to crash.
Emstar: a software environment for developing and deploying wireless sensor networks
- In Proceedings of the 2004 USENIX Technical Conference
, 2004
"... Recent work in wireless embedded networked systems has followed heterogeneous designs, incorporating a mixture of elements from extremely constrained 8- or 16-bit “Motes ” to less resourceconstrained 32-bit embedded “Microservers.” Emstar is a software environment for developing and deploying comple ..."
Abstract
-
Cited by 131 (21 self)
- Add to MetaCart
Recent work in wireless embedded networked systems has followed heterogeneous designs, incorporating a mixture of elements from extremely constrained 8- or 16-bit “Motes ” to less resourceconstrained 32-bit embedded “Microservers.” Emstar is a software environment for developing and deploying complex applications on such heterogeneous networks. Emstar is designed to leverage the additional resources of Microservers by trading off some performance for system robustness in sensor network applications. It enables fault isolation, fault tolerance, system visiblity, in-field debugging, and resource sharing across multiple applications. In order to accomplish these objectives, Emstar is designed to run as a multiprocess system and consists of libraries that implement message-passing IPC primitives, services that support networking, sensing, and time synchronization, and tools that support simulation, emulation, and visualization of live systems, both real and simulated. We evaluate this work by discussing the Acoustic ENSBox, a platform for distributed acoustic sensing that we built using Emstar. We show that by leveraging existing Emstar services, we are able to significantly reduce development time This work was made possible with support from The Center for Embedded Networked Sensing (CENS) under the NSF Cooperative Agreement CCR-0120778, and the UC MICRO program (grant
Splitting Interfaces: Making Trust Between Applications and Operating Systems Configurable
- In Proceedings of OSDI
, 2006
"... In current commodity systems, applications have no way of limiting their trust in the underlying operating system (OS), leaving them at the complete mercy of an attacker who gains control over the OS. In this work, we describe the design and implementation of Proxos, a system that allows application ..."
Abstract
-
Cited by 39 (1 self)
- Add to MetaCart
In current commodity systems, applications have no way of limiting their trust in the underlying operating system (OS), leaving them at the complete mercy of an attacker who gains control over the OS. In this work, we describe the design and implementation of Proxos, a system that allows applications to configure their trust in the OS by partitioning the system call interface into trusted and untrusted components. System call routing rules that indicate which system calls are to be handled by the untrusted commodity OS, and which are to be handled by a trusted private OS, are specified by the application developer. We find that rather than defining a new system call interface, routing system calls of an existing interface allows applications currently targeted towards commodity operating systems to isolate their most sensitive components from the commodity OS with only minor source code modifications. We have built a prototype of our system on top of the Xen Virtual Machine Monitor with Linux as the commodity OS. In practice, we find that the system call routing rules are short and simple – on the order of 10’s of lines of code. In addition, applications in Proxos incur only modest performance overhead, with most of the cost resulting from inter-VM context switches. 1
Proactive fault tolerance for hpc with xen virtualization,” inICS ’07
- Proceedings of the 21st Annual International Conference on Supercomputing
, 2007
"... Large-scale parallel computing is relying increasingly on clusters with thousands of processors. At such large counts of compute nodes, faults are becoming common place. Current techniques to tolerate faults focus on reactive schemes to recover from faults and generally rely on a checkpoint/restart ..."
Abstract
-
Cited by 36 (6 self)
- Add to MetaCart
Large-scale parallel computing is relying increasingly on clusters with thousands of processors. At such large counts of compute nodes, faults are becoming common place. Current techniques to tolerate faults focus on reactive schemes to recover from faults and generally rely on a checkpoint/restart mechanism. Yet, in today’s systems, node failures can often be anticipated by detecting a deteriorating health status. Instead of a reactive scheme for fault tolerance (FT), we are promoting a proactive one where processes automatically migrate from “unhealthy ” nodes to healthy ones. Our approach relies on operating system virtualization techniques exemplified by but not limited to Xen. This paper contributes an automatic and transparent mechanism for proactive FT for arbitrary MPI applications. It leverages virtualization techniques combined with health monitoring
Pragmatic Nonblocking Synchronization for Real-Time Systems
, 2001
"... We present a pragmatic methodology for designing nonblocking real-time systems. Our methodology uses a combination of lock-free and wait-free synchronization techniques and clearly states which technique should be applied in which situation. ..."
Abstract
-
Cited by 36 (12 self)
- Add to MetaCart
We present a pragmatic methodology for designing nonblocking real-time systems. Our methodology uses a combination of lock-free and wait-free synchronization techniques and clearly states which technique should be applied in which situation.
Reducing TCB complexity for security-sensitive applications: Three case studies
- In Proceedings of EuroSys 2006
, 2006
"... The future of digital systems is complexity, and complexity is the worst enemy of security.-- Bruce Schneier [40]. The large size and high complexity of securitysensitive applications and systems software is a primary cause for their poor testability and high vulnerability. One approach to alleviate ..."
Abstract
-
Cited by 35 (4 self)
- Add to MetaCart
The future of digital systems is complexity, and complexity is the worst enemy of security.-- Bruce Schneier [40]. The large size and high complexity of securitysensitive applications and systems software is a primary cause for their poor testability and high vulnerability. One approach to alleviate this problem is to extract the security-sensitive parts of application and systems software, thereby reducing the size and complexity of software that needs to be trusted. At the system software level, we use the Nizza architecture which relies on a kernelized trusted computing base (TCB) and on the reuse of legacy code using trusted wrappers to minimize the size of the TCB. At the application level, we extract the security-sensitive portions of an already existing application into an AppCore. The AppCore is executed as a trusted process in the Nizza architecture while the rest of the application executes on a virtualized, untrusted legacy operating system. In three case studies of real-world applications (ecommerce transaction client, VPN gateway and digital signatures in an e-mail client), we achieved a considerable reduction in code size and complexity. In contrast to the few hundred thousand lines of current application software code running on millions of lines of systems software code, we have AppCores with tens of thousands of lines of code running on a hundred thousand lines of systems software code. We also show the performance penalty of AppCores to be modest (a few percent) compared to current software.
DROPS -- OS Support for Distributed Multimedia Applications
- IN PROCEEDINGS OF THE EIGHTH ACM SIGOPS EUROPEAN WORKSHOP
, 1998
"... The characterising new requirement for distributed multimedia applications is the coexistence of dynamic real-time and non-real-time applications on hosts and networks. While some networks (e.g., ATM) in principle have the capability to reserve bandwidth on shared links, host systems usually do not. ..."
Abstract
-
Cited by 33 (14 self)
- Add to MetaCart
The characterising new requirement for distributed multimedia applications is the coexistence of dynamic real-time and non-real-time applications on hosts and networks. While some networks (e.g., ATM) in principle have the capability to reserve bandwidth on shared links, host systems usually do not. DROPS (Dresden Real-time OPerating System) is being built to remedy that situation by providing resource managers that allow the reservation of resources in advance and enforce that reservations. It allows the coexistence of timesharing applications (with no reservations) and real-time applications (with reservations). By outlining the principle architecture, some design decisions, and first results, the paper demonstrates how these objectives can be met using straightforward OS technology. It argues that middleware for diverse platforms cannot meet these objectives efficiently without proper core operating system support.
Taming Linux
- In Proceedings of the 5th Annual Australasian Conference on Parallel And Real-Time Systems (PART ’98
, 1998
"... This paper describes the overall design, partial implementation and brief performance evaluation of a system in which Linux and its applications run besides real-time applications. The separation of the real-time and time-sharing subsystems is not restricted to the use of the CPU but enforced as wel ..."
Abstract
-
Cited by 33 (9 self)
- Add to MetaCart
This paper describes the overall design, partial implementation and brief performance evaluation of a system in which Linux and its applications run besides real-time applications. The separation of the real-time and time-sharing subsystems is not restricted to the use of the CPU but enforced as well for other resources, namely main memory and caches. This paper details the changes needed for the original Linux to decouple it from real-time processes and analyzes the performance of the resulting system.
Reducing TCB size by using untrusted components -- small kernels versus virtual-machine monitors
- IN PROC. OF THE 11TH ACM SIGOPS EUROPEAN WORKSHOP
, 2004
"... Secure systems are best built on top of a small trusted operating system: The smaller the operating system, the easier it can be assured or verified for correctness. In this ..."
Abstract
-
Cited by 23 (2 self)
- Add to MetaCart
Secure systems are best built on top of a small trusted operating system: The smaller the operating system, the easier it can be assured or verified for correctness. In this

