Results 1 - 10
of
57
Virtualizing Transactional Memory
, 2005
"... Writing concurrent programs is difficult because of the complexity of ensuring proper synchronization. Conventional lock-based synchronization suffers from wellknown limitations, so researchers have considered nonblocking transactions as an alternative. Recent hardware proposals have demonstrated ho ..."
Abstract
-
Cited by 224 (2 self)
- Add to MetaCart
Writing concurrent programs is difficult because of the complexity of ensuring proper synchronization. Conventional lock-based synchronization suffers from wellknown limitations, so researchers have considered nonblocking transactions as an alternative. Recent hardware proposals have demonstrated how transactions can achieve high performance while not suffering limitations of lock-based mechanisms. However, current hardware proposals require programmers to be aware of platform-specific resource limitations such as buffer sizes, scheduling quanta, as well as events such as page faults, and process migrations. If the transactional model is to gain wide acceptance, hardware support for transactions must be virtualized to hide these limitations in much the same way that virtual memory shields the programmer from platform-specific limitations of physical memory. This paper proposes Virtual Transactional Memory (VTM), a user-transparent system that shields the programmer from various platform-specific resource limitations. VTM maintains the performance advantage of hardware transactions, incurs low overhead in time, and has modest costs in hardware support. While many system-level challenges remain, VTM takes a step toward making transactional models more widely acceptable.
Flashback: A Lightweight Extension for Rollback and Deterministic Replay for Software Debugging
- In USENIX Annual Technical Conference, General Track
, 2004
"... Unfortunately, finding software bugs is a very challenging task because many bugs are hard to reproduce. While debugging a program, it would be very useful to rollback a crashed program to a previous execution point and deterministically re-execute the "buggy " code region. However, most p ..."
Abstract
-
Cited by 82 (6 self)
- Add to MetaCart
Unfortunately, finding software bugs is a very challenging task because many bugs are hard to reproduce. While debugging a program, it would be very useful to rollback a crashed program to a previous execution point and deterministically re-execute the "buggy " code region. However, most previous work on rollback and replay support was designed to survive hardware or operating system failures, and is therefore too heavyweight for the fine-grained rollback and replay needed for software debugging. This paper presents Flashback, a lightweight OS extension that provides fine-grained rollback and replay to help debug software. Flashback uses shadow processes to efficiently roll back in-memory state of a process, and logs a process ' interactions with the system to support deterministic replay. Both shadow processes and logging of system calls are implemented in a lightweight fashion specifically designed for the purpose of software debugging. We have implemented a prototype of Flashback in the Linux operating system. Our experimental results with micro-benchmarks and real applications show that Flashback adds little overhead and can quickly roll back a debugged program to a previous execution point and deterministically replay from that point.
Treating bugs as allergies -- a safe method to survive software failures
- IN SOSP
, 2005
"... Many applications demand availability. Unfortunately, software failures greatly reduce system availability. Previous approaches for surviving software failures suffer from several limitations, including requiring application restructuring, failing to address deterministic software bugs, unsafely spe ..."
Abstract
-
Cited by 69 (6 self)
- Add to MetaCart
Many applications demand availability. Unfortunately, software failures greatly reduce system availability. Previous approaches for surviving software failures suffer from several limitations, including requiring application restructuring, failing to address deterministic software bugs, unsafely speculating on program execution, and re-quiring a long recovery time. This paper
Virtual Log Based File Systems for a Programmable Disk
, 1999
"... In this paper, we study how to minimize the latency of small synchronous writes to disks. The basic approach is to write to free sectors that are near the current disk head location by leveraging the embedded processor core inside the disk. We develop a number of analytical models to demonstrate the ..."
Abstract
-
Cited by 60 (1 self)
- Add to MetaCart
In this paper, we study how to minimize the latency of small synchronous writes to disks. The basic approach is to write to free sectors that are near the current disk head location by leveraging the embedded processor core inside the disk. We develop a number of analytical models to demonstrate the performance potential of this approach. We then present the design of a virtual log, a log whose entries are not physically contiguous, and a variation of the log-structured file system based on this approach. The virtual log based file systems can e#ciently support small synchronous writes without extra hardware support while retaining the advantages of LFS including its potential to support transactional semantics. We compare our approach against traditional update-in-place and logging systems by modifying the Solaris kernel to serve as a simulation engine. Our evaluations show that random synchronous updates on an unmodified UFS execute up to an order of magnitude faster on a virtual log...
Enhancing Software Reliability with Speculative Threads
, 2002
"... This paper advocates the use of a monitor-and-recover programming paradigm to enhance the reliability of software, and proposes an architectural design that allows software and hardware to cooperate in making this paradigm more efficient and easier to program. We propose that programmers write moni ..."
Abstract
-
Cited by 55 (0 self)
- Add to MetaCart
This paper advocates the use of a monitor-and-recover programming paradigm to enhance the reliability of software, and proposes an architectural design that allows software and hardware to cooperate in making this paradigm more efficient and easier to program. We propose that programmers write monitoring functions assuming simple sequential execution semantics. Our architecture speeds up the computation by executing the monitoring functions speculatively in parallel with the main computation. For recovery, programmers can define fine-grain transactions whose side effects, including all register modifications and memory writes, can either be committed or aborted under program control. Transactions are implemented efficiently by treating them as speculative threads. Our experimental
Exploring Failure Transparency and the Limits of Generic Recovery
- In Proc. 4th USENIX Symposium on Operating Systems Design and Implementation
, 2000
"... Abstract: We explore the abstraction of failure transparency in which the operating system provides the illusion of failure-free operation. To provide failure transparency, an operating system must recover applications after hardware, operating system, and application failures, and must do so withou ..."
Abstract
-
Cited by 46 (7 self)
- Add to MetaCart
Abstract: We explore the abstraction of failure transparency in which the operating system provides the illusion of failure-free operation. To provide failure transparency, an operating system must recover applications after hardware, operating system, and application failures, and must do so without help from the programmer or unduly slowing failure-free performance. We describe two invariants that must be upheld to provide failure transparency: one that ensures sufficient application state is saved to guarantee the user cannot discern failures, and another that ensures sufficient application state is lost to allow recovery from failures affecting application state. We find that several real applications get failure transparency in the presence of simple stop failures with overhead of 0-12%. Less encouragingly, we find that applications violate one invariant in the course of upholding the other for more than 90 % of application faults and 3-15% of operating system faults, rendering transparent recovery impossible for these cases. 1.
Improving Availability with Recursive Micro-Reboots: A Soft-State System Case Study
, 2003
"... Even after decades of software engineering research, complex computer systems still fail. This paper makes the case for increasing research emphasis on dependability and, specifically, on improving availability by reducing time-to-recover. All software fails at some point, so systems must be able to ..."
Abstract
-
Cited by 35 (4 self)
- Add to MetaCart
Even after decades of software engineering research, complex computer systems still fail. This paper makes the case for increasing research emphasis on dependability and, specifically, on improving availability by reducing time-to-recover. All software fails at some point, so systems must be able to recover from failures. Recovery itself can fail too, so systems must know how to intelligently retry their recovery. We present here a recursive approach, in which a minimal subset of components is recovered first
The Common Case Transactional Behavior of Multithreaded Programs
- In Proceedings of the 12th International Conference on High-Performance Computer Architecture
, 2006
"... Transactional memory (TM) provides an easy-to-use and high-performance parallel programming model for the upcoming chip-multiprocessor systems. Several researchers have proposed alternative hardware and software TM implementations. However, the lack of transaction-based programs makes it difficult t ..."
Abstract
-
Cited by 34 (6 self)
- Add to MetaCart
Transactional memory (TM) provides an easy-to-use and high-performance parallel programming model for the upcoming chip-multiprocessor systems. Several researchers have proposed alternative hardware and software TM implementations. However, the lack of transaction-based programs makes it difficult to understand the merits of each proposal and to tune future TM implementations to the common case behavior of real application. This work addresses this problem by analyzing the common case transactional behavior for 35 multithreaded programs from a wide range of application domains. We identify transactions within the source code by mapping existing primitives for parallelism and synchronization management to transaction boundaries. The analysis covers basic characteristics such as transaction length, distribution of readset and write-set size, and the frequency of nesting and I/O operations. The measured characteristics provide key insights into the design of efficient TM systems for both nonblocking synchronization and speculative parallelization. 1.
HeRMES: High-Performance Reliable MRAM-Enabled Storage
- In Proceedings of the 8th IEEE Workshop on Hot Topics in Operating Systems (HotOS-VIII
, 2001
"... Magnetic RAM (MRAM) is a new memory technology with access and cost characteristics comparable to those of conventional dynamic RAM (DRAM) and the non-volatility of magnetic media such as disk. Simply replacing DRAM with MRAM will make main memory non-volatile, but it will not improve file system pe ..."
Abstract
-
Cited by 29 (7 self)
- Add to MetaCart
Magnetic RAM (MRAM) is a new memory technology with access and cost characteristics comparable to those of conventional dynamic RAM (DRAM) and the non-volatility of magnetic media such as disk. Simply replacing DRAM with MRAM will make main memory non-volatile, but it will not improve file system performance. However, effective use of MRAM in a file system has the potential to significantly improve performance over existing file systems. The HeRMES file system will use MRAM to dramatically improve file system performance by using it as a permanent store for both file system data and metadata. In particular, metadata operations, which make up over 50% of all file system requests [14], are nearly free in HeRMES because they do not require any disk accesses. Data requests will also be faster, both because of increased metadata request speed and because using MRAM as a non-volatile cache will allow HeRMES to better optimize data placement on disk. Though MRAM capacity is too small to replace disk entirely, HeRMES will use MRAM to provide high-speed access to relatively small units of data and metadata, leaving most file data stored on disk.
Sweeper: A lightweight endto-end system for defending against fast worms
- InProceedings of 2007 EuroSys Conference
"... The vulnerabilities that plague computers cause endless grief to users. Slammer compromised millions of hosts in minutes; a hit-list worm would take under a second. Recently proposed techniques respond better than manual approaches, but require expensive instrumentation, which limits deployment. Alt ..."
Abstract
-
Cited by 26 (3 self)
- Add to MetaCart
The vulnerabilities that plague computers cause endless grief to users. Slammer compromised millions of hosts in minutes; a hit-list worm would take under a second. Recently proposed techniques respond better than manual approaches, but require expensive instrumentation, which limits deployment. Although spreading “antibodies ” (e.g. signatures) ameliorates this limitation, hosts depending on antibodies are defenseless until inoculation; to the fastest hit-list worms this delay is crucial. Additionally, most recently proposed techniques cannot provide recovery to provide continuous service after an attack. We propose a novel solution called Sweeper that provides both fast and accurate post-attack analysis and efficient recovery with low normal execution overhead. Sweeper innovatively combines several techniques: (1) Sweeper uses lightweight monitoring techniques to detect a wide array of suspicious requests, providing a first level of defense. (2) By cleverly leveraging lightweight checkpointing, Sweeper postpones heavyweight monitoring until absolutely necessary — after an attack is detected. Sweeper rolls back and re-executes multiple times to dynamically apply heavyweight analysis techniques via dynamic binary instrumentation. Since only the execution involved in the attack is analyzed, the analysis is efficient, yet thorough. (3) Based on the analysis results, Sweeper automatically generates lowoverhead antibodies to prevent future attacks of the same vulnerability. (4) Finally, Sweeper again re-executes to perform fast recovery for continuous service. We implement Sweeper in a real system. Our experimental results with three real-world servers and four real security vulnerabilities show that Sweeper can detect an attack and generate antibodies in under 60 milliseconds. Our results also show that Sweeper imposes under 1 % overhead during normal execution, clearly suitable for widespread production deployment (especially since Sweeper also allows partial deployment). Finally, we analytically show that, for a

