Results 1 - 10
of
51
Vigilante: End-to-End Containment of Internet Worm Epidemics
, 2008
"... Worm containment must be automatic because worms can spread too fast for humans to respond. Recent work proposed network-level techniques to automate worm containment; these techniques have limitations because there is no information about the vulnerabilities exploited by worms at the network level. ..."
Abstract
-
Cited by 206 (5 self)
- Add to MetaCart
Worm containment must be automatic because worms can spread too fast for humans to respond. Recent work proposed network-level techniques to automate worm containment; these techniques have limitations because there is no information about the vulnerabilities exploited by worms at the network level. We propose Vigilante, a new end-to-end architecture to contain worms automatically that addresses these limitations. In Vigilante, hosts detect worms by instrumenting vulnerable programs to analyze infection attempts. We introduce dynamic data-flow analysis: a broad-coverage host-based algorithm that can detect unknown worms by tracking the flow of data from network messages and disallowing unsafe uses of this data. We also show how to integrate other host-based detection mechanisms into the Vigilante architecture. Upon detection, hosts generate self-certifying alerts (SCAs), a new type of security alert that can be inexpensively verified by any vulnerable host. Using SCAs, hosts can cooperate to contain an outbreak, without having to trust each other. Vigilante broadcasts SCAs over an overlay network that propagates alerts rapidly and resiliently. Hosts receiving an SCA protect themselves by generating filters with vulnerability condition slicing: an algorithm that performs dynamic analysis of the vulnerable program to identify control-flow conditions that lead
DieHard: probabilistic memory safety for unsafe languages
- in PLDI ’06
, 2006
"... Applications written in unsafe languages like C and C++ are vulnerable to memory errors such as buffer overflows, dangling pointers, and reads of uninitialized data. Such errors can lead to program crashes, security vulnerabilities, and unpredictable behavior. We present DieHard, a runtime system th ..."
Abstract
-
Cited by 93 (13 self)
- Add to MetaCart
Applications written in unsafe languages like C and C++ are vulnerable to memory errors such as buffer overflows, dangling pointers, and reads of uninitialized data. Such errors can lead to program crashes, security vulnerabilities, and unpredictable behavior. We present DieHard, a runtime system that tolerates these errors while probabilistically maintaining soundness. DieHard uses randomization and replication to achieve probabilistic memory safety by approximating an infinite-sized heap. DieHard’s memory manager randomizes the location of objects in a heap that is at least twice as large as required. This algorithm prevents heap corruption and provides a probabilistic guarantee of avoiding memory errors. For additional safety, DieHard can operate in a replicated mode where multiple replicas of the same application are run simultaneously. By initializing each replica with a different random seed and requiring agreement on output, the replicated version of Die-Hard increases the likelihood of correct execution because errors are unlikely to have the same effect across all replicas. We present analytical and experimental results that show DieHard’s resilience to a wide range of memory errors, including a heap-based buffer overflow in an actual application.
IRON file systems
- In Proceedings of the 20th ACM Symposium on Operating Systems Principles (SOSP ’05
, 2005
"... IRON FILE SYSTEMSVijayan Prabhakaran Disk drives are widely used as a primary medium for storing information.While commodity file systems trust disks to either work or fail completely, modern disks exhibit complex failure modes such as latent sector faults and block corrup-tions, where only portions ..."
Abstract
-
Cited by 74 (24 self)
- Add to MetaCart
IRON FILE SYSTEMSVijayan Prabhakaran Disk drives are widely used as a primary medium for storing information.While commodity file systems trust disks to either work or fail completely, modern disks exhibit complex failure modes such as latent sector faults and block corrup-tions, where only portions of a disk fail.
Exterminator: Automatically correcting memory errors with high probability
- In Proceedings of the 2007 ACM SIGPLAN Conference on Programming Language Design and Implementation, ACM
, 2007
"... Programs written in C and C++ are susceptible to memory errors, including buffer overflows and dangling pointers. These errors, which can lead to crashes, erroneous execution, and security vulnerabilities, are notoriously costly to repair. Tracking down their location in the source code is difficult ..."
Abstract
-
Cited by 31 (3 self)
- Add to MetaCart
Programs written in C and C++ are susceptible to memory errors, including buffer overflows and dangling pointers. These errors, which can lead to crashes, erroneous execution, and security vulnerabilities, are notoriously costly to repair. Tracking down their location in the source code is difficult, even when the full memory state of the program is available. Once the errors are finally found, fixing them remains challenging: even for critical security-sensitive bugs, the average time between initial reports and the issuance of a patch is nearly 1 month. We present Exterminator, a system that automatically corrects heap-based memory errors without programmer intervention. Exterminator exploits randomization to pinpoint errors with high precision. From this information, Exterminator derives runtime patches that fix these errors both in current and subsequent executions. In addition, Exterminator enables collaborative bug correction by merging patches generated by multiple users. We present analytical and empirical results that demonstrate Exterminator’s effectiveness at detecting and correcting both injected and real faults. 1.
Deadlock Immunity: Enabling Systems To Defend Against Deadlocks
"... Deadlock immunity is a property by which programs, once afflicted by a given deadlock, develop resistance against future occurrences of that and similar deadlocks. We describe a technique that enables programs to automatically gain such immunity without assistance from programmers or users. We imple ..."
Abstract
-
Cited by 23 (3 self)
- Add to MetaCart
Deadlock immunity is a property by which programs, once afflicted by a given deadlock, develop resistance against future occurrences of that and similar deadlocks. We describe a technique that enables programs to automatically gain such immunity without assistance from programmers or users. We implemented the technique for both Java and POSIX threads and evaluated it with several real systems, including MySQL, JBoss, SQLite, Apache ActiveMQ, Limewire, and Java JDK. The results demonstrate effectiveness against real deadlock bugs, while incurring modest performance overhead and scaling to 1024 threads. We therefore conclude that deadlock immunity offers programmers and users an attractive tool for coping with elusive deadlocks. 1
Triage: Diagnosing production run failures at the users site
- In Proc. of 21st SOSP
, 2007
"... Diagnosing production run failures is a challenging yet important task. Most previous work focuses on offsite diagnosis, i.e. development site diagnosis with the programmers present. This is insufficient for production-run failures as: (1) it is difficult to reproduce failures offsite for diagnosis; ..."
Abstract
-
Cited by 21 (3 self)
- Add to MetaCart
Diagnosing production run failures is a challenging yet important task. Most previous work focuses on offsite diagnosis, i.e. development site diagnosis with the programmers present. This is insufficient for production-run failures as: (1) it is difficult to reproduce failures offsite for diagnosis; (2) offsite diagnosis cannot provide timely guidance for recovery or security purposes; (3) it is infeasible to provide a programmer to diagnose every production run failure; and (4) privacy concerns limit the release of information (e.g. coredumps) to programmers. To address production-run failures, we propose a system, called Triage, that automatically performs onsite software failure diagnosis at the very moment of failure. It provides a detailed diagnosis report, including the failure nature, triggering conditions, related code and variables, the fault propagation chain, and potential fixes. Triage achieves this by leveraging lightweight reexecution support
Application-level correctness and its impact on fault tolerance
- In Proceedings of the 13th International Symposium on High Performance Computer Architecture
, 2007
"... Traditionally, fault tolerance researchers have required architectural state to be numerically perfect for program execution to be correct. However, in many programs, even if execution is not 100 % numerically correct, the program can still appear to execute correctly from the user’s perspective. He ..."
Abstract
-
Cited by 21 (1 self)
- Add to MetaCart
Traditionally, fault tolerance researchers have required architectural state to be numerically perfect for program execution to be correct. However, in many programs, even if execution is not 100 % numerically correct, the program can still appear to execute correctly from the user’s perspective. Hence, whether a fault is unacceptable or benign may depend on the level of abstraction at which correctness is evaluated, with more faults being benign at higher levels of abstraction, i.e. at the user or application level, compared to lower levels of abstraction, i.e. at the architecture level. The extent to which programs are more fault resilient at higher levels of abstraction is application dependent. Programs that produce inexact and/or approximate outputs can be very resilient at the application level. We call such programs soft computations, and we find they are common in multimedia workloads, as well as artificial intelligence (AI) workloads. Programs that compute exact numerical outputs offer less error resilience at the application level. However, we find all programs studied in this paper exhibit some enhanced fault resilience at the application level, including those that are traditionally considered exact computations– e.g., SPECInt CPU2000. This paper investigates definitions of program correctness that view correctness from the application’s standpoint rather than the architecture’s standpoint. Under application-level correctness, a program’s execution is deemed correct as long as the result it produces is acceptable to the user. To quantify user satisfaction, we rely on application-level fidelity metrics that capture userperceived program solution quality. We conduct a detailed fault susceptibility study that measures how much more fault resilient programs are when defining correctness at the
Diverse replication for single-machine Byzantine-fault tolerance
- In Submission
, 2008
"... New single-machine environments are emerging from abundant computation available through multiple cores and secure virtualization. In this paper, we describe the research challenges and opportunities around diversified replication as a method to increase the Byzantine-fault tolerance (BFT) of single ..."
Abstract
-
Cited by 13 (2 self)
- Add to MetaCart
New single-machine environments are emerging from abundant computation available through multiple cores and secure virtualization. In this paper, we describe the research challenges and opportunities around diversified replication as a method to increase the Byzantine-fault tolerance (BFT) of single-machine servers to software attacks or errors. We then discuss the design space of BFT protocols enabled by these new environments. 1
LeakSurvivor: Towards Safely Tolerating Memory Leaks for Garbage-Collected Languages
- In USENIX Annual Technical Conference
, 2008
"... Continuous memory leaks severely hurt program performance and software availability for garbage-collected programs. This paper presents a safe method, called LeakSurvivor, to tolerate continuous memory leaks at runtime for garbage-collected programs. Our main idea is to periodically swap out the “Po ..."
Abstract
-
Cited by 9 (2 self)
- Add to MetaCart
Continuous memory leaks severely hurt program performance and software availability for garbage-collected programs. This paper presents a safe method, called LeakSurvivor, to tolerate continuous memory leaks at runtime for garbage-collected programs. Our main idea is to periodically swap out the “Potentially Leaked ” (PL) memory objects identified by leak detectors from the virtual memory to disks. As a result, the virtual memory space occupied by the PL objects can be reclaimed by garbage collectors and available for future uses. If a swapped-out PL object is accesses later, LeakSurvivor will restore it from disks to the memory for correct program execution. Furthermore, LeakSurvivor helps developers to prune false positives. We have built the prototype of LeakSurvivor on top of Jikes RVM 2.4.2, a high performance Java-in-Java virtual machine developed by IBM. We conduct the experiments with three Java applications including Eclipse, SPECjbb2000 and Jigsaw. Among them, Eclipse and Jigsaw contain memory leaks introduced by their developers, while SPECjbb2000 contain a memory leak injected by us. Our results show that LeakSurvivor effectively tolerates memory leaks for two applications (Eclipse and SPECjbb2000), i.e., no cumulative performance degradation and no software failures when facing continuous memory leaks at runtime. For Jigsaw, LeakSurvivor extends the program lifetime by two times and improves the performance by 46 % compared with native runs. Furthermore, when there are no memory leaks, LeakSurvivor imposes small runtime overhead, i.e., 2.5 % over the leak detector and 23.7 % over the native runs. 1
Samurai: Protecting Critical Data in Unsafe Languages ABSTRACT
"... Programs written in type-unsafe languages such as C and C++ incur costly memory errors that result in corrupted data structures, program crashes, and incorrect results. We present a data-centric solution to memory corruption called critical memory, a memory model that allows programmers to identify ..."
Abstract
-
Cited by 9 (2 self)
- Add to MetaCart
Programs written in type-unsafe languages such as C and C++ incur costly memory errors that result in corrupted data structures, program crashes, and incorrect results. We present a data-centric solution to memory corruption called critical memory, a memory model that allows programmers to identify and protect data that is critical for correct program execution. Critical memory defines operations to consistently read and update critical data, and ensures that other non-critical updates in the program will not corrupt it. We also present Samurai, a runtime system that implements critical memory in software. Samurai uses replication and forward error correction to provide probabilistic guarantees of critical memory semantics. Because Samurai does not modify memory operations on non-critical data, the majority of memory operations in programs run at full speed, and Samurai is compatible with third party libraries. Using both applications, including a Web server, and libraries (an STL list class and a memory allocator), we evaluate the performance overhead and fault tolerance that Samurai provides. We find that Samurai is a useful and practical approach for the majority of the applications and libraries considered.

