Results 1 - 10
of
26
Triage: Diagnosing production run failures at the users site
- In Proc. of 21st SOSP
, 2007
"... Diagnosing production run failures is a challenging yet important task. Most previous work focuses on offsite diagnosis, i.e. development site diagnosis with the programmers present. This is insufficient for production-run failures as: (1) it is difficult to reproduce failures offsite for diagnosis; ..."
Abstract
-
Cited by 21 (3 self)
- Add to MetaCart
Diagnosing production run failures is a challenging yet important task. Most previous work focuses on offsite diagnosis, i.e. development site diagnosis with the programmers present. This is insufficient for production-run failures as: (1) it is difficult to reproduce failures offsite for diagnosis; (2) offsite diagnosis cannot provide timely guidance for recovery or security purposes; (3) it is infeasible to provide a programmer to diagnose every production run failure; and (4) privacy concerns limit the release of information (e.g. coredumps) to programmers. To address production-run failures, we propose a system, called Triage, that automatically performs onsite software failure diagnosis at the very moment of failure. It provides a detailed diagnosis report, including the failure nature, triggering conditions, related code and variables, the fault propagation chain, and potential fixes. Triage achieves this by leveraging lightweight reexecution support
Automatically Patching Errors in Deployed Software
, 2009
"... We present ClearView, a system for automatically patching errors in deployed software. ClearView works on stripped Windows x86 binaries without any need for source code, debugging information, or other external information, and without human intervention. ClearView (1) observes normal executions to ..."
Abstract
-
Cited by 18 (4 self)
- Add to MetaCart
We present ClearView, a system for automatically patching errors in deployed software. ClearView works on stripped Windows x86 binaries without any need for source code, debugging information, or other external information, and without human intervention. ClearView (1) observes normal executions to learn invariants that characterize the application’s normal behavior, (2) uses error detectors to monitor the execution to detect failures, (3) identifies violations of learned invariants that occur during failed executions, (4) generates candidate repair patches that enforce selected invariants by changing the state or the flow of control to make the invariant true, and (5) observes the continued execution of patched applications to select the most successful patch. ClearView is designed to correct errors in software with high availability requirements. Aspects of ClearView that make it particularly
Cobra: Fine-grained Malware Analysis using Stealth Localized-Executions
- In Proceedings of 2006 IEEE Symposium on Security and Privacy (Oakland’06
, 2006
"... Fine-grained code analysis in the context of malware is a complex and challenging task that provides insight into malware code-layers (polymorphic/metamorphic), its data encryption/decryption engine, its memory-layout etc., important pieces of information that can be used to detect and counter the m ..."
Abstract
-
Cited by 14 (0 self)
- Add to MetaCart
Fine-grained code analysis in the context of malware is a complex and challenging task that provides insight into malware code-layers (polymorphic/metamorphic), its data encryption/decryption engine, its memory-layout etc., important pieces of information that can be used to detect and counter the malware and its variants. Current research in fine-grained code analysis can be categorized into static and dynamic approaches. Static approaches have been tailored towards malware and allow exhaustive fine-grained malicious code analysis, but lack support for self-modifying code, have limitations related to code-obfuscations and face the undecidability problem. Given that most if not all malware employ self-modifying code and code-obfuscations, poses the need to analyze them at runtime using dynamic approaches. However, current dynamic approaches for fine-grained code analysis are not tailored specifically towards malware and lack support for multithreading, self-modifying and/or selfchecking code and are easily detected and countered by everevolving anti-analysis tricks employed by malicious code. To address this problem we propose a powerful dynamic fine-grained malicious code analysis framework codenamed Cobra, to combat malware that are becoming increasingly hard to analyze. Our goal is to provide a stealth, efficient, portable and easy-to-use framework supporting multithreading, self-modifying/self-checking code and any form of code obfuscation in both user- and kernel-mode on commodity operating systems. Cobra cannot be detected or countered and can be dynamically and selectively deployed on malware specific code-streams while allowing other code-streams to execute as is. We also illustrate the framework utility by describing our experience with a tool employing Cobra to analyze a real-world malware. 1.
Evaluating Indirect Branch Handling Mechanisms
- in Software Dynamic Translation Systems”, Int’l. Symp. on Code Generation and Optimization
, 2007
"... Software Dynamic Translation (SDT) systems are used for program instrumentation, dynamic optimization, security, intrusion detection, and many other uses. As noted by many researchers, a major source of SDT overhead is the execution of code which is needed to translate an indirect branch’s target ad ..."
Abstract
-
Cited by 8 (3 self)
- Add to MetaCart
Software Dynamic Translation (SDT) systems are used for program instrumentation, dynamic optimization, security, intrusion detection, and many other uses. As noted by many researchers, a major source of SDT overhead is the execution of code which is needed to translate an indirect branch’s target address into the address of the translated destination block. This paper discusses the sources of indirect branch (IB) overhead in SDT systems and evaluates several techniques for overhead reduction. Measurements using SPEC CPU2000 show that the appropriate choice and configuration of IB translation mechanisms can significantly reduce the IB handling overhead. In addition, cross-architecture evaluation of IB handling mechanisms reveals that the most efficient implementation and configuration can be highly dependent on the implementation of the
Reducing exit stub memory consumption in code caches
- In High Performance Embedded Architectures and Compilers
, 2007
"... Abstract. The interest in translation-based virtual execution environments (VEEs) is growing with the recognition of their importance in a variety of applications. However, due to constrained memory and energy resources, developing a VEE for an embedded system presents a number of challenges. In thi ..."
Abstract
-
Cited by 6 (3 self)
- Add to MetaCart
Abstract. The interest in translation-based virtual execution environments (VEEs) is growing with the recognition of their importance in a variety of applications. However, due to constrained memory and energy resources, developing a VEE for an embedded system presents a number of challenges. In this paper we focus on the VEE’s memory overhead, and in particular, the code cache. Both code traces and exit stubs are stored in a code cache. Exit stubs keep track of the branches off a trace, and we show they consume up to 66.7 % of the code cache. We present four techniques for reducing the space occupied by exit stubs, two of which assume unbounded code caches and the absence of code cache invalidations, and two without these restrictions. These techniques reduce space by 43.5 % and also improve performance by 1.5%. After applying our techniques, the percentage of space consumed by exit stubs in the resulting code cache was reduced to 41.4%. 1
How to do a million watchpoints: Efficient Debugging using Dynamic Instrumentation
"... Abstract. Application debugging is a tedious but inevitable chore in any software development project. An effective debugger can make programmers more productive by allowing them to pause execution and inspect the state of the process, or monitor writes to memory to detect data corruption. This pape ..."
Abstract
-
Cited by 5 (4 self)
- Add to MetaCart
Abstract. Application debugging is a tedious but inevitable chore in any software development project. An effective debugger can make programmers more productive by allowing them to pause execution and inspect the state of the process, or monitor writes to memory to detect data corruption. This paper introduces the new concept of Efficient Debugging using Dynamic Instrumentation (EDDI). The paper demonstrates for the first time the feasibility of using dynamic instrumentation ondemand to accelerate software debuggers, especially when the available hardware support is lacking or inadequate. As an example, EDDI can simultaneously monitor millions of memory locations without crippling the host processing platform. It does this in software and hence provides a portable debugging environment. It is also well suited for interactive debugging because of its low overhead. EDDI provides a scalable and extensible debugging framework that can substantially increase the feature set of current debuggers. 1
Binary Translation Using Peephole Superoptimizers
"... We present a new scheme for performing binary translation that produces code comparable to or better than existing binary translators with much less engineering effort. Instead of hand-coding the translation from one instruction set to another, our approach automatically learns translation rules usi ..."
Abstract
-
Cited by 4 (0 self)
- Add to MetaCart
We present a new scheme for performing binary translation that produces code comparable to or better than existing binary translators with much less engineering effort. Instead of hand-coding the translation from one instruction set to another, our approach automatically learns translation rules using superoptimization techniques. We have implemented a PowerPC-x86 binary translator and report results on small and large computeintensive benchmarks. When compared to the native compiler, our translated code achieves median performance of 67 % on large benchmarks and in some small stress tests actually outperforms the native compiler. We also report comparisons with the open source binary translator Qemu and a commercial tool, Apple’s Rosetta. We consistently outperform the former and are comparable to or faster than the latter on all but one benchmark. 1
Practical Memory Checking with Dr. Memory
"... Abstract—Memory corruption, reading uninitialized memory, using freed memory, and other memory-related errors are among the most difficult programming bugs to identify and fix due to the delay and non-determinism linking the error to an observable symptom. Dedicated memory checking tools are invalua ..."
Abstract
-
Cited by 3 (0 self)
- Add to MetaCart
Abstract—Memory corruption, reading uninitialized memory, using freed memory, and other memory-related errors are among the most difficult programming bugs to identify and fix due to the delay and non-determinism linking the error to an observable symptom. Dedicated memory checking tools are invaluable for finding these errors. However, such tools are difficult to build, and because they must monitor all memory accesses by the application, they incur significant overhead. Accuracy is another challenge: memory errors are not always straightforward to identify, and numerous false positive error reports can make a tool unusable. A third obstacle to creating such a tool is that it depends on low-level operating system and architectural details, making it difficult to port to other platforms and difficult to target proprietary systems like Windows. This paper presents Dr. Memory, a memory checking tool that operates on both Windows and Linux applications. Dr. Memory handles the complex and not fully documented Windows environment, and avoids reporting false positive memory leaks that plague traditional leak locating algorithms. Dr. Memory employs efficient instrumentation techniques; a direct comparison with the state-of-the-art Valgrind Memcheck tool reveals that Dr. Memory is twice as fast as Memcheck on average and up to four times faster on individual benchmarks.
Dynamic Information Flow Tracking on Multicores
, 2008
"... Dynamic Information Flow Tracking (DIFT) is a promising technique for detecting software attacks. Due to the computationally intensive nature of the technique, prior efficient implementations [21, 6] rely on specialized hardware support whose only purpose is to enable DIFT. Alternatively, prior soft ..."
Abstract
-
Cited by 3 (0 self)
- Add to MetaCart
Dynamic Information Flow Tracking (DIFT) is a promising technique for detecting software attacks. Due to the computationally intensive nature of the technique, prior efficient implementations [21, 6] rely on specialized hardware support whose only purpose is to enable DIFT. Alternatively, prior software implementations are either too slow [17, 15] resulting in execution time increases as much as four fold for SPEC integer programs or they are not transparent [31] requiring source code modifications. In this paper, we propose the use of chip multiprocessors (CMP) to perform DIFT transparently and efficiently. We spawn a helper thread that is scheduled on a separate core and is only responsible for performing information flow tracking operations. This entails the communication of registers and flags between the main and helper threads. We explore software (shared memory) and hardware (dedicated interconnect) approaches to enable this communication. Finally, we propose a novel application of the DIFT infrastructure where, in addition to the detection of the software attack, DIFT assists in the process of identifying the cause of the bug in the code that enabled the exploit in the first place. We conducted detailed simulations to evaluate the overhead for performing DIFT and found that to be 48 % for SPEC integer programs.
Quantifying the Potential of Program Analysis Peripherals
"... Abstract—As programmers are asked to manage more complicated parallel machines, it is likely that they will become increasingly dependent on tools such as multi-threaded data race detectors, memory bounds checkers, dynamic dataflow trackers, and various performance profilers to understand and mainta ..."
Abstract
-
Cited by 3 (0 self)
- Add to MetaCart
Abstract—As programmers are asked to manage more complicated parallel machines, it is likely that they will become increasingly dependent on tools such as multi-threaded data race detectors, memory bounds checkers, dynamic dataflow trackers, and various performance profilers to understand and maintain their software. As these tools continue to grow in importance, it is worth exploring the potential for special purpose accelerators for these tasks, especially since commodity multi-cores can only provide limited speedups. Rather than performing all the instrumentation and analysis on the main processor, we explore the idea of using the increasingly highthroughput board level interconnect available on many systems to offload analysis to a parallel off-chip accelerator. There are many non-trivial technical issues in taking such an approach that may not appear in simulation, and to flush them out we have developed a prototype system that maps a DMA based analysis engine, sitting on a PCI-mounted FPGA, into the Valgrind instrumentation framework. Using this prototype we characterize the potential of such a system to both accelerate existing software development tools and enable a new class of heavyweight dynamic analysis. While many issues still remain with such an approach, we demonstrate that program analysis speedups of 29 % to 440 % could be achieved today with strictly off-the-shelf components on some of the state-of-the-art tools, and we carefully quantify the bottlenecks to illuminate several new opportunities for further architectural innovation. Keywords-Dynamic program analysis, Hardware support, Debugging and Security, Offchip accelerator

