Results 1 - 10
of
109
Secure Execution Via Program Shepherding
, 2002
"... We introduce program shepherding, a method for monitoring control flow transfers during program execution to enforce a security policy. Program shepherding provides three techniques as building blocks for security policies. First, shepherding can restrict execution privileges on the basis of code or ..."
Abstract
-
Cited by 215 (5 self)
- Add to MetaCart
We introduce program shepherding, a method for monitoring control flow transfers during program execution to enforce a security policy. Program shepherding provides three techniques as building blocks for security policies. First, shepherding can restrict execution privileges on the basis of code origins. This distinction can ensure that malicious code masquerading as data is never executed, thwarting a large class of security attacks. Second, shepherding can restrict control transfers based on instruction class, source, and target. For example, shepherding can forbid execution of shared library code except through declared entry points, and can ensure that a return instruction only targets the instruction after a call. Finally, shepherding guarantees that sandboxing checks placed around any type of program operation will never be bypassed. We have implemented these capabilities efficiently in a runtime system with minimal or no performance penalties. This system operates on unmodified native binaries, requires no special hardware or operating system support, and runs on existing IA-32 machines under both Linux and Windows.
Myths and realities: The performance impact of garbage collection
- In Proceedings of the ACM Conference on Measurement & Modeling Computer Systems
, 2004
"... This paper explores and quantifies garbage collection behavior for three whole heap collectors and generational counterparts: copying semi-space, mark-sweep, and reference counting, the canonical algorithms from which essentially all other collection algorithms are derived. Efficient implementations ..."
Abstract
-
Cited by 87 (25 self)
- Add to MetaCart
This paper explores and quantifies garbage collection behavior for three whole heap collectors and generational counterparts: copying semi-space, mark-sweep, and reference counting, the canonical algorithms from which essentially all other collection algorithms are derived. Efficient implementations in MMTk, a Java memory management toolkit, in IBM’s Jikes RVM share all common mechanisms to provide a clean experimental platform. Instrumentation separates collector and program behavior, and performance counters measure timing and memory behavior on three architectures. Our experimental design reveals key algorithmic features and how they match program characteristics to explain the direct and indirect costs of garbage collection as a function of heap size on the SPEC JVM benchmarks. For example, we find that the contiguous allocation of copying collectors attains significant locality benefits over free-list allocators. The reduced collection costs of the generational algorithms together with the locality benefit of contiguous allocation motivates a copying nursery for newly allocated objects. These benefits dominate the overheads of generational collectors compared with non-generational and no collection, disputing the myth that “no garbage collection is good garbage collection. ” Performance is less sensitive to the mature space collection algorithm in our benchmarks. However the locality and pointer mutation characteristics for a given program occasionally prefer copying or mark-sweep. This study is unique in its breadth of garbage collection algorithms and its depth of analysis. Categories and Subject Descriptors D.3.4 [Programming Languages]: Processors—Memory management
Dynamic hot data stream prefetching for general-purpose programs
- InACM SIGPLANConference on Programming Language Designand Implementation
, 2002
"... Prefetching data ahead of use has the potential to tolerate the growing processor-memory performance gap by overlapping long latency memory accesses with useful computation. While sophisticated prefetching techniques have been automated for limited domains, such as scientific codes that access dense ..."
Abstract
-
Cited by 87 (1 self)
- Add to MetaCart
Prefetching data ahead of use has the potential to tolerate the growing processor-memory performance gap by overlapping long latency memory accesses with useful computation. While sophisticated prefetching techniques have been automated for limited domains, such as scientific codes that access dense arrays in loop nests, a similar level of success has eluded general-purpose programs, especially pointer-chasing codes written in languages such as C and C++. We address this problem by describing, implementing and evaluating a dynamic prefetching scheme. Our technique runs on stock hardware, is completely automatic, and works for generalpurpose programs, including pointer-chasing codes written in weakly-typed languages, such as C and C++. It operates in three phases. First, the profiling phase gathers a temporal data reference profile from a running program with low-overhead. Next, the profiling is turned off and a fast analysis algorithm extracts hot data streams, which are data reference sequences that frequently repeat in the same order, from the temporal profile. Then, the system dynamically injects code at appropriate program points to detect and prefetch these hot data streams. Finally, the process enters the hibernation phase where no profiling or analysis is performed, and the program continues to execute with the added prefetch instructions. At the end of the hibernation phase, the program is deoptimized to remove the inserted checks and prefetch instructions, and control returns to the profiling phase. For long-running programs, this profile, analyze and optimize, hibernate, cycle will repeat multiple times. Our initial results from applying dynamic prefetching are promising, indicating overall execution time improvements of 5–19 % for several memory-performance-limited SPECint2000 benchmarks running their largest (ref) inputs.
Characterizing and Predicting Program Behavior and its Variability
- In International Conference on Parallel Architectures and Compilation Techniques
, 2003
"... To reach the next level of performance and energy efficiency, optimizations are increasingly applied in a dynamic and adaptive manner. Current adaptive systems are typically reactive and optimize hardware or software in response to detecting a shift in program behavior. We argue that program behavio ..."
Abstract
-
Cited by 83 (3 self)
- Add to MetaCart
To reach the next level of performance and energy efficiency, optimizations are increasingly applied in a dynamic and adaptive manner. Current adaptive systems are typically reactive and optimize hardware or software in response to detecting a shift in program behavior. We argue that program behavior variability requires adaptive systems to be predictive rather than reactive. In order to be effective, systems need to adapt according to future rather than most recent past behavior. In this paper we explore the potential of incorporating prediction into adaptive systems. We study the time-varying behavior of programs using metrics derived from hardware counters on two different micro-architectures. Our evaluation shows that programs do indeed exhibit significant behavior variation even at a granularity of millions of instructions. In addition, while the actual behavior across metrics may be different, periodicity in the behavior is shared across metrics. We exploit these characteristics in the design of on-line statistical and table-based predictors. We introduce a new class of predictors, cross-metric predictors, that use one metric to predict another, thus making possible an efficient coupling of multiple predictors. We evaluate these predictors on the SPECcpu2000 benchmark suite and show that table-based predictors outperform statistical predictors by as much as 69 % on benchmarks with high variability. 1.
Oil and Water? High Performance Garbage Collection in Java with MMTk
- In ICSE 2004, 26th International Conference on Software Engineering
, 2004
"... Increasingly popular languages such as Java and C # require efficient garbage collection. This paper presents the design, implementation, and evaluation of MMTk, a Memory Management Toolkit for and in Java. MMTk is an efficient, composable, extensible, and portable framework for building garbage col ..."
Abstract
-
Cited by 81 (18 self)
- Add to MetaCart
Increasingly popular languages such as Java and C # require efficient garbage collection. This paper presents the design, implementation, and evaluation of MMTk, a Memory Management Toolkit for and in Java. MMTk is an efficient, composable, extensible, and portable framework for building garbage collectors. MMTk uses design patterns and compiler cooperation to combine modularity and efficiency. The resulting system is more robust, easier to maintain, and has fewer defects than monolithic collectors. Experimental comparisons with monolithic Java and C implementations reveal MMTk has significant performance advantages as well. Performance critical system software typically uses monolithic C at the expense of flexibility. Our results refute common wisdom that only this approach attains efficiency, and suggest that performance critical software can embrace modular design and high-level languages. 1
Transactional Monitors for Concurrent Objects
, 2004
"... Transactional monitors are proposed as an alternative to monitors based on mutualexclusion synchronization for object-oriented programming languages. Transactional monitors have execution semantics similar to mutual-exclusion monitors but implement monitors as lightweight transactions that can be ex ..."
Abstract
-
Cited by 66 (8 self)
- Add to MetaCart
Transactional monitors are proposed as an alternative to monitors based on mutualexclusion synchronization for object-oriented programming languages. Transactional monitors have execution semantics similar to mutual-exclusion monitors but implement monitors as lightweight transactions that can be executed concurrently (or in parallel on multiprocessors). They alleviate many of the constraints that inhibit construction of transparently scalable and robust applications. We undertake
Safe futures for Java
- In Proceedings of the 20th Annual ACM SIGPLAN Conference on Object-Oriented Programming, Systems, Languages, and Applications (OOPSLA 2005). ACM
, 2005
"... A future is a simple and elegant abstraction that allows concurrency to be expressed often through a relatively small rewrite of a sequential program. In the absence of side-effects, futures serve as benign annotations that mark potentially concurrent regions of code. Unfortunately, when computation ..."
Abstract
-
Cited by 59 (7 self)
- Add to MetaCart
A future is a simple and elegant abstraction that allows concurrency to be expressed often through a relatively small rewrite of a sequential program. In the absence of side-effects, futures serve as benign annotations that mark potentially concurrent regions of code. Unfortunately, when computation relies heavily on mutation as is the case in Java, its meaning is less clear, and much of its intended simplicity lost. This paper explores the definition and implementation of safe futures for Java. One can think of safe futures as truly transparent annotations on method calls, which designate opportunities for concurrency. Serial programs can be made concurrent simply by replacing standard method calls with future invocations. Most significantly, even though some parts of the program are executed concurrently and may indeed operate on shared data, the semblance of serial execution is nonetheless preserved. Thus, program reasoning is simplified since data dependencies present in a sequential program are not violated in a version augmented with safe futures. Besides presenting a programming model and API for safe futures, we formalize the safety conditions that must be satisfied to ensure equivalence between a sequential Java program and its futureannotated counterpart. A detailed implementation study is also provided. Our implementation exploits techniques such as object versioning and task revocation to guarantee necessary safety conditions. We also present an extensive experimental evaluation of our implementation to quantify overheads and limitations. Our experiments indicate that for programs with modest mutation rates on shared data, applications can use futures to profitably exploit parallelism, without sacrificing safety.
High-Level Adaptive Program Optimization with ADAPT
, 2001
"... Compile-time optimization is often limited by a lack of target machine and input data set knowledge. Without this information, compilers may be forced to make conservative assumptions to preserve correctness and to avoid performance degradation. In order to cope with this lack of information at comp ..."
Abstract
-
Cited by 50 (7 self)
- Add to MetaCart
Compile-time optimization is often limited by a lack of target machine and input data set knowledge. Without this information, compilers may be forced to make conservative assumptions to preserve correctness and to avoid performance degradation. In order to cope with this lack of information at compile-time, adaptive and dynamic systems can be used to perform optimization at runtime when complete knowledge of input and machine parameters is available. This paper presents a compiler-supported high-level adaptive optimization system. Users describe, in a domain specific language, optimizations performed by stand-alone optimization tools and backend compiler flags, as well as heuristics for applying these optimizations dynamically at runtime. The ADAPT compiler reads these descriptions and generates application-specific runtime systems to apply the heuristics. To facilitate the usage of existing tools and compilers, overheads are minimized by decoupling optimization from execution. Our system, ADAPT, supports a range of paradigms proposed recently, including dynamic compilation, parameterization and runtime sampling. We demonstrate our system by applying several optimization techniques to a suite of benchmarks on two target machines. ADAPT is shown to consistently outperform statically generated executables, improving performance by as much as 70%.
Java without the Coffee Breaks: A Nonintrusive Multiprocessor Garbage Collector
- In Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI) (Snowbird
, 2001
"... The deployment of Java as a concurrent programming language has created a critical need for high-performance, concurrent, and incremental multiprocessor garbage collection. We present the Recycler, a fully concurrent pure reference counting garbage collector that we have implemented in the Jalapeno ..."
Abstract
-
Cited by 50 (10 self)
- Add to MetaCart
The deployment of Java as a concurrent programming language has created a critical need for high-performance, concurrent, and incremental multiprocessor garbage collection. We present the Recycler, a fully concurrent pure reference counting garbage collector that we have implemented in the Jalapeno Java virtual machine running on shared memory multiprocessors.
Vertical Profiling: Understanding the Behavior of Object-Oriented Applications
"... Object-oriented programming languages provide a rich set of features that provide significant software engineering benefits. The increased productivity provided by these features comes at a justifiable cost in a more sophisticated runtime system whose responsibility is to implement these features e# ..."
Abstract
-
Cited by 47 (14 self)
- Add to MetaCart
Object-oriented programming languages provide a rich set of features that provide significant software engineering benefits. The increased productivity provided by these features comes at a justifiable cost in a more sophisticated runtime system whose responsibility is to implement these features e#ciently. However, the virtualization introduced by this sophistication provides a significant challenge to understanding complete system performance, not found in traditionally compiled languages, such as C or C++. Thus, understanding system performance of such a system requires profiling that spans all levels of the execution stack, such as the hardware, operating system, virtual machine, and application.

