Results 11 - 20
of
48
Dynamic software updates: a VM-centric approach
- PLDI'09
, 2009
"... Software evolves to fix bugs and add features. Stopping and restarting programs to apply changes is inconvenient and often costly. Dynamic software updating (DSU) addresses this problem by updating programs while they execute, but existing DSU systems for managed languages do not support many update ..."
Abstract
-
Cited by 16 (3 self)
- Add to MetaCart
Software evolves to fix bugs and add features. Stopping and restarting programs to apply changes is inconvenient and often costly. Dynamic software updating (DSU) addresses this problem by updating programs while they execute, but existing DSU systems for managed languages do not support many updates that occur in practice and are inefficient. This paper presents the design and implementation of JVOLVE, a DSU-enhanced Java VM. Updated programs may add, delete, and replace fields and methods anywhere within the class hierarchy. JVOLVE implements these updates by adding to and coordinating VM classloading, just-in-time compilation, scheduling, return barriers, on-stack replacement, and garbage collection. JVOLVE is safe: its use of bytecode verification and VM thread synchronization ensures that an update will always produce type-correct executions. JVOLVE is flexible: it can support 20 of 22 updates to three open-source programs—Jetty web server, JavaEmailServer, and CrossFTP server—based on actual releases occurring over 1 to 2 years. JVOLVE is efficient: performance experiments show that JVOLVE incurs no overhead during steady-state execution. These results demonstrate that this work is a significant step towards practical support for dynamic updates in virtual machines for managed languages.
Intelligent Selection of Application-Specific Garbage Collectors Abstract
"... Java program execution times vary greatly with different garbage collection algorithms. Until now, it has not been possible to determine the best GC algorithm for a particular program without exhaustively profiling that program for all available GC algorithms. This paper presents a new approach. We ..."
Abstract
-
Cited by 10 (1 self)
- Add to MetaCart
Java program execution times vary greatly with different garbage collection algorithms. Until now, it has not been possible to determine the best GC algorithm for a particular program without exhaustively profiling that program for all available GC algorithms. This paper presents a new approach. We use machine learning techniques to build a prediction model that, given a single profile run of a previously unseen Java program, can predict a good GC algorithm for that program. We implement this technique in Jikes RVM and test it on several standard benchmark suites. Our technique achieves 5 % speedup in overall execution time (averaged across all test programs for all heap sizes) compared with selecting the default GC algorithm in every trial. We present further experiments to show that an oracle predictor could achieve an average 17 % speedup on the same experiments. In addition, we provide evidence to suggest that GC behaviour is sometimes independent of program inputs. These observations lead us to propose that intelligent selection of GC algorithms is suitably straightforward, efficient and effective to merit further exploration regarding its potential inclusion in the general Java software deployment process. Categories and Subject Descriptors D.3.4 [Programming Languages]: Processors—Memory management (garbage collection)
Online optimizations driven by hardware performance monitoring
- In Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI
, 2007
"... Hardware performance monitors provide detailed direct feedback about application behavior and are an additional source of information that a compiler may use for optimization. A JIT compiler is in a good position to make use of such information because it is running on the same platform as the user ..."
Abstract
-
Cited by 8 (0 self)
- Add to MetaCart
Hardware performance monitors provide detailed direct feedback about application behavior and are an additional source of information that a compiler may use for optimization. A JIT compiler is in a good position to make use of such information because it is running on the same platform as the user applications. As hardware platforms become more and more complex, it becomes more and more difficult to model their behavior. Profile information that captures general program properties (like execution frequency of methods or basic blocks) may be useful, but does not capture sufficient information about the execution platform. Machine-level performance data obtained from a hardware performance monitor can not only direct the compiler to those parts of the program that deserve its attention but also determine if an optimization step actually improved the performance of the application. This paper presents an infrastructure based on a dynamic compiler+runtime environment for Java that incorporates machine-level information as an additional kind of feedback for the compiler and runtime environment. The low-overhead monitoring system provides fine-grained performance data that can be tracked back to individual Java bytecode instructions. As an example, the paper presents results for object co-allocation in a generational garbage collector that optimizes spatial locality of objects on-line using measurements about cache misses. In the best case, the execution time is reduced by 14 % and L1 cache misses by 28%.
Leak Pruning
- ASPLOS'09
, 2009
"... Managed languages improve programmer productivity with type safety and garbage collection, which eliminate memory errors such as dangling pointers, double frees, and buffer overflows. However, programs may still leak memory if programmers forget to eliminate the last reference to an object that will ..."
Abstract
-
Cited by 8 (4 self)
- Add to MetaCart
Managed languages improve programmer productivity with type safety and garbage collection, which eliminate memory errors such as dangling pointers, double frees, and buffer overflows. However, programs may still leak memory if programmers forget to eliminate the last reference to an object that will not be used again. Leaks slow programs by increasing collector workload and frequency. Growing leaks crash programs. Instead of crashing, leak pruning extends program availability by predicting and reclaiming leaked objects at run time. Whereas garbage collection over-approximates live objects using reachability, leak pruning predicts dead objects and reclaims them based on how stale they are and the size of stale data structures. Leak pruning preserves semantics because it waits for heap exhaustion before reclaiming objects and then poisons references to objects it reclaims. If the program later tries to access these objects, the virtual machine (VM) throws an internal error. We implement leak pruning in a Java VM, show its overhead is low, and evaluate it on 10 leaking programs. Leak pruning does not help two programs, executes four substantial programs 1.6-35X longer, and executes four programs, including two leaks in Eclipse, for at least 24 hours. In the worst case, leak pruning defers fatal errors. In the best case, programs with unbounded memory requirements execute indefinitely and correctly in bounded memory with consistent throughput.
Bosschere. 64-bit versus 32-bit virtual machines for Java
- Software: Practice and Experience
"... The Java language is popular because of its platform independence, making it useful in a lot of technologies ranging from embedded devices to high-performance systems. The platform-independent property of Java, which is visible at the Java bytecode level, is only made possible thanks to the availabi ..."
Abstract
-
Cited by 7 (5 self)
- Add to MetaCart
The Java language is popular because of its platform independence, making it useful in a lot of technologies ranging from embedded devices to high-performance systems. The platform-independent property of Java, which is visible at the Java bytecode level, is only made possible thanks to the availability of a Virtual Machine (VM), which needs to be designed specifically for each underlying hardware platform. More specifically, the same Java bytecode should run properly on a 32-bit or a 64-bit VM. In this paper, we compare the behavioral characteristics of 32-bit and 64-bit VMs using a large set of Java benchmarks. This is done using the Jikes Research VM as well as the IBM JDK 1.4.0 production VM on a PowerPC-based IBM machine. By running the PowerPC machine in both 32-bit and 64-bit mode we are able to compare 32-bit and 64-bit VMs. We conclude that the space an object takes in the heap in 64-bit mode is 39.3% larger on average than in 32-bit mode. We identify three reasons for this: (i) the larger pointer size, (ii) the increased header and (iii) the increased alignment. The minimally required heap size is 51.1 % larger on average in 64-bit than in 32-bit mode. From our experimental setup using hardware performance monitors, we observe that 64-bit computing typically results in a significantly larger number of data cache misses at all levels of the memory hierarchy. In addition, we observe that when a sufficiently large heap is available, the IBM JDK 1.4.0 VM is 1.7 % slower on average in 64-bit mode than in 32-bit mode. Copyright c ○ 2005
Isla Vista Heap Sizing: Using Feedback to Avoid Paging
- In Proceedings of the International Symposium on Code Generation and Optimization (CGO
, 2007
"... Managed runtime environments (MREs) employ garbage collection (GC) for automatic memory management. However, GC induces pressure on the virtual memory (VM) manager, since it may touch pages that are not related to the working set of the application. Paging due to GC can significantly hurt performanc ..."
Abstract
-
Cited by 7 (2 self)
- Add to MetaCart
Managed runtime environments (MREs) employ garbage collection (GC) for automatic memory management. However, GC induces pressure on the virtual memory (VM) manager, since it may touch pages that are not related to the working set of the application. Paging due to GC can significantly hurt performance, even when the application’s working set fits into physical memory. We present a feedback-directed heap resizing mechanism to avoid GC-induced paging, using information from the operating system (OS). We avoid costly GCs when there is physical memory available, and trade off GC for paging when memory is constrained Our mechanism is simple and uses allocation stall events during GC alone to trigger heap resizing, without user participation or OS kernel modification. Our system enables significant performance improvements when real memory is restricted and similar to, or better performance than, the current state-of-the-art MRE, when memory is unconstrained. 1.
Garbage collection should be lifetime aware
- IN: IMPLEMENTATION, COMPILATION, OPTIMIZATION OF OBJECT-ORIENTED LANGUAGES, PROGRAMS AND SYSTEMS
, 2006
"... We argue that garbage collection should be more closely tied to object demographics. We show that this behaviour is sufficiently distinctive to make exploitation feasible and describe a novel GC framework that exploits object lifetime analysis yet tolerates imprecision. We argue for future collector ..."
Abstract
-
Cited by 6 (2 self)
- Add to MetaCart
We argue that garbage collection should be more closely tied to object demographics. We show that this behaviour is sufficiently distinctive to make exploitation feasible and describe a novel GC framework that exploits object lifetime analysis yet tolerates imprecision. We argue for future collectors based on combinations of approximate analyses and dynamic sampling.
Java Performance Evaluation through Rigorous Replay Compilation
- In ACM Conference on Object-Oriented Programming, Systems, Languages, and Applications
, 2008
"... A managed runtime environment, such as the Java virtual machine, is non-trivial to benchmark. Java performance is affected in various complex ways by the application and its input, as well as by the virtual machine (JIT optimizer, garbage collector, thread scheduler, etc.). In addition, nondetermini ..."
Abstract
-
Cited by 6 (0 self)
- Add to MetaCart
A managed runtime environment, such as the Java virtual machine, is non-trivial to benchmark. Java performance is affected in various complex ways by the application and its input, as well as by the virtual machine (JIT optimizer, garbage collector, thread scheduler, etc.). In addition, nondeterminism due to timer-based sampling for JIT optimization, thread scheduling, and various system effects further complicate the Java performance benchmarking process. Replay compilation is a recently introduced Java performance analysis methodology that aims at controlling nondeterminism to improve experimental repeatability. The key idea of replay compilation is to control the compilation load during experimentation by inducing a pre-recorded compilation plan at replay time. Replay compilation also enables teasing apart performance effects of the application versus the virtual machine. This paper argues that in contrast to current practice which uses a single compilation plan at replay time, multiple compilation plans add statistical rigor to the replay compilation methodology. By doing so, replay compilation better accounts for the variability observed in compilation load across compilation plans. In addition, we propose matchedpair comparison for statistical data analysis. Matched-pair comparison considers the performance measurements per compilation plan before and after an innovation of interest as a pair, which enables limiting the number of compilation plans needed for accurate performance analysis compared to statistical analysis assuming unpaired measurements.
A localized tracing scheme applied to garbage collection
- In APLAS 2006
, 2006
"... Abstract. We present a method to visit all nodes in a forest of data structures while taking into account object placement. We call the technique a Localized Tracing Scheme as it improves locality of reference during object tracing activity. The method organizes the heap into regions and uses trace ..."
Abstract
-
Cited by 5 (2 self)
- Add to MetaCart
Abstract. We present a method to visit all nodes in a forest of data structures while taking into account object placement. We call the technique a Localized Tracing Scheme as it improves locality of reference during object tracing activity. The method organizes the heap into regions and uses trace queues to defer and group tracing of remote objects. The principle of localized tracing reduces memory traffic and can be used as an optimization to improve performance at several levels of the memory hierarchy. The method is applicable to a wide range of uniprocessor garbage collection algorithms as well as to shared memory multiprocessor collectors. Experiments with a mark-and-sweep collector show performance improvements up to 75 % at the virtual memory level. 1
Profile-based pretenuring
- ACM TRANSACTIONS ON PROGRAMMING LANGUAGES AND SYSTEMS
, 2007
"... Pretenuring can reduce copying costs in garbage collectors by allocating long-lived objects into regions that the garbage collector will rarely, if ever, collect. We extend previous work on pretenuring as follows: (1) We produce pretenuring advice that is neutral with respect to the garbage collecto ..."
Abstract
-
Cited by 5 (0 self)
- Add to MetaCart
Pretenuring can reduce copying costs in garbage collectors by allocating long-lived objects into regions that the garbage collector will rarely, if ever, collect. We extend previous work on pretenuring as follows: (1) We produce pretenuring advice that is neutral with respect to the garbage collector algorithm and configuration. We thus can and do combine advice from different applications. We find for our benchmarks that predictions using object lifetimes at each allocation site in Java programs are accurate, which simplifies the pretenuring implementation. (2) We gather and apply advice to both applications and Jikes RVM, a compiler and runtime system for Java written in Java. Our results demonstrate that building combined advice into Jikes RVM from different application executions improves performance, regardless of the application Jikes RVM is compiling and executing. This build-time advice thus gives user applications some benefits of pretenuring, without any application profiling. No previous work uses profile feedback to pretenure in the runtime system. (3) We find that application-only advice also consistently improves performance, but that the combination of build-time and application-specific advice is almost always noticeably better. (4) Our same advice improves the performance of generational, Older First, and Beltway collectors, illustrating that it is collector neutral. (5) We include an immortal allocation space in addition

