Results 1 - 10
of
34
Dynamic metrics for Java
- In Proceedings of the 18th ACM SIGPLAN conference on Object-oriented programing, systems, languages, and applications
, 2003
"... ..."
Dynamic Selection of Application-Specific Garbage Collectors
, 2004
"... Much prior work has shown that the performance enabled by garbage collection (GC) systems is highly dependent upon the behavior of the application as well as on the available resources. That is, no single GC enables the best performance for all programs and all heap sizes. To address this limitation ..."
Abstract
-
Cited by 30 (5 self)
- Add to MetaCart
Much prior work has shown that the performance enabled by garbage collection (GC) systems is highly dependent upon the behavior of the application as well as on the available resources. That is, no single GC enables the best performance for all programs and all heap sizes. To address this limitation, we present the design, implementation, and empirical evaluation of a novel Java Virtual Machine (JVM) extension that facilitates dynamic switching between a number of very different and popular garbage collectors. We also show how to exploit this functionality using annotation-guided GC selection and evaluate the system using a large number of benchmarks. In addition, we implement and evaluate a simple heuristic to investigate the efficacy of switching automatically. Our results show that, on average, our annotation-guided system introduces less than 4% overhead and improves performance by 24% over the worstperforming GC (across heap sizes) and by 7% over always using the popular Generational/Mark-Sweep hybrid.
The Open Runtime Platform: A Flexible High-Performance Managed Runtime Environment
- Intel Technology Journal
, 2003
"... managed runtime environment (MRTE) that features exact generational garbage collection, fast thread synchronization, and multiple coexisting just-in-time compilers (JITs). ORP was designed for flexibility in order to support experiments in dynamic compilation, garbage collection, synchronization, an ..."
Abstract
-
Cited by 25 (8 self)
- Add to MetaCart
managed runtime environment (MRTE) that features exact generational garbage collection, fast thread synchronization, and multiple coexisting just-in-time compilers (JITs). ORP was designed for flexibility in order to support experiments in dynamic compilation, garbage collection, synchronization, and other technologies. It can be built to run either Java or Common Language Infrastructure (CLI) applications, to run under the Windows or Linux operating systems, and to run on the IA-32 or Itanium processor family (IPF) architectures.
Coupling On-Line and Off-Line Profile Information to Improve Program Performance
, 2002
"... Dynamic compilation and optimization are widely used for Internet computing, in which an intermediate form of the code is compiled to native code during execution. ..."
Abstract
-
Cited by 21 (8 self)
- Add to MetaCart
Dynamic compilation and optimization are widely used for Internet computing, in which an intermediate form of the code is compiled to native code during execution.
NWSLite: A Light-Weight Prediction Utility for Mobile Devices
, 2004
"... Computation off-loading, i.e., remote execution, has been shown to be effective for extending the computational power and battery life of resource-restricted devices, e.g., hand-held, wearable, and pervasive computers. Remote execution systems must predict the cost of executing both locally and remo ..."
Abstract
-
Cited by 16 (4 self)
- Add to MetaCart
Computation off-loading, i.e., remote execution, has been shown to be effective for extending the computational power and battery life of resource-restricted devices, e.g., hand-held, wearable, and pervasive computers. Remote execution systems must predict the cost of executing both locally and remotely to determine when offloading will be most beneficial. These costs however, are dependent upon the execution behavior of the task being considered and the highly-variable performance of the underlying resources, e.g., CPU (local and remote), bandwidth, and network latency. As such, remote execution systems must employ sophisticated, prediction techniques that accurately guide computation off-loading. Moreover, these techniques must be efficient, i.e., they cannot consume significant resources, e.g., energy, execution time, etc., since they are performed on the mobile device.
Code Annotation for Safe and Efficient Dynamic Object Resolution
, 2003
"... The execution time of object oriented programs can be drastically reduced by transforming "non escaping" objects into a collection of its component scalar data fields. But for languages that support dynamic linking, this kind of optimization (which we call "object resolution") can usually only be pe ..."
Abstract
-
Cited by 9 (7 self)
- Add to MetaCart
The execution time of object oriented programs can be drastically reduced by transforming "non escaping" objects into a collection of its component scalar data fields. But for languages that support dynamic linking, this kind of optimization (which we call "object resolution") can usually only be performed at runtime, when the entire program is available for analysis. In such cases, the resulting performance increases will be offset by the additional costs that arise during the analysis and restructuring phases.
Using phase behavior in scientific application to guide linux operating system customization
- In Workshop on Next Generation Software at IEEE International Parallel and Distributed Processing Symposium (IPDPS
, 2005
"... In this paper, we present the design of a system that automatically generates application-specific Linux images for scientific applications that execute using batched cluster resources. Key to our approach is the use of recurring patterns in program performance, i.e., phase-behavior, that can be exp ..."
Abstract
-
Cited by 8 (5 self)
- Add to MetaCart
In this paper, we present the design of a system that automatically generates application-specific Linux images for scientific applications that execute using batched cluster resources. Key to our approach is the use of recurring patterns in program performance, i.e., phase-behavior, that can be exploited potentially to guide automatic Linux customization and to enable significantly higher levels in program performance. We overview project and present a set of preliminary results that show the potential of our approach. 1
Annotations for Portable Intermediate Languages
, 2001
"... This paper identifies high-level program properties that can be discovered by static analysis in a compiler front end, and that are useful for classical low-level optimizations. We suggest how intermediate language code could be annotated to convey these properties to the code generator. ..."
Abstract
-
Cited by 6 (1 self)
- Add to MetaCart
This paper identifies high-level program properties that can be discovered by static analysis in a compiler front end, and that are useful for classical low-level optimizations. We suggest how intermediate language code could be annotated to convey these properties to the code generator.
Write barrier elision for concurrent garbage collectors
- In Proceedings of the 4th international symposium on Memory management
, 2004
"... ABSTRACT Concurrent garbage collectors require write barriers to preserveconsistency, but these barriers impose significant direct and indirect costs. While there has been a lot of work on optimizing write barri-ers, we present the first study of their elision in a concurrent collector. We show cond ..."
Abstract
-
Cited by 6 (0 self)
- Add to MetaCart
ABSTRACT Concurrent garbage collectors require write barriers to preserveconsistency, but these barriers impose significant direct and indirect costs. While there has been a lot of work on optimizing write barri-ers, we present the first study of their elision in a concurrent collector. We show conditions under which write barriers are redundant,and describe how these conditions can be applied to both incremental update or snapshot-at-the-beginning barriers. We then evaluatethe potential for write barrier elimination with a trace-based limit study, which shows that a significant percentage of write barriersare redundant. On average, 54 % of incremental barriers and 83 % of snapshot barriers are unnecessary.
Java Performance Evaluation through Rigorous Replay Compilation
- In ACM Conference on Object-Oriented Programming, Systems, Languages, and Applications
, 2008
"... A managed runtime environment, such as the Java virtual machine, is non-trivial to benchmark. Java performance is affected in various complex ways by the application and its input, as well as by the virtual machine (JIT optimizer, garbage collector, thread scheduler, etc.). In addition, nondetermini ..."
Abstract
-
Cited by 6 (0 self)
- Add to MetaCart
A managed runtime environment, such as the Java virtual machine, is non-trivial to benchmark. Java performance is affected in various complex ways by the application and its input, as well as by the virtual machine (JIT optimizer, garbage collector, thread scheduler, etc.). In addition, nondeterminism due to timer-based sampling for JIT optimization, thread scheduling, and various system effects further complicate the Java performance benchmarking process. Replay compilation is a recently introduced Java performance analysis methodology that aims at controlling nondeterminism to improve experimental repeatability. The key idea of replay compilation is to control the compilation load during experimentation by inducing a pre-recorded compilation plan at replay time. Replay compilation also enables teasing apart performance effects of the application versus the virtual machine. This paper argues that in contrast to current practice which uses a single compilation plan at replay time, multiple compilation plans add statistical rigor to the replay compilation methodology. By doing so, replay compilation better accounts for the variability observed in compilation load across compilation plans. In addition, we propose matchedpair comparison for statistical data analysis. Matched-pair comparison considers the performance measurements per compilation plan before and after an innovation of interest as a pair, which enables limiting the number of compilation plans needed for accurate performance analysis compared to statistical analysis assuming unpaired measurements.

