Results 1 - 10
of
27
Vertical Profiling: Understanding the Behavior of Object-Oriented Applications
"... Object-oriented programming languages provide a rich set of features that provide significant software engineering benefits. The increased productivity provided by these features comes at a justifiable cost in a more sophisticated runtime system whose responsibility is to implement these features e# ..."
Abstract
-
Cited by 47 (14 self)
- Add to MetaCart
Object-oriented programming languages provide a rich set of features that provide significant software engineering benefits. The increased productivity provided by these features comes at a justifiable cost in a more sophisticated runtime system whose responsibility is to implement these features e#ciently. However, the virtualization introduced by this sophistication provides a significant challenge to understanding complete system performance, not found in traditionally compiled languages, such as C or C++. Thus, understanding system performance of such a system requires profiling that spans all levels of the execution stack, such as the hardware, operating system, virtual machine, and application.
A dynamic optimization framework for a Java just-in-time compiler
, 2001
"... The high performance implementation of Java Virtual Machines (JVM) and Just-In-Time (JIT) compilers is directed toward adaptive compilation optimizations on the basis of online runtime profile in-formation. This paper describes the design and implementation of a dynamic optimization framework in a p ..."
Abstract
-
Cited by 42 (7 self)
- Add to MetaCart
The high performance implementation of Java Virtual Machines (JVM) and Just-In-Time (JIT) compilers is directed toward adaptive compilation optimizations on the basis of online runtime profile in-formation. This paper describes the design and implementation of a dynamic optimization framework in a production-level Java JIT compiler. Our approach is to employ a mixed mode interpreter and a three level optimizing compiler, supporting quick, full, and spe-cial optimization, each of which has a different set of tradeoffs be-tween compilation overhead and execution speed. A lightweight sampling profiler operates continuously during the entire program's execution. When necessary, detailed information on runtime behav-ior is collected by dynamically generating instrumentation code which can be installed to and uninstalled from the specified recom-pilation target code. Value profiling with this instrumentation mechanism allows fully automatic code specialization to be per-formed on the basis of specific parameter values or global data at the highest optimization level. The experimental results show that our approach offers high performance and a low code expansion ra-tio in both program startup and steady state measurements in com-parison to the compile-only approach, and that the code specializa-tion can also contribute modest pertbrmance improvements. 1.
The Case for Profile-Directed Selection of Garbage Collectors
, 2000
"... Many garbage-cE6zcc systems use a single garbagecrb lecbag algorithmacrit allapplicRz"FRN It has long been known that thisci pro duc poor performanc onapplic# tions forwhic h thatcatz#E6N is not well suited. In some systems,suc h as those thatexec#6 stand-alonectand-a execd-alone an appropriatecppro ..."
Abstract
-
Cited by 26 (3 self)
- Add to MetaCart
Many garbage-cE6zcc systems use a single garbagecrb lecbag algorithmacrit allapplicRz"FRN It has long been known that thisci pro duc poor performanc onapplic# tions forwhic h thatcatz#E6N is not well suited. In some systems,suc h as those thatexec#6 stand-alonectand-a execd-alone an appropriatecppropri foreac happlic"FER ca be selecz" from a pool of availablecblezqS96 and tuned by using profile information. In a study of 20 benc hmarks and several cz99EOz"FS cz99EO with the Marmot optimizing Java-to-native c#qRO#z" for everycyzq#SFO there was at least one benc hmark that would have been at least 15% faster with a more appropriatecpropriat The czqE69Oz" are acO ying cgz9qFqz" a generationalce yingcgzqq6Oqz whic h is cz bined witheac h of 4 di#erent write barriers, and the null c6NN9z"EF whic h allo cloz but neverczq#qRSz A detailed analysis of storage managementc#Nq shows how they vary by applicEq#9 and cz#q9SFz" 1. INTRODUCTION Automatic storage management eliminates a significz t so...
Online performance auditing: using hot optimizations without getting burned
- In Proceedings of the SIGPLAN Conference on Programming Language Design and Implementation
, 2006
"... As hardware complexity increases and virtualization is added at more layers of the execution stack, predicting the performance impact of optimizations becomes increasingly difficult. Production compilers and virtual machines invest substantial development effort in performance tuning to achieve good ..."
Abstract
-
Cited by 24 (2 self)
- Add to MetaCart
As hardware complexity increases and virtualization is added at more layers of the execution stack, predicting the performance impact of optimizations becomes increasingly difficult. Production compilers and virtual machines invest substantial development effort in performance tuning to achieve good performance for a range of benchmarks. Although optimizations typically perform well on average, they often have unpredictable impact on running time, sometimes degrading performance significantly. Today’s VMs perform sophisticated feedback-directed optimizations, but these techniques do not address performance degradations, and they actually make the situation worse by making the system more unpredictable. This paper presents an online framework for evaluating the effectiveness of optimizations, enabling an online system to automatically identify and correct performance anomalies that occur at runtime. This work opens the door for a fundamental shift in the way optimizations are developed and tuned for online systems, and may allow the body of work in offline empirical optimization search to be applied automatically at runtime. We present our implementation and evaluation of this system in a product Java VM.
Statistically rigorous Java performance evaluation
- In Proceedings of the ACM SIGPLAN Conference on Object-Oriented Programming, Systems, Languages, and Applications (OOPSLA
, 2007
"... Java performance is far from being trivial to benchmark because it is affected by various factors such as the Java application, its input, the virtual machine, the garbage collector, the heap size, etc. In addition, non-determinism at run-time causes the execution time of a Java program to differ fr ..."
Abstract
-
Cited by 23 (3 self)
- Add to MetaCart
Java performance is far from being trivial to benchmark because it is affected by various factors such as the Java application, its input, the virtual machine, the garbage collector, the heap size, etc. In addition, non-determinism at run-time causes the execution time of a Java program to differ from run to run. There are a number of sources of non-determinism such as Just-In-Time (JIT) compilation and optimization in the virtual machine (VM) driven by timerbased method sampling, thread scheduling, garbage collection, and various system effects. There exist a wide variety of Java performance evaluation methodologies used by researchers and benchmarkers. These methodologies differ from each other in a number of ways. Some report average performance over a number of runs of the same experiment; others report the best or second best performance observed; yet others report the worst. Some iterate the benchmark multiple times within a single VM invocation; others consider multiple VM invocations and iterate a single benchmark execution; yet others consider multiple VM invocations and iterate the benchmark multiple times. This paper shows that prevalent methodologies can be misleading, and can even lead to incorrect conclusions. The reason is that the data analysis is not statistically rigorous. In this paper, we present a survey of existing Java performance evaluation methodologies and discuss the importance of statistically rigorous data analysis for dealing with non-determinism. We advocate approaches to quantify startup as well as steady-state performance, and, in addition, we provide the JavaStats software to automatically obtain performance numbers in a rigorous manner. Although this paper focuses on Java performance evaluation, many of the issues addressed in this paper also apply to other programming languages and systems that build on a managed runtime system.
MicroPhase: An Approach to Proactively Invoking Garbage Collection for Improved Performance
, 2007
"... To date, the most commonly used criterion for invoking garbage collection (GC) is based on heap usage; that is garbage collection is invoked when the heap or an area inside the heap is full. This approach can suffer from two performance shortcomings, untimely garbage collection invocations and large ..."
Abstract
-
Cited by 11 (0 self)
- Add to MetaCart
To date, the most commonly used criterion for invoking garbage collection (GC) is based on heap usage; that is garbage collection is invoked when the heap or an area inside the heap is full. This approach can suffer from two performance shortcomings, untimely garbage collection invocations and large volumes of surviving objects. In this work, we explore a new GC triggering approach called MicroPhase that exploits two observations, (i) allocation requests occur in phases and (ii) the phase boundaries coincide with times when most objects also die, to proactively invoke garbage collection yielding high efficiency. We extended the HotSpot virtual machine from Sun Microsystems to support MicroPhase and conducted experiments using 20 benchmarks. The experimental results indicate that our technique can reduce the GC times in 19 applications. The differences in GC overhead range from an increase of 1 % to a decrease of 26 % when the heap is set to be twice the maximum live-size. As a result, MicroPhase can improve the the overall performance of 13 benchmarks. The performance differences range from a degradation of 2.5 % to an improvement of 14%.
Array bounds check elimination for the Java HotSpot client compiler
- IN PPPJ ’07: PROCEEDINGS OF THE 5TH INTERNATIONAL SYMPOSIUM ON PRINCIPLES AND PRACTICE OF PROGRAMMING IN JAVA
, 2007
"... Whenever an array element is accessed, Java virtual machines execute a compare instruction to ensure that the index value is within the valid bounds. This reduces the execution speed of Java programs. Array bounds check elimination identifies situations in which such checks are redundant and can be ..."
Abstract
-
Cited by 10 (0 self)
- Add to MetaCart
Whenever an array element is accessed, Java virtual machines execute a compare instruction to ensure that the index value is within the valid bounds. This reduces the execution speed of Java programs. Array bounds check elimination identifies situations in which such checks are redundant and can be removed. We present an array bounds check elimination algorithm for the Java HotSpot TM VM based on static analysis in the just-in-time compiler. The algorithm works on an intermediate representation in static single assignment form and maintains conditions for index expressions. It fully removes bounds checks if it can be proven that they never fail. Whenever possible, it moves bounds checks out of loops. The static number of checks remains the same, but a check inside a loop is likely to be executed more often. If such a check fails, the executing program falls back to interpreted mode, avoiding the problem that an exception is thrown at the wrong place. The evaluation shows a speedup near to the theoretical maximum for the scientific SciMark benchmark suite (40% on average). The algorithm also improves the execution speed for the SPECjvm98 benchmark suite (2 % on average, 12 % maximum).
Toward a definition of run-time object-oriented metrics
- 7th ECOOP Workshop on Quantitative Approaches in Object-Oriented Engineering
, 2003
"... Abstract — This position paper outlines a programme of research based on the quantification of run-time elements of Java programs. In particular, we adapt two common objectoriented metrics, coupling and cohesion, so that they can be applied at run-time. We demonstrate some preliminary results of our ..."
Abstract
-
Cited by 8 (4 self)
- Add to MetaCart
Abstract — This position paper outlines a programme of research based on the quantification of run-time elements of Java programs. In particular, we adapt two common objectoriented metrics, coupling and cohesion, so that they can be applied at run-time. We demonstrate some preliminary results of our analysis on programs from the SPEC JVM98 benchmark suite. Index Terms—Area V (Metrics Validation): Formal and empirical validation of OO metrics, Standard data sets for metrics validation. I.
Libra: A library operating system for a jvm in a virtualized execution environment
- In VEE (Virtual Execution Environments
, 2007
"... If the operating system could be specialized for every application, many applications would run faster. For example, Java virtual machines (JVMs) provide their own threading model and memory protection, so general-purpose operating system implementations of these abstractions are redundant. However, ..."
Abstract
-
Cited by 7 (1 self)
- Add to MetaCart
If the operating system could be specialized for every application, many applications would run faster. For example, Java virtual machines (JVMs) provide their own threading model and memory protection, so general-purpose operating system implementations of these abstractions are redundant. However, traditional means of transforming existing systems into specialized systems are difficult to adopt because they require replacing the entire operating system. This paper describes Libra, an execution environment specialized for IBM’s J9 JVM. Libra does not replace the entire operating system. Instead, Libra and J9 form a single statically-linked image that runs in a hypervisor partition. Libra provides the services necessary to achieve good performance for the Java workloads of interest but relies on an instance of Linux in another hypervisor partition to provide a networking stack, a filesystem, and other services. The expense of remote calls is offset by the fact that Libra’s services can be customized for a particular workload; for example, on the Nutch search engine, we show that two simple customizations improve application throughput by a factor of 2.7.
Java Performance Evaluation through Rigorous Replay Compilation
- In ACM Conference on Object-Oriented Programming, Systems, Languages, and Applications
, 2008
"... A managed runtime environment, such as the Java virtual machine, is non-trivial to benchmark. Java performance is affected in various complex ways by the application and its input, as well as by the virtual machine (JIT optimizer, garbage collector, thread scheduler, etc.). In addition, nondetermini ..."
Abstract
-
Cited by 6 (0 self)
- Add to MetaCart
A managed runtime environment, such as the Java virtual machine, is non-trivial to benchmark. Java performance is affected in various complex ways by the application and its input, as well as by the virtual machine (JIT optimizer, garbage collector, thread scheduler, etc.). In addition, nondeterminism due to timer-based sampling for JIT optimization, thread scheduling, and various system effects further complicate the Java performance benchmarking process. Replay compilation is a recently introduced Java performance analysis methodology that aims at controlling nondeterminism to improve experimental repeatability. The key idea of replay compilation is to control the compilation load during experimentation by inducing a pre-recorded compilation plan at replay time. Replay compilation also enables teasing apart performance effects of the application versus the virtual machine. This paper argues that in contrast to current practice which uses a single compilation plan at replay time, multiple compilation plans add statistical rigor to the replay compilation methodology. By doing so, replay compilation better accounts for the variability observed in compilation load across compilation plans. In addition, we propose matchedpair comparison for statistical data analysis. Matched-pair comparison considers the performance measurements per compilation plan before and after an innovation of interest as a pair, which enables limiting the number of compilation plans needed for accurate performance analysis compared to statistical analysis assuming unpaired measurements.

