Results 1 - 10
of
19
Adaptive Optimization in the Jalapeno JVM
- In ACM SIGPLAN Conference on Object-Oriented Programming Systems, Languages, and Applications (OOPSLA
, 2000
"... (*58()9$"2#$:0/,;58(03<10/2,>=?33@">"29 #A:0*/,B58(*C2"258/052,D3*>#$,,6-*0'/ 58@F,058*,+HG?!"*0"I"252J58K0/ ,6-*0'/ 030"6N*IO40"58DP)"58QF,058SRUT6252,D<0!2T6252,V52!8("9 "W5X3,06*9E,'Y58(*03C:0'/ X3,06*9E,'Y58(*03C 1622 *'\,20/2XD3Q#$,U-0/269EU,/52,X"58QF,0'58,+ I,2/2-K58X^528-3L2T6252,_0/252/,5 ..."
Abstract
-
Cited by 149 (10 self)
- Add to MetaCart
(*58()9$"2#$:0/,;58(03<10/2,>=?33@">"29 #A:0*/,B58(*C2"258/052,D3*>#$,,6-*0'/ 58@F,058*,+HG?!"*0"I"252J58K0/ ,6-*0'/ 030"6N*IO40"58DP)"58QF,058SRUT6252,D<0!2T6252,V52!8("9 "W5X3,06*9E,'Y58(*03C:0'/ X3,06*9E,'Y58(*03C 1622 *'\,20/2XD3Q#$,U-0/269EU,/52,X"58QF,0'58,+ I,2/2-K58X^528-3L2T6252,_0/252/,58('4-*0'2,Y 0C#$,058Z#>58,0@=`58a02T/2*(*C/,':b(/,058c+ \",25C0d@"3,152058[#;58!*03e0/252,/58( 5805f8(""52<00"58>b(3589$3,3*"*58QF058C-02,;"(3T Y2520'58258/,03@20'Q"3+ ] D,Q"...
Characterizing the Memory Behavior of Java Workloads: A Structured View and Opportunities for Optimizations
, 2000
"... This paper studies the memory behavior of important Java workloads used in benchmarking Java Virtual Machines (JVMs), based on instrumentation of both application and library code in a state-of-theart JVM, and provides structured information about these workloads to help guide systems' design. We be ..."
Abstract
-
Cited by 54 (3 self)
- Add to MetaCart
This paper studies the memory behavior of important Java workloads used in benchmarking Java Virtual Machines (JVMs), based on instrumentation of both application and library code in a state-of-theart JVM, and provides structured information about these workloads to help guide systems' design. We begin by characterizing the inherent memory behavior of the benchmarks, such as information on the breakup of heap accesses among different categories and on the hotness of references to fields and methods. We then provide detailed information about misses in the data TLB and caches, including the distribution of misses over different kinds of accesses and over different methods. In the process, we make interesting discoveries about TLB behavior and limitations of data prefetching schemes discussed in the literature in dealing with pointer-intensive Java codes. Throughout this paper, we develop a set of recommendations to computer architects and compiler writers on how to optimize computer systems and system software to run Java programs more efficiently. This paper also makes the first attempt to compare the characteristics of SPECjvm98 to those of a server-oriented benchmark, pBOB, and explain why the current set of SPECjvm98 benchmarks may not be adequate for a comprehensive and objective evaluation of JVMs and just-in-time (JIT) compilers. We discover that the fraction of accesses to array elements is quite significant, demonstrate that the number of "hot spots" in the benchmarks is small, and show that field reordering cannot yield significant performance gains. We also show that even a fairly large L2 data cache is not effective for many Java benchmarks. We observe that instructions used to prefetch data into the L2 data cache are often squashed because of high TLB miss ...
Workload Characterization of Multithreaded Java Servers
- IN IEEE INTERNATIONAL SYMPOSIUM ON PERFORMANCE ANALYSIS OF SYSTEMS AND SOFTWARE
, 2001
"... Java has gained popularity in the commercial server arena, but the characteristics of Java server applications are not well understood. In this research, we characterize the behavior of two Java server benchmarks, VolanoMark and SPECjbb2000, on a Pentium III system with the latest Java Hotspot Serve ..."
Abstract
-
Cited by 24 (2 self)
- Add to MetaCart
Java has gained popularity in the commercial server arena, but the characteristics of Java server applications are not well understood. In this research, we characterize the behavior of two Java server benchmarks, VolanoMark and SPECjbb2000, on a Pentium III system with the latest Java Hotspot Server VM. We compare Java server applications with SPECint2000 and also investigate the impact of multithreading by increasing the number of clients. Java servers are seen to exhibit poor instruction access behavior, including high instruction miss rate, high ITLB miss rate, high BTB miss rate and, as a result, high I-stream stalls. With increasing number of threads, the instruction behavior improves, suggesting increased locality of access. But the resource stalls increase and eventually dwarf the diminishing I-stream stalls. With more clients, the instruction count per unit work increases and becomes a hindrance to the scalability of the servers.
Using complete system simulation to characterize SPECjvm98 benchmarks
- In Proceedings of International Conference on Supercomputing
, 2000
"... Complete system simulation to understand the influence of architecture and operating systems on application execution has been identified to be crucial for systems design. While there have been previous attempts at understanding the architectural impact of Java programs, there has been no prior work ..."
Abstract
-
Cited by 18 (6 self)
- Add to MetaCart
Complete system simulation to understand the influence of architecture and operating systems on application execution has been identified to be crucial for systems design. While there have been previous attempts at understanding the architectural impact of Java programs, there has been no prior work investigating the operating system (kernel) activity during their executions. This problem is particularly interesting in the context of Java since it is not only the application that can invoke kernel services, but so does the underlying Java Virtual Machine (JVM) implementation which runs these programs. Further, the JVM style (JIT compiler or interpreter) and the manner in which the different JVM components (such as the garbage collector and class loader) are exercised, can have a significant impact on the kernel activities. To investigate these issues, this research uses complete system
An Empirical Study of Selective Optimization
- In 13th International Workshop on Languages and Compilers for Parallel Computing
, 2000
"... This paper describes an empirical study of the SPECjvm98 benchmarks, using the Jalape~no virtual machine. The study employs two compilers, a nonoptimizing compiler that is initially used to compile all application methods, and an optimizing compiler that is selectively used to recompile a parameteri ..."
Abstract
-
Cited by 16 (3 self)
- Add to MetaCart
This paper describes an empirical study of the SPECjvm98 benchmarks, using the Jalape~no virtual machine. The study employs two compilers, a nonoptimizing compiler that is initially used to compile all application methods, and an optimizing compiler that is selectively used to recompile a parameterized set of hot methods based on past profiling. We view this study as a step in examining the feasibility of adaptive optimization in this environment. The results show promise for adaptive optimization. In particular, they show that the combined time (execution and compilation) of selective opt-compilation can be less than the execution time of no opt-compilation and the combined time of full opt-compilation. The results also show that the combined time of selective opt-compilation can be competitive with static compilation (full opt-compilation not counting compilation time) for the SPECjvm98 benchmarks with input size 100. 1 Introduction One technique for increasing the efficie...
Understanding the behavior of compiler optimizations
, 2004
"... Compiler writers usually follow some known rules of thumb on the effectiveness of optimizations when implementing compilers. While many of these rules may be correct, it is a far better methodology to base implementation decisions on a scientific evaluation of optimizations. To this end, we present ..."
Abstract
-
Cited by 10 (5 self)
- Add to MetaCart
Compiler writers usually follow some known rules of thumb on the effectiveness of optimizations when implementing compilers. While many of these rules may be correct, it is a far better methodology to base implementation decisions on a scientific evaluation of optimizations. To this end, we present an exploration of the costs and benefits of optimizations implemented in Jikes RVM, a research virtual machine that includes an aggressive optimizing compiler. We measure and report the performance impact due to optimizations, both when the optimizations are used by themselves and when they are used with other optimizations. To understand why optimizations do or do not improve performance, we wrote kernel programs to test and explore the behavior of each optimization. To increase the generality of our results, we report measurements on two architectures (IA32 and PowerPC). Based on our findings, we present a set of recommendations for compiler writers. KEY WORDS: Java; compiler optimizations; performance evaluation of optimizations 1.
Understanding Control Flow Transfer and its Predictability in Java Processing
, 2001
"... An in-depth look and understanding of control flow transfer and its predictability can guide architects to adapt control flow prediction hardware in Java processing or finely tune the performance of JVM software on general purpose machines. To our knowledge, this paper provides the first insight of ..."
Abstract
-
Cited by 7 (4 self)
- Add to MetaCart
An in-depth look and understanding of control flow transfer and its predictability can guide architects to adapt control flow prediction hardware in Java processing or finely tune the performance of JVM software on general purpose machines. To our knowledge, this paper provides the first insight of branch behavior on a standard Java Virtual Machine with real workloads. Employing a complete system simulation environment, we profile branch execution characteristics and quantify the performance of a wide range of prediction schemes on both user and kernel code. The impact of different JVM styles (JIT compiler and interpreter) on branch behavior is also studied. We find that: (1) Kernel branches constitute a significant portion of total branch execution in Java processing; (2) Kernel and user code favor different prediction mechanisms; (3) Java processing exercises fairly large number of branch sites and large control flow footprint compared with the execution of benchmarks such as SPECInt95; (4) A major part of the dynamic indirect branches are multiple target (polymorphic) branches. Target addresses of indirect branches, especially those in interpreting mode are highly interleaved and cause high BTB misprediction. 1.
Characterization of Value Locality in Java Programs
- In Proceedings of the Workshop on Workload Characterization, ICCD
, 2000
"... Recent works have shown that there is significant repetition of instruction result values in RISC programs. This phenomenon was termed value locality. This paper gains an initial understanding of value locality in the context of Java programs. Java programs are different from typical C-compiled RISC ..."
Abstract
-
Cited by 5 (0 self)
- Add to MetaCart
Recent works have shown that there is significant repetition of instruction result values in RISC programs. This phenomenon was termed value locality. This paper gains an initial understanding of value locality in the context of Java programs. Java programs are different from typical C-compiled RISC programs because 1) they are highly-object oriented with many dynamically linked method calls, and 2) they are compiled into bytecodes for the stack-based Java Virtual Machine architecture. This work shows that there is considerable value locality in method arguments and return values among repeated invocations of the same methods in the SPEC JVM98 benchmarks. The paper also shows similarly ample value locality of the stack source and destination operand values of repeatedly executed bytecode instructions. The paper concludes by leveraging the strong semantic information provided by bytecodes to asses the value locality of different types of values. 1. Introduction Recent works have shown...
Workload Characterization of Java Server Applications on Two PowerPC Processors
- In Proceedings of the Third Annual Austin Center for Advanced Studies Conference
, 2002
"... Java has become fairly popular on commercial servers in recent years. However, the behavior of Java server applications has not been studied extensively. We characterize two Java server benchmarks, SPECjbb2000 and VolanoMark 2.1.2, on two IBM PowerPC architectures, the RS64-III and the POWER3-II, an ..."
Abstract
-
Cited by 5 (0 self)
- Add to MetaCart
Java has become fairly popular on commercial servers in recent years. However, the behavior of Java server applications has not been studied extensively. We characterize two Java server benchmarks, SPECjbb2000 and VolanoMark 2.1.2, on two IBM PowerPC architectures, the RS64-III and the POWER3-II, and compare them to more traditional workloads as represented by selected benchmarks from SPECint2000. We find that our Java server benchmarks have generally the same characteristics on both platforms: in particular, high instruction cache, ITLB, and BTAC (Branch Target Address Cache) miss rates. These benchmarks also exhibit high L2 miss rates due mostly to data loads. Instruction cache and L2 misses are seen to be the primary contributors to CPI.
Workload Characterization of Multithreaded Java Servers on Two PowerPC Processors
- PowerPC Processors” 4th Annual IEEE International Workshop on Workload Characterization
, 2001
"... Java has, in recent years, become fairly popular as a platform for commercial servers. However, the behavior of Java server applications has not been studied extensively. ..."
Abstract
-
Cited by 5 (1 self)
- Add to MetaCart
Java has, in recent years, become fairly popular as a platform for commercial servers. However, the behavior of Java server applications has not been studied extensively.

