Results 1 - 10
of
36
Optimizing Java Bytecode using the Soot Framework: Is it Feasible?
- In Proceedings of CC’00, International Conference on Compiler Construction (2000), Springer-Verlag (LNCS
, 2000
"... . This paper presents Soot, a framework for optimizing Java TM bytecode. The framework is implemented in Java and supports three intermediate representations for representing Java bytecode: Baf, a streamlined representation of Java's stack-based bytecode; Jimple, a typed three-address intermedi ..."
Abstract
-
Cited by 86 (14 self)
- Add to MetaCart
. This paper presents Soot, a framework for optimizing Java TM bytecode. The framework is implemented in Java and supports three intermediate representations for representing Java bytecode: Baf, a streamlined representation of Java's stack-based bytecode; Jimple, a typed three-address intermediate representation suitable for optimization; and Grimp, an aggregated version of Jimple. Our approach to class le optimization is to rst convert the stack-based bytecode into Jimple, a three-address form more amenable to traditional program optimization, and then convert the optimized Jimple back to bytecode. In order to demonstrate that our approach is feasible, we present experimental results showing the eects of processing class les through our framework. In particular, we study the techniques necessary to effectively translate Jimple back to bytecode, without losing performance. Finally, we demonstrate that class le optimization can be quite eective by showing the resul...
Soot - a Java Bytecode Optimization Framework
, 1999
"... This paper presents Soot, a framework for optimizing Java bytecode. The framework is implemented in Java and supports three intermediate representations for representing Java bytecode: Baf, a streamlined representation of bytecode which is simple to manipulate; Jimple, a typed 3-address intermedi ..."
Abstract
-
Cited by 84 (0 self)
- Add to MetaCart
This paper presents Soot, a framework for optimizing Java bytecode. The framework is implemented in Java and supports three intermediate representations for representing Java bytecode: Baf, a streamlined representation of bytecode which is simple to manipulate; Jimple, a typed 3-address intermediate representation suitable for optimization; and Grimp, an aggregated version of Jimple suitable for decompilation. We describe the motivation for each representation, and the salient points in translating from one representation to another. In order to demonstrate the usefulness of the framework, we have implemented intraprocedural and whole program optimizations. To show that whole program bytecode optimization can give performance improvements, we provide experimental results for 12 large benchmarks, including 8 SPECjvm98 benchmarks running on JDK 1.2 for GNU/Linuxtm . These results show up to 8% improvement when the optimized bytecode is run using the interpreter and up to 21% when run...
Using Annotations to Reduce Dynamic Optimization Time
, 2001
"... Dynamic compilation and optimization are widely used in heterogenous computing environments, in which an intermediate form of the code is compiled to native code during execution. An important tradeoff exists between the amount of time spent dynamically optimizing the program and the running time of ..."
Abstract
-
Cited by 48 (13 self)
- Add to MetaCart
Dynamic compilation and optimization are widely used in heterogenous computing environments, in which an intermediate form of the code is compiled to native code during execution. An important tradeoff exists between the amount of time spent dynamically optimizing the program and the running time of the program. The time to perform dynamic optimizations can cause significant delays during execution and also prohibit performance gains that result from more complex optimization.
A brief history of just-in-time
- ACM Computing Surveys
, 2003
"... Software systems have been using “just-in-time ” compilation (JIT) techniques since the 1960s. Broadly, JIT compilation includes any translation performed dynamically, after a program has started execution. We examine the motivation behind JIT compilation and constraints imposed on JIT compilation s ..."
Abstract
-
Cited by 42 (1 self)
- Add to MetaCart
Software systems have been using “just-in-time ” compilation (JIT) techniques since the 1960s. Broadly, JIT compilation includes any translation performed dynamically, after a program has started execution. We examine the motivation behind JIT compilation and constraints imposed on JIT compilation systems, and present a classification scheme for
Continuous Program Optimization: A Case Study
- ACM Transactions on Programming Languages and Systems
, 2003
"... This paper presents a system that provides code generation at load-time and continuous program optimization at run-time. First, the architecture of the system is presented. Then, two optimization techniques are discussed that were developed specifically in the context of continuous optimization. The ..."
Abstract
-
Cited by 38 (7 self)
- Add to MetaCart
This paper presents a system that provides code generation at load-time and continuous program optimization at run-time. First, the architecture of the system is presented. Then, two optimization techniques are discussed that were developed specifically in the context of continuous optimization. The first of these optimizations continually adjusts the storage layouts of dynamic data structures to maximize data cache locality, while the second performs profile-driven instruction re-scheduling to increase instruction-level parallelism. These two optimizations have very di#erent cost/benefit ratios, presented in a series of benchmarks. The paper concludes with an outlook to future research directions and an enumeration of some remaining research problems. The empirical results presented in this paper make a case in favor of continuous optimization, but indicate that it needs to be applied judiciously. In many situations, the costs of dynamic optimizations outweigh their benefit, so that no break-even point is ever reached. In favorable circumstances, on the other hand, speed-ups of over 120% have been observed. It appears as if the main beneficiaries of continuous optimization are shared libraries, which at di#erent times can be optimized in the context of the currently dominant client application.
An Architectural Framework for Run-Time Optimization
- IEEE Transactions on Computers
, 2001
"... Wide-issue processors continue to achieve higher performance by exploiting greater instruction-level parallelism. Dynamic techniques such as out-of-order execution and hardware speculation have proven effective at increasing instruction throughput. Run-time optimization promises to provide an even ..."
Abstract
-
Cited by 30 (2 self)
- Add to MetaCart
Wide-issue processors continue to achieve higher performance by exploiting greater instruction-level parallelism. Dynamic techniques such as out-of-order execution and hardware speculation have proven effective at increasing instruction throughput. Run-time optimization promises to provide an even higher level of performance by adaptively applying aggressive code transformations on a larger scope. This paper presents a new hardware mechanism for generating and deploying run-time optimized code. The mechanism can be viewed as a filtering system, that resides in the retirement stage of the processor pipeline, accepts an instruction execution stream as input, and produces instruction profiles and sets of linked, optimized traces as output. The code deployment mechanism uses an extension to the branch prediction mechanism to migrate execution into the new code without modifying the original code. These new components do not add delay to the execution of the program except during short bursts of reoptimization. This technique provides a strong platform for run-time optimization because the hot execution regions are extracted, optimized, and written to main memory for execution and because these regions persist across context switches. The current design of the framework supports a suite of optimizations including partial function inlining (even into shared libraries), code straightening optimizations, loop unrolling, and peephole optimizations. 1
Architectural Issues in Java Runtime Systems
, 1999
"... The Java Virtual Machine (JVM) is the corner stone of Java technology, and its efficiency in executing the portable Java bytecodes is crucial for the success of this technology. Interpretation, Just-In-Time (JIT) compilation, and hardware realization are well known solutions for a JVM, and previous ..."
Abstract
-
Cited by 27 (10 self)
- Add to MetaCart
The Java Virtual Machine (JVM) is the corner stone of Java technology, and its efficiency in executing the portable Java bytecodes is crucial for the success of this technology. Interpretation, Just-In-Time (JIT) compilation, and hardware realization are well known solutions for a JVM, and previous research has proposed optimizations for each of these techniques. However, each technique has its pros and cons and may not be uniformly attractive for all hardware platforms. Instead, an understanding of the architectural implications of JVM implementations with real applications, can be crucial to the development of enabling technologies for efficient Java runtime system development on a wide range of platforms (from resource-rich servers to resource-constrained hand-held/embedded systems). Towards this goal, this paper examines architectural issues, from both the hardware and JVM implementation perspectives. It specifically explores the potential of a smart JIT compiler strategy that can ...
The Open Runtime Platform: A Flexible High-Performance Managed Runtime Environment
- Intel Technology Journal
, 2003
"... managed runtime environment (MRTE) that features exact generational garbage collection, fast thread synchronization, and multiple coexisting just-in-time compilers (JITs). ORP was designed for flexibility in order to support experiments in dynamic compilation, garbage collection, synchronization, an ..."
Abstract
-
Cited by 25 (8 self)
- Add to MetaCart
managed runtime environment (MRTE) that features exact generational garbage collection, fast thread synchronization, and multiple coexisting just-in-time compilers (JITs). ORP was designed for flexibility in order to support experiments in dynamic compilation, garbage collection, synchronization, and other technologies. It can be built to run either Java or Common Language Infrastructure (CLI) applications, to run under the Windows or Linux operating systems, and to run on the IA-32 or Itanium processor family (IPF) architectures.
LLVA: A Low-level Virtual Instruction Set Architecture
- IN MICRO-36
, 2003
"... A virtual instruction set architecture (V-ISA) implemented via a processor-specific software translation layer can provide great flexibility to processor designers. Recent examples such as Crusoe and DAISY, however, have used existing hardware instruction sets as virtual ISAs, which complicates tran ..."
Abstract
-
Cited by 20 (3 self)
- Add to MetaCart
A virtual instruction set architecture (V-ISA) implemented via a processor-specific software translation layer can provide great flexibility to processor designers. Recent examples such as Crusoe and DAISY, however, have used existing hardware instruction sets as virtual ISAs, which complicates translation and optimization. In fact, there has been little research on specific designs for a virtual ISA for processors. This paper proposes a novel virtual ISA (LLVA) and a translation strategy for implementing it on arbitrary hardware. The instruction set is typed, uses an infinite virtual register set in Static Single Assignment form, and provides explicit control-flow and dataflow information, and yet uses low-level operations closely matched to traditional hardware. It includes novel mechanisms to allow more flexible optimization of native code, including a flexible exception model and minor constraints on self-modifying code. We propose a translation strategy that enables offline translation and transparent offline caching of native code and profile information, while remaining completely OS-independent. It also supports optimizations directly on the representation at install-time, runtime, and offline between executions. We show experimentally that the virtual ISA is compact, it is closely matched to ordinary hardware instruction sets, and permits very fast code generation, yet has enough high-level information to permit sophisticated program analyses and optimizations.
An Evaluation of Java for Numerical Computing
- In Proceedings of ISCOPE'98
, 1998
"... This paper describes the design and implementation of high performance numerical software in Java. Our primary goals are to characterize the performance of object-oriented numerical software written in Java and to investigate whether Java is a suitable language for such endeavors. We have implemente ..."
Abstract
-
Cited by 20 (1 self)
- Add to MetaCart
This paper describes the design and implementation of high performance numerical software in Java. Our primary goals are to characterize the performance of object-oriented numerical software written in Java and to investigate whether Java is a suitable language for such endeavors. We have implemented JLAPACK, a subset of the LAPACK library in Java. LAPACK is a high-performance Fortran 77 library used to solve common linear algebra problems. JLAPACK is an object-oriented library, using encapsulation, inheritance, and exception handling. It performs within a factor of four of the optimized Fortran version for certain platforms and test cases. When used with the native BLAS library, JLAPACK performs comparably with the Fortran version using the native BLAS library. We conclude that high-performance numerical software could be written in Java if a handful of concerns about language features and compilation strategies are adequately addressed.

