Results 1 - 10
of
22
Oil and Water? High Performance Garbage Collection in Java with MMTk
- In ICSE 2004, 26th International Conference on Software Engineering
, 2004
"... Increasingly popular languages such as Java and C # require efficient garbage collection. This paper presents the design, implementation, and evaluation of MMTk, a Memory Management Toolkit for and in Java. MMTk is an efficient, composable, extensible, and portable framework for building garbage col ..."
Abstract
-
Cited by 81 (18 self)
- Add to MetaCart
Increasingly popular languages such as Java and C # require efficient garbage collection. This paper presents the design, implementation, and evaluation of MMTk, a Memory Management Toolkit for and in Java. MMTk is an efficient, composable, extensible, and portable framework for building garbage collectors. MMTk uses design patterns and compiler cooperation to combine modularity and efficiency. The resulting system is more robust, easier to maintain, and has fewer defects than monolithic collectors. Experimental comparisons with monolithic Java and C implementations reveal MMTk has significant performance advantages as well. Performance critical system software typically uses monolithic C at the expense of flexibility. Our results refute common wisdom that only this approach attains efficiency, and suggest that performance critical software can embrace modular design and high-level languages. 1
Barriers: Friend or Foe?
, 2004
"... Modern garbage collectors rely on read and write barriers imposed on heap accesses by the mutator, to keep track of references between different regions of the garbage collected heap, and to synchronize actions of the mutator with those of the collector. It has been a long-standing untested assumpti ..."
Abstract
-
Cited by 32 (5 self)
- Add to MetaCart
Modern garbage collectors rely on read and write barriers imposed on heap accesses by the mutator, to keep track of references between different regions of the garbage collected heap, and to synchronize actions of the mutator with those of the collector. It has been a long-standing untested assumption that barriers impose significant overhead to garbage-collected applications. As a result, researchers have devoted effort to development of optimization approaches for elimination of unnecessary barriers, or proposed new algorithms for garbage collection that avoid the need for barriers while retaining the capability for independent collection of heap partitions. On the basis of the results presented here, we dispel the assumption that barrier overhead should be a primary motivator for such efforts. We present a
Quantifying the performance of garbage collection vs. explicit memory management
- in: Proc. ACM Conference on Object-Oriented Programming, Systems, Languages, and Applications (OOPSLA
, 2005
"... Garbage collection yields numerous software engineering benefits, but its quantitative impact on performance remains elusive. One can compare the cost of conservative garbage collection to explicit memory management in C/C++ programs by linking in an appropriate collector. This kind of direct compar ..."
Abstract
-
Cited by 31 (5 self)
- Add to MetaCart
Garbage collection yields numerous software engineering benefits, but its quantitative impact on performance remains elusive. One can compare the cost of conservative garbage collection to explicit memory management in C/C++ programs by linking in an appropriate collector. This kind of direct comparison is not possible for languages designed for garbage collection (e.g., Java), because programs in these languages naturally do not contain calls to free. Thus, the actual gap between the time and space performance of explicit memory management and precise, copying garbage collection remains unknown. We introduce a novel experimental methodology that lets us quantify the performance of precise garbage collection versus explicit memory management. Our system allows us to treat unaltered Java programs as if they used explicit memory management by relying
Statistically rigorous Java performance evaluation
- In Proceedings of the ACM SIGPLAN Conference on Object-Oriented Programming, Systems, Languages, and Applications (OOPSLA
, 2007
"... Java performance is far from being trivial to benchmark because it is affected by various factors such as the Java application, its input, the virtual machine, the garbage collector, the heap size, etc. In addition, non-determinism at run-time causes the execution time of a Java program to differ fr ..."
Abstract
-
Cited by 23 (3 self)
- Add to MetaCart
Java performance is far from being trivial to benchmark because it is affected by various factors such as the Java application, its input, the virtual machine, the garbage collector, the heap size, etc. In addition, non-determinism at run-time causes the execution time of a Java program to differ from run to run. There are a number of sources of non-determinism such as Just-In-Time (JIT) compilation and optimization in the virtual machine (VM) driven by timerbased method sampling, thread scheduling, garbage collection, and various system effects. There exist a wide variety of Java performance evaluation methodologies used by researchers and benchmarkers. These methodologies differ from each other in a number of ways. Some report average performance over a number of runs of the same experiment; others report the best or second best performance observed; yet others report the worst. Some iterate the benchmark multiple times within a single VM invocation; others consider multiple VM invocations and iterate a single benchmark execution; yet others consider multiple VM invocations and iterate the benchmark multiple times. This paper shows that prevalent methodologies can be misleading, and can even lead to incorrect conclusions. The reason is that the data analysis is not statistically rigorous. In this paper, we present a survey of existing Java performance evaluation methodologies and discuss the importance of statistically rigorous data analysis for dealing with non-determinism. We advocate approaches to quantify startup as well as steady-state performance, and, in addition, we provide the JavaStats software to automatically obtain performance numbers in a rigorous manner. Although this paper focuses on Java performance evaluation, many of the issues addressed in this paper also apply to other programming languages and systems that build on a managed runtime system.
A unified theory of garbage collection
- In ACM Conference on Object-Oriented Programming, Systems, Languages, and Applications
, 2004
"... Tracing and reference counting are uniformly viewed as being fundamentally different approaches to garbage collection that possess very distinct performance properties. We have implemented highperformance collectors of both types, and in the process observed that the more we optimized them, the more ..."
Abstract
-
Cited by 8 (2 self)
- Add to MetaCart
Tracing and reference counting are uniformly viewed as being fundamentally different approaches to garbage collection that possess very distinct performance properties. We have implemented highperformance collectors of both types, and in the process observed that the more we optimized them, the more similarly they behaved — that they seem to share some deep structure. We present a formulation of the two algorithms that shows that they are in fact duals of each other. Intuitively, the difference is that tracing operates on live objects, or “matter”, while reference counting operates on dead objects, or “anti-matter”. For every operation performed by the tracing collector, there is a precisely corresponding anti-operation performed by the reference counting collector. Using this framework, we show that all high-performance collectors (for example, deferred reference counting and generational collection) are in fact hybrids of tracing and reference counting. We develop a uniform cost-model for the collectors to quantify the trade-offs that result from choosing different hybridizations of tracing and reference counting. This allows the correct scheme to be selected based on system performance requirements and the expected properties of the target application.
Investigating the throughput degradation behavior of Java application servers: A view from inside the virtual machine
- In: Proceedings of the 4th International Conference on Principles and Practices of Programming in Java
, 2006
"... Application servers are gaining popularity as a way for businesses to conduct day-to-day operations. Currently, the most adopted technologies for Application Servers are Java and.NET. While strong emphasis has been placed on the performance and throughput of these servers, only a few research effort ..."
Abstract
-
Cited by 6 (2 self)
- Add to MetaCart
Application servers are gaining popularity as a way for businesses to conduct day-to-day operations. Currently, the most adopted technologies for Application Servers are Java and.NET. While strong emphasis has been placed on the performance and throughput of these servers, only a few research efforts have focused on the degradation behaviors. Specifically, investigating how they perform under stress and factors that affect their throughput degradation behaviors. As a preliminary study, we conducted experiments to observe the throughput degradation behavior of Java application servers and found that the throughput degrades ungracefully. Thus, the goal of this work is three-fold: (i) identifying the primary factors that cause poor throughput degradation, (ii) investigating how these factors affect the throughput degradation, and (iii) observing how changes of algorithms and policies governing these factors affect the throughput degradation. Categories and Subject Descriptors D.3.4 [Programming Language]: Processors—Memory management (garbage collection)
Generating object lifetime traces with Merlin
- ACM TRANSACTIONS ON PROGRAMMING LANGUAGES AND SYSTEMS (TOPLAS
, 2006
"... Programmers are writing a rapidly growing number of programs in object-oriented languages, such as Java and C#, that require garbage collection. Garbage collection traces and simulation speed up research by enabling deeper understandings of object lifetime behavior and quick exploration and design o ..."
Abstract
-
Cited by 6 (2 self)
- Add to MetaCart
Programmers are writing a rapidly growing number of programs in object-oriented languages, such as Java and C#, that require garbage collection. Garbage collection traces and simulation speed up research by enabling deeper understandings of object lifetime behavior and quick exploration and design of new garbage collection algorithms. When generating perfect traces, the brute-force method of computing object lifetimes requires a whole-heap garbage collection at every potential collection point in the program. Because this process is prohibitively expensive, researchers often use granulated traces by collecting only periodically, for example, every 32 KB of allocation. We extend the state of the art for simulating garbage collection algorithms in two ways. First, we develop a systematic methodology for simulation studies of copying garbage collection and present results showing the effects of trace granularity on these simulations. We show that trace granularity often distorts simulated garbage collection results compared with perfect traces. Second, we present and measure the performance of a new algorithm called Merlin for computing object lifetimes.
The Mapping Collector: Virtual Memory Support for Generational, Parallel, and Concurrent Compaction
, 2008
"... Parallel and concurrent garbage collectors are increasingly employed by managed runtime environments (MREs) to maintain scalability, as multi-core architectures and multi-threaded applications become pervasive. Moreover, state-of-the-art MREs commonly implement compaction to eliminate heap fragmenta ..."
Abstract
-
Cited by 4 (3 self)
- Add to MetaCart
Parallel and concurrent garbage collectors are increasingly employed by managed runtime environments (MREs) to maintain scalability, as multi-core architectures and multi-threaded applications become pervasive. Moreover, state-of-the-art MREs commonly implement compaction to eliminate heap fragmentation and enable fast linear object allocation. Our empirical analysis of object demographics reveals that unreachable objects in the heap tend to form clusters large enough to be effectively managed at the granularity of virtual memory pages. Even though processes can manipulate the mapping of the virtual address space through the standard operating system (OS) interface on most platforms, extant parallel/concurrent compactors do not do so to exploit this clustering behavior and instead achieve compaction by performing, relatively expensive, object moving and pointer adjustment. We introduce the Mapping Collector (MC), which leverages virtual memory operations to reclaim and consolidate free space without moving objects and updating pointers. MC is a nearly-singlephase compactor that is simpler and more efficient than previously reported compactors that comprise two to four phases. Through effective MRE-OS coordination, MC maintains the simplicity of a non-moving collector while providing efficient parallel and concurrent compaction. We implement both stop-the-world and concurrent MC in a generational garbage collection framework within the open-source HotSpot Java Virtual Machine. Our experimental evaluation using a multiprocessor indicates that MC significantly increases throughput and scalability as well as reduces pause times, relative to stateof-the-art, parallel and concurrent compactors.
Systematic Dynamic Memory Management Design Methodology for Reduced Memory Footprint
"... New portable consumer embedded devices must execute multimedia and wireless network applications that demand extensive memory footprint. Moreover, they must heavily rely on Dynamic Memory (DM) due to the unpredictability of the input data (e.g. 3D streams features) and system behavior (e.g. number o ..."
Abstract
-
Cited by 3 (3 self)
- Add to MetaCart
New portable consumer embedded devices must execute multimedia and wireless network applications that demand extensive memory footprint. Moreover, they must heavily rely on Dynamic Memory (DM) due to the unpredictability of the input data (e.g. 3D streams features) and system behavior (e.g. number of applications running concurrently defined by the user). Within this context, consistent design methodologies that can tackle efficiently the complex DM behavior of these multimedia and network applications are in great need. In this paper, we present a new methodology that allows to design custom DM management mechanisms with a reduced memory footprint for such kind of dynamic applications. First, our methodology describes the large design space of DM management decisions for multimedia and wireless network applications. Then, we propose a suitable way to traverse the aforementioned design space and construct custom DM managers that minimize the DM used by these highly dynamic applications. As a result, our methodology achieves improvements of memory footprint by 60 % on average in real case studies over the current state-of-the-art DM managers used for these types of dynamic applications.
Caching strategies for improving generational garbage collection in Smalltalk
- North Carolina State University
, 2003
"... Abstract. In generational garbage collection, the youngest generation of the heap is frequently traversed during garbage collection. Due to randomness of the traversal, memory access patterns are unpredictable and cache performance becomes crucial to garbage-collection efficiency. Our proposal to im ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
Abstract. In generational garbage collection, the youngest generation of the heap is frequently traversed during garbage collection. Due to randomness of the traversal, memory access patterns are unpredictable and cache performance becomes crucial to garbage-collection efficiency. Our proposal to improve cache performance of garbage collection is to “pin ” the youngest generation (sometimes called the nursery) in the cache, converting all nursery accesses to cache hits. To make the nursery fit inside the cache, we reduce its size, and, to prevent its eviction from the cache, we configure the operating system’s pagefault handler to disallow any page allocation that would cause cache conflicts to the nursery. We evaluated our scheme on a copying-style generational garbage collector using IBM VisualAge Smalltalk and Jikes research virtual machine. Our simulation results indicate that the increase in frequency of garbage collection due to a smaller nursery is overshadowed by gains of converting all nursery accesses to cache hits. 1

