Results 1 - 10
of
13
Beltway: Getting Around Garbage Collection Gridlock
- PLDI'02
, 2002
"... We present the design and implementation of a new garbage collection framework that significantly generalizes existing copying collectors. The Beltway framework exploits and separates object age and incrementality. It groups objects in one or more increments on queues called belts, collects belts in ..."
Abstract
-
Cited by 59 (16 self)
- Add to MetaCart
We present the design and implementation of a new garbage collection framework that significantly generalizes existing copying collectors. The Beltway framework exploits and separates object age and incrementality. It groups objects in one or more increments on queues called belts, collects belts independently, and collects increments on a belt in first-in-first-out order. We show that Beltway configurations, selected by command line options, act and perform the same as semi-space, generational, and older-first collectors, and encompass all previous copying collectors of which we are aware.
A Generational Mostly-concurrent Garbage Collector
- IN PROCEEDINGS OF THE INTERNATIONAL SYMPOSIUM ON MEMORY MANAGEMENT
, 2000
"... This paper reports our experiences with a mostly-concurrent incremental garbage collector, implemented in the context of a high performance virtual machine for the Java^TM programming language. The garbage collector is based on the "mostly parallel" collection algorithm of Boehm et al., and can b ..."
Abstract
-
Cited by 43 (6 self)
- Add to MetaCart
This paper reports our experiences with a mostly-concurrent incremental garbage collector, implemented in the context of a high performance virtual machine for the Java^TM programming language. The garbage collector is based on the "mostly parallel" collection algorithm of Boehm et al., and can be used as the old generation of a generational memory system. It overloads efficient write-barrier code already generated to support generational garbage collection to also identify objects that were modified during concurrent marking. These objects must be rescanned to ensure that the concurrent marking phase marks all live objects. This algorithm minimises maximum garbage collection pause times, while having only a small impact on the average garbage collection pause time and overall execution time. We support our claims with experimental results, for both a synthetic benchmark and real programs.
Towards flexible and safe technology for runtime evolution of java language applications
- In Proceedings of the Workshop on Engineering Complex Object-Oriented Systems for Evolution, in association with OOPSLA 2001 International Conference
, 2001
"... There is a class of important computer applications that must run without interruption and yet must be changed from time to time to fix bugs or upgrade functionality. In this paper, we present the initial runtime evolution framework which we have developed for the HotSpot Java Virtual Machine, that ..."
Abstract
-
Cited by 30 (2 self)
- Add to MetaCart
There is a class of important computer applications that must run without interruption and yet must be changed from time to time to fix bugs or upgrade functionality. In this paper, we present the initial runtime evolution framework which we have developed for the HotSpot Java Virtual Machine, that allows us to change running applications on-the-fly, without interruption. We describe our staged implementation plan, where stages correspond to increasing levels of implementation complexity, yet on each stage a reasonably complete set of facilities is provided. The first stage — support for changes to method bodies only — has already been implemented and is to be included in the forthcoming Java 2 Platform release. We discuss multiple policies for dealing with active methods of running applications, present our thoughts on instance conversion implementation, and suggest that runtime evolution technology can be used for dynamic fine-grain profiling of applications. 1
GC Points in a Threaded Environment
- SMLI TR-98-70. Sun Microsystems
, 1998
"... : Many garbage-collected systems, including most that involve a stop-the-world phase, restrict GC to socalled GC points. In single-threaded environments, GC points carry no overhead: when a GC must be done, the single thread is already at a GC point. In multi-threaded environments, however, only the ..."
Abstract
-
Cited by 17 (0 self)
- Add to MetaCart
: Many garbage-collected systems, including most that involve a stop-the-world phase, restrict GC to socalled GC points. In single-threaded environments, GC points carry no overhead: when a GC must be done, the single thread is already at a GC point. In multi-threaded environments, however, only the thread that triggers the GC by failing an allocation will be at a GC point. Other threads must be rolled forward to their next GC point before the GC can take place. We compare, in the context of a high-performance Java^TM virtual machine, two approaches to advancing threads to a GC point, polling and code patching, while keeping all other factors constant. Code patching outperforms polling by an average of 4.7% and sometimes by as much as 11.2%, while costing only slightly more compiled code space. Put differently, since most programs spend less than 1/5 of the time in GC, a 4.7% bottom-line speedup amounts to more than a 20% reduction in the GC-related costs. Patching is, however, more di...
Efficient Object Sampling Via Weak References
, 2000
"... The performance of automatic memory management may be improved if the policies used in allocating and collecting objects had knowledge of the lifetimes of objects. To date, approaches to the pretenuring of objects in older generations have relied on profiledriven feedback gathered from trace runs. T ..."
Abstract
-
Cited by 9 (0 self)
- Add to MetaCart
The performance of automatic memory management may be improved if the policies used in allocating and collecting objects had knowledge of the lifetimes of objects. To date, approaches to the pretenuring of objects in older generations have relied on profiledriven feedback gathered from trace runs. This feedback has been used to specialize allocation sites in a program. These approaches suffer from a number of limitations. We propose an alternative that through efficient sampling of objects allows for on-line adaption of allocation sites to improve the efficiency of the memory system. In doing so, we make use of a facility already present in many collectors such as those found in Java TM virtual machines: weak references. By judiciously tracking a subset of allocated objects with weak references, we are able to gather the necessary statistics to make better object-placement decisions.
A parallel, incremental, mostly concurrent garbage collector for servers
- ACM Transactions on Programming Languages and Systems
"... Multithreaded applications with multi-gigabyte heaps running on modern servers provide new challenges for garbage collection (GC). The challenges for “server-oriented ” GC include: ensuring short pause times on a multi-gigabyte heap while minimizing throughput penalty, good scaling on multiprocessor ..."
Abstract
-
Cited by 5 (1 self)
- Add to MetaCart
Multithreaded applications with multi-gigabyte heaps running on modern servers provide new challenges for garbage collection (GC). The challenges for “server-oriented ” GC include: ensuring short pause times on a multi-gigabyte heap while minimizing throughput penalty, good scaling on multiprocessor hardware, and keeping the number of expensive multi-cycle fence instructions required by weak ordering to a minimum. We designed and implemented a collector facing these demands building on the mostly concurrent garbage collector proposed by Boehm et al. Our collector incorporates new ideas into the original collector. We make it parallel and incremental; we employ concurrent low-priority background GC threads to take advantage of processor idle time; we propose novel algorithmic improvements to the basic mostly concurrent algorithm improving its efficiency and shortening its pause times; and finally, we use advanced techniques, such as a low-overhead work packet mechanism to enable full parallelism among the incremental and concurrent collecting threads and ensure load balancing. We compared the new collector to the mature, well-optimized, parallel, stop-the-world marksweep collector already in the IBM JVM. When allowed to run aggressively, using 72 % of the CPU utilization during a short concurrent phase, our collector prototype reduces the maximum pause time from 161ms to 46ms while only losing 11.5 % throughput when running the SPECjbb2000 benchmark on a 600 MB heap on an 8-way PowerPC 1.1 GHz processors. When the collector is limited to a non-intrusive operation using only 29 % of the CPU utilization, the maximum pause time obtained is 79ms and the loss in throughput is 15.4%.
Concurrent Remembered Set Refinement in Generational Garbage Collection
- In USENIX Java Virtual Machine Research and Technology Symposium (JVM’02
, 2002
"... ..."
The Moxie JVM Experience
"... By January 1998, only two years after the launch of the first Java virtual machine, almost all JVMs in use today had been architected. In the nine years since, technology has advanced enormously, with respect to the underlying hardware, language implementation, and in the application domain. Althoug ..."
Abstract
-
Cited by 2 (2 self)
- Add to MetaCart
By January 1998, only two years after the launch of the first Java virtual machine, almost all JVMs in use today had been architected. In the nine years since, technology has advanced enormously, with respect to the underlying hardware, language implementation, and in the application domain. Although JVM technology has moved forward in leaps and bounds, basic design decisions made in the 90’s has anchored JVM implementation. The Moxie project set out to explore the question: ‘How would we design a JVM from scratch knowing what we know today?’ Amid the mass of design questions we faced, the tension between performance and flexibility was pervasive, persistent and problematic. In this experience paper we describe the Moxie project and its lessons, a process which began with consulting experts from industry and academia, and ended with a fully working prototype.
Modular Integration of Memory Management in Programming Language System
, 2002
"... This paper proposes an implementation scheme for programming language interpreters that addresses both modular design of memory management system and execution efficiency. Modularity let the language designer and the application programmer incorporate the most efective algorithm in their systems. Mo ..."
Abstract
- Add to MetaCart
This paper proposes an implementation scheme for programming language interpreters that addresses both modular design of memory management system and execution efficiency. Modularity let the language designer and the application programmer incorporate the most efective algorithm in their systems. Modular implementations of memory management systems can be used as a base for empirical analysis of various memory management systems. The proposed system offers an abstract interface that can be used to implement various memory management algorithms. Our specialization technique takes the implementation of the interpreter and the application bytecode and generates their specialized versions whose memory management interface calls to the given memory management system. Benchmark results show that specialization technique eliminates the 5-10% overhead and our system works as efficiently as hand-tuned programming language system.
Sun Microsystems
"... Generational garbage collection divides a heap up into two or more generations, and usually collects a youngest generation most frequently. Collection of the youngest generation requires identification of pointers into that generation from older generations; a data structure that supports such ident ..."
Abstract
- Add to MetaCart
Generational garbage collection divides a heap up into two or more generations, and usually collects a youngest generation most frequently. Collection of the youngest generation requires identification of pointers into that generation from older generations; a data structure that supports such identification is called a remembered set. Various remembered set mechanisms have been proposed; these generally require mutator code to execute a write barrier when modifying pointer fields. Remembered set data structures can vary in their precision: an imprecise structure requires the garbage collector to do more work to find old-to-young pointers. Generally there is a tradeo# between remembered set precision and barrier cost: a more precise remembered set requires a more elaborate barrier. Many current systems tend to favor more e#cient barriers in this tradeo#, as shown by the widespread popularity of relatively imprecise card marking techniques. This imprecision becomes increasingly costly as the ratio between old- and young-generation sizes grows. We propose a technique that maintains more precise remembered sets that scale with old-generation size, using a barrier whose cost is not significantly greater than card marking.

