Results 1 -
8 of
8
Controlling garbage collection and heap growth to reduce execution time of Java applications
- In ACM Conference on ObjectOriented Programming, Systems, Languages, and Applications (OOPSLA’01
, 2001
"... ABSTRACT In systems that support garbage collection, a tension exists between collecting garbage too frequently and not collecting garbage frequently enough. Garbage collection that occurs too frequently may introduce unnecessary overheads at the risk of not collecting much garbage during each cycle ..."
Abstract
-
Cited by 26 (0 self)
- Add to MetaCart
ABSTRACT In systems that support garbage collection, a tension exists between collecting garbage too frequently and not collecting garbage frequently enough. Garbage collection that occurs too frequently may introduce unnecessary overheads at the risk of not collecting much garbage during each cycle. On the other hand, collecting garbage too infrequently can result in applications that execute with a large amount of virtual memory (i.e., with a large footprint) and suffer from increased execution times due to paging. In this paper, we use a large collection of JavaTMapplications and the highly tuned and widely used Boehm-DemersWeiser (BDW) conservative mark-and-sweep garbage collector to experimentally examine the extent to which the frequency of garbage collection impacts an application's execution time, footprint, and pause times. We use these results to devise some guidelines for controlling garbage collection and heap growth in a conservative garbage collector in order to minimize application execution times. Then we describe new strategies for controlling garbage collection and heap growth that impact not only the frequency with which garbage collection occurs but also the points at which garbage collection occurs. Experimental results demonstrate that, when compared with the existing approach used in the standard BDW collector, our new strategy can significantly reduce application execution times. Our goal is to obtain a better understanding of how to control garbage collection and heap growth for an individual application executing in isolation. These results can be applied in a number of high-performance computing and server environments, in addition to some single-user environments. This work should also provide insights into how
Immix: A Mark-Region Garbage Collector with Space Efficiency, Fast Collection, and Mutator Performance
, 2008
"... Programmers are increasingly choosing managed languages for modern applications, which tend to allocate many short-to-medium lived small objects. The garbage collector therefore directly determines program performance by making a classic space-time tradeoff that seeks to provide space efficiency, fa ..."
Abstract
-
Cited by 19 (9 self)
- Add to MetaCart
Programmers are increasingly choosing managed languages for modern applications, which tend to allocate many short-to-medium lived small objects. The garbage collector therefore directly determines program performance by making a classic space-time tradeoff that seeks to provide space efficiency, fast reclamation, and mutator performance. The three canonical tracing garbage collectors: semi-space, mark-sweep, and mark-compact each sacrifice one objective. This paper describes a collector family, called mark-region, and introduces opportunistic defragmentation, which mixes copying and marking in a single pass. Combining both, we implement immix, a novel high performance garbage collector that achieves all three performance objectives. The key insight is to allocate and reclaim memory in contiguous regions, at a coarse block grain when possible and otherwise in groups of finer grain lines. We show that immix outperforms existing canonical algorithms, improving total application performance by 7 to 25 % on average across 20 benchmarks. As the mature space in a generational collector, immix matches or beats a highly tuned generational collector, e.g. it improves jbb2000 by 5%. These innovations and the identification of a new family of collectors open new opportunities for garbage collector design.
Correctness-preserving derivation of concurrent garbage collection algorithms
- Available at http://www.worldbank.org/en_breve Jalan, Jyotsna and Martin Ravallion. 2001. “Does piped water reduce diarrhea for children in Rural India.” Policy Research Working Paper
, 2006
"... Constructing correct concurrent garbage collection algorithms is notoriously hard. Numerous such algorithms have been proposed, implemented, and deployed – and yet the relationship among them in terms of speed and precision is poorly understood, and the validation of one algorithm does not carry ove ..."
Abstract
-
Cited by 8 (2 self)
- Add to MetaCart
Constructing correct concurrent garbage collection algorithms is notoriously hard. Numerous such algorithms have been proposed, implemented, and deployed – and yet the relationship among them in terms of speed and precision is poorly understood, and the validation of one algorithm does not carry over to others. As programs with low latency requirements written in garbagecollected languages become part of society’s mission-critical infrastructure, it is imperative that we raise the level of confidence in the correctness of the underlying system, and that we understand the trade-offs inherent in our algorithmic choice. In this paper we present correctness-preserving transformations that can be applied to an initial abstract concurrent garbage collection algorithm which is simpler, more precise, and easier to prove correct than algorithms used in practice — but also more expensive and with less concurrency. We then show how both pre-existing and new algorithms can be synthesized from the abstract algorithm by a series of our transformations. We relate the algorithms formally using a new definition of precision, and informally with respect to overhead and concurrency. This provides many insights about the nature of concurrent collection, allows the direct synthesis of new and useful algorithms, reduces the burden of proof to a single simple algorithm, and lays the groundwork for the automated synthesis of correct concurrent collectors. 1.
Derivation and evaluation of concurrent collectors
- In Proceedings of the Nineteenth European Conference on Object-Oriented Programming
, 2005
"... Abstract. There are many algorithms for concurrent garbage collection, but they are complex to describe, verify, and implement. This has resulted in a poor understanding of the relationships between the algorithms, and has precluded systematic concurrent garbage collection algorithm, and show how ex ..."
Abstract
-
Cited by 3 (3 self)
- Add to MetaCart
Abstract. There are many algorithms for concurrent garbage collection, but they are complex to describe, verify, and implement. This has resulted in a poor understanding of the relationships between the algorithms, and has precluded systematic concurrent garbage collection algorithm, and show how existing snapshot and incremental update collectors, can be derived from the abstract algorithm by reducing precision. We also derive a new hybrid algorithm that reduces floating garbage while terminating quickly. We have implemented a concurrent collector framework and the resulting algorithms in IBM’s J9 Java virtual machine product and compared their performance in terms of space, time, and incrementality. The results show that incremental update algorithms sometimes reduce memory requirements (on 3 of 5 benchmarks) but they also sometimes take longer due to recomputation in the termination phase (on 4 of 5 benchmarks). Our new hybrid algorithm has memory requirements similar to the incremental update collectors while avoiding recomputation in the termination phase. 1
Cross-Language, Type-Safe, and Transparent Object Sharing For Co-Located Managed Runtimes
"... As software becomes increasingly complex and difficult to analyze, it is more and more common for developers to use high-level, type-safe, object-oriented (OO) programming languages and to architect systems that comprise multiple components. Different components are often implemented in different pr ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
As software becomes increasingly complex and difficult to analyze, it is more and more common for developers to use high-level, type-safe, object-oriented (OO) programming languages and to architect systems that comprise multiple components. Different components are often implemented in different programming languages. In state-of-the-art multicomponent, multi-language systems, cross-component communication relies on remote procedure calls (RPC) and message passing. As components are increasingly co-located on the same physical machine to ensure high utilization of multi-core systems, there is a growing potential for using shared memory for cross-language cross-runtime communication. We present the design and implementation of Co-Located Runtime Sharing (CoLoRS), a system that enables crosslanguage,
A Lock-Free, Concurrent, and Incremental Stack Scanning for Garbage Collectors
"... Two major efficiency parameters for garbage collectors are the throughput overheads and the pause times that they introduce. Highly responsive systems need to use collectors with as short as possible pause times. Pause lengths have decreased significantly during the years, especially through the use ..."
Abstract
- Add to MetaCart
Two major efficiency parameters for garbage collectors are the throughput overheads and the pause times that they introduce. Highly responsive systems need to use collectors with as short as possible pause times. Pause lengths have decreased significantly during the years, especially through the use of concurrent garbage collectors. For modern concurrent collectors, the longest pause is typically created by the need to atomically scan the runtime stack. All practical concurrent collectors that we are aware of must obtain a snapshot of the pointers on each thread’s runtime stack, in order to reclaim objects correctly. To further reduce the length of the collector pauses, incremental stack scans were proposed. However, previous such methods employ locks to stop the mutator from accessing a stack frame while it is being scanned. Thus, these methods introduce a potential long and unpredictable pauses for a mutator thread. In this work we propose the first concurrent, incremental, and lock-free stack scanning for garbage collectors, allowing high responsiveness and support for programs that employ fine-synchronization to avoid locks. Our solution can be employed by all concurrent collectors that we are aware of, it is lock-free, it imposes a negligible overhead on the program execution, and it supports the special in-stack references existing in languages like C#.
Concurrent Collection as an Operating System Service for Cross-Runtime Cross-Language Memory Management
"... We present GC-as-a-Service (GaS), a cross-runtime, crosslanguage garbage collection (GC) library that can be used to simplify the implementation of runtime systems, and that exploits available multicore technologies. GaS decouples GC from other runtime components and exposes a fine-grain API for use ..."
Abstract
- Add to MetaCart
We present GC-as-a-Service (GaS), a cross-runtime, crosslanguage garbage collection (GC) library that can be used to simplify the implementation of runtime systems, and that exploits available multicore technologies. GaS decouples GC from other runtime components and exposes a fine-grain API for use by GC-cooperative runtimes of different programming languages for heap memory management. GaS provides concurrent, on-the-fly GC and avoids moving objects for use as a precise or conservative collector. We integrate GaS within production-quality runtime systems for Python and Java. Our experimental evaluation shows that using GaS as an alternative to tightly integrated GC introduces modest overhead and that GaS reduces pause times significantly for Python and Java programs.
Technical Report no. 2008-69 NB-FEB: An Easy-to-Use and Scalable Universal Synchronization Primitive for Parallel Programming
, 811
"... This paper addresses the problem of universal synchronization primitives that can support scalable thread synchronization for large-scale many-core architectures. The universal synchronization primitives that have been deployed widely in conventional architectures, are the compare-and-swap (CAS) and ..."
Abstract
- Add to MetaCart
This paper addresses the problem of universal synchronization primitives that can support scalable thread synchronization for large-scale many-core architectures. The universal synchronization primitives that have been deployed widely in conventional architectures, are the compare-and-swap (CAS) and load-linked/store-conditional (LL/SC) primitives. However, such synchronization primitives are expected to reach their scalability limits in the evolution to many-core architectures with thousands of cores. We introduce a non-blocking full/empty bit primitive, or NB-FEB for short, as a promising synchronization primitive for parallel programming on may-core architectures. We show that the NB-FEB primitive is universal, scalable, feasible and convenient to use. NB-FEB, together with registers, can solve the consensus problem for an arbitrary number of processes

