Results 1 - 10
of
14
Beltway: Getting Around Garbage Collection Gridlock
- PLDI'02
, 2002
"... We present the design and implementation of a new garbage collection framework that significantly generalizes existing copying collectors. The Beltway framework exploits and separates object age and incrementality. It groups objects in one or more increments on queues called belts, collects belts in ..."
Abstract
-
Cited by 59 (16 self)
- Add to MetaCart
We present the design and implementation of a new garbage collection framework that significantly generalizes existing copying collectors. The Beltway framework exploits and separates object age and incrementality. It groups objects in one or more increments on queues called belts, collects belts independently, and collects increments on a belt in first-in-first-out order. We show that Beltway configurations, selected by command line options, act and perform the same as semi-space, generational, and older-first collectors, and encompass all previous copying collectors of which we are aware.
Ulterior Reference Counting: Fast Garbage Collection without a Long Wait
- IN OOPSLA 2003 ACM CONFERENCE ON OBJECT-ORIENTED PROGRAMMING, SYSTEMS, LANGUAGES AND APPLICATIONS
, 2003
"... General purpose garbage collectors have yet to combine short pause times with high throughput. For example, generational collectors can achieve high throughput. They have modest average pause times, but occasionally collect the whole heap and consequently incur long pauses. At the other extreme, con ..."
Abstract
-
Cited by 31 (7 self)
- Add to MetaCart
General purpose garbage collectors have yet to combine short pause times with high throughput. For example, generational collectors can achieve high throughput. They have modest average pause times, but occasionally collect the whole heap and consequently incur long pauses. At the other extreme, concurrent collectors, including reference counting, attain short pause times but with significant performance penalties. This paper introduces a new hybrid collector that combines copying generational collection for the young objects and reference counting the old objects to achieve both goals. It restricts copying and reference counting to the object demographics for which they perform well. Key to our algorithm is a generalization of deferred reference counting we call Ulterior Reference Counting. Ulterior reference counting safely ignores mutations to select heap objects. We compare a generational reference counting hybrid with pure reference counting, pure marksweep, and hybrid generational mark-sweep collectors. This new collector combines excellent throughput, matching a high performance generational mark-sweep hybrid, with low maximum pause times.
Mark-Copy: Fast copying GC with less space overhead
- OOPSLA'03
, 2003
"... Copying garbage collectors have a number of advantages over noncopying collectors, including cheap allocation and avoiding fragmentation. However, in order to provide completeness (the guarantee to reclaim each garbage object eventually), standard copying collectors require space equal to twice the ..."
Abstract
-
Cited by 28 (1 self)
- Add to MetaCart
Copying garbage collectors have a number of advantages over noncopying collectors, including cheap allocation and avoiding fragmentation. However, in order to provide completeness (the guarantee to reclaim each garbage object eventually), standard copying collectors require space equal to twice the size of the maximum live data for a program. We present a mark-copy collection algorithm (MC) that extends generational copying collection and significantly reduces the heap space required to run a program. MC reduces space overhead by 75–85 % compared with standard copying garbage collectors, increasing the range of applications that can use copying garbage collection. We show that when MC is given the same amount of space as a generational copying collector, it improves total execution time of Java benchmarks significantly in tight heaps, and by 5–10 % in moderate size heaps. We also compare the performance of MC with a (non-generational) mark-sweep collector and a hybrid copying/mark-sweep generational collector. We find that MC can run in heaps comparable in size to the minimum heap space required by mark-sweep. We also find that for most benchmarks MC is significantly faster than mark-sweep in small and moderate size heaps. When compared with the hybrid collector, MC improves total execution time by about 5 % for some benchmarks, partly by increasing the speed of execution of the application code.
Cork: Dynamic memory leak detection for garbage-collected languages
- In POPL
, 2007
"... A memory leak in a garbage-collected program occurs when the program inadvertently maintains references to objects that it no longer needs. Memory leaks cause systematic heap growth, degrading performance and resulting in program crashes after perhaps days or weeks of execution. Prior approaches for ..."
Abstract
-
Cited by 21 (1 self)
- Add to MetaCart
A memory leak in a garbage-collected program occurs when the program inadvertently maintains references to objects that it no longer needs. Memory leaks cause systematic heap growth, degrading performance and resulting in program crashes after perhaps days or weeks of execution. Prior approaches for detecting memory leaks rely on heap differencing or detailed object statistics which store state proportional to the number of objects in the heap. These overheads preclude their use on the same processor for deployed long-running applications. This paper introduces a dynamic heap-summarization technique based on type that accurately identifies leaks, is space efficient (adding less than 1 % to the heap), and is time efficient (adding 2.3% on average to total execution time). We implement this approach in Cork which utilizes dynamic type information and garbage collection to summarize the live objects in a type points-from graph (TPFG) whose nodes (types) and edges (references between types) are annotated with volume. Cork compares TPFGs across multiple collections, identifies growing data structures, and computes a type slice for the user. Cork is accurate: it identifies systematic heap growth with no false positives in 4 of 15 benchmarks we tested. Cork’s slice report enabled us (non-experts) to quickly eliminate growing data structures in SPECjbb2000 and Eclipse, something their developers had not previously done. Cork is accurate, scalable, and efficient enough to consider using online. Categories and Subject Descriptors D.2.5 [Software Engineering]: Testing and Debugging—Debugging aids
Abstract Effective Prefetch for Mark-Sweep Garbage Collection ∗
"... Garbage collection is a performance-critical feature of most modern object oriented languages, and is characterized by poor locality since it must traverse the heap. In this paper we show that by combining two very simple ideas we can significantly improve the performance of the canonical mark-sweep ..."
Abstract
-
Cited by 5 (2 self)
- Add to MetaCart
Garbage collection is a performance-critical feature of most modern object oriented languages, and is characterized by poor locality since it must traverse the heap. In this paper we show that by combining two very simple ideas we can significantly improve the performance of the canonical mark-sweep collector, resulting in improvements in application performance. We make three main contributions: 1) we develop a methodology and framework for accurately and deterministically analyzing the tracing loop at the heart of the collector, 2) we offer a number of insights and improvements over conventional design choices for mark-sweep collectors, and 3) we find that two simple ideas — edge order traversal and software prefetch — combine to greatly improve garbage collection performance although each is unproductive in isolation.
Explaining failures of program analyses
- SIGPLAN Not
"... With programs getting larger and often more complex with each new release, programmers need all the help they can get in understanding and transforming programs. Fortunately, modern development environments, such as Eclipse, incorporate tools for understanding, navigating, and transforming programs. ..."
Abstract
-
Cited by 4 (1 self)
- Add to MetaCart
With programs getting larger and often more complex with each new release, programmers need all the help they can get in understanding and transforming programs. Fortunately, modern development environments, such as Eclipse, incorporate tools for understanding, navigating, and transforming programs. These tools typically use program analyses to extract relevant properties of programs. These tools are often invaluable to developers; for example, many programmers use refactoring tools regularly. However, poor results by the underlying analyses can compromise a tool’s usefulness. For example, a bug finding tool may produce too many false positives if the underlying analysis is overly conservative, and thus overwhelm the user with too many possible errors in the program. In such cases it would be invaluable for the tool to explain to the user why it believes that each bug exists. Armed with this knowledge, the user can decide which bugs are worth pursing and which are false positives. The contributions of this paper are as follows: (i) We describe requirements on the structure of an analysis so that we can produce reasons when the analysis fails; the user of the analysis determines whether or not an analysis’s results constitute failure. We also describe a simple language that enforces these requirements; (ii) We describe how to produce necessary and sufficient reasons for analysis failure; (iii) We evaluate our system with respect to a number of analyses and programs and find that most reasons are small (and thus usable) and that our system is fast enough for interactive use.
Accurate, Efficient, and Adaptive Calling Context Profiling
- PLDI '06
, 2006
"... Calling context profiles are used in many inter-procedural code optimizations and in overall program understanding. Unfortunately, the collection of profile information is highly intrusive due to the high frequency of method calls in most applications. Previously proposed calling-context profiling m ..."
Abstract
-
Cited by 3 (1 self)
- Add to MetaCart
Calling context profiles are used in many inter-procedural code optimizations and in overall program understanding. Unfortunately, the collection of profile information is highly intrusive due to the high frequency of method calls in most applications. Previously proposed calling-context profiling mechanisms consequently suffer from either low accuracy, high overhead, or both. We have developed a new approach for building the calling context tree at runtime, called adaptive bursting. By selectively inhibiting redundant profiling, this approach dramatically reduces overhead while preserving profile accuracy. We first demonstrate the drawbacks of previously proposed calling context profiling mechanisms. We show that a low-overhead solution using sampled stack-walking alone is less than 50 % accurate, based on degree of overlap with a complete calling-context tree. We also show that a static bursting approach collects a highly accurate profile, but causes an unacceptable application slowdown. Our adaptive solution achieves 85 % degree of overlap and provides an 88% hot-edge coverage when using a 0.1 hot-edge threshold, while dramatically reducing overhead compared to the static bursting approach.
Correcting the Dynamic Call Graph Using Control Flow Constraints
- In International Conference on Compiler Construction
, 2007
"... Abstract. To reason about programs, dynamic optimizers and analysis tools use sampling to collect a dynamic call graph (DCG). However, sampling has not achieved high accuracy with low runtime overhead. As object-oriented programmers compose increasingly complex programs, inaccurate call graphs will ..."
Abstract
-
Cited by 2 (2 self)
- Add to MetaCart
Abstract. To reason about programs, dynamic optimizers and analysis tools use sampling to collect a dynamic call graph (DCG). However, sampling has not achieved high accuracy with low runtime overhead. As object-oriented programmers compose increasingly complex programs, inaccurate call graphs will inhibit analysis and optimizations. This paper demonstrates how to use static and dynamic control flow graph (CFG) constraints to improve the accuracy of the DCG. We introduce the frequency dominator (FDOM), a novel CFG relation that extends the dominator relation to expose static relative execution frequencies of basic blocks. We combine conservation of flow and dynamic CFG basic block profiles to further improve the accuracy of the DCG. Together these approaches add minimal overhead (1%) and achieve 85 % accuracy compared to a perfect call graph for SPEC JVM98 and DaCapo benchmarks. Compared to sampling alone, accuracy improves by 12 to 36%. These results demonstrate that static and dynamic control-flow information offer accurate information for efficiently improving the DCG. 1
A Case For Low-Complexity MP Architectures
- In Proceedings of the Conference on Supercomputing
, 2007
"... Advances in semiconductor technology have driven sharedmemory servers toward processors with multiple cores per die and multiple threads per core. This paper presents simple hardware primitives enabling flexible and low-complexity multi-chip designs supporting an efficient inter-node coherence proto ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
Advances in semiconductor technology have driven sharedmemory servers toward processors with multiple cores per die and multiple threads per core. This paper presents simple hardware primitives enabling flexible and low-complexity multi-chip designs supporting an efficient inter-node coherence protocol implemented in software. We argue that our primitives and the example design presented in this paper have lower hardware overhead, have easier (and later) verification requirements, and provide the opportunity for flexible coherence protocols and simpler protocol bug corrections than traditional designs. Our evaluation is based on detailed full-system simulations of modern chip-multiprocessors and both commercial and HPC workloads. We compare a low-complexity system based on the proposed primitives with aggressive hardware multi-chip shared-memory systems and show that the performance is competitive across a large design space. 1.
Wake Up and Smell the Coffee: Evaluation Methodology for the 21st Century
"... Evaluation methodology underpins all innovation in experimental computer science. It requires relevant workloads, appropriate experimental design, and rigorous analysis. Unfortunately, methodology is not keeping pace with the changes in our field. The rise of managed languages such as Java, C#, and ..."
Abstract
- Add to MetaCart
Evaluation methodology underpins all innovation in experimental computer science. It requires relevant workloads, appropriate experimental design, and rigorous analysis. Unfortunately, methodology is not keeping pace with the changes in our field. The rise of managed languages such as Java, C#, and Ruby in the past decade and the imminent rise of commodity multicore architectures for the next decade pose new methodological challenges that are not yet widely understood. This paper explores the consequences of our collective inattention to methodology on innovation, makes recommendations for addressing this problem in one domain, and provides guidelines for other domains. We describe benchmark suite design, experimental design, and analysis for evaluating Java applications. For example, we introduce new criteria for measuring and selecting diverse applications for a benchmark suite. We show that the complexity and nondeterminism of the Java runtime system make experimental design a first-order consideration, and we recommend mechanisms for addressing complexity and nondeterminism. Drawing on these results, we suggest how to adapt methodology more broadly. To continue to deliver innovations, our field needs to significantly increase participation in and funding for developing sound methodological foundations. 1.

