• Documents
  • Authors
  • Tables
  • Other Seers ▼
    RefSeer AckSeer CollabSeer SeerSeer
  • Log in
  • Sign up
  • MetaCart

CiteSeerX logo

Advanced Search Include Citations
Advanced Search Include Citations | Disambiguate

Intelligent Selection of Application-Specific Garbage Collectors Abstract

by Jeremy Singer, Gavin Brown, Ian Watson, John Cavazos
Add To MetaCart

Tools

Sorted by:
Results 1 - 9 of 9

Cross-Input Learning and Discriminative Prediction in Evolvable Virtual Machines

by Feng Mao, Xipeng Shen
"... Abstract—Modern languages like Java and C # rely on dynamic optimizations in virtual machines for better performance. Current dynamic optimizations are reactive. Their performance is constrained by the dependence on runtime sampling and the partial knowledge of the execution. This work tackles the p ..."
Abstract - Cited by 6 (5 self) - Add to MetaCart
Abstract—Modern languages like Java and C # rely on dynamic optimizations in virtual machines for better performance. Current dynamic optimizations are reactive. Their performance is constrained by the dependence on runtime sampling and the partial knowledge of the execution. This work tackles the problems by developing a set of techniques that make a virtual machine evolve across production runs. The virtual machine incrementally learns the relation between program inputs and optimization strategies so that it proactively predicts the optimizations suitable for a new run. The prediction is discriminative, guarded by confidence measurement through dynamic self-evaluation. We employ an enriched extensible specification language to resolve the complexities in program inputs. These techniques, implemented in Jikes RVM, produce significant performance improvement on a set of Java applications.

Online phase-adaptive data layout selection

by Chengliang Zhang, Martin Hirzel - In Proceedings of the European Conference on Object-Oriented Programming , 2008
"... Abstract. Good data layouts improve cache and TLB performance of object-oriented software, but unfortunately, selecting an optimal data layout a priori is NP-hard. This paper introduces layout auditing, a technique that selects the best among a set of layouts online (while the program is running). L ..."
Abstract - Cited by 5 (0 self) - Add to MetaCart
Abstract. Good data layouts improve cache and TLB performance of object-oriented software, but unfortunately, selecting an optimal data layout a priori is NP-hard. This paper introduces layout auditing, a technique that selects the best among a set of layouts online (while the program is running). Layout auditing randomly applies different layouts over time and observes their performance. As it becomes confident about which layout performs best, it selects that layout with higher probability. But if a phase shift causes a different layout to perform better, layout auditing learns the new best layout. We implemented our technique in a product Java virtual machine, using copying generational garbage collection to produce different layouts, and tested it on 20 long-running benchmarks and 4 hardware platforms. Given any combination of benchmark and platform, layout auditing consistently performs close to the best layout for that combination, without requiring offline training. 1

Influence of program inputs on the selection of garbage collectors

by Feng Mao, Eddy Z. Zhang, Xipeng Shen - In Proceedings of the International Conference on Virtual Execution Environments (VEE , 2009
"... Many studies have shown that the best performer among a set of garbage collectors tends to be different for different applications. Researchers have proposed application-specific selection of garbage collectors. In this work, we concentrate on a second dimension of the problem: the influence of prog ..."
Abstract - Cited by 4 (2 self) - Add to MetaCart
Many studies have shown that the best performer among a set of garbage collectors tends to be different for different applications. Researchers have proposed application-specific selection of garbage collectors. In this work, we concentrate on a second dimension of the problem: the influence of program inputs on the selection of garbage collectors. We collect tens to hundreds of inputs for a set of Java benchmarks, and measure their performance on Jikes RVM with different heap sizes and garbage collectors. A rigorous statistical analysis produces four-fold insights. First, inputs influence the relative performance of garbage collectors significantly, causing large variations to the top set of garbage collectors across inputs. Profiling one or few runs is thus inadequate for selecting the garbage collector that works well for most inputs. Second, when the heap size ratio is fixed, one or two types of garbage collectors are enough to stimulate the top performance of the program on all inputs. Third, for some programs, the heap size ratio significantly affects the relative performance of different types of garbage collectors. For the selection of garbage collectors on those programs, it is necessary to have a cross-input predictive model that predicts the minimum possible heap size of the execution on an arbitrary input. Finally, based on regression techniques, we demonstrate the predictability of the minimum possible heap size, indicating the potential feasibility of the input-specific selection of garbage collectors.

Blind Optimization for Exploiting Hardware Features

by Dan Knights, Todd Mytkowicz, Peter F. Sweeney, Michael C. Mozer, Amer Diwan
"... Abstract. Software systems typically exploit only a small fraction of the realizable performance from the underlying microprocessors. While there has been much work on hardware-aware optimizations, two factors limit their benefit. First, microprocessors are so complex that it is unlikely that even a ..."
Abstract - Add to MetaCart
Abstract. Software systems typically exploit only a small fraction of the realizable performance from the underlying microprocessors. While there has been much work on hardware-aware optimizations, two factors limit their benefit. First, microprocessors are so complex that it is unlikely that even an aggressively optimizing compiler will be able to satisfy all the constraints necessary to obtain the best performance. Thus, most optimizations use a simplified model of the hardware (e.g., they may be cache-aware but they may ignore other hardware structures, such as TLBs, etc.). Second, hardware manufacturers do not reveal all details of their microprocessors so even if the authors of optimizations wanted to simultaneously optimize for all components of the hardware, they may be unable to do so because they are working with limited knowledge. This paper presents and evaluates our blind optimization approach which provides a way to get around these issues. Blind optimization uses the insight that we can generate many variants of an application by altering semantic preserving parameters of an application; for example our variants can cover the space of code and data layout by shifting the positions of code and data in memory. Our optimization strategy attempts to find a variant that performs well with respect to an optimization objective. We show that even our first implementation of blind optimization speeds up a number of programs from the SPECint 2006 benchmark suite. 1

Economics, Measurement

by Jeremy Singer, Richard Jones, Gavin Brown, Mikel Luján
"... This paper argues that economic theory can improve our understanding of memory management. We introduce the allocation curve, as an analogue of the demand curve from microeconomics. An allocation curve for a program characterises how the amount of garbage collection activity required during its exec ..."
Abstract - Add to MetaCart
This paper argues that economic theory can improve our understanding of memory management. We introduce the allocation curve, as an analogue of the demand curve from microeconomics. An allocation curve for a program characterises how the amount of garbage collection activity required during its execution varies in relation to the heap size associated with that program. The standard treatment of microeconomic demand curves (shifts and elasticity) can be applied directly and intuitively to our new allocation curves. As an application of this new theory, we show how allocation elasticity can be used to control the heap growth rate for variable sized heaps in Jikes RVM.

Evolvable Virtual Machines

by William Mary, Feng Mao, Xipeng Shen, Feng Mao, Xipeng Shen
"... Modern languages like Java and C # rely on dynamic optimizations in virtual machines for better performance. Current dynamic optimizations are reactive. Their performance is constrained by the dependence on runtime sampling and the partial knowledge of the execution. This work tackles the problems b ..."
Abstract - Add to MetaCart
Modern languages like Java and C # rely on dynamic optimizations in virtual machines for better performance. Current dynamic optimizations are reactive. Their performance is constrained by the dependence on runtime sampling and the partial knowledge of the execution. This work tackles the problems by developing a set of techniques that make a virtual machine evolve across runs. The virtual machine incrementally learns the relation between program inputs and optimization strategies so that it proactively predicts the optimizations suitable for a new run. The prediction is discriminative, guarded by confidence measurement through dynamic self-evaluation. The complexity in program inputs are resolved by an extensible specification language. Experiments demonstrate the effectiveness of the techniques in enhancing the optimizor in a virtual machine increasingly with certain guarantees. 1

The Study and Handling of Program Inputs in the Selection of Garbage Collectors ∗

by Xipeng Shen, Feng Mao, Kai Tian, Eddy Zheng Zhang
"... Many studies have shown that the best performer among a set of garbage collectors tends to be different for different applications. Researchers have proposed applicationspecific selection of garbage collectors. In this work, we concentrate on a second dimension of the problem: the influence of progr ..."
Abstract - Add to MetaCart
Many studies have shown that the best performer among a set of garbage collectors tends to be different for different applications. Researchers have proposed applicationspecific selection of garbage collectors. In this work, we concentrate on a second dimension of the problem: the influence of program inputs on the selection of garbage collectors. We collect tens to hundreds of inputs for a set of Java benchmarks, and measure their performance on Jikes RVM with different heap sizes and garbage collectors. A rigorous statistical analysis produces four-fold insights. First, inputs influence the relative performance of garbage collectors significantly, causing large variations to the top set of garbage collectors across inputs. Profiling one or few runs is thus inadequate for selecting the garbage collector that works well for most inputs. Second, when the heap size ratio is fixed, one or two types of garbage collectors are enough to stimulate the top performance of the program on all inputs. Third, for some programs, the heap size ratio significantly affects the relative performance of different types of garbage collectors. For the selection of garbage collectors on those programs, it is necessary to have a cross-input predictive model that predicts the minimum possible heap size of the execution on an arbitrary input. Finally, by adopting statistical learning techniques, we investigate the cross-input predictability of the influence. Experimental results demonstrate that with regression and classification techniques, it is possible to predict the best garbage collector (along with

A Step Towards Transparent Integration of Input-Consciousness into Dynamic Program Optimizations

by Kai Tian, Eddy Z. Zhang, Xipeng Shen
"... Dynamic program optimizations are critical for the efficiency of applications in managed programming languages and scripting languages. Recent studies have shown that exploitation of program inputs may enhance the effectiveness of dynamic optimizations significantly. However, current solutions for e ..."
Abstract - Add to MetaCart
Dynamic program optimizations are critical for the efficiency of applications in managed programming languages and scripting languages. Recent studies have shown that exploitation of program inputs may enhance the effectiveness of dynamic optimizations significantly. However, current solutions for enabling the exploitation require either programmers’ annotations or intensive offline profiling, impairing the practical adoption of the techniques. This current work examines the basic feasibility of transparent integration of input-consciousness into dynamic program optimizations, particularly in managed execution environments. It uses transparent learning across production runs as the basic vehicle, and investigates the implications of cross-run learning on each main component of inputconscious

Technical Report no. 2008-69 NB-FEB: An Easy-to-Use and Scalable Universal Synchronization Primitive for Parallel Programming

by Phuong Hoai, Tsigas Otto, J. Anshus , 811
"... This paper addresses the problem of universal synchronization primitives that can support scalable thread synchronization for large-scale many-core architectures. The universal synchronization primitives that have been deployed widely in conventional architectures, are the compare-and-swap (CAS) and ..."
Abstract - Add to MetaCart
This paper addresses the problem of universal synchronization primitives that can support scalable thread synchronization for large-scale many-core architectures. The universal synchronization primitives that have been deployed widely in conventional architectures, are the compare-and-swap (CAS) and load-linked/store-conditional (LL/SC) primitives. However, such synchronization primitives are expected to reach their scalability limits in the evolution to many-core architectures with thousands of cores. We introduce a non-blocking full/empty bit primitive, or NB-FEB for short, as a promising synchronization primitive for parallel programming on may-core architectures. We show that the NB-FEB primitive is universal, scalable, feasible and convenient to use. NB-FEB, together with registers, can solve the consensus problem for an arbitrary number of processes
The National Science Foundation
  • About CiteSeerX
  • Submit Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2010 The Pennsylvania State University