Results 1 - 10
of
28
RapidMRC: Approximating L2 Miss Rate Curves on Commodity Systems for Online Optimizations
, 2009
"... Miss rate curves (MRCs) are useful in a number of contexts. In our research, online L2 cache MRCs enable us to dynamically identify optimal cache sizes when cache-partitioning a shared-cache multicore processor. Obtaining L2 MRCs has generally been assumed to be expensive when done in software and c ..."
Abstract
-
Cited by 18 (2 self)
- Add to MetaCart
Miss rate curves (MRCs) are useful in a number of contexts. In our research, online L2 cache MRCs enable us to dynamically identify optimal cache sizes when cache-partitioning a shared-cache multicore processor. Obtaining L2 MRCs has generally been assumed to be expensive when done in software and consequently, their usage for online optimizations has been limited. To address these problems and opportunities, we have developed a low-overhead software technique to obtain L2 MRCs online on current processors, exploiting features available in their performance monitoring units so that no changes to the application source code or binaries are required. Our technique, called RapidMRC, requires a single probing period of roughly 221 million processor cycles (147 ms), and subsequently 124 million cycles (83 ms) to process the data. We demonstrate its accuracy by comparing the obtained MRCs to the actual L2 MRCs of 30 applications taken from SPECcpu2006, SPECcpu2000, and SPECjbb2000. We show that RapidMRC can be applied to sizing cache partitions, helping to achieve performance improvements of up to 27%.
Tolerating Memory Leaks
- In ACM Conference on Object-Oriented Programming, Systems, Languages, and Applications
, 2008
"... Abstract Type safety and garbage collection in managed languages elimi-nate memory errors such as dangling pointers, double frees, and ..."
Abstract
-
Cited by 12 (4 self)
- Add to MetaCart
Abstract Type safety and garbage collection in managed languages elimi-nate memory errors such as dangling pointers, double frees, and
LeakSurvivor: Towards Safely Tolerating Memory Leaks for Garbage-Collected Languages
- In USENIX Annual Technical Conference
, 2008
"... Continuous memory leaks severely hurt program performance and software availability for garbage-collected programs. This paper presents a safe method, called LeakSurvivor, to tolerate continuous memory leaks at runtime for garbage-collected programs. Our main idea is to periodically swap out the “Po ..."
Abstract
-
Cited by 9 (2 self)
- Add to MetaCart
Continuous memory leaks severely hurt program performance and software availability for garbage-collected programs. This paper presents a safe method, called LeakSurvivor, to tolerate continuous memory leaks at runtime for garbage-collected programs. Our main idea is to periodically swap out the “Potentially Leaked ” (PL) memory objects identified by leak detectors from the virtual memory to disks. As a result, the virtual memory space occupied by the PL objects can be reclaimed by garbage collectors and available for future uses. If a swapped-out PL object is accesses later, LeakSurvivor will restore it from disks to the memory for correct program execution. Furthermore, LeakSurvivor helps developers to prune false positives. We have built the prototype of LeakSurvivor on top of Jikes RVM 2.4.2, a high performance Java-in-Java virtual machine developed by IBM. We conduct the experiments with three Java applications including Eclipse, SPECjbb2000 and Jigsaw. Among them, Eclipse and Jigsaw contain memory leaks introduced by their developers, while SPECjbb2000 contain a memory leak injected by us. Our results show that LeakSurvivor effectively tolerates memory leaks for two applications (Eclipse and SPECjbb2000), i.e., no cumulative performance degradation and no software failures when facing continuous memory leaks at runtime. For Jigsaw, LeakSurvivor extends the program lifetime by two times and improves the performance by 46 % compared with native runs. Furthermore, when there are no memory leaks, LeakSurvivor imposes small runtime overhead, i.e., 2.5 % over the leak detector and 23.7 % over the native runs. 1
Leak Pruning
- ASPLOS'09
, 2009
"... Managed languages improve programmer productivity with type safety and garbage collection, which eliminate memory errors such as dangling pointers, double frees, and buffer overflows. However, programs may still leak memory if programmers forget to eliminate the last reference to an object that will ..."
Abstract
-
Cited by 8 (4 self)
- Add to MetaCart
Managed languages improve programmer productivity with type safety and garbage collection, which eliminate memory errors such as dangling pointers, double frees, and buffer overflows. However, programs may still leak memory if programmers forget to eliminate the last reference to an object that will not be used again. Leaks slow programs by increasing collector workload and frequency. Growing leaks crash programs. Instead of crashing, leak pruning extends program availability by predicting and reclaiming leaked objects at run time. Whereas garbage collection over-approximates live objects using reachability, leak pruning predicts dead objects and reclaims them based on how stale they are and the size of stale data structures. Leak pruning preserves semantics because it waits for heap exhaustion before reclaiming objects and then poisons references to objects it reclaims. If the program later tries to access these objects, the virtual machine (VM) throws an internal error. We implement leak pruning in a Java VM, show its overhead is low, and evaluate it on 10 leaking programs. Leak pruning does not help two programs, executes four substantial programs 1.6-35X longer, and executes four programs, including two leaks in Eclipse, for at least 24 hours. In the worst case, leak pruning defers fatal errors. In the best case, programs with unbounded memory requirements execute indefinitely and correctly in bounded memory with consistent throughput.
Program Locality Analysis Using Reuse Distance
, 2009
"... On modern computer systems, the memory performance of an application depends on its locality. For a single execution, locality-correlated measures like average miss rate or working-set size have long been analyzed using reuse distance—the number of distinct locations accessed between consecutive acc ..."
Abstract
-
Cited by 8 (4 self)
- Add to MetaCart
On modern computer systems, the memory performance of an application depends on its locality. For a single execution, locality-correlated measures like average miss rate or working-set size have long been analyzed using reuse distance—the number of distinct locations accessed between consecutive accesses to a given location. This article addresses the analysis problem at the program level, where the size of data and the locality of execution may change significantly depending on the input. The article presents two techniques that predict how the locality of a program changes with its input. The first is approximate reuse-distance measurement, which is asymptotically faster than exact methods while providing a guaranteed precision. The second is statistical prediction of locality in all executions of a program based on the analysis of a few executions. The prediction process has three steps: dividing data accesses into groups, finding the access patterns in each group, and building parameterized models. The resulting prediction may be used on-line with the help of distance-based sampling. When evaluated on fifteen benchmark applications, the new techniques predicted program locality with good accuracy, even for test executions that are orders of magnitude larger than the training executions. The two techniques are among the first to enable quantitative analysis of whole-program locality and
Isla Vista Heap Sizing: Using Feedback to Avoid Paging
- In Proceedings of the International Symposium on Code Generation and Optimization (CGO
, 2007
"... Managed runtime environments (MREs) employ garbage collection (GC) for automatic memory management. However, GC induces pressure on the virtual memory (VM) manager, since it may touch pages that are not related to the working set of the application. Paging due to GC can significantly hurt performanc ..."
Abstract
-
Cited by 7 (2 self)
- Add to MetaCart
Managed runtime environments (MREs) employ garbage collection (GC) for automatic memory management. However, GC induces pressure on the virtual memory (VM) manager, since it may touch pages that are not related to the working set of the application. Paging due to GC can significantly hurt performance, even when the application’s working set fits into physical memory. We present a feedback-directed heap resizing mechanism to avoid GC-induced paging, using information from the operating system (OS). We avoid costly GCs when there is physical memory available, and trade off GC for paging when memory is constrained Our mechanism is simple and uses allocation stall events during GC alone to trigger heap resizing, without user participation or OS kernel modification. Our system enables significant performance improvements when real memory is restricted and similar to, or better performance than, the current state-of-the-art MRE, when memory is unconstrained. 1.
Efficiently and precisely locating memory leaks and bloat
- Conference on Programming Language Design and Implementation
, 2009
"... Inefficient use of memory, including leaks and bloat, remain a significant challenge for C and C++ developers. Applications with these problems become slower over time as their working set grows and can become unresponsive. At the same time, memory leaks and bloat remain notoriously difficult to deb ..."
Abstract
-
Cited by 6 (0 self)
- Add to MetaCart
Inefficient use of memory, including leaks and bloat, remain a significant challenge for C and C++ developers. Applications with these problems become slower over time as their working set grows and can become unresponsive. At the same time, memory leaks and bloat remain notoriously difficult to debug, and comprise a large number of reported bugs in mature applications. Previous tools for diagnosing memory inefficiencies—based on garbage collection, binary rewriting, or code sampling—impose high overheads (up to 100X) or generate many false alarms. This paper presents Hound, a runtime system that helps track down the sources of memory leaks and bloat in C and C++ applications. Hound employs data sampling, a staleness-tracking approach based on a novel heap organization, to make it both precise and efficient. Hound has no false positives, and its runtime and space overhead are low enough that it can be used in deployed applications. We demonstrate Hound’s efficacy across a suite of synthetic benchmarks and real applications.
Virtual Machine Memory Access Tracing With Hypervisor Exclusive Cache ∗ Abstract
"... consolidation can benefit from the prediction of VM page miss rate at each candidate memory size. Such prediction is challenging for the hypervisor (or VM monitor) due to a lack of knowledge on VM memory access pattern. This paper explores the approach that the hypervisor takes over the management f ..."
Abstract
-
Cited by 5 (1 self)
- Add to MetaCart
consolidation can benefit from the prediction of VM page miss rate at each candidate memory size. Such prediction is challenging for the hypervisor (or VM monitor) due to a lack of knowledge on VM memory access pattern. This paper explores the approach that the hypervisor takes over the management for part of the VM memory and thus all accesses that miss the remaining VM memory can be transparently traced by the hypervisor. For online memory access tracing, its overhead should be small compared to the case that all allocated memory is directly managed by the VM. To save memory space, the hypervisor manages its memory portion as an exclusive cache (i.e., containing only data that is not in the remaining VM memory). To minimize I/O overhead, evicted data from a VM enters its cache directly from VM memory (as opposed to entering from the secondary storage). We guarantee the cache correctness by only caching memory pages whose current contents provably match those of corresponding storage locations. Based on our design, we show that when the VM evicts pages in the LRU order, the employment of the hypervisor cache does not introduce any additional I/O overhead in the system. We implemented the proposed scheme on the Xen para-virtualization platform. Our experiments with microbenchmarks and four real data-intensive services (SPECweb99, index searching, TPC-C, and TPC-H) illustrate the overhead of our hypervisor cache and the accuracy of cache-driven VM page miss rate prediction. We also present the results on adaptive VM memory allocation with performance assurance. 1
XMem: Type-Safe, Transparent, Shared Memory for Cross-Runtime Communication and Coordination
"... Developers commonly build contemporary enterprise applications using type-safe, component-based platforms, such as J2EE, and architect them to comprise multiple tiers, such as a web container, application server, and database engine. Administrators increasingly execute each tier in its own managed r ..."
Abstract
-
Cited by 5 (2 self)
- Add to MetaCart
Developers commonly build contemporary enterprise applications using type-safe, component-based platforms, such as J2EE, and architect them to comprise multiple tiers, such as a web container, application server, and database engine. Administrators increasingly execute each tier in its own managed runtime environment (MRE) to improve reliability and to manage system complexity through the fault containment and modularity offered by isolated MRE instances. Such isolation, however, necessitates expensive cross-tier communication based on protocols such as object serialization and remote procedure calls. Administrators commonly co-locate communicating MREs on a single host to reduce communication overhead and to better exploit increasing numbers of available processing cores. However, state-of-the-art MREs offer no support for more efficient communication between co-located
AS-GC: An efficient generational garbage collector for Java application servers
- In Proceedings of the European Conference on Object-Oriented Programming (ECOOP
, 2007
"... Abstract. A generational collection strategy utilizing a single nursery cannot efficiently manage objects in application servers due to variance in their lifespans. In this paper, we introduce an optimization technique designed for application servers that exploits an observation that remotable obje ..."
Abstract
-
Cited by 4 (2 self)
- Add to MetaCart
Abstract. A generational collection strategy utilizing a single nursery cannot efficiently manage objects in application servers due to variance in their lifespans. In this paper, we introduce an optimization technique designed for application servers that exploits an observation that remotable objects are commonly used as gateways for client requests. Objects instantiated as part of these requests (remote objects) often live longer than objects not created to serve these remote requests (local objects). Thus, our scheme creates remote and local objects in two separate nurseries; each is properly sized to match the lifetime characteristic of the residing objects. We extended the generational collector in HotSpot to support the proposed optimization and found that given the same heap size, the proposed scheme can improve the maximum throughput of an application server by 14% over the default collector. It also allows the application server to handle 10% higher workload prior to memory exhaustion. 1

