Results 1 - 10
of
28
Dynamic storage allocation: A survey and critical review
, 1995
"... Dynamic memory allocation has been a fundamental part of most computer systems since roughly 1960, and memory allocation is widely considered to be either a solved problem or an insoluble one. In this survey, we describe a variety of memory allocator designs and point out issues relevant to their de ..."
Abstract
-
Cited by 187 (6 self)
- Add to MetaCart
Dynamic memory allocation has been a fundamental part of most computer systems since roughly 1960, and memory allocation is widely considered to be either a solved problem or an insoluble one. In this survey, we describe a variety of memory allocator designs and point out issues relevant to their design and evaluation. We then chronologically survey most of the literature on allocators between 1961 and 1995. (Scores of papers are discussed, in varying detail, and over 150 references are given.) We argue that allocator designs have been unduly restricted by an emphasis on mechanism, rather than policy, while the latter is more important; higher-level strategic issues are still more important, but have not been given much attention. Most theoretical analyses and empirical allocator evaluations to date have relied on very strong assumptions of randomness and independence, but real program behavior exhibits important regularities that must be exploited if allocators are to perform well in practice.
Packing Schemes for Gang Scheduling
- In Job Scheduling Strategies for Parallel Processing
, 1996
"... . Jobs that do not require all processors in the system can be packed together for gang scheduling. We examine accounting traces from several parallel computers to show that indeed many jobs have small sizes and can be packed together. We then formulate a number of such packing algorithms, and e ..."
Abstract
-
Cited by 97 (20 self)
- Add to MetaCart
. Jobs that do not require all processors in the system can be packed together for gang scheduling. We examine accounting traces from several parallel computers to show that indeed many jobs have small sizes and can be packed together. We then formulate a number of such packing algorithms, and evaluate their effectiveness using simulations based on our workload study. The results are that two algorithms are the best: either perform the mapping based on a buddy system of processors, or use migration to re-map the jobs more tightly whenever a job arrives or terminates. Other approaches, such as mapping to the least loaded PEs, proved to be counterproductive. The buddy system approach depends on the capability to gang-schedule jobs in multiple slots, if there is space. The migration algorithm is more robust, but is expected to suffer greatly due to the overhead of the migration itself. In either case fragmentation is not an issue, and utilization may top 90% with sufficiently...
The Memory Fragmentation Problem: Solved
- Proceedings of the First International Symposium on Memory Management, ACM
, 1998
"... We show that for 8 real and varied C and C++ programs, several conventional dynamic storage allocators provide nearzero fragmentation, once we account for overheads due to implementation details such as headers, alignment, etc. This substantially strengthens our previous results showing that the mem ..."
Abstract
-
Cited by 60 (1 self)
- Add to MetaCart
We show that for 8 real and varied C and C++ programs, several conventional dynamic storage allocators provide nearzero fragmentation, once we account for overheads due to implementation details such as headers, alignment, etc. This substantially strengthens our previous results showing that the memory fragmentation problem has generally been misunderstood, and that good allocator policies can provide good memory usage for most programs. The new results indicate that for most programs, excellent allocator policies are readily available, and efficiency of implementation is the major challenge. While we believe that our experimental results are state-of-the-art and our methodology is superior to most previous work, more work should be done to identify and study unusual problematic program behaviors not represented in our sample. 1
Practical, transparent operating system support for superpages
- SIGOPS Oper. Syst. Rev
, 2002
"... Most general-purpose processors provide support for memory pages of large sizes, called superpages. Superpages enable each entry in the translation lookaside buffer (TLB) to map a large physical memory region into a virtual address space. This dramatically increases TLB coverage, reduces TLB misses, ..."
Abstract
-
Cited by 32 (3 self)
- Add to MetaCart
Most general-purpose processors provide support for memory pages of large sizes, called superpages. Superpages enable each entry in the translation lookaside buffer (TLB) to map a large physical memory region into a virtual address space. This dramatically increases TLB coverage, reduces TLB misses, and promises performance improvements for many applications. However, supporting superpages poses several challenges to the operating system, in terms of superpage allocation and promotion tradeoffs, fragmentation control, etc. We analyze these issues, and propose the design of an effective superpage management system. We implement it in FreeBSD on the Alpha CPU, and evaluate it on real workloads and benchmarks. We obtain substantial performance benefits, often exceeding 30%; these benefits are sustained even under stressful workload scenarios. 1
Non-Compacting Memory Allocation and Real-Time Garbage Collection
, 1996
"... Garbage collection is the automatic reclamation of computer storage [Knu73, Coh81, Wil92, Wil95]. While in many systems, programmers must explicitly reclaim heap memory at some point in their program by using a "free" or "dispose" statement, garbage collected systems free the programmer from this ..."
Abstract
-
Cited by 26 (2 self)
- Add to MetaCart
Garbage collection is the automatic reclamation of computer storage [Knu73, Coh81, Wil92, Wil95]. While in many systems, programmers must explicitly reclaim heap memory at some point in their program by using a "free" or "dispose" statement, garbage collected systems free the programmer from this burden. In spite of its obvious attractiveness for many applications, garbage collection for real-time programs is not popular. This is largely due to the perceived cost and disruptiveness of garbage collection in general, and of incremental garbage collection in particular. Most existing "real-time" garbage collectors are not in fact usefully real-time, largely due to the use of a read barrier to trigger incremental copying of data structures being traversed by the running application. This may slow down running applications unpredictably, even though individual increments of garbage collection work are small and bounded. We have developed a hard real-time garbage collector which us...
Scalability of Dynamic Storage Allocation Algorithms
, 1996
"... Dynamic storage allocation has a significant impact on computer performance. A dynamic storage allocator manages space for objects whose lifetimes are not known by the system at the time of their creation. A good dynamic storage allocator should utilize storage efficiently and satisfy requests in as ..."
Abstract
-
Cited by 14 (2 self)
- Add to MetaCart
Dynamic storage allocation has a significant impact on computer performance. A dynamic storage allocator manages space for objects whose lifetimes are not known by the system at the time of their creation. A good dynamic storage allocator should utilize storage efficiently and satisfy requests in as few instructions as possible. A dynamic storage allocator on a multiprocessor should have the ability to satisfy multiple requests concurrently. This paper examines parallel dynamic storage allocation algorithms and how performancescales with increasing numbers of processors. The highest throughputs and lowest instruction counts are achieved with multiple free list fit I. The best memory utilization is achieved using a best fit system.
Efficient implementation of the first-fit strategy for dynamic storage allocation
- ACM Transactions on Programming Languages and Systems
, 1989
"... We describe an algorithm that efficiently implements the first-fit strategy for dynamic storage allocation. The algorithm imposes a storage overhead of only one word per allocated block (plus a few percent of the total space used for dynamic storage), and the time required to allocate or free a bloc ..."
Abstract
-
Cited by 11 (0 self)
- Add to MetaCart
We describe an algorithm that efficiently implements the first-fit strategy for dynamic storage allocation. The algorithm imposes a storage overhead of only one word per allocated block (plus a few percent of the total space used for dynamic storage), and the time required to allocate or free a block is O(log W), where W is the maximum number of words allocated dynamically. The algorithm is faster than many commonly used algorithms, especially when many small blocks are allocated, and has good worst-case behavior. It is relatively easy to implement and could be used internally by an operating system or to provide run-time support for high-level languages such as Pascal and Ada. A Pascal implementation is given in the Appendix.
Power Exploration for Dynamic Data Types through Virtual Memory Management Refinement
- In Proceedings of the International Symposium on Low Power Electronics and Design
, 1998
"... In this paper we present our novel power exploration methodology for applications with dynamic data types. Our methodology is crucial to obtain effective solutions in an embedded (HW or SW) processor context. The contributions are twofold. First we define the complete search space for Virtual Memory ..."
Abstract
-
Cited by 10 (2 self)
- Add to MetaCart
In this paper we present our novel power exploration methodology for applications with dynamic data types. Our methodology is crucial to obtain effective solutions in an embedded (HW or SW) processor context. The contributions are twofold. First we define the complete search space for Virtual Memory Management (VMM) mechanisms in a structured way with orthogonal decision trees. Secondly we present our systematic methodology for exploration of the maximal power that takes into account characteristics of the application to heavily prune the search space guiding the choices of a VMM mechanism. Finally we demonstrate for two industrial examples that power can vary considerably depending on the VMM chosen. Moreover these experiments show the effectiveness of our exploration methodology. 1 Introduction We target applications that require manipulation of large amounts of data that are dynamically created and destroyed at run time, such as protocol processing applications. These applications...
Resource Allocation Schemes for Gang Scheduling
- Proceedings of 6th Workshop on Job Scheduling Strategies for Parallel Processing, Cancun
, 2000
"... Gang scheduling is currently the most popular scheduling scheme for parallel processing in a time shared environment. In this paper we first describe the ideas of job re-packing and workload tree for e#ciently allocating resources to enhance the performance of gang scheduling. We then present so ..."
Abstract
-
Cited by 8 (2 self)
- Add to MetaCart
Gang scheduling is currently the most popular scheduling scheme for parallel processing in a time shared environment. In this paper we first describe the ideas of job re-packing and workload tree for e#ciently allocating resources to enhance the performance of gang scheduling. We then present some experimental results obtained by implementing four di#erent resource allocation schemes. These results show how the ideas, such as re-packing jobs, running jobs in multiple slots and minimising the average number of time slots in the system, a#ect system and job performance when incorporated into the buddy based allocation scheme for gang scheduling.
Fast Allocation and Deallocation with an Improved Buddy System
- In Proceedings of the 19th Conference on the Foundations of Software Technology and Theoretical Computer Science (FST & TCS'99), Lecture Notes in Computer Science
, 1999
"... . We propose several modifications to the binary buddy system for managing dynamic allocation of memory blocks whose sizes are powers of two. The standard buddy system allocates and deallocates blocks in \Theta(lg n) time in the worst case (and on an amortized basis), where n is the size of the ..."
Abstract
-
Cited by 6 (2 self)
- Add to MetaCart
. We propose several modifications to the binary buddy system for managing dynamic allocation of memory blocks whose sizes are powers of two. The standard buddy system allocates and deallocates blocks in \Theta(lg n) time in the worst case (and on an amortized basis), where n is the size of the memory. We present two schemes that improve the running time to O(1) time, where the time bound for deallocation is amortized. The first scheme uses one word of extra storage compared to the standard buddy system, but may fragment memory more than necessary. The second scheme has essentially the same fragmentation as the standard buddy system, and uses O(2 (1+ p lg n) lg lg n ) bits of auxiliary storage, which is !(lg k n) but o(n " ) for all k 1 and " ? 0. Finally, we present simulation results estimating the effect of the excess fragmentation in the first scheme. 1 Introduction The binary buddy system [13] is a well-known system for maintaining a dynamic collection of me...

