Results 1 - 10
of
50
Dynamic storage allocation: A survey and critical review
, 1995
"... Dynamic memory allocation has been a fundamental part of most computer systems since roughly 1960, and memory allocation is widely considered to be either a solved problem or an insoluble one. In this survey, we describe a variety of memory allocator designs and point out issues relevant to their de ..."
Abstract
-
Cited by 187 (6 self)
- Add to MetaCart
Dynamic memory allocation has been a fundamental part of most computer systems since roughly 1960, and memory allocation is widely considered to be either a solved problem or an insoluble one. In this survey, we describe a variety of memory allocator designs and point out issues relevant to their design and evaluation. We then chronologically survey most of the literature on allocators between 1961 and 1995. (Scores of papers are discussed, in varying detail, and over 150 references are given.) We argue that allocator designs have been unduly restricted by an emphasis on mechanism, rather than policy, while the latter is more important; higher-level strategic issues are still more important, but have not been given much attention. Most theoretical analyses and empirical allocator evaluations to date have relied on very strong assumptions of randomness and independence, but real program behavior exhibits important regularities that must be exploited if allocators are to perform well in practice.
Cache-Conscious Data Placement
- in Proceedings of the Eighth International Conference on Architectural Support for Programming Languages and Operating Systems
, 1998
"... As the gap between memory and processor speeds continues to widen, cache efficiency is an increasingly important component of processor performance. Compiler techniques have been used to improve instruction cache performance by mapping code with temporal locality to different cache blocks in the vir ..."
Abstract
-
Cited by 131 (3 self)
- Add to MetaCart
As the gap between memory and processor speeds continues to widen, cache efficiency is an increasingly important component of processor performance. Compiler techniques have been used to improve instruction cache performance by mapping code with temporal locality to different cache blocks in the virtual address space eliminating cache conflicts. These code placement techniques can be applied directly to the problem of placing data for improved data cache performance. In this paper we present a general framework for Cache Conscious Data Placement. This is a compiler directed approach that creates an address placement for the stack (local variables), global variables, heap objects, and constants in order to reduce data cache misses. The placement of data objects is guided by a temporal relationship graph between objects generated via profiling. Our results show that profile driven data placement significantly reduces the data miss rate by 24% on average. 1 Introduction Much effort has b...
Memory Management with Explicit Regions
, 1998
"... Much research has been devoted to studies of and algorithms for memory management based on garbage collection or explicit allocation and deallocation. An alternative approach, region-based memory management, has been known for decades, but has not been wellstudied. In a region-based system each allo ..."
Abstract
-
Cited by 115 (4 self)
- Add to MetaCart
Much research has been devoted to studies of and algorithms for memory management based on garbage collection or explicit allocation and deallocation. An alternative approach, region-based memory management, has been known for decades, but has not been wellstudied. In a region-based system each allocation specifies a region, and memory is reclaimed by destroying a region, freeing all the storage allocated therein. We show that on a suite of allocation-intensive C programs, regions are competitive with malloc/free and sometimes substantially faster. We also show that regions support safe memory management with low overhead. Experience with our benchmarks suggests that modifying many existing programs to use regions is not difficult. 1 Introduction The two most popular memory management techniques are explicit allocation and deallocation, as in C's malloc/free, and various forms of garbagecollection [Wil92]. Both have well-known advantages and disadvantages, discussed further below. A t...
DieHard: probabilistic memory safety for unsafe languages
- in PLDI ’06
, 2006
"... Applications written in unsafe languages like C and C++ are vulnerable to memory errors such as buffer overflows, dangling pointers, and reads of uninitialized data. Such errors can lead to program crashes, security vulnerabilities, and unpredictable behavior. We present DieHard, a runtime system th ..."
Abstract
-
Cited by 93 (13 self)
- Add to MetaCart
Applications written in unsafe languages like C and C++ are vulnerable to memory errors such as buffer overflows, dangling pointers, and reads of uninitialized data. Such errors can lead to program crashes, security vulnerabilities, and unpredictable behavior. We present DieHard, a runtime system that tolerates these errors while probabilistically maintaining soundness. DieHard uses randomization and replication to achieve probabilistic memory safety by approximating an infinite-sized heap. DieHard’s memory manager randomizes the location of objects in a heap that is at least twice as large as required. This algorithm prevents heap corruption and provides a probabilistic guarantee of avoiding memory errors. For additional safety, DieHard can operate in a replicated mode where multiple replicas of the same application are run simultaneously. By initializing each replica with a different random seed and requiring agreement on output, the replicated version of Die-Hard increases the likelihood of correct execution because errors are unlikely to have the same effect across all replicas. We present analytical and experimental results that show DieHard’s resilience to a wide range of memory errors, including a heap-based buffer overflow in an actual application.
Hoard: A Scalable Memory Allocator for Multithreaded Applications
, 2000
"... Parallel, multithreaded C and C++ programs such as web servers, database managers, news servers, and scientific applications are becoming increasingly prevalent. For these applications, the memory allocator is often a bottleneck that severely limits program performance and scalability on multiproces ..."
Abstract
-
Cited by 93 (14 self)
- Add to MetaCart
Parallel, multithreaded C and C++ programs such as web servers, database managers, news servers, and scientific applications are becoming increasingly prevalent. For these applications, the memory allocator is often a bottleneck that severely limits program performance and scalability on multiprocessor systems. Previous allocators suffer from problems that include poor performance and scalability, and heap organizations that introduce false sharing. Worse, many allocators exhibit a dramatic increase in memory consumption when confronted with a producer-consumer pattern of object allocation and freeing. This increase in memory consumption can range from a factor of P (the number of processors) to unbounded memory consumption.
Segregating Heap Objects by Reference Behavior and Lifetime
- IN PROCEEDINGS OF THE EIGHTH INTERNATIONAL CONFERENCE ON ARCHITECTURAL SUPPORT FOR PROGRAMMING LANGUAGES AND OPERATING SYSTEMS (ASPLOS-VIII
, 1998
"... Dynamic storage allocation has become increasingly important in many applications, in part due to the use of the object-oriented paradigm. At the same time, processor speeds are increasing faster than memory speeds and programs are increasing in size faster than memories. In this paper, we investiga ..."
Abstract
-
Cited by 72 (5 self)
- Add to MetaCart
Dynamic storage allocation has become increasingly important in many applications, in part due to the use of the object-oriented paradigm. At the same time, processor speeds are increasing faster than memory speeds and programs are increasing in size faster than memories. In this paper, we investigate e#orts to predict heap object reference and lifetime behavior at the time objects are allocated. Our approach uses profile-based optimization, and considers a variety of di#erent information sources present at the time of object allocation to predict the object's reference frequency and lifetime. Our results, based on measurements of six allocation intensive programs, show that program references to heap objects are highly predictable and that our prediction methods can successfully predict the behavior of these heap objects. We show that our methods can decrease the page fault rate of the programs measured, sometimes dramatically, in cases where the physical memory available to the progr...
Parallel data mining for association rules on shared-memory multiprocessors
- In Proc. Supercomputing’96
, 1996
"... Abstract. In this paper we present a new parallel algorithm for data mining of association rules on shared-memory multiprocessors. We study the degree of parallelism, synchronization, and data locality issues, and present optimizations for fast frequency computation. Experiments show that a signific ..."
Abstract
-
Cited by 62 (19 self)
- Add to MetaCart
Abstract. In this paper we present a new parallel algorithm for data mining of association rules on shared-memory multiprocessors. We study the degree of parallelism, synchronization, and data locality issues, and present optimizations for fast frequency computation. Experiments show that a significant improvement of performance is achieved using our proposed optimizations. We also achieved good speed-up for the parallel algorithm. A lot of data-mining tasks (e.g. association rules, sequential patterns) use complex pointer-based data structures (e.g. hash trees) that typically suffer from suboptimal data locality. In the multiprocessor case shared access to these data structures may also result in false sharing. For these tasks it is commonly observed that the recursive data structure is built once and accessed multiple times during each iteration. Furthermore, the access patterns after the build phase are highly ordered. In such cases locality and false sharing sensitive memory placement of these structures can enhance performance significantly. We evaluate a set of placement policies for parallel association discovery, and show that simple placement schemes can improve execution time by more than a factor of two. More complex schemes yield additional gains.
The Slab Allocator: An Object-Caching Kernel Memory Allocator
- USENIX SUMMER TECHNICAL CONFERENCE
, 1994
"... This paper presents a comprehensive design overview of the SunOS 5.4 kernel memory allocator. This allocator is based on a set of object-caching primitives that reduce the cost of allocating complex objects by retaining their state between uses. These same primitives prove equally effective for mana ..."
Abstract
-
Cited by 54 (3 self)
- Add to MetaCart
This paper presents a comprehensive design overview of the SunOS 5.4 kernel memory allocator. This allocator is based on a set of object-caching primitives that reduce the cost of allocating complex objects by retaining their state between uses. These same primitives prove equally effective for managing stateless memory (e.g. data pages and temporary buffers) because they are space-efficient and fast. The allocator’s object caches respond dynamically to global memory pressure, and employ an objectcoloring scheme that improves the system’s overall cache utilization and bus balance. The allocator also has several statistical and debugging features that can detect a wide range of problems throughout the system. 1.
Improving Cache Behavior of Dynamically Allocated Data Structures
- In International Conference on Parallel Architectures and Compilation Techniques
, 1998
"... Poor data layout in memory may generate weak data locality and poor performance. Code transformations such as loop blocking or interchanging and array padding have addressed this issue for scientific applications. However many generalist applications do not use data arrays, but dynamically allocated ..."
Abstract
-
Cited by 50 (0 self)
- Add to MetaCart
Poor data layout in memory may generate weak data locality and poor performance. Code transformations such as loop blocking or interchanging and array padding have addressed this issue for scientific applications. However many generalist applications do not use data arrays, but dynamically allocated heterogeneous data structures. In this paper, we explore two data layout techniques for dynamically allocated data structures: field reorganization, and instance interleaving. The application of these techniques may be guided by program profiling. This allows significant cache behavior improvements on some applications. To support instance interleaving, we developed a specific memory allocation library called ialloc. An ialloc-like library may be of great help in a toolbox for performance tuning of general-purpose applications.
Design and Implementation of Code Optimizations for a Type-Directed Compiler for Standard ML
, 1996
"... Abstract The trends in software development are towards larger programs, more complex programs, and more use of programs as "component software. " These trends mean that the features of modern programming languages are becoming more important than ever before. Programming languages need to ..."
Abstract
-
Cited by 47 (2 self)
- Add to MetaCart
Abstract The trends in software development are towards larger programs, more complex programs, and more use of programs as "component software. " These trends mean that the features of modern programming languages are becoming more important than ever before. Programming languages need to have features such as strong typing, a module system, polymorphism, automatic storage management, and higher-order functions. In short, modern programming languages are becoming more important than ever before.

