Results 1 - 10
of
14
Dynamic storage allocation: A survey and critical review
, 1995
"... Dynamic memory allocation has been a fundamental part of most computer systems since roughly 1960, and memory allocation is widely considered to be either a solved problem or an insoluble one. In this survey, we describe a variety of memory allocator designs and point out issues relevant to their de ..."
Abstract
-
Cited by 187 (6 self)
- Add to MetaCart
Dynamic memory allocation has been a fundamental part of most computer systems since roughly 1960, and memory allocation is widely considered to be either a solved problem or an insoluble one. In this survey, we describe a variety of memory allocator designs and point out issues relevant to their design and evaluation. We then chronologically survey most of the literature on allocators between 1961 and 1995. (Scores of papers are discussed, in varying detail, and over 150 references are given.) We argue that allocator designs have been unduly restricted by an emphasis on mechanism, rather than policy, while the latter is more important; higher-level strategic issues are still more important, but have not been given much attention. Most theoretical analyses and empirical allocator evaluations to date have relied on very strong assumptions of randomness and independence, but real program behavior exhibits important regularities that must be exploited if allocators are to perform well in practice.
Rewriting Executable Files to Measure Program Behavior
- SOFTWARE PRACTICE & EXPERIENCE
, 1994
"... ..."
Using Branch Handling Hardware to Support Profile-Driven Optimization
- In Proceedings of the 27th Annual International Symposium on Microarchitecture
, 1994
"... Profile-based optimizations can be used for instruction scheduling, loop scheduling, data preloading, function in-lining, and instruction cache performance enhancement. However, these techniques have not been embraced by software vendors because programs instrumented for profiling run 2--30 times sl ..."
Abstract
-
Cited by 29 (5 self)
- Add to MetaCart
Profile-based optimizations can be used for instruction scheduling, loop scheduling, data preloading, function in-lining, and instruction cache performance enhancement. However, these techniques have not been embraced by software vendors because programs instrumented for profiling run 2--30 times slower, an awkward compile-run-recompile sequence is required, and a test input suite must be collected and validated for each program. This paper proposes using existing branch handling hardware to generate profile information in real time. Techniques are presented for both one-level and two-level branch hardware organizations. The approach produces high accuracy with small slowdown in execution (0.4%--4.6%). This allows a program to be profiled while it is used, eliminating the need for a test input suite. This practically removes the inconvenience of profiling. With contemporary processors driven increasingly by compiler support, hardware-based profiling is important for high-performance sy...
Visualizing program slices
- In IEEE/CS Symposium on Visual Languages
, 1994
"... Copyright © 1994 IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work ..."
Abstract
-
Cited by 28 (1 self)
- Add to MetaCart
Copyright © 1994 IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the IEEE.
An Evaluation of Software-Based Release Consistent Protocols
- Journal of Parallel and Distributed Computing
, 1995
"... This paper presents an evaluation of three software implementations of release consistency. Release consistent protocols allow data communication to be aggregated, and multiple writers to simultaneously modify a single page. We evaluated an eager invalidate protocol that enforces consistency when sy ..."
Abstract
-
Cited by 27 (5 self)
- Add to MetaCart
This paper presents an evaluation of three software implementations of release consistency. Release consistent protocols allow data communication to be aggregated, and multiple writers to simultaneously modify a single page. We evaluated an eager invalidate protocol that enforces consistency when synchronization variables are released, a lazy invalidate protocol that enforces consistency when synchronization variables are acquired, and a lazy hybrid protocol that selectively uses update to reduce access misses. Our evaluation is based on implementations running on DECstation-5000/240s connected by an ATM LAN, and an execution-driven simulator that allows us to vary network parameters. Our results show that the lazy protocols consistently outperform the eager protocol for all but one application, and that the lazy hybrid performs the best overall. However, the relative performance of the implementations is highly dependent on the relative speeds of the network, processor, and communicat...
SALTO: System for Assembly-Language Transformation and Optimization
, 1996
"... : On critical applications, particularly embedded systems, the performance tuning requires multiple passes. Salto (System for Assembly Language Transformation and Optimization) is a retargetable framework for developing all the spectrum of tools that are needed for performance tuning on low-level co ..."
Abstract
-
Cited by 16 (6 self)
- Add to MetaCart
: On critical applications, particularly embedded systems, the performance tuning requires multiple passes. Salto (System for Assembly Language Transformation and Optimization) is a retargetable framework for developing all the spectrum of tools that are needed for performance tuning on low-level codes (assembly-languages) on uniprocessors. Salto enables the building of profiling, tracing and optimization tools. The user is responsible for giving a machine description of the target architecture, which includes instruction-set of the processor, precise hardware configuration and reservation-tables for all instructions, but high-level functions are provided to him for writing any tool corresponding to his needs. Moreover Salto will be a part of a global solution for manipulating assembly-code to implement low-level code restructuration as well as to provide a high-level code restructurer with useful information collected from the assembler code and from instruction profiling. Salto ha...
Efficient Coverage Testing Using Global Dominator Graphs
, 1999
"... Coverage testing techniques, such as statement and decision coverage, play a significant role in improving the quality of software systems. Constructing a thorough set of tests that yield high coverage, however, is often avery tedious, time consuming task. In this paper we present a technique to fin ..."
Abstract
-
Cited by 15 (0 self)
- Add to MetaCart
Coverage testing techniques, such as statement and decision coverage, play a significant role in improving the quality of software systems. Constructing a thorough set of tests that yield high coverage, however, is often avery tedious, time consuming task. In this paper we present a technique to find a small subset of a program's statements and decisions with the property that covering the subset implies covering the rest. We introduce the notion of a mega block which is a set of basic blocks spanning multiple procedures with the property that one basic block in it is executed iff every basic block in it is executed. We also present an algorithm to construct a data structure called the global dominator graph showing dominator relationships among mega blocks. A tester only needs to create test cases that are aimed at executing one basic block from each of the leaf nodes in this directed acyclic graph. Every other basic block in the program will automatically be covered by the same test set.
Hardware-Based Profiling: An Effective Technique for Profile-Driven Optimization
, 1996
"... Profile-based optimizations can be used for instruction scheduling, loop scheduling, data preloading, function in-lining, and instruction cache performance enhancement. However, these techniques have not been embraced by software vendors because programs instrumented for profiling run significantly ..."
Abstract
-
Cited by 14 (1 self)
- Add to MetaCart
Profile-based optimizations can be used for instruction scheduling, loop scheduling, data preloading, function in-lining, and instruction cache performance enhancement. However, these techniques have not been embraced by software vendors because programs instrumented for profiling run significantly slower, an awkward compile-run-recompile sequence is required, and a test input suite must be collected and validated for each program. This paper introduces hardware-based profiling that uses traditional branch handling hardware to generate profile information in real time. Techniques are presented for both one-level and two-level branch hardware organizations. The approach produces high accuracy with small slowdown in execution (0.4%--4.6%). This allows a program to be profiled while it is used, eliminating the need for a test input suite. With contemporary processors driven increasingly by compiler support, hardware-based profiling is important for high-performance systems. Keywords: Bran...
Dominators, Super Blocks, and Program Coverage
, 1994
"... In this paper we present techniques to #nd subsets of nodes of a #owgraph that satisfy the following property: A test set that exercises all nodes in a subset exercises all nodes in the #owgraph. Analogous techniques to #nd subsets of edges are also proposed. These techniques may be used to signi#ca ..."
Abstract
-
Cited by 9 (0 self)
- Add to MetaCart
In this paper we present techniques to #nd subsets of nodes of a #owgraph that satisfy the following property: A test set that exercises all nodes in a subset exercises all nodes in the #owgraph. Analogous techniques to #nd subsets of edges are also proposed. These techniques may be used to signi#cantly reduce the cost of coverage testing of programs. A notion of a super block consisting of one or more basic blocks is developed. If any basic block in a super block is exercised by an input then all basic blocks in that super blockmust be exercised by the same input. Dominator relationships among super blocks are used to identify a subset of the super blocks whose coverage implies that of all super blocks and, in turn, that of all basic blocks. Experiments with eight systems in the range of 1-75K lines of code show that, on the average, test cases targeted to cover just 29# of the basic blocks and 32# of the branches ensure 100# block and branchcoverage, respectively.
Generic Program Monitoring by Trace Analysis
, 2001
"... Program execution monitoring consists of checking whole executions for given properties in order to collect global run-time information. Monitoring is very useful to maintain programs. However, application developers face the following dilemma: either they use existing tools which never exactly #t t ..."
Abstract
-
Cited by 6 (2 self)
- Add to MetaCart
Program execution monitoring consists of checking whole executions for given properties in order to collect global run-time information. Monitoring is very useful to maintain programs. However, application developers face the following dilemma: either they use existing tools which never exactly #t their needs, or they invest a lot of e#ort to implement monitoring code. In this report we argue that, when an event-oriented tracer exists, the compiler developers can enable the application developers to easily code their own, relevant, monitors. We propose a high-level operator, called foldt, which operates on execution traces. One of the key advantages of our approachisthat it allows a clean separation of concerns; the de#nition of monitors is totally distinct from both the user source code and the language compiler. We giveanumber of applications of the foldt operator to compute monitors for Mercury program executions: execution pro#les, graphical abstract views, and two test coverage measurements. Each example is implemented by a few simple lines of Mercury.

