Results 1 - 10
of
12
Experience with a Software-Defined Machine Architecture
- Unreachable Procedures in Object-oriented WRL Research Report 91/10
, 1991
"... We built a system in which the compiler back end and the linker work together to present an abstract machine at a considerably higher level than the actual machine. The intermediate language translated by the back end is the target language of all high-level compilers and is also the only assembl ..."
Abstract
-
Cited by 53 (7 self)
- Add to MetaCart
We built a system in which the compiler back end and the linker work together to present an abstract machine at a considerably higher level than the actual machine. The intermediate language translated by the back end is the target language of all high-level compilers and is also the only assembly language generally available. This lets us do intermodule register allocation, which would be harder if some of the code in the program had come from a traditional assembler, out of sight of the optimizer. We do intermodule register allocation and pipeline instruction scheduling at link time, using information gathered by the compiler back end. The mechanism for analyzing and modifying the program at link time was also useful in a wide array of instrumentation tools. i 1. Introduction When our lab built its experimental RISC workstation, the Titan, we defined a high-level assembly language as the official interface to the machine. This high-level assembly language, called Mahler,...
Procedure Merging with Instruction Caches
- Proceedings of the ACM SIGPLAN '91 Conference on Programming Language Design and Implementation
, 1991
"... This paper describes a method of determining which procedures to merge for machines with instruction caches. The method uses profile information, the structure of the program, the cache size, and the cache miss penalty to guide the choice. Optimization for the cache is assumed to follow procedure me ..."
Abstract
-
Cited by 49 (0 self)
- Add to MetaCart
This paper describes a method of determining which procedures to merge for machines with instruction caches. The method uses profile information, the structure of the program, the cache size, and the cache miss penalty to guide the choice. Optimization for the cache is assumed to follow procedure merging. The method weighs the benefit of removing calls with the increase in the instruction cache miss rate. Better performance is achieved than previous schemes that do not consider the cache. Merging always results in a savings, unlike simpler schemes that can make programs slower once cache effects are considered. The new method also has better performance even when parameters to simpler algorithms are varied to get the best performance. This report is a preprint of a paper that will be presented at the ACM SIGPLAN '91 Conference on Programming Language Design and Implementation, Toronto, Ontario, Canada, June 26-28, 1991. Copyright 1990 ACM. i 1 Introduction This paper presents a ...
Design and Implementation of Code Optimizations for a Type-Directed Compiler for Standard ML
, 1996
"... Abstract The trends in software development are towards larger programs, more complex programs, and more use of programs as "component software. " These trends mean that the features of modern programming languages are becoming more important than ever before. Programming languages need to ..."
Abstract
-
Cited by 47 (2 self)
- Add to MetaCart
Abstract The trends in software development are towards larger programs, more complex programs, and more use of programs as "component software. " These trends mean that the features of modern programming languages are becoming more important than ever before. Programming languages need to have features such as strong typing, a module system, polymorphism, automatic storage management, and higher-order functions. In short, modern programming languages are becoming more important than ever before.
The Named-State Register File: Implementation and Performance
, 1995
"... Context switches are slow in conventional processors because the entire processor state must be saved and restored, even if much of the state is not used before the next context switch. This paper introduces the Named-State Register File, a fine-grain associative register file. The NSF uses hardware ..."
Abstract
-
Cited by 14 (1 self)
- Add to MetaCart
Context switches are slow in conventional processors because the entire processor state must be saved and restored, even if much of the state is not used before the next context switch. This paper introduces the Named-State Register File, a fine-grain associative register file. The NSF uses hardware and software techniques to efficiently manage registers among sequential or parallel procedure activations. The NSF holds more live data per register than conventional register files, and requires much less spill and reload traffic to switch between concurrent contexts. The NSF speeds execution of some sequential and parallel programs by 9% to 17% over alternative register file organizations. The NSF has access time comparable to a conventional register file and only adds 5% to the area of a typical processor chip. Keywords: multithreaded, processor, register, context switch. NOTE: This is a draft copy of a paper that has been submitted for publication. Please do not reference or redistrib...
Analysis of Recursive Types in an Imperative Language
, 1994
"... Analysis of Recursive Types in an Imperative Language by Edward Yan-Bing Wang Doctor of Philosophy in Computer Science University of California at BERKELEY Professor Paul N. Hilfinger, Chair I introduce Algorithm P and p-set, a new type-analysis algorithm and its associated type description, capable ..."
Abstract
-
Cited by 13 (0 self)
- Add to MetaCart
Analysis of Recursive Types in an Imperative Language by Edward Yan-Bing Wang Doctor of Philosophy in Computer Science University of California at BERKELEY Professor Paul N. Hilfinger, Chair I introduce Algorithm P and p-set, a new type-analysis algorithm and its associated type description, capable of deducing and accurately representing recursive types in programs in an imperative language, using type information present in both object creation and object use. When applied to Lisp and used to discover unnecessary type checks, Algorithm P is capable of removing close to all of the type checks on structured objects in most programs. Professor Paul N. Hilfinger Dissertation Committee Chair iii To my parents iv Acknowledgements I thank my advisor, Paul Hilfinger, and the other members of the committee, Alex Aiken and Phil Colella. Marcia Feitel helped in the writing of this report, in both content and style, and in English and Latin. The research documented here was in part support...
Measuring the Cost of Storage Management
- Lisp and Symbolic Computation
, 1994
"... We study the cost of storage management for garbage-collected programs compiled with the Standard ML of New Jersey compiler. We show that the cost of storage management is not the same as the time spent garbage collecting. For many of the programs, the time spent garbage collecting is less than the ..."
Abstract
-
Cited by 7 (2 self)
- Add to MetaCart
We study the cost of storage management for garbage-collected programs compiled with the Standard ML of New Jersey compiler. We show that the cost of storage management is not the same as the time spent garbage collecting. For many of the programs, the time spent garbage collecting is less than the time spent doing other storage-management tasks. 1 Authors' addresses: David Tarditi, Computer Science Department, Carnegie Mellon University, 5000 Forbes Avenue, Pittsburgh, PA, 152133891. e-mail: dtarditi@cs.cmu.edu. Amer Diwan, Department of Computer Science, University of Massachusetts, Amherst, MA 01003-4610. e-mail: diwan@cs.umass.edu. This research is sponsored by the Defense Advanced Research Projects Agency, DoD, through ARPA Order 8313, and monitored by ESD/AVS under contract F19628-91-C-0168. David Tarditi is also supported by an AT&T PhD Scholarship. Views and conclusions contained in this document are those of the authors and should not be interpreted as representing the offic...
Program Analysis And Optimization For Machines With Instruction Cache
, 1991
"... In modern processors, the performance of the memory hierarchy is crucial in determining the overall performance of a CPU. Among the most important factors in deciding the performance of a CPU is the cache performance, and particularly, the cache miss rate. Determining and improving the hit rate of a ..."
Abstract
-
Cited by 6 (0 self)
- Add to MetaCart
In modern processors, the performance of the memory hierarchy is crucial in determining the overall performance of a CPU. Among the most important factors in deciding the performance of a CPU is the cache performance, and particularly, the cache miss rate. Determining and improving the hit rate of a cache is one of the most important tasks undertaken by a computer designer. Exploring the set of alternative cache organization is quite expensive, since the designer has a large number of cache design parameters at his disposal, and because the primary technique used for evaluating caches, trace-driven simulation, is very expensive. Additionally, because of tradeoffs in a cache design, the maximum performance achievable with a pure hardware solution is limited. This thesis presents a model of instruction caches that accurately predicts instruction cache behavior using easily obtained profile information. The model provides information useful to memory hierarchy designers and to programmers interested in developing algorithms with improved instruction cache performance. In addition, an automatic optimizer is described that reduces the instruction cache miss rate by 80% for a set of 10 large Pascal programs with a 32KB direct-mapped instruction cache. The model can also be used by other compiler optimizations that affect instruction cache performance. In particular, a new method of procedure merging is described that attempts to find the best procedures to inline when instruction cache effects are included. iv Acknowledgments Thanks to my advisor John Hennessy for providing such an interesting and challenging environment and numerous suggestions for improving this thesis. Thanks to Steve Richardson, Steve Tjiang, C. Y. Chu, Paul Chow, Malcolm Wing, Mark Horowitz, Arturo Sal...
A Simple Approach To Supporting Untagged Objects In Dynamically Typed Languages
"... This paper discusses a straightforward approach to using untagged and unboxed values in dynamically typed languages. An implementation of our algorithms allows a dynamically typed language to attain performance close to that of highly optimized C code on a variety of benchmarks (including many float ..."
Abstract
-
Cited by 6 (3 self)
- Add to MetaCart
This paper discusses a straightforward approach to using untagged and unboxed values in dynamically typed languages. An implementation of our algorithms allows a dynamically typed language to attain performance close to that of highly optimized C code on a variety of benchmarks (including many floating-point intensive computations) and dramatically reduces heap usage. \Delta 1. Introduction
The Named-State Register File
- AI-TR 1459, MIT Artificial Intelligence Laboratory
, 1993
"... A register file is a critical resource of modern processors. Most hardware and software mechanisms to manage registers across procedure calls do not efficiently support multithreaded programs. To switch between parallel threads, a conventional processor must spill and reload thread contexts from reg ..."
Abstract
-
Cited by 5 (1 self)
- Add to MetaCart
A register file is a critical resource of modern processors. Most hardware and software mechanisms to manage registers across procedure calls do not efficiently support multithreaded programs. To switch between parallel threads, a conventional processor must spill and reload thread contexts from registers to memory. If context switches are frequent and unpredictable, a large fraction of execution time is spent saving and restoring registers. This thesis introduces the Named-State Register File, a fine-grain, fully-associative register organization. The NSF uses hardware and software mechanisms to manage registers among many concurrent activations. The NSF enables both fast context switching and efficient sequential program performance. The NSF holds more live data than conventional register files, and requires much less spill and reload traffic to switch between concurrent active contexts. The NSF speeds execution of some sequential and parallel programs by 9% to 17% over alternative r...
Garbage Collecting The Internet
"... Distributed systems present a new challenge to garbage collection design. Garbage collection schemes for linked, heterogeneous data-structures distributed over a network are reviewed for the first time. As distributed garbage collectors evolved from single address space collectors, these are classif ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
Distributed systems present a new challenge to garbage collection design. Garbage collection schemes for linked, heterogeneous data-structures distributed over a network are reviewed for the first time. As distributed garbage collectors evolved from single address space collectors, these are classified first. The classification is extended to distributed collectors taking into account the additional issues of distribution: locality; latency and synchronisation. Categories and Subject Descriptors: C.2.4 [Computer-Communications Networks]: Distributed Systems; D.1.3 [Programming Techniques]: Concurrent Programming, Distributed programming, parallel programming; D.4.2 [Operating Systems]: Storage management; D.4.3: File systems management. Additional key words and phrases: memory management, automatic storage reclamation, garbage collection, reference counting, distributed systems, distributed memories, distributed file systems, network communication. Contents 1 Introduction 3 2 Taxonom...

