Results 1 -
6 of
6
Customization: optimizing compiler technology for SELF, a dynamically-typed object-oriented programming language
, 1989
"... Dynamically-typed object-oriented languages please programmers, but their lack of static type information penalizes performance. Our new implementation tech-niques extract static type information from declaration-free programs. Our system compiles several copies of a given procedure, each customized ..."
Abstract
-
Cited by 185 (16 self)
- Add to MetaCart
Dynamically-typed object-oriented languages please programmers, but their lack of static type information penalizes performance. Our new implementation tech-niques extract static type information from declaration-free programs. Our system compiles several copies of a given procedure, each customized for one receiver type, so that the type of the receiver is bound at compile time. The compiler predicts types that are statically unknown but likely, and inserts run-time type tests to verify its predictions. It splits calls, compiling a copy on each control path, optimized to the specific types on that path. Coupling these new techniques with compile-time message lookup, aggressive procedure inlining, and traditional optimizations has doubled the performance of dynamically-typed object-oriented languages. 1.
Typed closure conversion
- In Proceedings of the 23th Symposium on Principles of Programming Languages (POPL
, 1996
"... The views and conclusions contained in this document are those of the authors and should not be interpreted as representing o cial policies, either expressed or implied, of the Advanced Research Projects Agency or the U.S. Government. Any opinions, ndings, and conclusions or recommendations expresse ..."
Abstract
-
Cited by 146 (22 self)
- Add to MetaCart
The views and conclusions contained in this document are those of the authors and should not be interpreted as representing o cial policies, either expressed or implied, of the Advanced Research Projects Agency or the U.S. Government. Any opinions, ndings, and conclusions or recommendations expressed in this material are those of the We study the typing properties of closure conversion for simply-typed and polymorphic-calculi. Unlike most accounts of closure conversion, which only treat the untyped-calculus, we translate well-typed source programs to well-typed target programs. This allows later compiler phases to take advantage of types for representation analysis and tag-free garbage collection, and it facilitates correctness proofs. Our account of closure conversion for the simply-typed language takes advantage of a simple model of objects by mapping closures to existentials. Closure conversion for the polymorphic language requires additional type machinery, namely translucency in the style of Harper and Lillibridge's module calculus, to express the type of a closure.
The MIT Alewife Machine: A Large-Scale Distributed-Memory Multiprocessor
- In Proceedings of Workshop on Scalable Shared Memory Multiprocessors
, 1991
"... The Alewife multiprocessor project focuses on the architecture and design of a large-scale parallel machine. The machine uses a low-dimensional direct interconnection network to provide scalable communication bandwidth, while allowing the exploitation of locality. Despite its distributed-memory arch ..."
Abstract
-
Cited by 138 (22 self)
- Add to MetaCart
The Alewife multiprocessor project focuses on the architecture and design of a large-scale parallel machine. The machine uses a low-dimensional direct interconnection network to provide scalable communication bandwidth, while allowing the exploitation of locality. Despite its distributed-memory architecture, Alewife allows efficient shared-memory programming through a multilayered approach to locality management. A new scalable cache-coherence scheme called LimitLESS directories allows the use of caches for reducing communication latency and network bandwidth requirements. Alewife also employs run-time and compile-time methods for partitioning and placement of data and processes to enhance communication locality. While the above methods attempt to minimize communication latency, communication with distant processors cannot be completely avoided. Alewife's processor, Sparcle, is designed to tolerate these latencies by rapidly switching between threads of computation. This paper describe...
Compiling with Types
, 1995
"... Compilers for monomorphic languages, such as C and Pascal, take advantage of types to determine data representations, alignment, calling conventions, and register selection. However, these languages lack important features including polymorphism, abstract datatypes, and garbage collection. In contr ..."
Abstract
-
Cited by 97 (14 self)
- Add to MetaCart
Compilers for monomorphic languages, such as C and Pascal, take advantage of types to determine data representations, alignment, calling conventions, and register selection. However, these languages lack important features including polymorphism, abstract datatypes, and garbage collection. In contrast, modern programming languages such as Standard ML (SML), provide all of these features, but existing implementations fail to take full advantage of types. The result is that performance of SML code is quite bad when compared to C. In this thesis, I provide a general framework, called type-directed compilation, that allows compiler writers to take advantage of types at all stages in compilation. In the framework, types are used not only to determine efficient representations and calling conventions, but also to prove the correctness of the compiler. A key property of typedirected compilation is that all but the lowest levels of the compiler use typed intermediate languages. An advantage of this approach is that it provides a means for automatically checking the integrity of the resulting code. An important
Reactive Synchronization Algorithms for Multiprocessors
"... Synchronization algorithms that are efficient across a wide range of applications and operating conditions are hard to design because their performance depends on unpredictable run-time factors. The designer of a synchronization algorithm has a choice of protocols to use for implementing the synchro ..."
Abstract
-
Cited by 49 (2 self)
- Add to MetaCart
Synchronization algorithms that are efficient across a wide range of applications and operating conditions are hard to design because their performance depends on unpredictable run-time factors. The designer of a synchronization algorithm has a choice of protocols to use for implementing the synchronization operation. For example, candidate protocols for locks include test-and-set protocols and queueing protocols. Frequently, the best choice of protocols depends on the level of contention: previous research has shown that test-and-set protocols for locks outperform queueing protocols at low contention, while the opposite is true at high contention. This paper investigates reactive synchronization algorithms that dynamically choose protocols in response to the level of contention. We describe reactive algorithms for spin locks and fetch-and-op that choose among several shared-memory and message-passing protocols. Dynamically choosing protocols presents a challenge: a reactive algorithm needs to select and change protocols efficiently, and has to allow for the possibility that multiple processes may be executing different protocols at the same time. We describe the notion of consensus objects that the reactive algorithms use to preserve correctness in the face of dynamic protocol changes. Experimental measurements demonstrate that reactive algorithms perform close to the best static choice of protocols at all levels of contention. Furthermore, with mixed levels of contention, reactive algorithms outperform passive algorithms with fixed protocols, provided that contention levels do not change too frequently. Measurements of several parallel applications show that reactive algorithms result in modest performance gains for spin locks and significant gains for fetch-and-op.
A Parallel Virtual Machine for Efficient Scheme Compilation
- In Lisp and functional programming
, 1990
"... Programs compiled by Gambit, our Scheme compiler, achieve performance as much as twice that of the fastest available Scheme compilers. Gambit is easily ported, while retaining its high performance, through the use of a simple virtual machine (PVM). PVM allows a wide variety of machineindependent opt ..."
Abstract
-
Cited by 15 (8 self)
- Add to MetaCart
Programs compiled by Gambit, our Scheme compiler, achieve performance as much as twice that of the fastest available Scheme compilers. Gambit is easily ported, while retaining its high performance, through the use of a simple virtual machine (PVM). PVM allows a wide variety of machineindependent optimizations and it supports parallel computation based on the future construct. PVM conveys high-level information bidirectionally between the machine-independent front end of the compiler and the machine-dependent back end, making it easy to implement a number of common back end optimizations that are difficult to achieve for other virtual machines. PVM is similar to many real computer architectures and has an option to efficiently gather dynamic measurements of virtual machine usage. These measurements can be used in performance prediction for ports to other architectures as well as design decisions related to proposed optimizations and object representations.

