Results 1 - 10
of
10
The Implementation of the Cilk-5 Multithreaded Language
- In Proceedings of the SIGPLAN '98 Conference on Program Language Design and Implementation
, 1998
"... The fifth release of the multithreaded language Cilk uses a provably good "work-stealing" scheduling algorithm similar to the first system, but the language has been completely redesigned and the runtime system completely reengineered. The efficiency of the new implementation was aided by a clear st ..."
Abstract
-
Cited by 248 (20 self)
- Add to MetaCart
The fifth release of the multithreaded language Cilk uses a provably good "work-stealing" scheduling algorithm similar to the first system, but the language has been completely redesigned and the runtime system completely reengineered. The efficiency of the new implementation was aided by a clear strategy that arose from a theoretical analysis of the scheduling algorithm: concentrate on minimizing overheads that contribute to the work, even at the expense of overheads that contribute to the critical path. Although it may seem counterintuitive to move overheads onto the critical path, this "work-first" principle has led to a portable Cilk-5 implementation in which the typical cost of spawning a parallel thread is only between 2 and 6 times the cost of a C function call on a variety of contemporary machines. Many Cilk programs run on one processor with virtually no degradation compared to equivalent C programs. This paper describes how the work-first principle was exploited in the design...
No-Longer-Foreign: Teaching an ML compiler to speak C "natively"
, 2001
"... We present a new foreign-function interface for SML/NJ. It is based on the idea of data-level interoperability the ability of ML programs to inspect as well as manipulate C data structures directly. The core component of this work is an encoding of the almost complete C type system in ML types. [Var ..."
Abstract
-
Cited by 49 (0 self)
- Add to MetaCart
We present a new foreign-function interface for SML/NJ. It is based on the idea of data-level interoperability the ability of ML programs to inspect as well as manipulate C data structures directly. The core component of this work is an encoding of the almost complete C type system in ML types. [Variable-argument functions are the only feature of the C type system that we do not handle very well yet.] The encoding makes extensive use of a folklore typing trick, taking advantage of ML's polymorphism, its type constructors, its abstraction mechanisms, and even functors. A small low-level component which deals with C struct and union declarations as well as program linkage is hidden from the programmer's eye by a simple program-generator tool that translates C declarations to corresponding ML glue code.
Type-Preserving Garbage Collectors
, 2001
"... By combining existing type systems with standard typebased compilation techniques, we describe how to write strongly typed programs that include a function that acts as a tracing garbage collector for the program. Since the garbage collector is an explicit function, we do not need to provide a trust ..."
Abstract
-
Cited by 46 (4 self)
- Add to MetaCart
By combining existing type systems with standard typebased compilation techniques, we describe how to write strongly typed programs that include a function that acts as a tracing garbage collector for the program. Since the garbage collector is an explicit function, we do not need to provide a trusted garbage collector as a runtime service to manage memory. Since our language is strongly typed, the standard type soundness guarantee "Well typed programs do not go wrong" is extended to include the collector. Our type safety guarantee is non-trivial since not only does it guarantee the type safety of the garbage collector, but it guarantees that the collector preservers the type safety of the program being garbage collected. We describe the technique in detail and report performance measurements for a few microbenchmarks as well as sketch the proofs of type soundness for our system. 1 Introduction We outline an approach, based on ideas from existing type systems, to build a type-preser...
Contaminated Garbage Collection
, 2000
"... We describe a new method for determining when an object can be garbage collected. The method does not require marking live objects. Instead, each object X is dynamically associated with a stack frame , such that is collectable when pops. Because could havebeendead earlier, our method ..."
Abstract
-
Cited by 23 (3 self)
- Add to MetaCart
We describe a new method for determining when an object can be garbage collected. The method does not require marking live objects. Instead, each object X is dynamically associated with a stack frame , such that is collectable when pops. Because could havebeendead earlier, our method is conservative. Our results demonstrate that the method nonetheless identifies a large percentage of collectable objects. The method has been implemented in Sun's Java tm Virtual Machine interpreter, and results are presented based on this implementation.
Portable High-Performance Programs
, 1999
"... right notice and this permission notice are preserved on all copies. ..."
Abstract
-
Cited by 16 (0 self)
- Add to MetaCart
right notice and this permission notice are preserved on all copies.
The implementation of Lua 5.0
- Journal of Universal Computer Science
"... Abstract: We discuss the main novelties of the implementation of Lua 5.0: its registerbased virtual machine, the new algorithm for optimizing tables used as arrays, the implementation of closures, and the addition of coroutines. ..."
Abstract
-
Cited by 8 (4 self)
- Add to MetaCart
Abstract: We discuss the main novelties of the implementation of Lua 5.0: its registerbased virtual machine, the new algorithm for optimizing tables used as arrays, the implementation of closures, and the addition of coroutines.
SUDS: Automatic Parallelization for Raw Processors
, 2003
"... A computer can never be too fast or too cheap. Computer systems pervade nearly every aspect of science, engineering, communications and commerce because they perform certain tasks at rates unachievable by any other kind of system built by humans. A computer system 's throughput, however, is constrai ..."
Abstract
-
Cited by 4 (0 self)
- Add to MetaCart
A computer can never be too fast or too cheap. Computer systems pervade nearly every aspect of science, engineering, communications and commerce because they perform certain tasks at rates unachievable by any other kind of system built by humans. A computer system 's throughput, however, is constrained by that system 's ability to find concurrency. Given a particular target work load the computer architect's role is to design mechanisms to find and exploit the available concurrency in that work load.
Efficient and safe-for-space closure conversion
- ACM TOPLAS
"... Modern compilers often implement function calls (or returns) in two steps: first, a “closure ” environment is properly installed to provide access for free variables in the target program fragment; second, the control is transferred to the target by a “jump with arguments (or results). ” Closure con ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
Modern compilers often implement function calls (or returns) in two steps: first, a “closure ” environment is properly installed to provide access for free variables in the target program fragment; second, the control is transferred to the target by a “jump with arguments (or results). ” Closure conversion—which decides where and how to represent closures at runtime—is a crucial step in the compilation of functional languages. This paper presents a new algorithm that exploits the use of compile-time control and data flow information to optimize function calls. By extensive closure sharing and allocating as many closures in registers as possible, our new closure-conversion algorithm reduces heap allocation by 36 % and memory fetches for local and global variables by 43%; and improves the already efficient code generated by an earlier version of the Standard ML of New Jersey compiler by about 17 % on a DECstation 5000. Moreover, unlike most other approaches, our new closure-allocation scheme satisfies the strong safe-for-space-complexity rule, thus achieving good asymptotic space usage.

