Results 1 -
6 of
6
ADAPTIVE OPTIMIZATION FOR SELF: RECONCILING HIGH PERFORMANCE WITH EXPLORATORY PROGRAMMING
, 1994
"... Object-oriented programming languages confer many benefits, including abstraction, which lets the programmer hide
the details of an object’s implementation from the object’s clients. Unfortunately, crossing abstraction boundaries
often incurs a substantial run-time overhead in the form of frequent p ..."
Abstract
-
Cited by 95 (6 self)
- Add to MetaCart
Object-oriented programming languages confer many benefits, including abstraction, which lets the programmer hide
the details of an object’s implementation from the object’s clients. Unfortunately, crossing abstraction boundaries
often incurs a substantial run-time overhead in the form of frequent procedure calls. Thus, pervasive use of abstraction,
while desirable from a design standpoint, may be impractical when it leads to inefficient programs.
Aggressive compiler optimizations can reduce the overhead of abstraction. However, the long compilation times
introduced by optimizing compilers delay the programming environment‘s responses to changes in the program.
Furthermore, optimization also conflicts with source-level debugging. Thus, programmers are caught on the horns of
two dilemmas: they have to choose between abstraction and efficiency, and between responsive programming environments
and efficiency. This dissertation shows how to reconcile these seemingly contradictory goals by performing
optimizations lazily.
Four new techniques work together to achieve high performance and high responsiveness:
• Type feedback achieves high performance by allowing the compiler to inline message sends based on information
extracted from the runtime system. On average, programs run 1.5 times faster than the previous SELF system;
compared to a commercial Smalltalk implementation, two medium-sized benchmarks run about three times faster.
This level of performance is obtained with a compiler that is both simpler and faster than previous SELF compilers.
• Adaptive optimization achieves high responsiveness without sacrificing performance by using a fast nonoptimizing
compiler to generate initial code while automatically recompiling heavily used parts of the program
with an optimizing compiler. On a previous-generation workstation like the SPARCstation-2, fewer than 200
pauses exceeded 200 ms during a 50-minute interaction, and 21 pauses exceeded one second. On a currentgeneration
workstation, only 13 pauses exceed 400 ms.
• Dynamic deoptimization shields the programmer from the complexity of debugging optimized code by
transparently recreating non-optimized code as needed. No matter whether a program is optimized or not, it can
always be stopped, inspected, and single-stepped. Compared to previous approaches, deoptimization allows more
debugging while placing fewer restrictions on the optimizations that can be performed.
• Polymorphic inline caching generates type-case sequences on-the-fly to speed up messages sent from the same
call site to several different types of object. More significantly, they collect concrete type information for the
optimizing compiler.
With better performance yet good interactive behavior, these techniques make exploratory programming possible
both for pure object-oriented languages and for application domains requiring higher ultimate performance, reconciling
exploratory programming, ubiquitous abstraction, and high performance.
A Parallel, Real-Time Garbage Collector
, 2001
"... A'(=$B#127$C7D-7E"#%9F< >$7'(-7:;<<"G$&%- 12-*#)+1+)H7IJ->0" ;<<":'(%- 1+687)29:K*,< B>0"$%L.M.D.&%<1+12%&%7< 'K)2$"#=$)2; >%"ON5<'.$D- '(="P6 9F%9:< '(IF9?B127)+/#'(<&=%$$< '($C->0"Q)2$A*0-$%"R<>S- >S%- '1+)2('C&%<1+12%&%7< ' -12;<')27D09UT VW84X.D)2&YDG/'("$\< >]7D0C7)29FC- >I ..."
Abstract
-
Cited by 80 (11 self)
- Add to MetaCart
A'(=$B#127$C7D-7E"#%9F< >$7'(-7:;<<"G$&%- 12-*#)+1+)H7IJ->0" ;<<":'(%- 1+687)29:K*,< B>0"$%L.M.D.&%<1+12%&%7< 'K)2$"#=$)2; >%"ON5<'.$D- '(="P6 9F%9:< '(IF9?B127)+/#'(<&=%$$< '($C->0"Q)2$A*0-$%"R<>S- >S%- '1+)2('C&%<1+12%&%7< ' -12;<')27D09UT VW84X.D)2&YDG/'(<Z)2"#%"R[@%"R*,<B#>"$\< >]7D0C7)29FC- >I 7D'(%-"R9CB0$7./- B$\N5< 'K&=< 1+12%&%7)2<>^LE_\<X.=Z3'4$)+>&%`< B'E%- '1+)23' -12;<')27D09aX.-$"#%$)2;>0="bN5<'K$)29c/12c->0-12I$)2$%40)27.D0-"R$<9Fd)29c6 /'(-&=7)2&%- 1.Ne%-7B'(%$%L`M.D#)2$C/0-/,3'C/'(%$3>7$`7D0`%@73>$)2< >$c>%&36 =$$- '(IfNe< 'G-!/#'(-&%7)2&%- 1`)29c/12%9:3>7-7)2< >hgi'(%"B0&()+>0;j%@&=%$$)2Z )+>7('12%-Z)+>0;4kD- >"1+)+>0;!$7-&l$\->0"!; 12<*0-1Z -')2- *12%$%4^'(%"B&3)+>0; "< B*12O-1+12<&%-7)2<>^4->0"G$/,=&3)2- 17'(%-79F3>7C< N12-'(;:->0"!$9:- 1+1 <*m%&%7$%LonK>i)29c/12%9:3>7-7)2< >o*-$%"j<>p7D0G9:<")+[%"q-12;<6 ')27D9r)2$G%Z- 1+B-7%"p< >p-J$=7R< NQstvuPwGxq*y3>0&D9:- 'l$G<>pu B>Jz>73'/')2$bs={{{{#4|-O}~ 68X.-IGd127'(-u/- '(&(6K9?B127)+/'(<&%%$6 $<'L!M<G7D0:*,%$7`< NE<B#':l><X.12%"#;4.7D)2$c)2$:7D:[#'($7`)29c/#12(6 9F3>7-7)2<>J<N-c/- '(-1+123154^'(%-1+67)29Fc;- '*-;`&%<1+12%&%7< 'L M.Dc-Z('(-;Q&%<1+12%&%7<':$/,=%"PB/J)2$FPL t:-7CO/'(<&%%$$<'($c->0" sPL -7FVb/'(<&%%$$<'($%LjwG-@)29CB09r/- B$G7)29:%$:'(- >;RN'(<9 j9F$7<ti9:$%Lr>o&%<>7'(-$7%4:-i>< >65)+>0&('(%9:3>7-1G&%<1+12%&%7< ' X.D%7D3'G;3>0('(-7)2< >- 1< 'R><7(:...
Message dispatch on pipelined processors
- In ECOOP'95 Conference Proceedings
, 1995
"... Abstract. Object-oriented systems must implement message dispatch efficiently in order not to penalize the object-oriented programming style. We characterize the performance of most previously published dispatch techniques for both statically- and dynamically-typed languages with both single and mul ..."
Abstract
-
Cited by 22 (1 self)
- Add to MetaCart
Abstract. Object-oriented systems must implement message dispatch efficiently in order not to penalize the object-oriented programming style. We characterize the performance of most previously published dispatch techniques for both statically- and dynamically-typed languages with both single and multiple inheritance. Hardware organization (in particular, branch latency and superscalar instruction issue) significantly impacts dispatch performance. For example, inline caching may outperform C++-style “vtables ” on deeply pipelined processors even though it executes more instructions per dispatch. We also show that adding support for dynamic typing or multiple inheritance does not significantly impact dispatch speed for most techniques, especially on superscalar machines. Instruction space overhead (calling sequences) can exceed the space cost of data structures (dispatch tables), so that minimal table size may not imply minimal run-time space usage.
A high-performance distributed object-oriented system: The MUSHROOM Project
, 1991
"... Instruction Set. An assembler for this instruction set was built, and a number of programs written to evaluate the architecture and to test ideas. Ideas and results from this abstract machine were used in the design of the real hardware, and the implementation of the simulator assisted in the desig ..."
Abstract
- Add to MetaCart
Instruction Set. An assembler for this instruction set was built, and a number of programs written to evaluate the architecture and to test ideas. Ideas and results from this abstract machine were used in the design of the real hardware, and the implementation of the simulator assisted in the design of the later simulator and other software tools. 2.5.7 Other Issues Both technological and architectural improvements to conventional RISC processors have been progressing at a phenomenal rate. The resulting performance is considerable and still improving rapidly, with techniques such as "superscalar " architectures and instruction-level parallelism being introduced into production workstation systems. This has allowed VM-based implementations of Smalltalk, such as PS, to have gained in performance during the period of the project. Raw performance cannot be the aim of a single prototype machine; instead, architectural effectiveness, or the performance to clock period ratio, is the impor...
.1.1 Method Tables
"... m in c can use the stored method. This technique is called lookup cache and is reported to improve the overall performance of an implementation of the pure object-oriented language Smalltalk by as much as 37% [UP83]. To implement 22 CHAPTER 2 the technique, hash tables are used for method tables. Th ..."
Abstract
- Add to MetaCart
m in c can use the stored method. This technique is called lookup cache and is reported to improve the overall performance of an implementation of the pure object-oriented language Smalltalk by as much as 37% [UP83]. To implement 22 CHAPTER 2 the technique, hash tables are used for method tables. The advantage of lookup caches over method tables is that only those methods are recorded that are actually needed at runtime. The disadvantage is that the lookup needs to be performed the first time a method is used in a given class. This may lead to less reliable runtimes in runtime critical applications. 2.1.3 Inline Caches The key to this optimization is the observation that for a particular instruction for object /method application in the code the object to which the instruction is applied may change frequently, but the class of these objects changes much less frequently. It is this class th
Dynamic Grouping in an Object Oriented Virtual Memory Hierarchy
- Proceedings of the 1987 European Conference on Object-Oriented Programming, Lecture Notes in Computer Science
, 1987
"... Object oriented programming environments frequently suffer serious performance degradation because of a high level of paging activity when implemented using a conventional virtual memory system. Although the fine-grained, persistent nature of objects in such environments is not conducive to efficien ..."
Abstract
- Add to MetaCart
Object oriented programming environments frequently suffer serious performance degradation because of a high level of paging activity when implemented using a conventional virtual memory system. Although the fine-grained, persistent nature of objects in such environments is not conducive to efficient paging, the performance degradation can be limited by careful grouping of objects within pages. Such object placement schemes can be classified into four categories --- the grouping mechanism may be either static or dynamic and may use information acquired from static or dynamic properties. This paper investigates the effectiveness of a simple dynamic grouping strategy based on dynamic behaviour and compares it with a static grouping scheme based on static properties. These schemes are also compared with near-optimal and random cases. 1 Introduction Virtual memory enables programs much larger than primary memory to be implemented without the need for explicit management of primary memory...

