Results 1 - 10
of
64
Dynamo: A Transparent Dynamic Optimization System
- ACM SIGPLAN Notices
, 2000
"... We describe the design and implementation of Dynamo, a software dynamic optimization system that is capable of transparently improving the performance of a native instruction stream as it executes on the processor. The input native instruction stream to Dynamo can be dynamically generated (by a JIT ..."
Abstract
-
Cited by 347 (1 self)
- Add to MetaCart
We describe the design and implementation of Dynamo, a software dynamic optimization system that is capable of transparently improving the performance of a native instruction stream as it executes on the processor. The input native instruction stream to Dynamo can be dynamically generated (by a JIT for example), or it can come from the execution of a statically compiled native binary. This paper evaluates the Dynamo system in the latter, more challenging situation, in order to emphasize the limits, rather than the potential, of the system. Our experiments demonstrate that even statically optimized native binaries can be accelerated Dynamo, and often by a significant degree. For example, the average performance of --O optimized SpecInt95 benchmark binaries created by the HP product C compiler is improved to a level comparable to their --O4 optimized version running without Dynamo. Dynamo achieves this by focusing its efforts on optimization opportunities that tend to manifest only at runtime, and hence opportunities that might be difficult for a static compiler to exploit. Dynamo's operation is transparent in the sense that it does not depend on any user annotations or binary instrumentation, and does not require multiple runs, or any special compiler, operating system or hardware support. The Dynamo prototype presented here is a realistic implementation running on an HP PA-8000 workstation under the HPUX 10.20 operating system.
The Cartesian Product Algorithm - Simple and Precise Type Inference of Parametric Polymorphism
, 1995
"... Concrete types and abstract types are different and serve different purposes. Concrete types, the focus of this paper, are essential to support compilation, application delivery, and debugging in object-oriented environments. Concrete types should not be obtained from explicit type declarations beca ..."
Abstract
-
Cited by 96 (3 self)
- Add to MetaCart
Concrete types and abstract types are different and serve different purposes. Concrete types, the focus of this paper, are essential to support compilation, application delivery, and debugging in object-oriented environments. Concrete types should not be obtained from explicit type declarations because their presence limits polymorphism unacceptably. This leaves us with type inference. Unfortunately, while polymorphism demands the use of type inference, it has also been the hardest challenge for type inference. We review previous type inference algorithms that analyze code with parametric polymorphism and then present a new one: the cartesian product algorithm. It improves precision and efficiency over previous algorithms and deals directly with inheritance, rather than relying on a preprocessor to expand it away. Last, but not least, it is conceptually simple. The cartesian product algorithm has been used in the Self system since late 1993. We present measurements to document its pe...
ADAPTIVE OPTIMIZATION FOR SELF: RECONCILING HIGH PERFORMANCE WITH EXPLORATORY PROGRAMMING
, 1994
"... Object-oriented programming languages confer many benefits, including abstraction, which lets the programmer hide
the details of an object’s implementation from the object’s clients. Unfortunately, crossing abstraction boundaries
often incurs a substantial run-time overhead in the form of frequent p ..."
Abstract
-
Cited by 95 (6 self)
- Add to MetaCart
Object-oriented programming languages confer many benefits, including abstraction, which lets the programmer hide
the details of an object’s implementation from the object’s clients. Unfortunately, crossing abstraction boundaries
often incurs a substantial run-time overhead in the form of frequent procedure calls. Thus, pervasive use of abstraction,
while desirable from a design standpoint, may be impractical when it leads to inefficient programs.
Aggressive compiler optimizations can reduce the overhead of abstraction. However, the long compilation times
introduced by optimizing compilers delay the programming environment‘s responses to changes in the program.
Furthermore, optimization also conflicts with source-level debugging. Thus, programmers are caught on the horns of
two dilemmas: they have to choose between abstraction and efficiency, and between responsive programming environments
and efficiency. This dissertation shows how to reconcile these seemingly contradictory goals by performing
optimizations lazily.
Four new techniques work together to achieve high performance and high responsiveness:
• Type feedback achieves high performance by allowing the compiler to inline message sends based on information
extracted from the runtime system. On average, programs run 1.5 times faster than the previous SELF system;
compared to a commercial Smalltalk implementation, two medium-sized benchmarks run about three times faster.
This level of performance is obtained with a compiler that is both simpler and faster than previous SELF compilers.
• Adaptive optimization achieves high responsiveness without sacrificing performance by using a fast nonoptimizing
compiler to generate initial code while automatically recompiling heavily used parts of the program
with an optimizing compiler. On a previous-generation workstation like the SPARCstation-2, fewer than 200
pauses exceeded 200 ms during a 50-minute interaction, and 21 pauses exceeded one second. On a currentgeneration
workstation, only 13 pauses exceed 400 ms.
• Dynamic deoptimization shields the programmer from the complexity of debugging optimized code by
transparently recreating non-optimized code as needed. No matter whether a program is optimized or not, it can
always be stopped, inspected, and single-stepped. Compared to previous approaches, deoptimization allows more
debugging while placing fewer restrictions on the optimizations that can be performed.
• Polymorphic inline caching generates type-case sequences on-the-fly to speed up messages sent from the same
call site to several different types of object. More significantly, they collect concrete type information for the
optimizing compiler.
With better performance yet good interactive behavior, these techniques make exploratory programming possible
both for pure object-oriented languages and for application domains requiring higher ultimate performance, reconciling
exploratory programming, ubiquitous abstraction, and high performance.
The Cassowary Linear Arithmetic Constraint Solving Algorithm: Interface and Implementation
- ACM TRANSACTIONS ON COMPUTER HUMAN INTERACTION
, 1998
"... Linear equality and inequality constraints arise naturally in specifying many aspects of user interfaces, such as requiring that one window be to the left of another, requiring that a pane occupy the leftmost 1/3 of a window, or preferring that an object be contained within a rectangle if possibl ..."
Abstract
-
Cited by 72 (9 self)
- Add to MetaCart
Linear equality and inequality constraints arise naturally in specifying many aspects of user interfaces, such as requiring that one window be to the left of another, requiring that a pane occupy the leftmost 1/3 of a window, or preferring that an object be contained within a rectangle if possible. Current constraint solvers designed for UI applications cannot e#ciently handle simultaneous linear equations and inequalities. This is a major limitation. We describe Cassowary---an incremental algorithm based on the dual simplex method that can solve such systems of constraints e#ciently. This informal technical report describes the latest version of the Cassowary algorithm. It is derived from the paper "Solving Linear Arithmetic Constraints for User Interface Applications" by Alan Borning, Kim Marriott, Peter Stuckey, and Yi Xiao [7], published in the UIST'97 Proceedings. The UIST paper also contains a description of QOCA, a closely related solver that finds least-squares solut...
Hierarchical Constraint Logic Programming
, 1993
"... A constraint describes a relation to be maintained ..."
Abstract
-
Cited by 67 (3 self)
- Add to MetaCart
A constraint describes a relation to be maintained
Software profiling for hot path prediction: less is more
- SIGPLAN Not
"... Recently, there has been a growing interest in exploiting profile information in adaptive systems such as just-in-time compilers, dynamic optimizers and, binary translators. In this paper, we show that sophisticated software profiling schemes that provide highly accurate information in an offline se ..."
Abstract
-
Cited by 61 (0 self)
- Add to MetaCart
Recently, there has been a growing interest in exploiting profile information in adaptive systems such as just-in-time compilers, dynamic optimizers and, binary translators. In this paper, we show that sophisticated software profiling schemes that provide highly accurate information in an offline setting are ill-suited for these dynamic code generation systems. We experimentally demonstrate that hot path predictions must be made early in order to control the rising cost of missed opportunity that result from the prediction delay. We also show that existing sophisticated path profiling schemes, if used in an online setting, offer no prediction advantages over simpler schemes that exhibit much lower runtime overheads. Based on these observation we developed a new low-overhead software profiling scheme for hot path prediction. Using an abstract metric we compare our scheme to path profile based prediction and show that our scheme achieves comparable prediction quality. In our second set of experiments we include runtime overhead and evaluate the performance of our scheme in a realistic application: Dynamo, a dynamic optimization system. The results show that our prediction scheme clearly outperforms path profile based prediction and thus confirm that less profiling as exhibited in our scheme will actually lead to more effective hot path prediction. 1.
Solving Linear Arithmetic Constraints for User Interface Applications: Algorithm Details
, 1997
"... Linear equality and inequality constraints arise naturally in specifying many aspects of user interfaces, such as requiring that one window be to the left of another, requiring that a pane occupy the leftmost 1/3 of a window, or preferring that an object be contained within a rectangle if possible ..."
Abstract
-
Cited by 59 (17 self)
- Add to MetaCart
Linear equality and inequality constraints arise naturally in specifying many aspects of user interfaces, such as requiring that one window be to the left of another, requiring that a pane occupy the leftmost 1/3 of a window, or preferring that an object be contained within a rectangle if possible. Current constraint solvers designed for UI applications cannot efficiently handle simultaneous linear equations and inequalities. This is a major limitation. We describe incremental algorithms based on the dual simplex and active set methods that can solve such systems of constraints efficiently. Both algorithms have been implemented and tested. This informal technical report is adapted from the paper "Solving Linear Arithmetic Constraints for User Interface Applications," which will appear in the Proceedings of UIST'97 (The ACM User Interface and Software Technology Symposium). It contains additional details, in particular of the Cassowary and QOCA algorithms.
Reconciling responsiveness with performance in pure object-oriented languages
- ACM TRANSACTIONS ON PROGRAMMING LANGUAGES AND SYSTEMS
, 1996
"... Dynamically-dispatched calls often limit the performance of object-oriented programs since object-oriented programming encourages factoring code into small, reusable units, thereby increasing the frequency of these expensive operations. Frequent calls not only slow down execution with the dispatch o ..."
Abstract
-
Cited by 55 (0 self)
- Add to MetaCart
Dynamically-dispatched calls often limit the performance of object-oriented programs since object-oriented programming encourages factoring code into small, reusable units, thereby increasing the frequency of these expensive operations. Frequent calls not only slow down execution with the dispatch overhead per se, but more importantly they hinder optimization by limiting the range and effectiveness of standard global optimizations. In particular, dynamicallydispatched calls prevent standard interprocedural optimizations that depend on the availability of a static call graph. The SELF implementation described here offers two novel approaches to optimization. Type feedback speculatively inlines dynamically-dispatched calls based on profile information that predicts likely receiver classes. Adaptive optimization reconciles optimizing compilation with interactive performance by incrementally optimizing only the frequently-executed parts of a program. When combined, these two techniques result in a system that can execute programs significantly faster than previous systems while retaining much of the interactiveness of an interpreted system.
A third-generation Self implementation: Reconciling responsiveness with performance
, 1994
"... Abstract: Programming systems should be both responsive (to support rapid development) and efficient (to complete computations quickly). Pure object-oriented languages are harder to implement efficiently since they need optimization to achieve good performance. Unfortunately, optimization conflicts ..."
Abstract
-
Cited by 50 (4 self)
- Add to MetaCart
Abstract: Programming systems should be both responsive (to support rapid development) and efficient (to complete computations quickly). Pure object-oriented languages are harder to implement efficiently since they need optimization to achieve good performance. Unfortunately, optimization conflicts with interactive responsiveness because it tends to produce long compilation pauses, leading to unresponsive programming environments. Therefore, to achieve good responsiveness, existing exploratory programming environments such as the Smalltalk-80 environment rely on interpretation or non-optimizing dynamic compilation. But such systems pay a price for their interactiveness, since they may execute programs several times slower than an optimizing system. SELF-93 reconciles high performance with responsiveness by combining a fast, non-optimizing compiler with a slower, optimizing compiler. The resulting system achieves both excellent performance (two or three times faster than existing Smalltalk systems) and good responsiveness. Except for situations requiring large applications to be (re)compiled from scratch, the system allows for pleasant interactive use with few perceptible compilation pauses. To our knowledge, SELF-93 is the first implementation of a pure object-oriented language achieving both good performance and good responsiveness. When measuring interactive pauses, it is imperative to treat multiple short pauses as one longer pause if the pauses occur in short succession, since they are perceived as one pause by the user. We propose a definition of pause clustering and show that clustering can make an order-of-magnitude difference in the pause time distribution.
Concrete Type Inference: Delivering Object-Oriented Applications
, 1995
"... Unlimited copying without fee is permitted provided that the copies are not made nor distributed for direct commercial advantage, and credit to the source is given. Otherwise, no part of this work covered by copyright hereon may be reproduced in any form or by any means graphic, electronic, or mecha ..."
Abstract
-
Cited by 49 (0 self)
- Add to MetaCart
Unlimited copying without fee is permitted provided that the copies are not made nor distributed for direct commercial advantage, and credit to the source is given. Otherwise, no part of this work covered by copyright hereon may be reproduced in any form or by any means graphic, electronic, or mechanical, including photocopying, recording, taping, or storage in an information retrieval system, without the prior written permission of the copyright owner. TRADEMARKS Sun, Sun Microsystems, and the Sun logo are trademarks or registered trademarks of Sun Microsystems, Inc. UNIX is a registered trademark in the United States and other countries, exclusively licensed through X/Open Company, Ltd. All SPARC trademarks, including the SCD Compliant Logo, are trademarks or registered trademarks of SPARC International, Inc. SPARCstation, SPARCserver, SPARCengine, SPARCworks, and SPARCompiler are licensed exclusively to Sun Microsystems, Inc. All other product names mentioned herein are the trademarks of their respective owners.

