• Documents
  • Authors
  • Tables
  • Other Seers ▼
    RefSeer AckSeer CollabSeer SeerSeer
  • Log in
  • Sign up
  • MetaCart

CiteSeerX logo

Advanced Search Include Citations
Advanced Search Include Citations | Disambiguate

The concert system - compiler and runtime support for e cient, ne-grained concurrent objectoriented programs (1993)

by Andrew A Chien, Vijay Karamcheti, John Plevyak
Add To MetaCart

Tools

Sorted by:
Results 1 - 10 of 35
Next 10 →

Supporting Dynamic Data Structures on Distributed-Memory Machines

by Anne Rogers, Martin C. Carlisle, John H. Reppy, Laurie J. Hendren , 1995
"... this article, we describe an execution model for supporting programs that use pointer-based dynamic data structures. This model uses a simple mechanism for migrating a thread of control based on the layout of heap-allocated data and introduces parallelism using a technique based on futures and lazy ..."
Abstract - Cited by 143 (8 self) - Add to MetaCart
this article, we describe an execution model for supporting programs that use pointer-based dynamic data structures. This model uses a simple mechanism for migrating a thread of control based on the layout of heap-allocated data and introduces parallelism using a technique based on futures and lazy task creation. We intend to exploit this execution model using compiler analyses and automatic parallelization techniques. We have implemented a prototype system, which we call Olden, that runs on the Intel iPSC/860 and the Thinking Machines CM-5. We discuss our implementation and report on experiments with five benchmarks.

ADAPTIVE OPTIMIZATION FOR SELF: RECONCILING HIGH PERFORMANCE WITH EXPLORATORY PROGRAMMING

by Urs Hölzle , 1994
"... Object-oriented programming languages confer many benefits, including abstraction, which lets the programmer hide the details of an object’s implementation from the object’s clients. Unfortunately, crossing abstraction boundaries often incurs a substantial run-time overhead in the form of frequent p ..."
Abstract - Cited by 95 (6 self) - Add to MetaCart
Object-oriented programming languages confer many benefits, including abstraction, which lets the programmer hide the details of an object’s implementation from the object’s clients. Unfortunately, crossing abstraction boundaries often incurs a substantial run-time overhead in the form of frequent procedure calls. Thus, pervasive use of abstraction, while desirable from a design standpoint, may be impractical when it leads to inefficient programs. Aggressive compiler optimizations can reduce the overhead of abstraction. However, the long compilation times introduced by optimizing compilers delay the programming environment‘s responses to changes in the program. Furthermore, optimization also conflicts with source-level debugging. Thus, programmers are caught on the horns of two dilemmas: they have to choose between abstraction and efficiency, and between responsive programming environments and efficiency. This dissertation shows how to reconcile these seemingly contradictory goals by performing optimizations lazily. Four new techniques work together to achieve high performance and high responsiveness: • Type feedback achieves high performance by allowing the compiler to inline message sends based on information extracted from the runtime system. On average, programs run 1.5 times faster than the previous SELF system; compared to a commercial Smalltalk implementation, two medium-sized benchmarks run about three times faster. This level of performance is obtained with a compiler that is both simpler and faster than previous SELF compilers. • Adaptive optimization achieves high responsiveness without sacrificing performance by using a fast nonoptimizing compiler to generate initial code while automatically recompiling heavily used parts of the program with an optimizing compiler. On a previous-generation workstation like the SPARCstation-2, fewer than 200 pauses exceeded 200 ms during a 50-minute interaction, and 21 pauses exceeded one second. On a currentgeneration workstation, only 13 pauses exceed 400 ms. • Dynamic deoptimization shields the programmer from the complexity of debugging optimized code by transparently recreating non-optimized code as needed. No matter whether a program is optimized or not, it can always be stopped, inspected, and single-stepped. Compared to previous approaches, deoptimization allows more debugging while placing fewer restrictions on the optimizations that can be performed. • Polymorphic inline caching generates type-case sequences on-the-fly to speed up messages sent from the same call site to several different types of object. More significantly, they collect concrete type information for the optimizing compiler. With better performance yet good interactive behavior, these techniques make exploratory programming possible both for pure object-oriented languages and for application domains requiring higher ultimate performance, reconciling exploratory programming, ubiquitous abstraction, and high performance.

Software Caching and Computation Migration in Olden

by Martin C. Carlisle, Anne Rogers , 1995
"... The goal of the Olden project is to build a system that provides parallelism for general purpose C programs with minimal programmer annotations. We focus on programs using dynamic structures such as trees, lists, and DAGs. We demonstrate that providing both software caching and computation migratio ..."
Abstract - Cited by 84 (0 self) - Add to MetaCart
The goal of the Olden project is to build a system that provides parallelism for general purpose C programs with minimal programmer annotations. We focus on programs using dynamic structures such as trees, lists, and DAGs. We demonstrate that providing both software caching and computation migration can improve the performance of these programs, and provide a compile-time heuristic that selects between them for each pointer dereference. We have implemented a prototype system on the Thinking Machines CM-5. We describe our implementation and report on experiments with ten benchmarks.

Concert -- Efficient Runtime Support for Concurrent Object-Oriented Programming Languages on Stock Hardware

by Vijay Karamcheti , Andrew A. Chien , 1993
"... Inefficient implementations of global namespaces, message passing, and thread scheduling on stock multicomputers have prevented concurrent object-oriented programming (COOP) languages from gaining widespread acceptance. Recognizing that the architectures of stock multicomputers impose a hierarchy of ..."
Abstract - Cited by 58 (11 self) - Add to MetaCart
Inefficient implementations of global namespaces, message passing, and thread scheduling on stock multicomputers have prevented concurrent object-oriented programming (COOP) languages from gaining widespread acceptance. Recognizing that the architectures of stock multicomputers impose a hierarchy of costs for these operations, we have described a runtime system which provides different versions of each primitive, exposing performance distinctions for optimization. We confirm the advantages of a cost-hierarchy based runtime system organization by showing a variation of two orders of magnitude in version costs for a CM5 implementation. Frequency measurements based on COOP application programs demonstrate that a 39 % invocation cost reduction is feasible by simply selecting cheaper versions of runtime operations.

A Comparison of Architectural Support for Messaging in the TMC CM-5 and the Cray T3D

by Vijay Karamcheti, Andrew A. Chien - In Proceedings of the International Symposium on Computer Architecture , 1995
"... Programming models based on messaging continue to be an important programming model for parallel machines. Messaging costs are strongly influenced by a machine's network interface architecture. We examine the impact of architectural support for messaging in two machines -- the TMC CM-5 and the Cray ..."
Abstract - Cited by 55 (15 self) - Add to MetaCart
Programming models based on messaging continue to be an important programming model for parallel machines. Messaging costs are strongly influenced by a machine's network interface architecture. We examine the impact of architectural support for messaging in two machines -- the TMC CM-5 and the Cray T3D -- by exploring the design and performance of several messaging implementations. The additional features in the T3D support remote operations: memory access, fetch-and-increment, atomic swaps, and prefetch. Experiments on the CM-5 show that requiring processor involvement for message reception can increase the communication overheads from 60% to 300% for moderate variations in computation grain size at the destination. In contrast, the T3D hardware for remote operations decouples message reception from processor activity, producing high-performance messaging independent of computation grain size or variability. In addition, hardware support for a shared address space in the T3D can be us...

ICC++ -- A C++ Dialect for High Performance Parallel Computing

by A. A. Chien, U. S. Reddy, Chien Reddy Plevyak, J. Dolby - In Proceedings of the 2nd International Symposium on Object Technologies for Advanced Software , 1996
"... ICC++ is a new C++ concurrent dialect which allows sequential/parallel program versions to be maintained with single source, the construction of concurrent data abstractions, convenient expression of irregular and fine-grained concurrency, and supports high performance implementations. ICC++ prov ..."
Abstract - Cited by 55 (10 self) - Add to MetaCart
ICC++ is a new C++ concurrent dialect which allows sequential/parallel program versions to be maintained with single source, the construction of concurrent data abstractions, convenient expression of irregular and fine-grained concurrency, and supports high performance implementations. ICC++ provides annotations for potential concurrency, facilitating both sharing source with sequential programs and grain size tuning for efficient execution. ICC++ has a notion of object consistency which can be extended structurally and procedurally to implement larger data abstractions. Finally, ICC++ integrates arrays into the object system and hence the concurrency model. In short, ICC++ addresses concurrency and its relation to abstractions -- whether they are implemented by single objects, several objects, or object collections. The design of the language, its rationale, and current status are all described. Keywords concurrent object-oriented programming, concurrent languages, parallel...

Towards Better Inlining Decisions Using Inlining Trials

by Jeffrey Dean, Craig Chambers - In Proceedings of the 1994 ACM Conference on LISP and Functional Programming , 1994
"... Inlining trials are a general mechanism for making better automatic decisions about whether a routine is profitable to inline. Unlike standard source-level inlining heuristics, an inlining trial captures the effects of optimizations applied to the body of the inlined routine when calculating the cos ..."
Abstract - Cited by 53 (6 self) - Add to MetaCart
Inlining trials are a general mechanism for making better automatic decisions about whether a routine is profitable to inline. Unlike standard source-level inlining heuristics, an inlining trial captures the effects of optimizations applied to the body of the inlined routine when calculating the costs and benefits of inlining. The results of inlining trials are stored in a persistent database to be reused when making future inlining decisions at similar call sites. Type group analysis can determine the amount of available static information exploited during compilation, and the results of analyzing the compilation of an inlined routine help decide when a future call site would lead to substantially the same generated code as a given inlining trial. We have implemented inlining trials and type group analysis in an optimizing compiler for SELF, and by making wiser inlining decisions we were able to cut compilation time and compiled code space with virtually no loss of execution speed. We...

Lightweight Run-Time Code Generation

by Mark Leone, Peter Lee - Department of Computer Science, University of Melbourne , 1994
"... Run-time code generation is an alternative and complement to compile-time program analysis and optimization. Static analyses are inherently imprecise because most interesting aspects of run-time behavior are uncomputable. By deferring aspects of compilation to run time, more precise information abou ..."
Abstract - Cited by 52 (5 self) - Add to MetaCart
Run-time code generation is an alternative and complement to compile-time program analysis and optimization. Static analyses are inherently imprecise because most interesting aspects of run-time behavior are uncomputable. By deferring aspects of compilation to run time, more precise information about program behavior can be exploited, leading to greater opportunities for code improvement. The cost of performing optimization at run time is of paramount importance, since it must be repaid by improved performance in order to obtain an overall speedup. This paper describes a lightweight approach to run-time code generation, called deferred compilation, in which compile-time specialization is employed to reduce the cost of optimizing and generating code at run time. Implementation strategies developed for a prototype compiler are discussed, and the results of preliminary experiments demonstrating significant overall speedup are presented. 1 Introduction Many compiler optimizations depend ...

Concrete Type Inference: Delivering Object-Oriented Applications

by Ole Agesen , 1995
"... Unlimited copying without fee is permitted provided that the copies are not made nor distributed for direct commercial advantage, and credit to the source is given. Otherwise, no part of this work covered by copyright hereon may be reproduced in any form or by any means graphic, electronic, or mecha ..."
Abstract - Cited by 49 (0 self) - Add to MetaCart
Unlimited copying without fee is permitted provided that the copies are not made nor distributed for direct commercial advantage, and credit to the source is given. Otherwise, no part of this work covered by copyright hereon may be reproduced in any form or by any means graphic, electronic, or mechanical, including photocopying, recording, taping, or storage in an information retrieval system, without the prior written permission of the copyright owner. TRADEMARKS Sun, Sun Microsystems, and the Sun logo are trademarks or registered trademarks of Sun Microsystems, Inc. UNIX is a registered trademark in the United States and other countries, exclusively licensed through X/Open Company, Ltd. All SPARC trademarks, including the SCD Compliant Logo, are trademarks or registered trademarks of SPARC International, Inc. SPARCstation, SPARCserver, SPARCengine, SPARCworks, and SPARCompiler are licensed exclusively to Sun Microsystems, Inc. All other product names mentioned herein are the trademarks of their respective owners.

Obtaining Sequential Efficiency for Concurrent Object-Oriented Languages

by John Plevyak, Xingbin Zhang, Andrew A. Chien - In Proceedings of the ACM Symposium on the Principles of Programming Languages , 1995
"... Concurrent object-oriented programming (COOP) languages focus the abstraction and encapsulation power of abstract data types on the problem of concurrency control. In particular, pure fine-grained concurrent object-oriented languages (as opposed to hybrid or data parallel) provides the programmer wi ..."
Abstract - Cited by 47 (15 self) - Add to MetaCart
Concurrent object-oriented programming (COOP) languages focus the abstraction and encapsulation power of abstract data types on the problem of concurrency control. In particular, pure fine-grained concurrent object-oriented languages (as opposed to hybrid or data parallel) provides the programmer with a simple, uniform, and flexible model while exposing maximum concurrency. While such languages promise to greatly reduce the complexity of large-scale concurrent programming, the popularity of these languages has been hampered by efficiency which is often many orders of magnitude less than that of comparable sequential code. We present a sufficient set of techniques which enables the efficiency of fine-grained concurrent object-oriented languages to equal that of traditional sequential languages (like C) when the required data is available. These techniques are empirically validated by the application to a COOP implementation of the Livermore Loops. 1 Introduction The increasing use of ...
The National Science Foundation
  • About CiteSeerX
  • Submit Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2010 The Pennsylvania State University