Results 1 - 10
of
17
Shasta: A Low Overhead, Software-Only Approach . . . .
- IN PROCEEDINGS OF THE SEVENTH INTERNATIONAL CONFERENCE ON ARCHITECTURAL SUPPORT FOR PROGRAMMING LANGUAGES AND OPERATING SYSTEMS
, 1996
"... This paper describes Shasta, a system that supports a shared address space in software on clusters of computers with physically distributed memory. A unique aspect of Shasta compared to most other software distributed shared memory systems is that shared data can be kept coherent at a fine granu ..."
Abstract
-
Cited by 207 (5 self)
- Add to MetaCart
This paper describes Shasta, a system that supports a shared address space in software on clusters of computers with physically distributed memory. A unique aspect of Shasta compared to most other software distributed shared memory systems is that shared data can be kept coherent at a fine granularity. In addition, the system allows the coherence granularity to vary across different shared data structures in a single application. Shasta implements the shared address space by transparently rewriting the application executable to intercept loads and stores. For each shared load or store, the inserted code checks to see if the data is available locally and communicates with other processors if necessary. The system uses numerous techniques to reduce the run-time overhead of these checks. Since Shasta is implemented entirely in software, it also provides tremendous flexibility in supporting different types of cache coherence protocols. We have implemented an efficient cache co...
ADAPTIVE OPTIMIZATION FOR SELF: RECONCILING HIGH PERFORMANCE WITH EXPLORATORY PROGRAMMING
, 1994
"... Object-oriented programming languages confer many benefits, including abstraction, which lets the programmer hide
the details of an object’s implementation from the object’s clients. Unfortunately, crossing abstraction boundaries
often incurs a substantial run-time overhead in the form of frequent p ..."
Abstract
-
Cited by 95 (6 self)
- Add to MetaCart
Object-oriented programming languages confer many benefits, including abstraction, which lets the programmer hide
the details of an object’s implementation from the object’s clients. Unfortunately, crossing abstraction boundaries
often incurs a substantial run-time overhead in the form of frequent procedure calls. Thus, pervasive use of abstraction,
while desirable from a design standpoint, may be impractical when it leads to inefficient programs.
Aggressive compiler optimizations can reduce the overhead of abstraction. However, the long compilation times
introduced by optimizing compilers delay the programming environment‘s responses to changes in the program.
Furthermore, optimization also conflicts with source-level debugging. Thus, programmers are caught on the horns of
two dilemmas: they have to choose between abstraction and efficiency, and between responsive programming environments
and efficiency. This dissertation shows how to reconcile these seemingly contradictory goals by performing
optimizations lazily.
Four new techniques work together to achieve high performance and high responsiveness:
• Type feedback achieves high performance by allowing the compiler to inline message sends based on information
extracted from the runtime system. On average, programs run 1.5 times faster than the previous SELF system;
compared to a commercial Smalltalk implementation, two medium-sized benchmarks run about three times faster.
This level of performance is obtained with a compiler that is both simpler and faster than previous SELF compilers.
• Adaptive optimization achieves high responsiveness without sacrificing performance by using a fast nonoptimizing
compiler to generate initial code while automatically recompiling heavily used parts of the program
with an optimizing compiler. On a previous-generation workstation like the SPARCstation-2, fewer than 200
pauses exceeded 200 ms during a 50-minute interaction, and 21 pauses exceeded one second. On a currentgeneration
workstation, only 13 pauses exceed 400 ms.
• Dynamic deoptimization shields the programmer from the complexity of debugging optimized code by
transparently recreating non-optimized code as needed. No matter whether a program is optimized or not, it can
always be stopped, inspected, and single-stepped. Compared to previous approaches, deoptimization allows more
debugging while placing fewer restrictions on the optimizations that can be performed.
• Polymorphic inline caching generates type-case sequences on-the-fly to speed up messages sent from the same
call site to several different types of object. More significantly, they collect concrete type information for the
optimizing compiler.
With better performance yet good interactive behavior, these techniques make exploratory programming possible
both for pure object-oriented languages and for application domains requiring higher ultimate performance, reconciling
exploratory programming, ubiquitous abstraction, and high performance.
Beltway: Getting Around Garbage Collection Gridlock
- PLDI'02
, 2002
"... We present the design and implementation of a new garbage collection framework that significantly generalizes existing copying collectors. The Beltway framework exploits and separates object age and incrementality. It groups objects in one or more increments on queues called belts, collects belts in ..."
Abstract
-
Cited by 59 (16 self)
- Add to MetaCart
We present the design and implementation of a new garbage collection framework that significantly generalizes existing copying collectors. The Beltway framework exploits and separates object age and incrementality. It groups objects in one or more increments on queues called belts, collects belts independently, and collects increments on a belt in first-in-first-out order. We show that Beltway configurations, selected by command line options, act and perform the same as semi-space, generational, and older-first collectors, and encompass all previous copying collectors of which we are aware.
Age-Based Garbage Collection
- In Proceedings of SIGPLAN 1999 Conference on Object-Oriented Programming, Languages, & Applications
, 1999
"... Modern generational garbage collectors look for garbage among the young objects, because they have high mortality; however, these objects include the very youngest objects, which clearly are still live. We introduce new garbage collection algorithms, called age-based, some of which postpone consider ..."
Abstract
-
Cited by 45 (13 self)
- Add to MetaCart
Modern generational garbage collectors look for garbage among the young objects, because they have high mortality; however, these objects include the very youngest objects, which clearly are still live. We introduce new garbage collection algorithms, called age-based, some of which postpone consideration of the youngest objects. Collecting less than the whole heap requires write barrier mechanisms to track pointers into the collected region. We describe here a new, efficient write barrier implementation that works for age-based and traditional generational collectors. To compare several collectors, their configurations, and program behavior, we use an accurate simulator that models all heap objects and the pointers among them, but does not model cache or other memory effects. For object-oriented languages, our results demonstrate that an older-first collector, which collects older objects before the youngest ones, copies on average much less data than generational collectors. Our resul...
Barriers: Friend or Foe?
, 2004
"... Modern garbage collectors rely on read and write barriers imposed on heap accesses by the mutator, to keep track of references between different regions of the garbage collected heap, and to synchronize actions of the mutator with those of the collector. It has been a long-standing untested assumpti ..."
Abstract
-
Cited by 32 (5 self)
- Add to MetaCart
Modern garbage collectors rely on read and write barriers imposed on heap accesses by the mutator, to keep track of references between different regions of the garbage collected heap, and to synchronize actions of the mutator with those of the collector. It has been a long-standing untested assumption that barriers impose significant overhead to garbage-collected applications. As a result, researchers have devoted effort to development of optimization approaches for elimination of unnecessary barriers, or proposed new algorithms for garbage collection that avoid the need for barriers while retaining the capability for independent collection of heap partitions. On the basis of the results presented here, we dispel the assumption that barrier overhead should be a primary motivator for such efforts. We present a
The Case for Profile-Directed Selection of Garbage Collectors
, 2000
"... Many garbage-cE6zcc systems use a single garbagecrb lecbag algorithmacrit allapplicRz"FRN It has long been known that thisci pro duc poor performanc onapplic# tions forwhic h thatcatz#E6N is not well suited. In some systems,suc h as those thatexec#6 stand-alonectand-a execd-alone an appropriatecppro ..."
Abstract
-
Cited by 26 (3 self)
- Add to MetaCart
Many garbage-cE6zcc systems use a single garbagecrb lecbag algorithmacrit allapplicRz"FRN It has long been known that thisci pro duc poor performanc onapplic# tions forwhic h thatcatz#E6N is not well suited. In some systems,suc h as those thatexec#6 stand-alonectand-a execd-alone an appropriatecppropri foreac happlic"FER ca be selecz" from a pool of availablecblezqS96 and tuned by using profile information. In a study of 20 benc hmarks and several cz99EOz"FS cz99EO with the Marmot optimizing Java-to-native c#qRO#z" for everycyzq#SFO there was at least one benc hmark that would have been at least 15% faster with a more appropriatecpropriat The czqE69Oz" are acO ying cgz9qFqz" a generationalce yingcgzqq6Oqz whic h is cz bined witheac h of 4 di#erent write barriers, and the null c6NN9z"EF whic h allo cloz but neverczq#qRSz A detailed analysis of storage managementc#Nq shows how they vary by applicEq#9 and cz#q9SFz" 1. INTRODUCTION Automatic storage management eliminates a significz t so...
Lightweight Support for Fine-Grained Persistence on Stock Hardware
, 1995
"... LIGHTWEIGHT SUPPORT FOR FINE-GRAINED PERSISTENCE ON STOCK HARDWARE FEBRUARY 1995 ANTONY LLOYD HOSKING B.Sc., UNIVERSITY OF ADELAIDE M.Sc., UNIVERSITY OF WAIKATO Ph.D., UNIVERSITY OF MASSACHUSETTS AMHERST Directed by: Professor J. Eliot B. Moss Persistent programming languages combine the features of ..."
Abstract
-
Cited by 11 (7 self)
- Add to MetaCart
LIGHTWEIGHT SUPPORT FOR FINE-GRAINED PERSISTENCE ON STOCK HARDWARE FEBRUARY 1995 ANTONY LLOYD HOSKING B.Sc., UNIVERSITY OF ADELAIDE M.Sc., UNIVERSITY OF WAIKATO Ph.D., UNIVERSITY OF MASSACHUSETTS AMHERST Directed by: Professor J. Eliot B. Moss Persistent programming languages combine the features of database systems and programming languages to allow the seamless manipulation of both short- and long-term data, thus relieving programmers of the burden of distinguishing between data that is transient (temporarily allocated in main memory) or persistent (residing permanently on disk). Secondary storage concerns, including the representation and management of persistent data, are directly handled by the programming language implementation, rather than the programmer. Moreover, unlike traditional database systems, persistent programming languages extend to persistent data all the data structuring features supported by the language, not just those imposed by the underlying database system. P...
In or out? Putting write barriers in their place
- IN ACM SIGPLAN INTERNATIONAL SYMPOSIUM ON MEMORY MANAGEMENT (ISMM
, 2002
"... In many garbage collected systems, the mutator performs a write barrier for every pointer update. Using generational garbage collectors, we study in depth three code placement options for rememberedset write barriers: inlined, out-of-line, and partially inlined (fast path inlined, slow path out-of-l ..."
Abstract
-
Cited by 8 (1 self)
- Add to MetaCart
In many garbage collected systems, the mutator performs a write barrier for every pointer update. Using generational garbage collectors, we study in depth three code placement options for rememberedset write barriers: inlined, out-of-line, and partially inlined (fast path inlined, slow path out-of-line). The fast path determines if the collector needs to remember the pointer update. The slow path records the pointer in a list when necessary. Efficient implementations minimize the instructions on the fast path, and record few pointers (from 0.16 to 3 % of pointer stores in our benchmarks). We find the mutator performs best with a partially inlined barrier, by a modest 1.5 % on average over full inlining. We also study the compilation cost of write-barrier code placement. We find that partial inlining reduces the compilation cost by 20 to 25 % compared to full inlining. In the context of just-in-time compilation, the application is exposed to compiler activity. Regardless of the level of compiler activity, partial inlining consistently gives a total running time performance advantage over full inlining on the SPEC JVM98 benchmarks. When the compiler optimizes all application methods on demand and compiler load is highest, partial inlining improves total performance on average by 10.2%, and up to 18.5%.
Lightweight Write Detection and Checkpointing for Fine-Grained Persistence
, 1995
"... INTRODUCTION A persistent system #Atkinson et al. 1982; Atkinson et al. 1983; Atkinson et al. 1983; Atkinson and Buneman 1987# maintains data independently of the transitory programs that create and manipulate that data---data may outlive their creators, and be manipulated by yet other programs. To ..."
Abstract
-
Cited by 6 (4 self)
- Add to MetaCart
INTRODUCTION A persistent system #Atkinson et al. 1982; Atkinson et al. 1983; Atkinson et al. 1983; Atkinson and Buneman 1987# maintains data independently of the transitory programs that create and manipulate that data---data may outlive their creators, and be manipulated by yet other programs. To achieve this, persistent systems provide an abstraction of persistent storage, which programmers view as a stable This work has been supported by the National Science Foundation under grants CCR-9211272, CCR-8658074 and DCR-8500332, and by the following companies and corporations: Sun Microsystems, Digital Equipment, Apple Computer, GTE Laboratories, Eastman Kodak, General Electric, ParcPlace Systems, Xerox, and Tektronix. Name: Antony L. Hosking A#liation: Purdue University Address: Department of Computer Sciences, Purdue University,West Lafayette, IN 47907-1398, hosking@cs.purdue.edu Name: J. Eliot B. Moss A#liation: University of Massachuset
XMem: Type-Safe, Transparent, Shared Memory for Cross-Runtime Communication and Coordination
"... Developers commonly build contemporary enterprise applications using type-safe, component-based platforms, such as J2EE, and architect them to comprise multiple tiers, such as a web container, application server, and database engine. Administrators increasingly execute each tier in its own managed r ..."
Abstract
-
Cited by 5 (2 self)
- Add to MetaCart
Developers commonly build contemporary enterprise applications using type-safe, component-based platforms, such as J2EE, and architect them to comprise multiple tiers, such as a web container, application server, and database engine. Administrators increasingly execute each tier in its own managed runtime environment (MRE) to improve reliability and to manage system complexity through the fault containment and modularity offered by isolated MRE instances. Such isolation, however, necessitates expensive cross-tier communication based on protocols such as object serialization and remote procedure calls. Administrators commonly co-locate communicating MREs on a single host to reduce communication overhead and to better exploit increasing numbers of available processing cores. However, state-of-the-art MREs offer no support for more efficient communication between co-located

