Results 1 - 10
of
34
Avoiding Conflict Misses Dynamically in Large Direct-Mapped Caches
- In Proceedings of the 6th International Conference on Architectural Support for Programming Languages and Operating Systems
, 1994
"... This paper describes a method for improving the performance of a large direct-mapped cache by reducing the number of conflict misses. Our solution consists of two components: an inexpensive hardware device called a Cache Miss Lookaside (CML) buffer that detects conflicts by recording and summarizing ..."
Abstract
-
Cited by 96 (4 self)
- Add to MetaCart
This paper describes a method for improving the performance of a large direct-mapped cache by reducing the number of conflict misses. Our solution consists of two components: an inexpensive hardware device called a Cache Miss Lookaside (CML) buffer that detects conflicts by recording and summarizing a history of cache misses, and a software policy within the operating system's virtual memory system that removes conflicts by dynamically remapping pages whenever large numbers of conflict misses are detected. Using trace-driven simulation of applications and the operating system, we show that a CML buffer enables a large direct-mapped cache to perform nearly as well as a two-way set associative cache of equivalent size and speed, although with lower hardware cost and complexity. 1 Introduction In this paper we describe a dynamic method to eliminate conflict misses in large direct-mapped physically indexed caches. Conflicts are caused by interleaved references to words in memory that are...
Software Write Detection for a Distributed Shared Memory
- IN PROCEEDINGS OF THE FIRST USENIX SYMPOSIUM ON OPERATING SYSTEM DESIGN AND IMPLEMENTATION
, 1994
"... Most software-based distributed shared memory (DSM) systems rely on the operating system's virtual memory interface to detect writes to shared data. Strategies based on virtual memory page protection create two problems for a DSM system. First, writes can have high overhead since they are detected w ..."
Abstract
-
Cited by 93 (0 self)
- Add to MetaCart
Most software-based distributed shared memory (DSM) systems rely on the operating system's virtual memory interface to detect writes to shared data. Strategies based on virtual memory page protection create two problems for a DSM system. First, writes can have high overhead since they are detected with a page fault. As a result, a page must be writtenmany times to amortize the cost of that fault. Second, the size of a virtual memory page is too big to serve as a unit of coherency, inducing false sharing. Mechanisms to handle false sharing can increase runtime overhead and may cause data to be unnecessarily communicated between processors. In this paper, we present a new method for write detection that solves these problems. Our method relies on the compiler and runtime system to detect writes to shared data without invoking the operating system. We measure and compare implementations of a distributed shared memory system using both strategies, virtual memory and compiler /runtime, run...
Hardware and Software Support for Efficient Exception Handling
, 1994
"... Program-synchronous exceptions, for example, breakpoints, watchpoints, illegal opcodes, and memory access violations, provide information about exceptional conditions, interrupting the program and vectoring to an operating system handler. Over the last decade, however, programs and run-time systems ..."
Abstract
-
Cited by 73 (1 self)
- Add to MetaCart
Program-synchronous exceptions, for example, breakpoints, watchpoints, illegal opcodes, and memory access violations, provide information about exceptional conditions, interrupting the program and vectoring to an operating system handler. Over the last decade, however, programs and run-time systems have increasingly employed these mechanisms as a performance optimization to detect normal and expected conditions. Unfortunately, current architecture and operating system structures are designed for exceptional or erroneous conditions, where performance is of secondary importance, rather than normal conditions. Consequently, this has limited the practicality of such hardware-based detection mechanisms. We propose both hardware and software structures that permit efficient handling of synchronous exceptions by user-level code. We demonstrate a software implementation that reduces exceptiondelivery cost by an order-of-magnitude on current RISC processors, and show the performance benefits o...
A Generational Mostly-concurrent Garbage Collector
- IN PROCEEDINGS OF THE INTERNATIONAL SYMPOSIUM ON MEMORY MANAGEMENT
, 2000
"... This paper reports our experiences with a mostly-concurrent incremental garbage collector, implemented in the context of a high performance virtual machine for the Java^TM programming language. The garbage collector is based on the "mostly parallel" collection algorithm of Boehm et al., and can b ..."
Abstract
-
Cited by 43 (6 self)
- Add to MetaCart
This paper reports our experiences with a mostly-concurrent incremental garbage collector, implemented in the context of a high performance virtual machine for the Java^TM programming language. The garbage collector is based on the "mostly parallel" collection algorithm of Boehm et al., and can be used as the old generation of a generational memory system. It overloads efficient write-barrier code already generated to support generational garbage collection to also identify objects that were modified during concurrent marking. These objects must be rescanned to ensure that the concurrent marking phase marks all live objects. This algorithm minimises maximum garbage collection pause times, while having only a small impact on the average garbage collection pause time and overall execution time. We support our claims with experimental results, for both a synthetic benchmark and real programs.
Barriers: Friend or Foe?
, 2004
"... Modern garbage collectors rely on read and write barriers imposed on heap accesses by the mutator, to keep track of references between different regions of the garbage collected heap, and to synchronize actions of the mutator with those of the collector. It has been a long-standing untested assumpti ..."
Abstract
-
Cited by 32 (5 self)
- Add to MetaCart
Modern garbage collectors rely on read and write barriers imposed on heap accesses by the mutator, to keep track of references between different regions of the garbage collected heap, and to synchronize actions of the mutator with those of the collector. It has been a long-standing untested assumption that barriers impose significant overhead to garbage-collected applications. As a result, researchers have devoted effort to development of optimization approaches for elimination of unnecessary barriers, or proposed new algorithms for garbage collection that avoid the need for barriers while retaining the capability for independent collection of heap partitions. On the basis of the results presented here, we dispel the assumption that barrier overhead should be a primary motivator for such efforts. We present a
Approaches to Adding Persistence to Java
, 1996
"... We describe and name a range of approaches to adding persistence to the Java programming language. Java is interesting in this regard not only because of the current excitement over it. Some relevant properties of Java include: its blend of static and dynamic features, its incorporation of object co ..."
Abstract
-
Cited by 20 (2 self)
- Add to MetaCart
We describe and name a range of approaches to adding persistence to the Java programming language. Java is interesting in this regard not only because of the current excitement over it. Some relevant properties of Java include: its blend of static and dynamic features, its incorporation of object code into the environment, its offering of automatic storage management, its standardization of the object code format, its broad (but not exclusive) use of object orientation, and its use of a standard library. In considering approaches to adding persistence to Java, we offer a preliminary evaluation of the advantages and disadvantages of the approaches, and describe some directions we are pursuing in our own developments. We hope our descriptions and evaluations will be useful to others in understanding the attributes of systems and designs to be discussed at the workshop, or considered thereafter.
Client Cache Management in a Distributed Object Database
, 1995
"... A distributed object database stores objects persistently at servers. Applications run on client machines, fetching objects into a client-side cache of objects. If fetching and cache management are done in terms of objects, rather than fixed-size units such as pages, three problems must be solved: 1 ..."
Abstract
-
Cited by 18 (3 self)
- Add to MetaCart
A distributed object database stores objects persistently at servers. Applications run on client machines, fetching objects into a client-side cache of objects. If fetching and cache management are done in terms of objects, rather than fixed-size units such as pages, three problems must be solved: 1. which objects to prefetch, 2. how to translate, or swizzle, inter-object references when they are fetched from server to client, and 3. which objects to displace from the cache. This thesis reports the results of experiments to test various solutions to these problems. The experiments use the runtime system of the Thor distributed object database and benchmarks adapted from the Wisconsin OO7 benchmark suite. The thesis establishes the following points: 1. For plausible workloads involving some amount of object fetching, the prefetching policy is likely to have more impact on performance than swizzling policy or cache management policy. 2. A simple breadth-first prefetcher can have performa...
The Measured Cost of Copying Garbage Collection Mechanisms
- In Functional Programming
, 1997
"... We examine the costs and benefits of a variety of copying garbage collection (GC) mechanisms across multiple architectures and programming languages. Our study covers both low-level object representation and copying issues as well as the mechanisms needed to support more advanced techniques such as ..."
Abstract
-
Cited by 17 (1 self)
- Add to MetaCart
We examine the costs and benefits of a variety of copying garbage collection (GC) mechanisms across multiple architectures and programming languages. Our study covers both low-level object representation and copying issues as well as the mechanisms needed to support more advanced techniques such as generational collection, large object spaces, and type-segregated areas. Our experiments are made possible by a novel performance analysis tool, Oscar. Oscar allows us to capture snapshots of programming language heaps that may then be used to replay garbage collections. The replay program is self-contained and written in C, which makes it easy to port to other architectures and to analyze with standard performance analysis tools. Furthermore, it is possible to study additional programming languages simply by instrumenting existing implementations to capture heap snapshots. In general, we found that careful implementation of GC mechanisms can have a significant benefit. For a simple collecto...
Address Space Sparsity and Fine Granularity
- ACM Operating Systems Review
, 1995
"... Operating Systems Review, 29(1):87-90, January 1995 modified version of a paper presented at 6th SIGOPS European Workshop 12th-14th September 1994, Schlo Dagstuhl, Germany To fully exploit the potential of large address spaces, e.g. 2 64 -byte, the sparsity problem has to be solved in an effici ..."
Abstract
-
Cited by 16 (3 self)
- Add to MetaCart
Operating Systems Review, 29(1):87-90, January 1995 modified version of a paper presented at 6th SIGOPS European Workshop 12th-14th September 1994, Schlo Dagstuhl, Germany To fully exploit the potential of large address spaces, e.g. 2 64 -byte, the sparsity problem has to be solved in an efficient manner. Current address translation schemes either cause enormous space overhead (page table trees) or do not support address space structuring, object grouping and mixed page sizes (inverted page tables). Furthermore, an essential handicap of current virtual address spaces is their coarse granularity. It restricts the concept's relevance to low level OS technology. Without this constraint, mapping could be a vertically integrating paradigm, useful on all levels from hardware up to application programming. Guarded page tables help solving both problems. They permit significant extensions of the current programming model without performance degradation: sparse occupation and coarse-grain...
A Security Model for Cooperative Work
- ACM SIGOPS Workshop, ACM, Dagstuhl
, 1994
"... This report proposes a security model designed to support cooperative tasks in which the security of the information used and produced is critical, and where the participants in a task are not equally trusted. This approach will support a range of security policies, including those in which the righ ..."
Abstract
-
Cited by 12 (6 self)
- Add to MetaCart
This report proposes a security model designed to support cooperative tasks in which the security of the information used and produced is critical, and where the participants in a task are not equally trusted. This approach will support a range of security policies, including those in which the rights of participants in cooperative tasks are restricted to just those that they need in order to perform their roles - so-called `minimum privilege' policies. The model is designed to be implemented in a variety of distributed system environments, assuming a minimum of trusted system components. We describe an approach to the implementation of the security model in the context of a shared distributed object system and we outline an implementation architecture for an open distributed security system that will allow several security models to coexist in a single distributed system. The model has two levels at which access control is represented -- user level and programming level. Security poli...

