Results 1 - 10
of
39
Shared memory consistency models: A tutorial
- IEEE Computer
, 1996
"... Parallel systems that support the shared memory abstraction are becoming widely accepted in many areas of computing. Writing correct and efficient programs for such systems requires a formal specification of memory semantics, called a memory consistency model. The most intuitive model—sequential con ..."
Abstract
-
Cited by 297 (8 self)
- Add to MetaCart
Parallel systems that support the shared memory abstraction are becoming widely accepted in many areas of computing. Writing correct and efficient programs for such systems requires a formal specification of memory semantics, called a memory consistency model. The most intuitive model—sequential consistency—greatly restricts the use of many performance optimizations commonly used by uniprocessor hardware and compiler designers, thereby reducing the benefit of using a multiprocessor. To alleviate this problem, many current multiprocessors support more relaxed consistency models. Unfortunately, the models supported by various systems differ from each other in subtle yet important ways. Furthermore, precisely defining the semantics of each model often leads to complex specifications that are difficult to understand for typical users and builders of computer systems. The purpose of this tutorial paper is to describe issues related to memory consistency models in a way that would be understandable to most computer professionals. We focus on consistency models proposed for hardware-based shared-memory systems. Many of these models are originally specified with an emphasis on the system optimizations they allow. We retain the system-centric emphasis, but use uniform and simple terminology to describe the different models. We also briefly discuss an alternate programmer-centric view that describes the models in terms of program behavior rather than specific system optimizations. 1
Fixing the Java memory model
- In ACM Java Grande Conference
, 1999
"... This paper describes the new Java memory model, which has been revised as part of Java 5.0. The model specifies the legal behaviors for a multithreaded program; it defines the semantics of multithreaded Java programs and partially determines legal implementations of Java virtual machines and compile ..."
Abstract
-
Cited by 247 (7 self)
- Add to MetaCart
This paper describes the new Java memory model, which has been revised as part of Java 5.0. The model specifies the legal behaviors for a multithreaded program; it defines the semantics of multithreaded Java programs and partially determines legal implementations of Java virtual machines and compilers. The new Java model provides a simple interface for correctly synchronized programs – it guarantees sequential consistency to data-race-free programs. Its novel contribution is requiring that the behavior of incorrectly synchronized programs be bounded by a well defined notion of causality. The causality requirement is strong enough to respect the safety and security properties of Java and weak enough to allow standard compiler and hardware optimizations. To our knowledge, other models are either too weak because they do not provide for sufficient safety/security, or are too strong because they rely on a strong notion of data and control dependences that precludes some standard compiler transformations. Although the majority of what is currently done in compilers is legal, the new model introduces significant differences, and clearly defines the boundaries of legal transformations. For example, the commonly accepted definition for control dependence is incorrect for Java, and transformations based on it may be invalid. In addition to providing the official memory model for Java, we believe the model described here could prove to be a useful basis for other programming languages that currently lack well-defined models, such as C++ and C#.
Multiprocessors Should Support Simple Memory Consistency Models
, 1998
"... Many future computers will be shared-memory multiprocessors. These hardware systems must define for software the allowable behavior of memory. A reasonable model is sequential consistency (SC), which makes a shared memory multiprocessor behave like a multiprogrammed uniprocessor. Since SC appears to ..."
Abstract
-
Cited by 75 (1 self)
- Add to MetaCart
Many future computers will be shared-memory multiprocessors. These hardware systems must define for software the allowable behavior of memory. A reasonable model is sequential consistency (SC), which makes a shared memory multiprocessor behave like a multiprogrammed uniprocessor. Since SC appears to limit some of the optimizations useful for aggressive hardware implementations, researchers and practitioners have defined several relaxed consistency models. Some of these models just relax the ordering from writes to reads (processor consistency, IBM 370, Intel Pentium Pro, and Sun TSO), while others aggressively relax the order among all normal reads and writes (weak ordering, release consistency, DEC Alpha, IBM PowerPC, and Sun RMO). This paper argues that multiprocessors should implement SC or, in some cases, a model that just relaxes the ordering from writes to reads. I argue against using aggressively relaxed models because, with the advent of speculative execution, these models do ...
Velodrome: A Sound and Complete Dynamic Atomicity Checker for Multithreaded Programs
"... Atomicity is a fundamental correctness property in multithreaded programs, both because atomic code blocks are amenable to sequential reasoning (which significantly simplifies correctness arguments), and because atomicity violations often reveal defects in a program’s synchronization structure. Unfo ..."
Abstract
-
Cited by 40 (10 self)
- Add to MetaCart
Atomicity is a fundamental correctness property in multithreaded programs, both because atomic code blocks are amenable to sequential reasoning (which significantly simplifies correctness arguments), and because atomicity violations often reveal defects in a program’s synchronization structure. Unfortunately, all atomicity analyses developed to date are incomplete in that they may yield false alarms on correctly synchronized programs, which limits their usefulness. We present the first dynamic analysis for atomicity that is both sound and complete. The analysis reasons about the exact dependencies between operations in the observed trace of the target program, and it reports error messages if and only if the observed trace is not conflict-serializable. Despite this significant increase in precision, the performance and coverage of our analysis is competitive with earlier incomplete dynamic analyses for atomicity.
READ-COPY UPDATE: USING EXECUTION HISTORY TO SOLVE CONCURRENCY PROBLEMS
"... Synchronization overhead, contention, and deadlock can pose severe challenges to designers and implementers of parallel programs. Therefore, many researchers have proposed update disciplines that solve these problems in restricted but commonly occurring situations. However, these proposals rely eith ..."
Abstract
-
Cited by 38 (17 self)
- Add to MetaCart
Synchronization overhead, contention, and deadlock can pose severe challenges to designers and implementers of parallel programs. Therefore, many researchers have proposed update disciplines that solve these problems in restricted but commonly occurring situations. However, these proposals rely either on garbage collectors [7, 8], termination of all processes currently using the data structure [10], or expensive explicit tracking of all processes accessing the data structure [5, 15]. These mechanisms are inappropriate in many cases, such as within many operating-system kernels and server applications. This paper proposes a novel and extremely efficient mechanism, called read-copy update, and compares its performance to that of conventional locking primitives.
Commit-Reconcile Fences (CRF): A New Memory Model for Architects and Compiler Writers
- In Proceedings of the 26th International Symposium on Computer Architecture
, 1999
"... We present a new mechanism-oriented memory model called Commit-Reconcile & Fences (CRF) and define it using algebraic rules. Many existing memory models can be described as restricted versions of CRF. The model has been designed so that it is both easy for architects to implement, and stable enough ..."
Abstract
-
Cited by 36 (6 self)
- Add to MetaCart
We present a new mechanism-oriented memory model called Commit-Reconcile & Fences (CRF) and define it using algebraic rules. Many existing memory models can be described as restricted versions of CRF. The model has been designed so that it is both easy for architects to implement, and stable enough to serve as a target machine interface for compilers of high-level languages. The CRF model exposes a semantic notion of caches (saches), and decomposes load and store instructions into finer-grain operations. We sketch how to integrate CRF into modern microprocessors and outline an adaptive coherence protocol to implement CRF in distributed shared-memory systems. CRF offers an upward compatible way to design next generation computer systems. 1. Loads and Stores: The CISC of Nineties Caching and instruction reordering are ubiquitous features of modern computer systems and are necessary to achieve higher performance. For uniprocessor configurations, these features are mostly transparent and...
X.: Improving the Java Memory Model Using CRF
- Proc. OOPSLA (2000) 1–12
"... This paper describes alternative memory semantics for Java programs using an enriched version of the Commit/Reconcile/Fence (CRF) memory model [16]. It outlines a set of reasonable practices for safe multithreaded programming in Java. Our semantics allow a number of optimizations such as load reorde ..."
Abstract
-
Cited by 31 (1 self)
- Add to MetaCart
This paper describes alternative memory semantics for Java programs using an enriched version of the Commit/Reconcile/Fence (CRF) memory model [16]. It outlines a set of reasonable practices for safe multithreaded programming in Java. Our semantics allow a number of optimizations such as load reordering that are currently prohibited. Simple thread-local algebraic rules express the effects of optimizations at the source or bytecode level. The rules focus on reordering source-level operations; they yield a simple dependency analysis algorithm for Java. An instruction-by-instruction translation of Java memory operations into CRF operations captures thread interactions precisely. The fine-grained synchronization of CRF means the algebraic rules are easily derived from the translation. CRF can be mapped directly to a modern architecture, and is thus a suitable target for optimizing memory coherence during code generation. Categories and Subject Descriptors
Specifying and Verifying a Broadcast and a Multicast Snooping Cache Coherence Protocol
- IEEE Transactions on Parallel and Distributed Systems
, 2000
"... In this paper, we develop a specification methodology that documents and specifies a cache coherence protocol in eight tables: the states, events, actions, and transitions of the cache and memory controllers, We then use this methodology to specify a detailed, modern three-state broadcast snooping ..."
Abstract
-
Cited by 26 (10 self)
- Add to MetaCart
In this paper, we develop a specification methodology that documents and specifies a cache coherence protocol in eight tables: the states, events, actions, and transitions of the cache and memory controllers, We then use this methodology to specify a detailed, modern three-state broadcast snooping protocol with an unordered data network and an ordered address network that allows arbitrary skew, We also present a detailed specification of a new protocol called Multicast Snooping [6] and, in doing so, we better illustrate the utility of the table-based specification methodology, Finally, we demonstrate a technique for verification of the Multicast Snooping protocol, through the sketch of a manual proof that the specification satisfies a sequentially consistent memory model, Index Terms--Cache coherence, protocol specification, protocol verification, memory consistency, multicast snooping.
Precedence-based memory models
- In Eleventh International Workshop on Distributed Algorithms, number 1320 in Lecture Notes in Computer Science
, 1997
"... We present a computation-centric theory of memory models. Unlike traditional processor-centric models, computation-centric models focus on the logical dependencies among instructions rather than the processor that happens to execute them. This theory allows us to define what a memory model is, and t ..."
Abstract
-
Cited by 25 (1 self)
- Add to MetaCart
We present a computation-centric theory of memory models. Unlike traditional processor-centric models, computation-centric models focus on the logical dependencies among instructions rather than the processor that happens to execute them. This theory allows us to define what a memory model is, and to investigate abstract properties of memory models. In particular, we focus on constructibility, which is a necessary property of those models that can be implemented exactly by an online algorithm. For a nonconstructible model, we show that there is a natural way to define the constructible version of that model. We explore the implications of constructibility in the context of dag-consistent memory models, which do not require that memory locations be serialized. The strongest dag-consistent model, called NN-dag consistency, is not constructible. However, its constructible version is equivalent to a model that we call location consistency, in which each location is serialized independently. 1
Nemos: A Framework for Axiomatic and Executable Specifications of Memory Consistency Models
- In International Parallel and Distributed Processing Symposium (IPDPS
, 2003
"... Conforming to the underlying memory consistency rules is a fundamental requirement for implementing shared memory systems and writing multiprocessor programs. ..."
Abstract
-
Cited by 25 (5 self)
- Add to MetaCart
Conforming to the underlying memory consistency rules is a fundamental requirement for implementing shared memory systems and writing multiprocessor programs.

