Results 1 - 10
of
26
Logtm: Log-based transactional memory
- in HPCA
, 2006
"... Transactional memory (TM) simplifies parallel programming by guaranteeing that transactions appear to execute atomically and in isolation. Implementing these properties includes providing data version management for the simultaneous storage of both new (visible if the transaction commits) and old (r ..."
Abstract
-
Cited by 173 (8 self)
- Add to MetaCart
Transactional memory (TM) simplifies parallel programming by guaranteeing that transactions appear to execute atomically and in isolation. Implementing these properties includes providing data version management for the simultaneous storage of both new (visible if the transaction commits) and old (retained if the transaction aborts) values. Most (hardware) TM systems leave old values “in place” (the target memory address) and buffer new values elsewhere until commit. This makes aborts fast, but penalizes (the much more frequent) commits. In this paper, we present a new implementation of transactional memory, Log-based Transactional Memory (LogTM), that makes commits fast by storing old values to a per-thread log in cacheable virtual memory and storing new values in place. LogTM makes two additional contributions. First, LogTM extends a MOESI directory protocol to enable both fast conflict detection on evicted blocks and fast commit (using lazy cleanup). Second, LogTM handles aborts in (library) software with little performance penalty. Evaluations running micro- and SPLASH-2 benchmarks on a 32way multiprocessor support our decision to optimize for commit by showing that only 1-2 % of transactions abort. 1.
What are race conditions? some issues and formalizations
- LOPLAS
, 1992
"... In shared-memory parallel programs that use explicit synchronization, race conditions result when accesses to shared memory are not properly synchronized. Race conditions are often considered to be manifestations of bugs, since their presence can cause the program to behave unexpectedly, Unfortunate ..."
Abstract
-
Cited by 77 (0 self)
- Add to MetaCart
In shared-memory parallel programs that use explicit synchronization, race conditions result when accesses to shared memory are not properly synchronized. Race conditions are often considered to be manifestations of bugs, since their presence can cause the program to behave unexpectedly, Unfortunately, there has been little agreement in the literature as to precisely what constitutes a race condition. Two different notions have been implicitly considered: one pertaining to programs intended to be deterministic (which we call general races) and the other to nondeterministic programs containing critical sections (which we call data races). However, the differences between general races and data races have not yet been recognized. This paper examines these differences by characterizing races using a formal model and exploring their properties. We show that two variations of each type of race exist: feasible general races and data races capture the intuitive notions desired for debugging and apparent races capture less accurate notions implicitly assumed by most dynamic race detection methods. We also show that locating feasible races is an NP-hard problem, implying that only the apparent races, which are approximations to feasible races, can be detected in practice. The complexity of dynamically locating apparent races depends on the type of synchronization used by the program, Apparent
Improving the Accuracy of Data Race Detection
- In Proceedings of the 1991 Conference on the Principles and Practice of Parallel Programming
, 1991
"... For shared-memory parallel programs that use explicit synchronization, data race detection is an important part of debugging. A data race exists when concurrently executing sections of code access common shared variables. In programs intended to be data race free, they are sources of nondeterminism ..."
Abstract
-
Cited by 65 (6 self)
- Add to MetaCart
For shared-memory parallel programs that use explicit synchronization, data race detection is an important part of debugging. A data race exists when concurrently executing sections of code access common shared variables. In programs intended to be data race free, they are sources of nondeterminism usually considered bugs. Previous methods for detecting data races in executions of parallel programs can determine when races occurred, but can report many data races that are artifacts of others and not direct manifestations of program bugs. Artifacts exist because some races can cause others and can also make false races appear real. Such artifacts can overwhelm the programmer with information irrelevant for debugging. This paper presents results showing how to identify nonartifact data races by validation and ordering. Data race validation attempts to determine which races involve events that either did execute concurrently or could have (called feasible data races). We show how each de...
The Mutual Exclusion Problem - Part II: Statement and Solutions
, 2000
"... The theory developed in Part I is used to state the mutual exclusion problem and several additional fairness and failure-tolerance requirements. Four "distributed " N-process solutions are given, ranging from a solution requiring only one communication bit per process that permits individual starvat ..."
Abstract
-
Cited by 53 (3 self)
- Add to MetaCart
The theory developed in Part I is used to state the mutual exclusion problem and several additional fairness and failure-tolerance requirements. Four "distributed " N-process solutions are given, ranging from a solution requiring only one communication bit per process that permits individual starvation, to one requiring about N ! communication bits per process that satisfies every reasonable fairness and failure-tolerance requirement that we can conceive of. Contents 1 Introduction 3 2 The Problem 4 2.1 Basic Requirements . . . . . . . . . . . . . . . . . . . . . . . . 4 2.2 Fairness Requirements . . . . . . . . . . . . . . . . . . . . . . 6 2.3 Premature Termination . . . . . . . . . . . . . . . . . . . . . 8 2.4 Failure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 3 The Solutions 14 3.1 The Mutual Exclusion Protocol . . . . . . . . . . . . . . . . . 15 3.2 The One-Bit Solution . . . . . . . . . . . . . . . . . . . . . . 17 3.3 A Digression . . . . . . . . . . . ...
On the Complexity of Event Ordering for Shared-Memory Parallel Program Executions
- In Proceedings of the 1990 International Conference on Parallel Processing
, 1990
"... This paper presents results on the complexity of computing event orderings for sharedmemory parallel program executions. Given a program execution, we formally define the problem of computing orderings that the execution must have exhibited or could have exhibited, and prove that computing such orde ..."
Abstract
-
Cited by 48 (6 self)
- Add to MetaCart
This paper presents results on the complexity of computing event orderings for sharedmemory parallel program executions. Given a program execution, we formally define the problem of computing orderings that the execution must have exhibited or could have exhibited, and prove that computing such orderings is an intractable problem. We present a formal model of a shared-memory parallel program execution on a sequentially consistent processor, and discuss event orderings in terms of this model. Programs are considered that use fork/join and either counting semaphores or event style synchronization. We define a feasible program execution to be an execution of the program that performs the same events as an observed execution, but which may exhibit different orderings among those events. Any program execution exhibiting the same data dependences among the shared data as the observed execution is feasible. We define several relations that capture the orderings present in all (or some) of the...
Detecting Data Races in Parallel Program Executions
- In Advances in Languages and Compilers for Parallel Computing, 1990 Workshop
, 1989
"... Several methods currently exist for detecting data races in an execution of a shared-memory parallel program. Although these methods address an important aspect of parallel program debugging, they do not precisely define the notion of a data race. As a result, is it not possible to precisely state w ..."
Abstract
-
Cited by 46 (6 self)
- Add to MetaCart
Several methods currently exist for detecting data races in an execution of a shared-memory parallel program. Although these methods address an important aspect of parallel program debugging, they do not precisely define the notion of a data race. As a result, is it not possible to precisely state which data races are detected, nor is the meaning of the reported data races always clear. Furthermore, these methods can sometimes generate false data race reports. They can determine whether a data race was exhibited during an execution, but when more than one data race is reported, only limited indication is given as to which ones are real. This paper addresses these two issues. We first present a model for reasoning about data races, and then present a two-phase approach to data race detection that attempts to validate the accuracy of each detected data race. Our model of data races distinguishes among those data races that actually occurred during an execution (actual data races), those...
Shared-memory mutual exclusion: Major research trends since
- Distributed Computing
, 1986
"... * Exclusion: At most one process executes its critical section at any time. ..."
Abstract
-
Cited by 38 (7 self)
- Add to MetaCart
* Exclusion: At most one process executes its critical section at any time.
How to make a correct multiprocess program execute correctly on a multiprocessor
- IEEE Transactions on Computers
, 1997
"... A multiprocess program executing on a modern multiprocessor must issue explicit commands to synchronize memory accesses. A method is proposed for deriving the necessary commands from a correctness proof of the underlying algorithm in a formalism based on temporal relations among operation executions ..."
Abstract
-
Cited by 33 (3 self)
- Add to MetaCart
A multiprocess program executing on a modern multiprocessor must issue explicit commands to synchronize memory accesses. A method is proposed for deriving the necessary commands from a correctness proof of the underlying algorithm in a formalism based on temporal relations among operation executions. index terms concurrency, memory consistency, multiprocessor, synchronization, verification
The Logic of Time Representation
, 1987
"... This investigation concerns representations of time by means of intervals, stemming from work of Allen [All83] and van Benthem [vBen83]. Allen described an Interval Calculus of thirteen binary relations on convex intervals over a linear order (the real numbers). He gave a practical algorithm for che ..."
Abstract
-
Cited by 28 (1 self)
- Add to MetaCart
This investigation concerns representations of time by means of intervals, stemming from work of Allen [All83] and van Benthem [vBen83]. Allen described an Interval Calculus of thirteen binary relations on convex intervals over a linear order (the real numbers). He gave a practical algorithm for checking the consistency of a subclass of Boolean constraints. First, we describe a completeness theorem for Allen's calculus, in its corresponding formulation as a first-order theory LM . LM is countably categorical, and axiomatises the complete theory of intervals over a dense unbounded linear order. Its only countable model up to isomorphism is the non-trivial intervals over the rational numbers. Algorithms are given for quantifer-elimination, consistency checking, and satisfaction of arbitrary first-order formulas in the Interval Calculus. A natural countable model of the calculus is presented, the TUS , in which clock- and calendar-time may be represented in a straightforward way. Allen an...
Time-Lapse Snapshots
- Proceedings of Israel Symposium on the Theory of Computing and Systems
, 1994
"... A snapshot scan algorithm takes an "instantaneous" picture of a region of shared memory that may be updated by concurrent processes. Many complex shared memory algorithms can be greatly simplified by structuring them around the snapshot scan abstraction. Unfortunately, the substantial decrease in ..."
Abstract
-
Cited by 26 (8 self)
- Add to MetaCart
A snapshot scan algorithm takes an "instantaneous" picture of a region of shared memory that may be updated by concurrent processes. Many complex shared memory algorithms can be greatly simplified by structuring them around the snapshot scan abstraction. Unfortunately, the substantial decrease in conceptual complexity is quite often counterbalanced by an increase in computational complexity. In this paper, we introduce the notion of a weak snapshot scan, a slightly weaker primitive that has a more efficient implementation. We propose the following methodology for using this abstraction: first, design and verify an algorithm using the more powerful snapshot scan, and second, replace the more powerful but less efficient snapshot with the weaker but more efficient snapshot, and show that the weaker abstraction nevertheless suffices to ensure the correctness of the enclosing algorithm. We give two examples of algorithms whose performance can be enhanced while retaining a simple m...

