Results 1  10
of
39
Atomic Snapshots of Shared Memory
, 1993
"... . This paper introduces a general formulation of atomic snapshot memory, a shared memory partitioned into words written (updated) by individual processes, or instantaneously read (scanned) in its entirety. This paper presents three waitfree implementations of atomic snapshot memory. The first imple ..."
Abstract

Cited by 169 (44 self)
 Add to MetaCart
. This paper introduces a general formulation of atomic snapshot memory, a shared memory partitioned into words written (updated) by individual processes, or instantaneously read (scanned) in its entirety. This paper presents three waitfree implementations of atomic snapshot memory. The first implementation in this paper uses unbounded (integer) fields in these registers, and is particularly easy to understand. The second implementation uses bounded registers. Its correctness proof follows the ideas of the unbounded implementation. Both constructions implement a singlewriter snapshot memory, in which each word may be updated by only one process, from singlewriter, nreader registers. The third algorithm implements a multiwriter snapshot memory from atomic nwriter, nreader registers, again echoing key ideas from the earlier constructions. All operations require \Theta(n 2 ) reads and writes to the component shared registers in the worst case. Categories and Subject Discriptors:...
Composite Registers
 Distributed Computing
, 1993
"... We introduce a shared data object, called a composite register, that generalizes the notion of an atomic register. A composite register is an arraylike shared data object that is partitioned into a number of components. An operation of a composite register either writes a value to a single componen ..."
Abstract

Cited by 108 (7 self)
 Add to MetaCart
We introduce a shared data object, called a composite register, that generalizes the notion of an atomic register. A composite register is an arraylike shared data object that is partitioned into a number of components. An operation of a composite register either writes a value to a single component or reads the values of all components. A composite register reduces to an ordinary atomic register when there is only one component. In this paper, we show that multireader, singlewriter atomic registers can be used to implement a composite register in which there is only one writer per component. In a related paper, we show how to use the composite register construction of this paper to implement a composite register with multiple writers per component. These two constructions show that it is possible to implement a shared memory that can be read in its entirety in a single snapshot operation, without using mutual exclusion. Keywords: atomicity, atomic register, composite register, conc...
Waitfree Parallel Algorithms for the UnionFind Problem
 In Proc. 23rd ACM Symposium on Theory of Computing
, 1994
"... We are interested in designing efficient data structures for a shared memory multiprocessor. In this paper we focus on the UnionFind data structure. We consider a fully asynchronous model of computation where arbitrary delays are possible. Thus we require our solutions to the data structure problem ..."
Abstract

Cited by 50 (0 self)
 Add to MetaCart
We are interested in designing efficient data structures for a shared memory multiprocessor. In this paper we focus on the UnionFind data structure. We consider a fully asynchronous model of computation where arbitrary delays are possible. Thus we require our solutions to the data structure problem have the waitfree property, meaning that each thread continues to make progress on its operations, independent of the speeds of the other threads. In this model efficiency is best measured in terms of the total number of instructions used to perform a sequence of data structure operations, the work performed by the processors. We give a waitfree implementation of an efficient algorithm for UnionFind. In addition we show that the worst case performance of the algorithm can be improved by simulating a synchronized algorithm, or by simulating a larger machine if the data structure requests support sufficient parallelism. Our solutions apply to a much more general adversary model than has be...
Hundreds of Impossibility Results for Distributed Computing
 Distributed Computing
, 2003
"... We survey results from distributed computing that show tasks to be impossible, either outright or within given resource bounds, in various models. The parameters of the models considered include synchrony, faulttolerance, different communication media, and randomization. The resource bounds refe ..."
Abstract

Cited by 44 (4 self)
 Add to MetaCart
We survey results from distributed computing that show tasks to be impossible, either outright or within given resource bounds, in various models. The parameters of the models considered include synchrony, faulttolerance, different communication media, and randomization. The resource bounds refer to time, space and message complexity. These results are useful in understanding the inherent difficulty of individual problems and in studying the power of different models of distributed computing.
Parallel Algorithms with Processor Failures and Delays
, 1995
"... We study efficient deterministic parallel algorithms on two models: restartable failstop CRCW PRAMs and asynchronous PRAMs. In the first model, synchronous processors are subject to arbitrary stop failures and restarts determined by an online adversary and involving loss of private but not shared ..."
Abstract

Cited by 41 (7 self)
 Add to MetaCart
We study efficient deterministic parallel algorithms on two models: restartable failstop CRCW PRAMs and asynchronous PRAMs. In the first model, synchronous processors are subject to arbitrary stop failures and restarts determined by an online adversary and involving loss of private but not shared memory; the complexity measures are completed work (where processors are charged for completed fixedsize update cycles) and overhead ratio (completed work amortized over necessary work and failures). In the second model, the result of the computation is a serializaton of the actions of the processors determined by an online adversary; the complexity measure is total work (number of steps taken by all processors). Despite their differences the two models share key algorithmic techniques. We present new algorithms for the WriteAll problem (in which P processors write ones into an array of size N ) for the two models. These algorithms can be used to implement a simulation strategy for any N ...
TimeLapse Snapshots
 Proceedings of Israel Symposium on the Theory of Computing and Systems
, 1994
"... A snapshot scan algorithm takes an "instantaneous" picture of a region of shared memory that may be updated by concurrent processes. Many complex shared memory algorithms can be greatly simplified by structuring them around the snapshot scan abstraction. Unfortunately, the substantial decrease in ..."
Abstract

Cited by 28 (8 self)
 Add to MetaCart
A snapshot scan algorithm takes an "instantaneous" picture of a region of shared memory that may be updated by concurrent processes. Many complex shared memory algorithms can be greatly simplified by structuring them around the snapshot scan abstraction. Unfortunately, the substantial decrease in conceptual complexity is quite often counterbalanced by an increase in computational complexity. In this paper, we introduce the notion of a weak snapshot scan, a slightly weaker primitive that has a more efficient implementation. We propose the following methodology for using this abstraction: first, design and verify an algorithm using the more powerful snapshot scan, and second, replace the more powerful but less efficient snapshot with the weaker but more efficient snapshot, and show that the weaker abstraction nevertheless suffices to ensure the correctness of the enclosing algorithm. We give two examples of algorithms whose performance can be enhanced while retaining a simple m...
Reading Many Variables in One Atomic Operation Solutions With Linear or Sublinear Complexity
 IN PROCEEDINGS OF THE 5TH INTERNATIONAL WORKSHOP ON DISTRIBUTED ALGORITHMS
, 1991
"... We address the problem of reading more than one variables (components) X 1 ; : : : ; X c , all in one atomic operation, by only one process called the reader, while each of these variables are being written by a set of writers. All operations (i.e. both reads and writes) are assumed to be totally ..."
Abstract

Cited by 25 (1 self)
 Add to MetaCart
We address the problem of reading more than one variables (components) X 1 ; : : : ; X c , all in one atomic operation, by only one process called the reader, while each of these variables are being written by a set of writers. All operations (i.e. both reads and writes) are assumed to be totally asynchronous and waitfree. For this problem, only algorithms that require at best quadratic time and space complexity can be derived from the existing literature (the time complexity of a construction is the number of suboperations of a highlevel operation and its space complexity is the number of atomic shared variables it needs). In this paper, we provide a deterministic protocol which has linear (in the number of processes) space complexity, linear time complexity for a read operation and constant time complexity for a write. Our solution does not make use of timestamps. Rather, it is the memory location where a write writes that differentiates it from the other writes. Also, ...
Determining Consensus Numbers
 SIAM JOURNAL OF COMPUTING
, 2000
"... Conditions on a shared object type T are given that are both necessary and sufficient for waitfree nprocess consensus to be solvable using objects of type Tand registers. The cond7??m8 apply to two large classes ofd"q?qm8qW"q shared objects: read:m dd:m6?W" objects [C. P. Kruskal, L.Rud"fiWm ..."
Abstract

Cited by 20 (5 self)
 Add to MetaCart
Conditions on a shared object type T are given that are both necessary and sufficient for waitfree nprocess consensus to be solvable using objects of type Tand registers. The cond7??m8 apply to two large classes ofd"q?qm8qW"q shared objects: read:m dd:m6?W" objects [C. P. Kruskal, L.Rud"fiWm and M. Snir, ACM Trans.P og. Lang. Syst., 10 (1988), pp. 579601]and read01] objects, which have operations that allow processes toread the state of the object. These classes includ most objects that areused as the primitives ofd6""fiWm8q systems. When the sequential specification of T is finite, the condq67m8 may be checked in a finite amount of time tod]""fi the question "Is the consensus number of T at least n?" The cond6"qm8 are alsoused to provid a clear proof of the robustness of the consensus hierarchy forreadfi" ddfi"?qm8 and readable objects.
A LockFree Approach to Object Sharing in RealTime Systems
, 1997
"... This work aims to establish the viability of lockfree object sharing in uniprocessor realtime systems. Naive usage of conventional lockbased objectsharing schemes in realtime systems leads to unbounded priority inversion. A priority inversion occurs when a task is blocked by a lowerpriority ta ..."
Abstract

Cited by 16 (0 self)
 Add to MetaCart
This work aims to establish the viability of lockfree object sharing in uniprocessor realtime systems. Naive usage of conventional lockbased objectsharing schemes in realtime systems leads to unbounded priority inversion. A priority inversion occurs when a task is blocked by a lowerpriority task that is inside a critical section. Mechanisms that bound priority inversion usually entail kernel overhead that is sometimes excessive. We propose that lockfree objects offer an attractive alternative to lockbased schemes because they eliminate priority inversion and its associated problems. On the surface, lockfree objects may seem to be unsuitable for hard realtime systems because accesses to such objects are not guaranteed to complete in bounded time. Nonetheless, we present scheduling conditions that demonstrate the applicability of lockfree objects in hard realtime systems. Our scheduling conditions are applicable to schemes such as ratemonotonic scheduling and earliestdeadline...
Spreading Rumors Rapidly Despite an Adversary
 J. ALGORITHMS
, 1998
"... In the collect problem [32], n processors in a sharedmemory system must each learn the values of n registers. We give a randomized algorithm that solves the collect problem in O(n log 3 n) total read and write operations with high probability, even if timing is under the control of a content ..."
Abstract

Cited by 15 (4 self)
 Add to MetaCart
In the collect problem [32], n processors in a sharedmemory system must each learn the values of n registers. We give a randomized algorithm that solves the collect problem in O(n log 3 n) total read and write operations with high probability, even if timing is under the control of a contentoblivious adversary (a slight weakening of the usual adaptive adversary). This improves on both the trivial upper bound of O(n 2 ) steps and the best previously known bound of O(n 3=2 log n) steps, and is close to the lower bound of \Omega\Gamma n log n) steps. Furthermore, we show how this algorithm can be used to obtain a multiuse cooperative collect protocol that is O(log 3 n)competitive in the latency model of Ajtai et al.[3] and O(n 1=2 log 3=2 n)competitive in the throughput model of Aspnes and Waarts [10]; in both cases the competitive ratios are within a polylogarithmic factor of optimal.