Results 1  10
of
35
A methodology for implementing highly concurrent data structures
 In 2nd Symp. Principles & Practice of Parallel Programming
, 1990
"... A con.curren.t object is a data structure shared by concurrent processes. Conventional techniques for implementing concurrent objects typically rely on criticaI sections: ensuring that only one process at a time can operate on the object. Nevertheless, critical sections are poorly suited for asynchr ..."
Abstract

Cited by 320 (12 self)
 Add to MetaCart
A con.curren.t object is a data structure shared by concurrent processes. Conventional techniques for implementing concurrent objects typically rely on criticaI sections: ensuring that only one process at a time can operate on the object. Nevertheless, critical sections are poorly suited for asynchronous systems: if one process is halted or delayed in a critical section, other, nonfaulty processes will be unable to progress. By contrast, a concurrent object implementation is nonblocking if it always guarantees that some process will complete an operation in a finite number of steps, and it is waitfree if it guarantees that each process will complete an operation in a finite number of steps. This paper proposes a new methodology for constructing nonblocking aud waitfree implementations of concurrent objects. The object’s representation and operations are written as st,ylized sequential programs, with no explicit synchronization. Each sequential operation is automatically transformed into a nonblocking or waitfree operation usiug novel synchronization and memory management algorithms. These algorithms are presented for a multiple instruction/multiple data (MIM D) architecture in which n processes communicate by applying read, write, and comparekYswa,p operations to a shared memory. 1
LockFree Linked Lists Using CompareandSwap
 In Proceedings of the Fourteenth Annual ACM Symposium on Principles of Distributed Computing
, 1995
"... Lockfree data structures implement concurrent objects without the use of mutual exclusion. This approach can avoid performance problems due to unpredictable delays while processes are within critical sections. Although universal methods are known that give lockfree data structures for any abstract ..."
Abstract

Cited by 95 (1 self)
 Add to MetaCart
Lockfree data structures implement concurrent objects without the use of mutual exclusion. This approach can avoid performance problems due to unpredictable delays while processes are within critical sections. Although universal methods are known that give lockfree data structures for any abstract data type, the overhead of these methods makes them inefficient when compared to conventional techniques using mutual exclusion, such as spin locks. We give lockfree data structures and algorithms for implementing a shared singlylinked list, allowing concurrent traversal, insertion, and deletion by any number of processes. We also show how the basic data structure can be used as a building block for other lockfree data structures. Our algorithms use the single word CompareandSwap synchronization primitive to implement the linked list directly, avoiding the overhead of universal methods, and are thus a practical alternative to using spin locks. 1 Introduction A concurrent object is an...
Nonblocking Algorithms and PreemptionSafe Locking on Multiprogrammed Shared Memory Multiprocessors
 JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING
, 1998
"... Most multiprocessors are multiprogrammed in order to achieve acceptable response time and to increase their utilization. Unfortunately, inopportune preemption may significantly degrade the performance of synchronized parallel applications. To address this problem, researchers have developed two pri ..."
Abstract

Cited by 72 (1 self)
 Add to MetaCart
Most multiprocessors are multiprogrammed in order to achieve acceptable response time and to increase their utilization. Unfortunately, inopportune preemption may significantly degrade the performance of synchronized parallel applications. To address this problem, researchers have developed two principal strategies for concurrent, atomic update of shared data structures: (1) preemptionsafe locking and (2) nonblocking (lockfree) algorithms. Preemptionsafe locking requires kernel support. Nonblocking algorithms generally require a universal atomic primitive such as compareandswap orloadlinked/storeconditional, and are widely regarded as inefficient. We evaluate the performance of preemptionsafe lockbased and nonblocking implementations of important data structures—queues, stacks, heaps, and counters—including nonblocking and lockbased queue algorithms of our own, in microbenchmarks and real applications on a 12processor SGI Challenge multiprocessor. Our results indicate that our nonblocking queue consistently outperforms the best known alternatives, and that datastructurespecific nonblocking algorithms, which exist for queues, stacks, and counters, can work extremely well. Not only do they outperform preemptionsafe lockbased algorithms on multiprogrammed machines, they also outperform ordinary locks on dedicated machines. At the same time, since generalpurpose nonblocking techniques do not yet appear to be practical, preemptionsafe locks remain the preferred alternative for complex data structures: they outperform
Implementing LockFree Queues
 In Proceedings of the Seventh International Conference on Parallel and Distributed Computing Systems, Las Vegas, NV
, 1994
"... We study practical techniques for implementing the FIFO queue abstract data type using lockfree data structures, which synchronize the operations of concurrent processes without the use of mutual exclusion. Two new algorithms based on linked lists and arrays are presented. We also propose a new sol ..."
Abstract

Cited by 59 (1 self)
 Add to MetaCart
We study practical techniques for implementing the FIFO queue abstract data type using lockfree data structures, which synchronize the operations of concurrent processes without the use of mutual exclusion. Two new algorithms based on linked lists and arrays are presented. We also propose a new solution to the ABA problem associated with the Compare&Swap instruction. The performance of our linked list algorithm is compared several other lockfree queue implementations, as well as more conventional locking techniques. 1 Introduction Concurrent access to a data structure shared among several processes must be synchronized in order to avoid conflicting updates. Conventionally this is done using mutual exclusion; processes modify the data structure only inside a critical section of code, within which the process is guaranteed exclusive access to the data structure. Typically, on a multiprocessor, critical sections are guarded with a spinlock . We will refer to all methods using mutual ...
Parallel Algorithms with Processor Failures and Delays
, 1995
"... We study efficient deterministic parallel algorithms on two models: restartable failstop CRCW PRAMs and asynchronous PRAMs. In the first model, synchronous processors are subject to arbitrary stop failures and restarts determined by an online adversary and involving loss of private but not shared ..."
Abstract

Cited by 42 (7 self)
 Add to MetaCart
We study efficient deterministic parallel algorithms on two models: restartable failstop CRCW PRAMs and asynchronous PRAMs. In the first model, synchronous processors are subject to arbitrary stop failures and restarts determined by an online adversary and involving loss of private but not shared memory; the complexity measures are completed work (where processors are charged for completed fixedsize update cycles) and overhead ratio (completed work amortized over necessary work and failures). In the second model, the result of the computation is a serializaton of the actions of the processors determined by an online adversary; the complexity measure is total work (number of steps taken by all processors). Despite their differences the two models share key algorithmic techniques. We present new algorithms for the WriteAll problem (in which P processors write ones into an array of size N ) for the two models. These algorithms can be used to implement a simulation strategy for any N ...
Hundreds of Impossibility Results for Distributed Computing
 Distributed Computing
, 2003
"... We survey results from distributed computing that show tasks to be impossible, either outright or within given resource bounds, in various models. The parameters of the models considered include synchrony, faulttolerance, different communication media, and randomization. The resource bounds refe ..."
Abstract

Cited by 40 (4 self)
 Add to MetaCart
We survey results from distributed computing that show tasks to be impossible, either outright or within given resource bounds, in various models. The parameters of the models considered include synchrony, faulttolerance, different communication media, and randomization. The resource bounds refer to time, space and message complexity. These results are useful in understanding the inherent difficulty of individual problems and in studying the power of different models of distributed computing.
Parallelizing unionfind in Constraint Handling Rules using confluence analysis
 LOGIC PROGRAMMING: 21ST INTERNATIONAL CONFERENCE, ICLP 2005, VOLUME 3668 OF LECTURE NOTES IN COMPUTER SCIENCE
, 2005
"... Constraint Handling Rules is a logical concurrent committedchoice rulebased language. Recently it was shown that the classical unionfind algorithm can be implemented in CHR with optimal time complexity. Here we investigate if a parallel implementation of this algorithm is also possible in CHR. The ..."
Abstract

Cited by 29 (15 self)
 Add to MetaCart
Constraint Handling Rules is a logical concurrent committedchoice rulebased language. Recently it was shown that the classical unionfind algorithm can be implemented in CHR with optimal time complexity. Here we investigate if a parallel implementation of this algorithm is also possible in CHR. The problem is hard for several reasons: Up to now, no parallel computation model for CHR was defined. Tarjan’s optimal unionfind is known to be hard to parallelize. The parallel code should be as close as possible to the sequential one. It turns out that confluence analysis of the sequential implementation gives almost all the information needed to parallelize the unionfind algorithm under a rather general parallel computation model for CHR.
A Theory of Competitive Analysis for Distributed Algorithms
 Proc. 35th Annual Symp. on Foundations of Computer Science
, 2003
"... We introduce a theory of competitive analysis for distributed algorithms. The first steps in this direction were made in the seminal papers of Bartal, Fiat, and Rabani [18], and of Awerbuch, Kutten, and Peleg [16], in the context of data management and job scheduling. In these papers, as well... ..."
Abstract

Cited by 29 (5 self)
 Add to MetaCart
We introduce a theory of competitive analysis for distributed algorithms. The first steps in this direction were made in the seminal papers of Bartal, Fiat, and Rabani [18], and of Awerbuch, Kutten, and Peleg [16], in the context of data management and job scheduling. In these papers, as well...
Universal Operations: Unary Versus Binary
, 1996
"... 1 1 Introduction 2 2 Related Work 5 3 Preliminaries 7 3.1 The Asynchronous SharedMemory Model : : : : : : : : : : : : : : : : : : : 7 3.2 Sensitivity : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 8 4 The Left/Right Algorithm 11 4.1 The General Scheme : : : : : : : ..."
Abstract

Cited by 27 (2 self)
 Add to MetaCart
1 1 Introduction 2 2 Related Work 5 3 Preliminaries 7 3.1 The Asynchronous SharedMemory Model : : : : : : : : : : : : : : : : : : : 7 3.2 Sensitivity : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 8 4 The Left/Right Algorithm 11 4.1 The General Scheme : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 11 4.2 The Left/Right Algorithm : : : : : : : : : : : : : : : : : : : : : : : : : : : : 13 4.2.1 Overview : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 13 4.2.2 The code : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 14 4.2.3 Correctness of the Algorithm : : : : : : : : : : : : : : : : : : : : : : 16 4.2.4 Analysis of the Algorithm : : : : : : : : : : : : : : : : : : : : : : : : 18 4.3 Inherently Asymmetric Data Structures : : : : : : : : : : : : : : : : : : : : 21 5 The Decision Algorithm 23 5.1 Monotone Paths : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 23 5.1.1 One Phase :...
A performance evaluation of lockfree synchronization protocols
 In Proceedings of the 13th Annual ACM Symposium on Principles of Distributed Computing (PODC
, 1994
"... In this paper, we investigate the practical performance of lockfree techniques that provide synchronization on sharedmemory multiprocessors. Our goal is to provide a technique to allow designers of new protocols to quickly determine an algorithm’s performance characteristics. We develop a simple a ..."
Abstract

Cited by 25 (1 self)
 Add to MetaCart
In this paper, we investigate the practical performance of lockfree techniques that provide synchronization on sharedmemory multiprocessors. Our goal is to provide a technique to allow designers of new protocols to quickly determine an algorithm’s performance characteristics. We develop a simple analytical performance model based on the architectural observations that memory accesses are expensive, synchronization instructions are more expensive, and that optimistic synchronization policies result in wasted communication bandwidth which can slow the system as a whole. Using our model, we evaluate the performance of five existing lockfree synchronization protocols. We validate our analysis by comparing our results with simulations of a parallel machine. Given this analysis, we identify those protocols which show promise of good performance in practice. In addition, we note that no existing protocols provide insensitivity to common delays while still offering performance equivalent to locks. Accordingly, we introduce a protocol, based on a combination of existing lockfree techniques, which satisfies these criteria. 1