Results 1  10
of
54
A Methodology for Implementing Highly Concurrent Data Objects
, 1993
"... A concurrent object is a data structure shared by concurrent processes. Conventional techniques for implementing concurrent objects typically rely on critical sections: ensuring that only one process at a time can operate on the object. Nevertheless, critical sections are poorly suited for asynchro ..."
Abstract

Cited by 357 (11 self)
 Add to MetaCart
(Show Context)
A concurrent object is a data structure shared by concurrent processes. Conventional techniques for implementing concurrent objects typically rely on critical sections: ensuring that only one process at a time can operate on the object. Nevertheless, critical sections are poorly suited for asynchronous systems: if one process is halted or delayed in a critical section, other, nonfaulty processes will be unable to progress. By contrast, a concurrent object implementation is lock free if it always guarantees that some process will complete an operation in a finite number of steps, and it is wait free if it guarantees that each process will complete an operation in a finite number of steps. This paper proposes a new methodology for constructing lockfree and waitfree implementations of concurrent objects. The object’s representation and operations are written as stylized sequential programs, with no explicit synchronization. Each sequential operation is automatically transformed into a lockfree or waitfree operation using novel synchronization and memory management algorithms. These algorithms are presented for a multiple instruction/multiple data (MIMD) architecture in which n processes communicate by applying atomic read, wrzte, load_linked, and store_conditional operations to a shared memory.
LockFree Linked Lists Using CompareandSwap
 In Proceedings of the Fourteenth Annual ACM Symposium on Principles of Distributed Computing
, 1995
"... Lockfree data structures implement concurrent objects without the use of mutual exclusion. This approach can avoid performance problems due to unpredictable delays while processes are within critical sections. Although universal methods are known that give lockfree data structures for any abstract ..."
Abstract

Cited by 111 (1 self)
 Add to MetaCart
(Show Context)
Lockfree data structures implement concurrent objects without the use of mutual exclusion. This approach can avoid performance problems due to unpredictable delays while processes are within critical sections. Although universal methods are known that give lockfree data structures for any abstract data type, the overhead of these methods makes them inefficient when compared to conventional techniques using mutual exclusion, such as spin locks. We give lockfree data structures and algorithms for implementing a shared singlylinked list, allowing concurrent traversal, insertion, and deletion by any number of processes. We also show how the basic data structure can be used as a building block for other lockfree data structures. Our algorithms use the single word CompareandSwap synchronization primitive to implement the linked list directly, avoiding the overhead of universal methods, and are thus a practical alternative to using spin locks. 1 Introduction A concurrent object is an...
Nonblocking Algorithms and PreemptionSafe Locking on Multiprogrammed Shared Memory Multiprocessors
 JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING
, 1998
"... Most multiprocessors are multiprogrammed in order to achieve acceptable response time and to increase their utilization. Unfortunately, inopportune preemption may significantly degrade the performance of synchronized parallel applications. To address this problem, researchers have developed two pri ..."
Abstract

Cited by 99 (1 self)
 Add to MetaCart
(Show Context)
Most multiprocessors are multiprogrammed in order to achieve acceptable response time and to increase their utilization. Unfortunately, inopportune preemption may significantly degrade the performance of synchronized parallel applications. To address this problem, researchers have developed two principal strategies for concurrent, atomic update of shared data structures: (1) preemptionsafe locking and (2) nonblocking (lockfree) algorithms. Preemptionsafe locking requires kernel support. Nonblocking algorithms generally require a universal atomic primitive such as compareandswap orloadlinked/storeconditional, and are widely regarded as inefficient. We evaluate the performance of preemptionsafe lockbased and nonblocking implementations of important data structures—queues, stacks, heaps, and counters—including nonblocking and lockbased queue algorithms of our own, in microbenchmarks and real applications on a 12processor SGI Challenge multiprocessor. Our results indicate that our nonblocking queue consistently outperforms the best known alternatives, and that datastructurespecific nonblocking algorithms, which exist for queues, stacks, and counters, can work extremely well. Not only do they outperform preemptionsafe lockbased algorithms on multiprogrammed machines, they also outperform ordinary locks on dedicated machines. At the same time, since generalpurpose nonblocking techniques do not yet appear to be practical, preemptionsafe locks remain the preferred alternative for complex data structures: they outperform
Implementing LockFree Queues
 In Proceedings of the Seventh International Conference on Parallel and Distributed Computing Systems, Las Vegas, NV
, 1994
"... We study practical techniques for implementing the FIFO queue abstract data type using lockfree data structures, which synchronize the operations of concurrent processes without the use of mutual exclusion. Two new algorithms based on linked lists and arrays are presented. We also propose a new sol ..."
Abstract

Cited by 70 (1 self)
 Add to MetaCart
(Show Context)
We study practical techniques for implementing the FIFO queue abstract data type using lockfree data structures, which synchronize the operations of concurrent processes without the use of mutual exclusion. Two new algorithms based on linked lists and arrays are presented. We also propose a new solution to the ABA problem associated with the Compare&Swap instruction. The performance of our linked list algorithm is compared several other lockfree queue implementations, as well as more conventional locking techniques. 1 Introduction Concurrent access to a data structure shared among several processes must be synchronized in order to avoid conflicting updates. Conventionally this is done using mutual exclusion; processes modify the data structure only inside a critical section of code, within which the process is guaranteed exclusive access to the data structure. Typically, on a multiprocessor, critical sections are guarded with a spinlock . We will refer to all methods using mutual ...
A method for implementing lockfree shared data structures
 In Proceedings of the 5th ACM Symposium on Parallel Algorithms and Architectures
, 1993
"... barnesQmpisb.mpg.de ..."
(Show Context)
Parallel Algorithms with Processor Failures and Delays
, 1995
"... We study efficient deterministic parallel algorithms on two models: restartable failstop CRCW PRAMs and asynchronous PRAMs. In the first model, synchronous processors are subject to arbitrary stop failures and restarts determined by an online adversary and involving loss of private but not shared ..."
Abstract

Cited by 54 (12 self)
 Add to MetaCart
We study efficient deterministic parallel algorithms on two models: restartable failstop CRCW PRAMs and asynchronous PRAMs. In the first model, synchronous processors are subject to arbitrary stop failures and restarts determined by an online adversary and involving loss of private but not shared memory; the complexity measures are completed work (where processors are charged for completed fixedsize update cycles) and overhead ratio (completed work amortized over necessary work and failures). In the second model, the result of the computation is a serializaton of the actions of the processors determined by an online adversary; the complexity measure is total work (number of steps taken by all processors). Despite their differences the two models share key algorithmic techniques. We present new algorithms for the WriteAll problem (in which P processors write ones into an array of size N ) for the two models. These algorithms can be used to implement a simulation strategy for any N ...
Hundreds of Impossibility Results for Distributed Computing
 Distributed Computing
, 2003
"... We survey results from distributed computing that show tasks to be impossible, either outright or within given resource bounds, in various models. The parameters of the models considered include synchrony, faulttolerance, different communication media, and randomization. The resource bounds refe ..."
Abstract

Cited by 52 (5 self)
 Add to MetaCart
We survey results from distributed computing that show tasks to be impossible, either outright or within given resource bounds, in various models. The parameters of the models considered include synchrony, faulttolerance, different communication media, and randomization. The resource bounds refer to time, space and message complexity. These results are useful in understanding the inherent difficulty of individual problems and in studying the power of different models of distributed computing.
Parallelizing unionfind in Constraint Handling Rules using confluence analysis
 LOGIC PROGRAMMING: 21ST INTERNATIONAL CONFERENCE, ICLP 2005, VOLUME 3668 OF LECTURE NOTES IN COMPUTER SCIENCE
, 2005
"... Constraint Handling Rules is a logical concurrent committedchoice rulebased language. Recently it was shown that the classical unionfind algorithm can be implemented in CHR with optimal time complexity. Here we investigate if a parallel implementation of this algorithm is also possible in CHR. The ..."
Abstract

Cited by 33 (15 self)
 Add to MetaCart
(Show Context)
Constraint Handling Rules is a logical concurrent committedchoice rulebased language. Recently it was shown that the classical unionfind algorithm can be implemented in CHR with optimal time complexity. Here we investigate if a parallel implementation of this algorithm is also possible in CHR. The problem is hard for several reasons: Up to now, no parallel computation model for CHR was defined. Tarjan’s optimal unionfind is known to be hard to parallelize. The parallel code should be as close as possible to the sequential one. It turns out that confluence analysis of the sequential implementation gives almost all the information needed to parallelize the unionfind algorithm under a rather general parallel computation model for CHR.
A Theory of Competitive Analysis for Distributed Algorithms
, 1994
"... We introduce a theory of competitive analysis for distributed algorithms. The first steps in this direction were made in the seminal papers of Bartal, Fiat, and Rabani [l'?], and of Awerbuch, Kutten, and Peleg [15], in the context of data management and job scheduling. In these papers, as well ..."
Abstract

Cited by 32 (5 self)
 Add to MetaCart
We introduce a theory of competitive analysis for distributed algorithms. The first steps in this direction were made in the seminal papers of Bartal, Fiat, and Rabani [l'?], and of Awerbuch, Kutten, and Peleg [15], in the context of data management and job scheduling. In these papers, as well as in other subsequent work [l4, 4, 181, the cost of a distributed algorithm as compared to the cost of an optimal globalcontrol algorithm. Here we introduce a more refined notion of competitiveness for distributed algorithms, one that reflects the performance of distributed algorithms more accurately. In particular, our theory allows one to compare the cost of a distributed online algorithm to the cost of an optimal distributed algorithm. We demonstrate our method by studying the
Universal Operations: Unary Versus Binary
, 1996
"... 1 1 Introduction 2 2 Related Work 5 3 Preliminaries 7 3.1 The Asynchronous SharedMemory Model : : : : : : : : : : : : : : : : : : : 7 3.2 Sensitivity : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 8 4 The Left/Right Algorithm 11 4.1 The General Scheme : : : : : : : ..."
Abstract

Cited by 30 (2 self)
 Add to MetaCart
1 1 Introduction 2 2 Related Work 5 3 Preliminaries 7 3.1 The Asynchronous SharedMemory Model : : : : : : : : : : : : : : : : : : : 7 3.2 Sensitivity : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 8 4 The Left/Right Algorithm 11 4.1 The General Scheme : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 11 4.2 The Left/Right Algorithm : : : : : : : : : : : : : : : : : : : : : : : : : : : : 13 4.2.1 Overview : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 13 4.2.2 The code : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 14 4.2.3 Correctness of the Algorithm : : : : : : : : : : : : : : : : : : : : : : 16 4.2.4 Analysis of the Algorithm : : : : : : : : : : : : : : : : : : : : : : : : 18 4.3 Inherently Asymmetric Data Structures : : : : : : : : : : : : : : : : : : : : 21 5 The Decision Algorithm 23 5.1 Monotone Paths : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 23 5.1.1 One Phase :...