Results 1  10
of
28
A methodology for implementing highly concurrent data structures
 In 2nd Symp. Principles & Practice of Parallel Programming
, 1990
"... A con.curren.t object is a data structure shared by concurrent processes. Conventional techniques for implementing concurrent objects typically rely on criticaI sections: ensuring that only one process at a time can operate on the object. Nevertheless, critical sections are poorly suited for asynchr ..."
Abstract

Cited by 320 (12 self)
 Add to MetaCart
A con.curren.t object is a data structure shared by concurrent processes. Conventional techniques for implementing concurrent objects typically rely on criticaI sections: ensuring that only one process at a time can operate on the object. Nevertheless, critical sections are poorly suited for asynchronous systems: if one process is halted or delayed in a critical section, other, nonfaulty processes will be unable to progress. By contrast, a concurrent object implementation is nonblocking if it always guarantees that some process will complete an operation in a finite number of steps, and it is waitfree if it guarantees that each process will complete an operation in a finite number of steps. This paper proposes a new methodology for constructing nonblocking aud waitfree implementations of concurrent objects. The objectâ€™s representation and operations are written as st,ylized sequential programs, with no explicit synchronization. Each sequential operation is automatically transformed into a nonblocking or waitfree operation usiug novel synchronization and memory management algorithms. These algorithms are presented for a multiple instruction/multiple data (MIM D) architecture in which n processes communicate by applying read, write, and comparekYswa,p operations to a shared memory. 1
Multimodel parallel programming in Psyche
 In Proceedings of the 2nd ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming
, 1990
"... Many different parallel programming models, including lightweight processes that communicate with shared memory and heavyweight processes that communicate with messages, have been used to implement parallel applications. Unfortunately, operating systems and languages designed for parallel programmin ..."
Abstract

Cited by 38 (12 self)
 Add to MetaCart
Many different parallel programming models, including lightweight processes that communicate with shared memory and heavyweight processes that communicate with messages, have been used to implement parallel applications. Unfortunately, operating systems and languages designed for parallel programming typically support only one model. Multimodel parallel programming is the simultaneous use of several different models, both across programs and within a single program. This paper describes multimodel parallel programming in the Psyche multiprocessor operating system. We explain why multimodel programming is desirable and present an operating system interface designed to support it. Through a series of three examples, we illustrate how the Psyche operating system supports different models of parallelism and how the different models are able to interact. 1.
An Efficient Algorithm for Concurrent Priority Queue Heaps
 Inf. Proc. Letters
, 1996
"... We present a new algorithm for concurrent access to arraybased priority queue heaps. Deletions proceed topdown as they do in a previous algorithm due to Rao and Kumar [6], but insertions proceed bottomup, and consecutive insertions use a bitreversal technique to scatter accesses across the fring ..."
Abstract

Cited by 22 (0 self)
 Add to MetaCart
We present a new algorithm for concurrent access to arraybased priority queue heaps. Deletions proceed topdown as they do in a previous algorithm due to Rao and Kumar [6], but insertions proceed bottomup, and consecutive insertions use a bitreversal technique to scatter accesses across the fringe of the tree, to reduce contention. Because insertions do not have to traverse the entire height of the tree (as they do in previous work), as many as O(M) operations can proceed in parallel, rather than O(log M) on a heap of size M . Experimental results on a Silicon Graphics Challenge multiprocessor demonstrate good overall performance for the new algorithm on small heaps, and significant performance improvements over known alternatives on large heaps with mixed insertion/deletion workloads. This work was supported in part by NSF grants nos. CDA8822724 and CCR9319445, and by ONR research grant no. N0001492J1801 (in conjunction with the DARPA Research in Information Science and Tech...
Lazy Queue: A new approach to implementing the Pendingevent Set
"... In discrete event simulation, very often the future event set is represented by a priority queue. The data structure used to implement the queue and the way operations are performed on it are often crucial to the execution time of a simulation. In this paper a new priority queue implementation strat ..."
Abstract

Cited by 16 (0 self)
 Add to MetaCart
In discrete event simulation, very often the future event set is represented by a priority queue. The data structure used to implement the queue and the way operations are performed on it are often crucial to the execution time of a simulation. In this paper a new priority queue implementation strategy, the Lazy Queue, is presented. It is tailored to handle operations on the pending event set efficiently. The Lazy Queue is a kind of multilist data structure that delays the sorting process until a point near the time where the elements are to be dequeued. In this way, the time needed to sort new elements in the queue is reduced. We have performed several experiments comparing queue access times with the access times of the implicit heap and the calendar queue. Our experimental results indicate that the Lazy Queue is superior to these priority queue implementations. Key words: Discrete Event Simulation, Priority Queue, Event List implementation, performance measurement. 1 Introduction...
Parallel Priority Queues
, 1991
"... This paper introduces the Parallel Priority Queue (PPQ) abstract data type. A PPQ stores a set of integervalued items and provides operations such as insertion of n new items or deletion of the n smallest ones. Algorithms for realizing PPQ operations on an nprocessor CREWPRAM are based on two new ..."
Abstract

Cited by 15 (1 self)
 Add to MetaCart
This paper introduces the Parallel Priority Queue (PPQ) abstract data type. A PPQ stores a set of integervalued items and provides operations such as insertion of n new items or deletion of the n smallest ones. Algorithms for realizing PPQ operations on an nprocessor CREWPRAM are based on two new data structures, the nBandwidthHeap (nH) and the nBandwidth LeftistHeap (nL), that are obtained as extensions of the well known sequential binaryheap and leftistheap, respectively. Using these structures, it is shown that insertion of n new items in a PPQ of m elements can be performed in parallel time O(h + log n), where h = log m n , while deletion of the n smallest items can be performed in time O(h + log log n). Keywords Data structures, parallel algorithms, analysis of algorithms, heaps, PRAM model. This work has been partly supported by the Ministero della Pubblica Istruzione of Italy and by the C.N.R. project "Sistemi Informatici e Calcolo Parallelo" y Istituto di Ela...
Concurrent Heaps on the BSP Model
, 1996
"... In this paper we present a new randomized selection algorithm on the BulkSynchronous Parallel (BSP) model of computation along with an application of this algorithm to dynamic data structures, namely Parallel Priority Queues (PPQs). We show that our algorithms improve previous results upon both the ..."
Abstract

Cited by 11 (11 self)
 Add to MetaCart
In this paper we present a new randomized selection algorithm on the BulkSynchronous Parallel (BSP) model of computation along with an application of this algorithm to dynamic data structures, namely Parallel Priority Queues (PPQs). We show that our algorithms improve previous results upon both the communication requirements and the amount of parallel slack required to achieve optimal performance. We also establish that optimality to within small multiplicative constant factors can be achieved for a wide range of parallel machines. While these algorithms are fairly simple themselves, descriptions of their performance in terms of the BSP parameters is somewhat involved. The main reward of quantifying these complications is that it allows transportable software to be written for parallel machines that fit the model. We also present experimental results for the selection algorithm that reinforce our claims.
Portable Distributed Priority Queues with MPI
, 1995
"... Part of this work has been presented in [17]. This paper analyzes the performances of portable distributed priority queues by examining the theoretical features required and by comparing various implementations. In spite of intrinsic bottlenecks and induced hotspots, we argue that tree topologies a ..."
Abstract

Cited by 9 (0 self)
 Add to MetaCart
Part of this work has been presented in [17]. This paper analyzes the performances of portable distributed priority queues by examining the theoretical features required and by comparing various implementations. In spite of intrinsic bottlenecks and induced hotspots, we argue that tree topologies are attractive to manage the natural centralized control required for the deletemin operation in order to detect the site which holds the item with the largest priority. We introduce an original perfect balancing to cope with the load variation due to the priority queue operations which continuously modify the overall number of items in the network. For comparison, we introduce the dheap and the binomial distributed priority queue. The purpose of this experiment is to convey, through executions on CrayT3D and MeikoT800, an understanding of the nature of the distributed priority queues, the range of their concurrency and a comparison of their efficiency to reduce requests latency. In particu...
Algorithms for Combinatorial Optimization in Real Time and their Automated Refinement by Genetic Programming
 University of Illinois at UrbanaChampaign
, 1994
"... The goal of this research is to develop a systematic, integrated method of designing efficient search algorithms that solve optimization problems in real time. Search algorithms studied in this thesis comprise metacontrol and primitive search. The class of optimization problems addressed are called ..."
Abstract

Cited by 7 (1 self)
 Add to MetaCart
The goal of this research is to develop a systematic, integrated method of designing efficient search algorithms that solve optimization problems in real time. Search algorithms studied in this thesis comprise metacontrol and primitive search. The class of optimization problems addressed are called combinatorial optimization problems, examples of which include many NPhard scheduling and planning problems, and problems in operations research and artificialintelligence applications. The problems we have addressed have a welldefined problem objective and a finite set of welldefined problem constraints. In this research, we use statespace trees as problem representations. The approach we have undertaken in designing efficient search algorithms is an engineering approach and consists of two phases: (a) designing generic search algorithms, and (b) improving by geneticsbased machine learning methods parametric heuristics used in the search algorithms designed. Our approach is a systematic method that integrates domain knowledge, search techniques, and automated learning techniques for designing better search algorithms. Knowledge captured in designing one search algorithm can be carried over for designing new ones. iv ACKNOWLEDGEMENTS I express my sincere gratitude to all the people who have helped me in the course of my graduate study. My thesis advisor, Professor Benjamin W. Wah, was always available for discussions and encouraged me to explore new ideas. I am deeply grateful to the committee
Amortization Results for Chromatic Search Trees, with an Application to Priority Queues
, 1997
"... this paper, we prove that only an amortized constant amount of rebalancing is necessary after an update in a chromatic search tree. We also prove that the amount of rebalancing done at any particular level decreases exponentially, going from the leaves toward the root. These results imply that, in p ..."
Abstract

Cited by 7 (0 self)
 Add to MetaCart
this paper, we prove that only an amortized constant amount of rebalancing is necessary after an update in a chromatic search tree. We also prove that the amount of rebalancing done at any particular level decreases exponentially, going from the leaves toward the root. These results imply that, in principle, a linear number of processes can access the tree simultaneously. We have included one interesting application of chromatic trees. Based on these trees, a priority queue with possibilities for a greater degree of parallelism than previous proposals can be implemented. ] 1997 Academic Press 1.
WaitFree Algorithms for Heaps
, 1994
"... This paper examines algorithms to implement heaps on shared memory multiprocessors. A natural model for these machines is an asynchronous parallel machine, in which the processors are subject to arbitrary delays. On such machines, it is desirable for algorithms to be waitfree, meaning that each thr ..."
Abstract

Cited by 6 (0 self)
 Add to MetaCart
This paper examines algorithms to implement heaps on shared memory multiprocessors. A natural model for these machines is an asynchronous parallel machine, in which the processors are subject to arbitrary delays. On such machines, it is desirable for algorithms to be waitfree, meaning that each thread makes progress independent of the other threads executing on the machine. We present a waitfree algorithm to implement heaps. The algorithms are similar to the general approach given in [4], with optimizations that allow many threads to work on the heap simultaneously, while still guaranteeing a strong serializability property. 1 Introduction We are interested in designing efficient data structures and algorithms for shared memory multiprocessors. Processors on these machines may execute instructions at a varying rate (due to cache behavior, for example), and are subject to long delays (e.g. when swapped out by the scheduler, or after a page fault). Programs are executed by a collection...