Results 1  10
of
10
Randomized Priority Queues for Fast Parallel Access
 Journal of Parallel and Distributed Computing
, 1997
"... Applications like parallel search or discrete event simulation often assign priority or importance to pieces of work. An effective way to exploit this for parallelization is to use a priority queue data structure for scheduling the work; but a bottleneck free implementation of parallel priority ..."
Abstract

Cited by 11 (1 self)
 Add to MetaCart
Applications like parallel search or discrete event simulation often assign priority or importance to pieces of work. An effective way to exploit this for parallelization is to use a priority queue data structure for scheduling the work; but a bottleneck free implementation of parallel priority queue access by many processors is required to make this approach scalable. We present simple and portable randomized algorithms for parallel priority queues on distributed memory machines with fully distributed storage. Accessing O(n) out of m elements on an nprocessor network with diameter d requires amortized time O with high probability for many network types. On logarithmic diameter networks, the algorithms are as fast as the best previously known EREWPRAM methods. Implementations demonstrate that the approach is already useful for medium scale parallelism.
Very Fast Optimal Parallel Algorithms for Heap Construction
, 1994
"... We give two algorithms for permuting n items in an array into heap order on a CRCW PRAM. The first is deterministic and runs in O(log log n) time and performs O(n) operations. This runtime is the best possible for any comparisonbased algorithm using n processors. The second is randomized and runs ..."
Abstract

Cited by 5 (0 self)
 Add to MetaCart
We give two algorithms for permuting n items in an array into heap order on a CRCW PRAM. The first is deterministic and runs in O(log log n) time and performs O(n) operations. This runtime is the best possible for any comparisonbased algorithm using n processors. The second is randomized and runs in O(log log log n) time with high probability, performing O(n) operations. No PRAM algorithm with o(log n) runtime was previously known for this problem. In order to obtain the deterministic result we study the parallel complexity of selecting the kth smallest of n elements on the CRCW PRAM, a problem that is of independent interest. We give an algorithm that is superior to existing ones when k is small compared to n. Consequently, we show that this problem can be solved in O(log log n + log k= log log n) time and O(n) operations for all 1 k n=2. A matching time lower bound is shown for all algorithms that use n or fewer processors to solve this problem. 1 Introduction A heap is a co...
Binary Tournaments and Priority Queues: PRAM and BSP
, 1997
"... We use an old idea of tournament based complete binary tree (CBT) to implement parallel priority queues (PQs). We show that this data structure enables a more efficient implementation of the operations extractmin and insert in terms of communications and synchronizations among processors than simil ..."
Abstract

Cited by 3 (3 self)
 Add to MetaCart
We use an old idea of tournament based complete binary tree (CBT) to implement parallel priority queues (PQs). We show that this data structure enables a more efficient implementation of the operations extractmin and insert in terms of communications and synchronizations among processors than similar operations on the implicit heap. In most cases we only improve the asymptotic bounds on constant factors. However, some operations can be twice faster using simpler parallel algorithms upon the CBT. 1 Data structure and basic operations Every item stored in the PQ consists of a priority value and an indentifier. We associate every leaf of the CBT with one item, and use the internal nodes to maintain a continuous binary tournament among the items. A match, at internal node n, consists of determining the item with greater priority (less numerical value) between the two children of n and writing the identifier of the winner in n. The tournament is made up of a set of matches played in ever...
On the Pending Event Set and Binary Tournaments
"... this paper we study the performance of the very first tournament based complete binary tree. We focus on discreteevent simulation and our results show that this unknown predecessor of heaps can be a more efficient alternative to the fastest pending event set implementations reported in the literatu ..."
Abstract

Cited by 3 (3 self)
 Add to MetaCart
this paper we study the performance of the very first tournament based complete binary tree. We focus on discreteevent simulation and our results show that this unknown predecessor of heaps can be a more efficient alternative to the fastest pending event set implementations reported in the literature. We also extend the idea of binary tournaments to a (2; L)tournament structure which exhibits the property of delaying the processing of events with larger timestamps whilst it keeps similar theoretical performance bounds to the native (2; 1)structure or CBT. This property can be certainly useful in systems where many pending events are expected to be deleted or rescheduled during the simulation. 2 Tournament trees
DiscreteEvent Simulation on the BulkSynchronous Parallel Model
, 1998
"... The bulksynchronous parallel (BSP) model of computing has been proposed to enable the development of portable software which achieves scalable performance across diverse parallel architectures. A number of applications of computing science have been demonstrated to be efficiently supported by the B ..."
Abstract

Cited by 2 (0 self)
 Add to MetaCart
The bulksynchronous parallel (BSP) model of computing has been proposed to enable the development of portable software which achieves scalable performance across diverse parallel architectures. A number of applications of computing science have been demonstrated to be efficiently supported by the BSP model in practice. In this
Priority Queue Operations On EREWPRAM
, 1997
"... . Using EREWPRAM algorithms on a tournament based complete binary tree we implement the insert and extractmin operations with p = log N processors at costs O(1) and O(log log N) respectively. Previous solutions [4, 7] under the PRAM model and identical assumptions attain O(log log N) cost for both ..."
Abstract

Cited by 1 (1 self)
 Add to MetaCart
. Using EREWPRAM algorithms on a tournament based complete binary tree we implement the insert and extractmin operations with p = log N processors at costs O(1) and O(log log N) respectively. Previous solutions [4, 7] under the PRAM model and identical assumptions attain O(log log N) cost for both operations. We also improve on constant factors the asymptotic bound for extractmin since in it we reduce the use of communication demanding primitives. The tournament tree enables the design of parallel algorithms that are noticeably simple. 1 Tournament trees Our data structure is a complete binary tree (CBT). Every item stored in the tree consists of a priority value and an identifier. We associate every leaf of the CBT with one item, and use the internal nodes to maintain a continuous binary tournament among the items. A match, at internal node n, consists of determining the item with higher priority (lesser numerical value) between the two children of n and writing the identifier of ...
Comparator Networks for Binary Heap Construction
 In Proc. 6th Scandinavian Workshop on Algorithm Theory
, 1998
"... Comparator networks for constructing binary heaps of size n are presented which have size O(n log log n) and depth O(log n). A lower bound of n log log n O(n) for the size of any heap construction network is also proven, implying that the networks presented are within a constant factor of optimal. W ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
Comparator networks for constructing binary heaps of size n are presented which have size O(n log log n) and depth O(log n). A lower bound of n log log n O(n) for the size of any heap construction network is also proven, implying that the networks presented are within a constant factor of optimal. We give a tight relation between the leading constants in the size of selection networks and in the size of heap construction networks.
A Survey on Parallel Algorithms for Priority Queue Operations
"... Parallel Priority Queue (PPQ) data structure supports parallel operations for manipulating data items with keys, such as inserting n new items, deleting n items with the smallest keys, creating a new PPQ that contains a set of items, and melding tow PPQ into one. In this article, we present some rec ..."
Abstract
 Add to MetaCart
Parallel Priority Queue (PPQ) data structure supports parallel operations for manipulating data items with keys, such as inserting n new items, deleting n items with the smallest keys, creating a new PPQ that contains a set of items, and melding tow PPQ into one. In this article, we present some recent research works on PPQ which support the simultaneous operations of the k smallest elements, k being a constant.
Lock Bypassing: An Efficient Algorithm For Concurrently Accessing Priority Heaps
, 1998
"... ing with credit is permitted. To copy otherwise, to republish, to post on servers, to redistribute to lists, or to use any component of this work in other works, requires prior specific permission and/or a fee. Permissions may be requested from Publications Dept, ACM Inc., 1515 Broadway, New York, N ..."
Abstract
 Add to MetaCart
ing with credit is permitted. To copy otherwise, to republish, to post on servers, to redistribute to lists, or to use any component of this work in other works, requires prior specific permission and/or a fee. Permissions may be requested from Publications Dept, ACM Inc., 1515 Broadway, New York, NY 10036 USA, fax +1 (212) 8690481, or permissions@acm.org. 2 \Delta Yong Yan and Xiaodong Zhang 1. INTRODUCTION The heap is an important data structure used as a priority queue in many application areas, such as in operating system scheduling, discrete event simulation, graph search and the branchandbound method on sharedmemory multiprocessors [Biswas and Browne 1993; Cormen et al. 1990; Jones 1986; Olariu and Wen 1991; Ranade et al. 1994]. In these applications, each processor repeatedly performs in a "accessthink" cycle. Every processor executes its current event or subproblem, then accesses the shared priority heap to insert new events or subproblems generated by the processor. The...