Results 1 -
9 of
9
Randomized Priority Queues for Fast Parallel Access
- Journal of Parallel and Distributed Computing
, 1997
"... Applications like parallel search or discrete event simulation often assign priority or importance to pieces of work. An effective way to exploit this for parallelization is to use a priority queue data structure for scheduling the work; but a bottleneck free implementation of parallel priority ..."
Abstract
-
Cited by 8 (1 self)
- Add to MetaCart
Applications like parallel search or discrete event simulation often assign priority or importance to pieces of work. An effective way to exploit this for parallelization is to use a priority queue data structure for scheduling the work; but a bottleneck free implementation of parallel priority queue access by many processors is required to make this approach scalable. We present simple and portable randomized algorithms for parallel priority queues on distributed memory machines with fully distributed storage. Accessing O(n) out of m elements on an n-processor network with diameter d requires amortized time O with high probability for many network types. On logarithmic diameter networks, the algorithms are as fast as the best previously known EREWPRAM methods. Implementations demonstrate that the approach is already useful for medium scale parallelism.
Very Fast Optimal Parallel Algorithms for Heap Construction
, 1994
"... We give two algorithms for permuting n items in an array into heap order on a CRCW PRAM. The first is deterministic and runs in O(log log n) time and performs O(n) operations. This run-time is the best possible for any comparison-based algorithm using n processors. The second is randomized and runs ..."
Abstract
-
Cited by 5 (0 self)
- Add to MetaCart
We give two algorithms for permuting n items in an array into heap order on a CRCW PRAM. The first is deterministic and runs in O(log log n) time and performs O(n) operations. This run-time is the best possible for any comparison-based algorithm using n processors. The second is randomized and runs in O(log log log n) time with high probability, performing O(n) operations. No PRAM algorithm with o(log n) run-time was previously known for this problem. In order to obtain the deterministic result we study the parallel complexity of selecting the kth smallest of n elements on the CRCW PRAM, a problem that is of independent interest. We give an algorithm that is superior to existing ones when k is small compared to n. Consequently, we show that this problem can be solved in O(log log n + log k= log log n) time and O(n) operations for all 1 k n=2. A matching time lower bound is shown for all algorithms that use n or fewer processors to solve this problem. 1 Introduction A heap is a co...
On the Pending Event Set and Binary Tournaments
"... this paper we study the performance of the very first tournament based complete binary tree. We focus on discrete-event simulation and our results show that this unknown predecessor of heaps can be a more efficient alternative to the fastest pending event set implementations reported in the literatu ..."
Abstract
-
Cited by 3 (3 self)
- Add to MetaCart
this paper we study the performance of the very first tournament based complete binary tree. We focus on discrete-event simulation and our results show that this unknown predecessor of heaps can be a more efficient alternative to the fastest pending event set implementations reported in the literature. We also extend the idea of binary tournaments to a (2; L)-tournament structure which exhibits the property of delaying the processing of events with larger timestamps whilst it keeps similar theoretical performance bounds to the native (2; 1)-structure or CBT. This property can be certainly useful in systems where many pending events are expected to be deleted or rescheduled during the simulation. 2 Tournament trees
Binary Tournaments and Priority Queues: PRAM and BSP
, 1997
"... We use an old idea of tournament based complete binary tree (CBT) to implement parallel priority queues (PQs). We show that this data structure enables a more efficient implementation of the operations extract-min and insert in terms of communications and synchronizations among processors than simil ..."
Abstract
-
Cited by 3 (3 self)
- Add to MetaCart
We use an old idea of tournament based complete binary tree (CBT) to implement parallel priority queues (PQs). We show that this data structure enables a more efficient implementation of the operations extract-min and insert in terms of communications and synchronizations among processors than similar operations on the implicit heap. In most cases we only improve the asymptotic bounds on constant factors. However, some operations can be twice faster using simpler parallel algorithms upon the CBT. 1 Data structure and basic operations Every item stored in the PQ consists of a priority value and an indentifier. We associate every leaf of the CBT with one item, and use the internal nodes to maintain a continuous binary tournament among the items. A match, at internal node n, consists of determining the item with greater priority (less numerical value) between the two children of n and writing the identifier of the winner in n. The tournament is made up of a set of matches played in ever...
Discrete-Event Simulation on the Bulk-Synchronous Parallel Model
, 1998
"... The bulk-synchronous parallel (BSP) model of computing has been proposed to enable the development of portable software which achieves scalable performance across diverse parallel architectures. A number of applications of computing science have been demonstrated to be efficiently supported by the B ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
The bulk-synchronous parallel (BSP) model of computing has been proposed to enable the development of portable software which achieves scalable performance across diverse parallel architectures. A number of applications of computing science have been demonstrated to be efficiently supported by the BSP model in practice.
Comparator Networks for Binary Heap Construction
- In Proc. 6th Scandinavian Workshop on Algorithm Theory
, 1998
"... Comparator networks for constructing binary heaps of size n are presented which have size O(n log log n) and depth O(log n). A lower bound of n log log n O(n) for the size of any heap construction network is also proven, implying that the networks presented are within a constant factor of optimal. W ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
Comparator networks for constructing binary heaps of size n are presented which have size O(n log log n) and depth O(log n). A lower bound of n log log n O(n) for the size of any heap construction network is also proven, implying that the networks presented are within a constant factor of optimal. We give a tight relation between the leading constants in the size of selection networks and in the size of heap construction networks.
Priority Queue Operations On EREW-PRAM
, 1997
"... . Using EREW-PRAM algorithms on a tournament based complete binary tree we implement the insert and extract-min operations with p = log N processors at costs O(1) and O(log log N) respectively. Previous solutions [4, 7] under the PRAM model and identical assumptions attain O(log log N) cost for both ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
. Using EREW-PRAM algorithms on a tournament based complete binary tree we implement the insert and extract-min operations with p = log N processors at costs O(1) and O(log log N) respectively. Previous solutions [4, 7] under the PRAM model and identical assumptions attain O(log log N) cost for both operations. We also improve on constant factors the asymptotic bound for extract-min since in it we reduce the use of communication demanding primitives. The tournament tree enables the design of parallel algorithms that are noticeably simple. 1 Tournament trees Our data structure is a complete binary tree (CBT). Every item stored in the tree consists of a priority value and an identifier. We associate every leaf of the CBT with one item, and use the internal nodes to maintain a continuous binary tournament among the items. A match, at internal node n, consists of determining the item with higher priority (lesser numerical value) between the two children of n and writing the identifier of ...
Lock Bypassing: An Efficient Algorithm For Concurrently Accessing Priority Heaps
, 1998
"... ing with credit is permitted. To copy otherwise, to republish, to post on servers, to redistribute to lists, or to use any component of this work in other works, requires prior specific permission and/or a fee. Permissions may be requested from Publications Dept, ACM Inc., 1515 Broadway, New York, N ..."
Abstract
- Add to MetaCart
ing with credit is permitted. To copy otherwise, to republish, to post on servers, to redistribute to lists, or to use any component of this work in other works, requires prior specific permission and/or a fee. Permissions may be requested from Publications Dept, ACM Inc., 1515 Broadway, New York, NY 10036 USA, fax +1 (212) 869-0481, or permissions@acm.org. 2 \Delta Yong Yan and Xiaodong Zhang 1. INTRODUCTION The heap is an important data structure used as a priority queue in many application areas, such as in operating system scheduling, discrete event simulation, graph search and the branch-and-bound method on shared-memory multiprocessors [Biswas and Browne 1993; Cormen et al. 1990; Jones 1986; Olariu and Wen 1991; Ranade et al. 1994]. In these applications, each processor repeatedly performs in a "access-think" cycle. Every processor executes its current event or subproblem, then accesses the shared priority heap to insert new events or subproblems generated by the processor. The...

