Results 1  10
of
15
Communication Efficient Data Structures on the BSP model with Applications
 IN PROCEEDINGS OF EUROPAR'96
, 1996
"... The implementation of data structures on distributed memory models such as the BulkSynchronous Parallel (BSP) model, rather than shared memory ones such as the Parallel Random Access Machine (PRAM), offers a serious challenge. In this work we undertake the architecture independent study of the comp ..."
Abstract

Cited by 18 (8 self)
 Add to MetaCart
The implementation of data structures on distributed memory models such as the BulkSynchronous Parallel (BSP) model, rather than shared memory ones such as the Parallel Random Access Machine (PRAM), offers a serious challenge. In this work we undertake the architecture independent study of the computation and communication requirements of searching ordered hlevel graphs, which include many of the standard data structures. We propose multiway search as a general tool for the design, analysis and implementation of BSP algorithms. This technique allows elegant highlevel design and analysis of algorithms, using data structures similar to those of sequential models. Applications to computational geometry and sorting are also presented. In particular, our new randomized sorting algorithm improves previously known BSP randomized sorting algorithms upon the amount of parallel slackness required to achieve optimality. Moreover, our methods are within a 1 + o(1) multiplicative factor of the ...
A Randomized Sorting Algorithm on the BSP model
 IN PROCEEDINGS OF IPPS
, 1997
"... We present a new randomized sorting algorithm on the BulkSynchronousParallel (BSP) model. The algorithm improves upon the parallel slack of previous algorithms to achieve optimality. Tighter probabilistic bounds are also established. It uses sample sorting and utilizes recently introduced search al ..."
Abstract

Cited by 15 (5 self)
 Add to MetaCart
We present a new randomized sorting algorithm on the BulkSynchronousParallel (BSP) model. The algorithm improves upon the parallel slack of previous algorithms to achieve optimality. Tighter probabilistic bounds are also established. It uses sample sorting and utilizes recently introduced search algorithms for a class of data structures on the BSP model. Moreover, our methods are within a 1+o(1) multiplicative factor of the respective sequential methods in terms of speedup for a wide range of the BSP parameters.
Concurrent Heaps on the BSP Model
, 1996
"... In this paper we present a new randomized selection algorithm on the BulkSynchronous Parallel (BSP) model of computation along with an application of this algorithm to dynamic data structures, namely Parallel Priority Queues (PPQs). We show that our algorithms improve previous results upon both the ..."
Abstract

Cited by 11 (11 self)
 Add to MetaCart
In this paper we present a new randomized selection algorithm on the BulkSynchronous Parallel (BSP) model of computation along with an application of this algorithm to dynamic data structures, namely Parallel Priority Queues (PPQs). We show that our algorithms improve previous results upon both the communication requirements and the amount of parallel slack required to achieve optimal performance. We also establish that optimality to within small multiplicative constant factors can be achieved for a wide range of parallel machines. While these algorithms are fairly simple themselves, descriptions of their performance in terms of the BSP parameters is somewhat involved. The main reward of quantifying these complications is that it allows transportable software to be written for parallel machines that fit the model. We also present experimental results for the selection algorithm that reinforce our claims.
Randomized Priority Queues for Fast Parallel Access
 Journal of Parallel and Distributed Computing
, 1997
"... Applications like parallel search or discrete event simulation often assign priority or importance to pieces of work. An effective way to exploit this for parallelization is to use a priority queue data structure for scheduling the work; but a bottleneck free implementation of parallel priority ..."
Abstract

Cited by 11 (1 self)
 Add to MetaCart
Applications like parallel search or discrete event simulation often assign priority or importance to pieces of work. An effective way to exploit this for parallelization is to use a priority queue data structure for scheduling the work; but a bottleneck free implementation of parallel priority queue access by many processors is required to make this approach scalable. We present simple and portable randomized algorithms for parallel priority queues on distributed memory machines with fully distributed storage. Accessing O(n) out of m elements on an nprocessor network with diameter d requires amortized time O with high probability for many network types. On logarithmic diameter networks, the algorithms are as fast as the best previously known EREWPRAM methods. Implementations demonstrate that the approach is already useful for medium scale parallelism.
Two Topics in Applied Algorithmics
, 1998
"... This thesis examines two largely unrelated problems in applied algorithmics, motivated by the search for efficient geometric algorithms. In the first part of the thesis, we consider the problem of finding efficient parallel algorithms for heterogeneous parallel computers, i.e., parallel computers in ..."
Abstract

Cited by 6 (0 self)
 Add to MetaCart
This thesis examines two largely unrelated problems in applied algorithmics, motivated by the search for efficient geometric algorithms. In the first part of the thesis, we consider the problem of finding efficient parallel algorithms for heterogeneous parallel computers, i.e., parallel computers in which different processors have different computational potential. To this end, we define a formal computational model for heterogeneous systems and develop algorithms for commonly used communication operations. The result is that many existing parallel algorithms which use these communication operations can be adapted to our model with little or no modifications. In the second part of the thesis we consider the problem of geometric models which allow for varying levels of detail. To this end, we extend the progressive mesh representation introduced by Hoppe. The main technical contribution of this part is an efficient scheme for refining only selected regions of a progressive mesh. Using ...
Parallel Priority Queue and List Contraction: The BSP Approach
 In Proc. EuroPar 97. LNCS
, 1997
"... . In this paper we present efficient and practical extensions of the randomized Parallel Priority Queue (PPQ) algorithms of Ranade et al., and efficient randomized and deterministic algorithms for the problem of list contraction on the BulkSynchronous Parallel (BSP) model. We also present an experi ..."
Abstract

Cited by 5 (0 self)
 Add to MetaCart
. In this paper we present efficient and practical extensions of the randomized Parallel Priority Queue (PPQ) algorithms of Ranade et al., and efficient randomized and deterministic algorithms for the problem of list contraction on the BulkSynchronous Parallel (BSP) model. We also present an experimental study of their performance. We show that our algorithms are communication efficient and achieve small multiplicative constant factors for a wide range of parallel machines. 1 Introduction We present an architecture independent study of the computation and communication requirements of an efficient Parallel Priority Queue (PPQ) implementation and list contraction algorithms along with an experimental study. The computational model adopted is the BulkSynchronous Parallel (BSP) model, proposed by L. G. Valiant [20], which deals explicitly with the notion of communication and synchronization among computational threads. A detailed discussion of the BSP model appears in [20]. The first a...
CoarseGrained Parallel Computing on Heterogeneous Systems
, 1998
"... We consider the problem of finding efficient parallel algorithms for heterogeneous parallel computers, i.e., parallel computers in which different processors have different computational potential. To this end, we define a formal computational model for heterogeneous systems and develop algorithm ..."
Abstract

Cited by 4 (0 self)
 Add to MetaCart
We consider the problem of finding efficient parallel algorithms for heterogeneous parallel computers, i.e., parallel computers in which different processors have different computational potential. To this end, we define a formal computational model for heterogeneous systems and develop algorithms for commonly used communication operations. The result is that many existing parallel algorithms which use these communication operations can be adapted to our model with little or no modifications. Experimental results are give which show that our algorithms are of considerable practical relevance.
Binary Tournaments and Priority Queues: PRAM and BSP
, 1997
"... We use an old idea of tournament based complete binary tree (CBT) to implement parallel priority queues (PQs). We show that this data structure enables a more efficient implementation of the operations extractmin and insert in terms of communications and synchronizations among processors than simil ..."
Abstract

Cited by 3 (3 self)
 Add to MetaCart
We use an old idea of tournament based complete binary tree (CBT) to implement parallel priority queues (PQs). We show that this data structure enables a more efficient implementation of the operations extractmin and insert in terms of communications and synchronizations among processors than similar operations on the implicit heap. In most cases we only improve the asymptotic bounds on constant factors. However, some operations can be twice faster using simpler parallel algorithms upon the CBT. 1 Data structure and basic operations Every item stored in the PQ consists of a priority value and an indentifier. We associate every leaf of the CBT with one item, and use the internal nodes to maintain a continuous binary tournament among the items. A match, at internal node n, consists of determining the item with greater priority (less numerical value) between the two children of n and writing the identifier of the winner in n. The tournament is made up of a set of matches played in ever...
DiscreteEvent Simulation on the BulkSynchronous Parallel Model
, 1998
"... The bulksynchronous parallel (BSP) model of computing has been proposed to enable the development of portable software which achieves scalable performance across diverse parallel architectures. A number of applications of computing science have been demonstrated to be efficiently supported by the B ..."
Abstract

Cited by 2 (0 self)
 Add to MetaCart
The bulksynchronous parallel (BSP) model of computing has been proposed to enable the development of portable software which achieves scalable performance across diverse parallel architectures. A number of applications of computing science have been demonstrated to be efficiently supported by the BSP model in practice.
Coarse Grained Parallel Computing on Heterogeneous Systems
 In Proceedings of ACM SAC
, 1998
"... Coarse grained parallel (CGP) computing models such as the coarse grained multicomputer (CGM), bulk synchronous parallel (BSP), and LogP models have received considerable attention recently from the parallel computing community. This paper examines a new application of CGP algorithms, namely in h ..."
Abstract

Cited by 2 (1 self)
 Add to MetaCart
Coarse grained parallel (CGP) computing models such as the coarse grained multicomputer (CGM), bulk synchronous parallel (BSP), and LogP models have received considerable attention recently from the parallel computing community. This paper examines a new application of CGP algorithms, namely in heterogeneous systems, and shows that this approach to heterogeneous computing has a number of advantages over traditional approaches. A hetegerogeneous CGP model of computation is defined, and a number of algorithms and basic communication operations are developed for this model. These algorithms have been implemented in the form of a reusable and extendable library which simplifies the task of programming heterogeneous systems.