## Randomized Parallel List Ranking For Distributed Memory Multiprocessors (1996)

### Cached

### Download Links

Citations: | 12 - 6 self |

### BibTeX

@MISC{Dehne96randomizedparallel,

author = {Frank Dehne and Siang W. Song},

title = {Randomized Parallel List Ranking For Distributed Memory Multiprocessors},

year = {1996}

}

### OpenURL

### Abstract

We present a randomized parallel list ranking algorithm for distributed memory multiprocessors, using a BSP like model. We first describe a simple version which requires, with high probability, log(3p) + log ln(n) = ~ O(logp+ log log n) communication rounds (h-relations with h = ~ O( n p )) and ~ O( n p ) local computation. We then outline an improved version which requires, with high probability, only r (4k + 6) log( 2 3 p) + 8 = ~ O(k log p) communication rounds where k = minfi 0j ln (i+1) n ( 2 3 p) 2i+1 g. Note that k ! ln (n) is an extremely small number. For n 10 10 100 and p 4, the value of k is at most 2. Hence, for a given number of processors, p, the number of communication rounds required is, for all practical purposes, independent of n. For n 1; 500; 000 and 4 p 2048, the number of communication rounds in our algorithm is bounded, with high probability, by 78, but the actual number of communication rounds observed so far is 25 in the worst case. Fo...

### Citations

1129 |
A bridging model for parallel computation
- Valiant
(Show Context)
Citation Context ...es PRAM algorithms optimally on distributed memory parallel systems. Valiant points out, however, that one may want to design algorithms that utilize local computations and minimize global operations =-=[22]-=- [23]. The BSP approach requires that g (= local computationspeed/router bandwidth) is low, or xed, even for increasing number of processors. Gerbessiotis and Valiant [14]describe circumstances where ... |

636 |
An Introduction to Parallel Algorithms
- J'aJ'a
- 1992
(Show Context)
Citation Context ...of its n=p nodes x 2 S the value dist(x). Proc.1 Proc.2 Proc.3 Proc.4 Fig. 1. ALinear Linked List Stored In A Distributed Memory Multiprocessor Several PRAM list ranking algorithms have been proposed =-=[15]-=- [20]. The rst optimal O(log n) EREW PRAM algorithm is due to Cole and Vishkin [7]. Another optimal deterministic algorithm is given by Anderson and Miller [2]. Parallel list ranking algorithms using ... |

286 |
Computational Geometry: An Introduction Through Randomized Algorithms
- Mulmuley
- 1994
(Show Context)
Citation Context ...n) ~ denotes O(n) \with high probability". More precisely, X = O(f(n)), ~ if and 1 only if (8c >c0 > 1) ProbfX cf(n)g ng(c) where c0 is a xed constant andg(c) is a polynomial in c with g(c) !1for=-= c !1[19]-=-.s2 Random Sampling in Linear Linked Lists Consider a linear linked list with asetS of n nodes. In this section we willshow that if we select n random elements (pivots) of S then, with high probabilit... |

202 |
General Purpose Parallel Architectures, Chapter 18 of Handbook of Theoretical
- Valiant
- 1990
(Show Context)
Citation Context ...l Speedup results fortheoretical PRAM algorithms do not necessarily match the speedups observed on real machines [3] [21]. Given su cient slackness in the number of processors, Valiant's BSP approach =-=[23]-=- simulates PRAM algorithms optimally on distributed memory parallel systems. Valiant points out, however, that one may want to design algorithms that utilize local computations and minimize global ope... |

173 | A comparison of sorting algorithms for the connection machine cm-2
- Blelloch, Leiserson, et al.
- 1991
(Show Context)
Citation Context ...also leads to improved portability across di erent parallel architectures ([13] [22] [23]). The above model has been used (explicitly or implicitly) in parallel algorithm design for various problems (=-=[6]-=-, [8], [9], [11], [12], [16], [10]) and shown very good practical timing results. The List Ranking Problem Consider a linear linked list consisting ofaset S of n nodes and, for each node x 2 S, apoint... |

165 | Direct bulk-synchronous parallel algorithms
- Gerbessiotis, Valiant
- 1994
(Show Context)
Citation Context ...d minimize global operations [22] [23]. The BSP approach requires that g (= local computationspeed/router bandwidth) is low, or xed, even for increasing number of processors. Gerbessiotis and Valiant =-=[14]-=-describe circumstances where PRAM simulations can not be performed e ciently, among others if the factor g is high. Unfortunately, this is true for most currently available multiprocessors. Furthermor... |

86 |
Scalable parallel geometric algorithms for coarse grained multicomputers
- Dehne, Fabri, et al.
- 1993
(Show Context)
Citation Context ...3], the costofamessage also contains a constant overhead cost s. Thevalue of s can be fairly large and the total message overhead cost can have a considerable impact on the speedup observed (see e.g. =-=[8]). W-=-e use a slightly enhanced version of the BSP model, referred to ascoarse grained multicomputer model [8], [9], [10]. It is comprised of a set of p processors P1�:::�P p with O(n=p) local memory pe... |

62 |
Type Architectures, Shared Memory, and the Corollary of Modest Potential
- Snyder
- 1986
(Show Context)
Citation Context ... actual number of communications rounds will not exceed 50. 1 Introduction The Model Speedup results fortheoretical PRAM algorithms do not necessarily match the speedups observed on real machines [3] =-=[21]-=-. Given su cient slackness in the number of processors, Valiant's BSP approach [23] simulates PRAM algorithms optimally on distributed memory parallel systems. Valiant points out, however, that one ma... |

49 | A randomized parallel 3D convex hull algorithm for coarse grained multicomputers
- Dehne, Deng, et al.
- 1995
(Show Context)
Citation Context ...ge overhead cost can have a considerable impact on the speedup observed (see e.g. [8]). We use a slightly enhanced version of the BSP model, referred to ascoarse grained multicomputer model [8], [9], =-=[10]. It-=- is comprised of a set of p processors P1�:::�P p with O(n=p) local memory per processor and an arbitrary communication network. All algorithms consist of alternating local computation and global ... |

48 | The complexity of parallel computation - Wyllie - 1979 |

46 |
Approximate parallel scheduling. Part I: The basic technique with applications to optimal parallel list ranking in logarithmic time
- Cole, Vishkin
- 1988
(Show Context)
Citation Context ...ar Linked List Stored In A Distributed Memory Multiprocessor Several PRAM list ranking algorithms have been proposed [15] [20]. The rst optimal O(log n) EREW PRAM algorithm is due to Cole and Vishkin =-=[7]-=-. Another optimal deterministic algorithm is given by Anderson and Miller [2]. Parallel list ranking algorithms using randomization were proposed by Miller and Reif [17] [18]. Thealgorithms use O(n) p... |

45 |
Deterministic parallel list ranking
- Anderson, Miller
- 1988
(Show Context)
Citation Context ...t ranking algorithms have been proposed [15] [20]. The rst optimal O(log n) EREW PRAM algorithm is due to Cole and Vishkin [7]. Another optimal deterministic algorithm is given by Anderson and Miller =-=[2]-=-. Parallel list ranking algorithms using randomization were proposed by Miller and Reif [17] [18]. Thealgorithms use O(n) processors. Theoptimal algorithm by Anderson and Miller [1] improves this by u... |

35 | Parallel sorting by over partitioning - Li, Sevcik - 1994 |

32 | Scalable and architecture independent parallel geometric algorithms with high probability optimal time - Dehne, Kenyon, et al. - 1994 |

31 | List ranking and list scan on the Cray C-90 - Reid-Miller - 1994 |

25 |
A Simple Randomized Parallel Algorithm for List-Ranking
- Anderson, Miller
- 1990
(Show Context)
Citation Context ...nderson and Miller [2]. Parallel list ranking algorithms using randomization were proposed by Miller and Reif [17] [18]. Thealgorithms use O(n) processors. Theoptimal algorithm by Anderson and Miller =-=[1] impr-=-oves this by usinganoptimal number of processors. A O( p (n)) time mesh algorithm is described in [4]. 3 O(n) ~ denotes O(n) \with high probability". More precisely, X = O(f(n)), ~ if and 1 only ... |

23 | A Comparison of Shared and Nonshared Memory Models of Computation
- Anderson, Snyder
(Show Context)
Citation Context ... the actual number of communications rounds will not exceed 50. 1 Introduction The Model Speedup results fortheoretical PRAM algorithms do not necessarily match the speedups observed on real machines =-=[3]-=- [21]. Given su cient slackness in the number of processors, Valiant's BSP approach [23] simulates PRAM algorithms optimally on distributed memory parallel systems. Valiant points out, however, that o... |

19 |
Parallel tree contraction part 1: fundamentals
- Miller, Reif
(Show Context)
Citation Context ...ithm is due to Cole and Vishkin [7]. Another optimal deterministic algorithm is given by Anderson and Miller [2]. Parallel list ranking algorithms using randomization were proposed by Miller and Reif =-=[17]-=- [18]. Thealgorithms use O(n) processors. Theoptimal algorithm by Anderson and Miller [1] improves this by usinganoptimal number of processors. A O( p (n)) time mesh algorithm is described in [4]. 3 O... |

15 | Good Programming Style on Multiprocessors - Deng, Gu - 1994 |

13 |
Solving tree problems on a mesh-connected processor array
- Atallah, Hambrusch
- 1986
(Show Context)
Citation Context ...eif [17] [18]. Thealgorithms use O(n) processors. Theoptimal algorithm by Anderson and Miller [1] improves this by usinganoptimal number of processors. A O( p (n)) time mesh algorithm is described in =-=[4]. 3 O-=-(n) ~ denotes O(n) \with high probability". More precisely, X = O(f(n)), ~ if and 1 only if (8c >c0 > 1) ProbfX cf(n)g ng(c) where c0 is a xed constant andg(c) is a polynomial in c with g(c) !1fo... |

13 |
Efficient Routing and Message Bounds for Optimal Parallel Algorithms
- Deng, Dymond
- 1995
(Show Context)
Citation Context ... rounds as well as the total local computation time. Furthermore, it has been shown that minimizing the number of supersteps also leads to improved portability across di erent parallel architectures (=-=[13]-=- [22] [23]). The above model has been used (explicitly or implicitly) in parallel algorithm design for various problems ([6], [8], [9], [11], [12], [16], [10]) and shown very good practical timing res... |

11 |
Aconvex hull algorithm on coarse grained multiprocessors
- Deng
- 1994
(Show Context)
Citation Context ...d portability across di erent parallel architectures ([13] [22] [23]). The above model has been used (explicitly or implicitly) in parallel algorithm design for various problems ([6], [8], [9], [11], =-=[12]-=-, [16], [10]) and shown very good practical timing results. The List Ranking Problem Consider a linear linked list consisting ofaset S of n nodes and, for each node x 2 S, apointer (x ! next(x)) to it... |

2 |
Introduction to parallel connectivity, list ranking, and Euler tour techniques
- Baase
- 1993
(Show Context)
Citation Context ...above algorithm can obviously be generalized to compute pre x or su x sums for associative operators. List ranking is a very popular tool for obtaining numerous parallel tree and graph algorithms [4] =-=[5]-=-. An important application outlined in [4] is to use list ranking for applying Euler tour techniques to tree problems: for an undirected forest of trees, rooting every tree at a given vertex chosen as... |

2 |
Scalable and Architecture IndependentParallel Geometric Algorithms with High Probability Optimal Time
- Dehne, Fabri, et al.
- 1994
(Show Context)
Citation Context ...message overhead cost can have a considerable impact on the speedup observed (see e.g. [8]). We use a slightly enhanced version of the BSP model, referred to ascoarse grained multicomputer model [8], =-=[9], [1-=-0]. It is comprised of a set of p processors P1�:::�P p with O(n=p) local memory per processor and an arbitrary communication network. All algorithms consist of alternating local computation and g... |

2 |
Parallel Sorting byOverpartitioning
- Li, Sevcik
- 1994
(Show Context)
Citation Context ...ability across di erent parallel architectures ([13] [22] [23]). The above model has been used (explicitly or implicitly) in parallel algorithm design for various problems ([6], [8], [9], [11], [12], =-=[16]-=-, [10]) and shown very good practical timing results. The List Ranking Problem Consider a linear linked list consisting ofaset S of n nodes and, for each node x 2 S, apointer (x ! next(x)) to itssucce... |

2 | Plaxton, "A Comparison of Sorting Algorithms for the Connection Machine CM-2 - Blelloch, Leiserson, et al. - 1991 |

1 |
Good ProgrammingStyle on Multiprocessors
- Gu
- 1994
(Show Context)
Citation Context ...mproved portability across di erent parallel architectures ([13] [22] [23]). The above model has been used (explicitly or implicitly) in parallel algorithm design for various problems ([6], [8], [9], =-=[11]-=-, [12], [16], [10]) and shown very good practical timing results. The List Ranking Problem Consider a linear linked list consisting ofaset S of n nodes and, for each node x 2 S, apointer (x ! next(x))... |

1 |
Parallel tree contraction part 1: Further applications
- Miller, Reif
- 1991
(Show Context)
Citation Context ...is due to Cole and Vishkin [7]. Another optimal deterministic algorithm is given by Anderson and Miller [2]. Parallel list ranking algorithms using randomization were proposed by Miller and Reif [17] =-=[18]-=-. Thealgorithms use O(n) processors. Theoptimal algorithm by Anderson and Miller [1] improves this by usinganoptimal number of processors. A O( p (n)) time mesh algorithm is described in [4]. 3 O(n) ~... |

1 |
List ranking and parallel tree compaction
- Reid-Miller, Miller, et al.
- 1993
(Show Context)
Citation Context ...s n=p nodes x 2 S the value dist(x). Proc.1 Proc.2 Proc.3 Proc.4 Fig. 1. ALinear Linked List Stored In A Distributed Memory Multiprocessor Several PRAM list ranking algorithms have been proposed [15] =-=[20]-=-. The rst optimal O(log n) EREW PRAM algorithm is due to Cole and Vishkin [7]. Another optimal deterministic algorithm is given by Anderson and Miller [2]. Parallel list ranking algorithms using rando... |

1 | General Purpose Parallel Architectures," Handbook of Theoretical Computer - al - 1990 |