Results 1  10
of
10
Optimal Doubly Logarithmic Parallel Algorithms Based On Finding All Nearest Smaller Values
, 1993
"... The all nearest smaller values problem is defined as follows. Let A = (a 1 ; a 2 ; : : : ; an ) be n elements drawn from a totally ordered domain. For each a i , 1 i n, find the two nearest elements in A that are smaller than a i (if such exist): the left nearest smaller element a j (with j ! i) a ..."
Abstract

Cited by 37 (7 self)
 Add to MetaCart
The all nearest smaller values problem is defined as follows. Let A = (a 1 ; a 2 ; : : : ; an ) be n elements drawn from a totally ordered domain. For each a i , 1 i n, find the two nearest elements in A that are smaller than a i (if such exist): the left nearest smaller element a j (with j ! i) and the right nearest smaller element a k (with k ? i). We give an O(log log n) time optimal parallel algorithm for the problem on a CRCW PRAM. We apply this algorithm to achieve optimal O(log log n) time parallel algorithms for four problems: (i) Triangulating a monotone polygon, (ii) Preprocessing for answering range minimum queries in constant time, (iii) Reconstructing a binary tree from its inorder and either preorder or postorder numberings, (vi) Matching a legal sequence of parentheses. We also show that any optimal CRCW PRAM algorithm for the triangulation problem requires \Omega\Gammauir log n) time. Dept. of Computing, King's College London, The Strand, London WC2R 2LS, England. ...
Connected Components in O(log 3/2 n) Parallel Time for the CREW PRAM
"... Finding the connected components of an undirected graph G = (V; E) on n = jV j vertices and m = jEj edges is a fundamental computational problem. The best known parallel algorithm for the CREW PRAM model runs in O(log 2 n) time using n 2 = log 2 n processors [6, 15]. For the CRCW PRAM model, ..."
Abstract

Cited by 15 (2 self)
 Add to MetaCart
Finding the connected components of an undirected graph G = (V; E) on n = jV j vertices and m = jEj edges is a fundamental computational problem. The best known parallel algorithm for the CREW PRAM model runs in O(log 2 n) time using n 2 = log 2 n processors [6, 15]. For the CRCW PRAM model, in which concurrent writing is permitted, the best known algorithm runs in O(log n) time using slightly more than (n +m)= log n processors [26, 9, 5]. Simulating this algorithm on the weaker CREW model increases its running time to O(log 2 n) [10, 19, 29]. We present here a simple algorithm that runs in O(log 3=2 n) time using n +m CREW processors. Finding an o(log 2 n) parallel connectivity algorithm for this model was an open problem for many years. 1 Introduction Let G = (V; E) be an undirected graph on n = jV j vertices and m = jEj edges. A path p of length k is a sequence of edges (e 1 ; \Delta \Delta \Delta ; e i ; \Delta \Delta \Delta ; e k ) such that e i 2 E for i = 1; \...
Optimal Deterministic Approximate Parallel Prefix Sums and Their Applications
 In Proc. Israel Symp. on Theory and Computing Systems (ISTCS'95
, 1995
"... We show that extremely accurate approximation to the prefix sums of a sequence of n integers can be computed deterministically in O(log log n) time using O(n= log log n) processors in the Common CRCW PRAM model. This complements randomized approximation methods obtained recently by Goodrich, Matias ..."
Abstract

Cited by 14 (0 self)
 Add to MetaCart
We show that extremely accurate approximation to the prefix sums of a sequence of n integers can be computed deterministically in O(log log n) time using O(n= log log n) processors in the Common CRCW PRAM model. This complements randomized approximation methods obtained recently by Goodrich, Matias and Vishkin and improves previous deterministic results obtained by Hagerup and Raman. Furthermore, our results completely match a lower bound obtained recently by Chaudhuri. Our results have many applications. Using them we improve upon the best known time bounds for deterministic approximate selection and for deterministic padded sorting. 1 Introduction The computation of prefix sums is one of the most basic tools in the design of fast parallel algorithms (see Blelloch [9] and J'aJ'a [33]). Prefixsums can be computed in O(logn) time and linear work in the EREW PRAM model (Ladner and Fischer [34]) and in O(log n= log log n) and linear work in the Common CRCW PRAM model (Cole and Vishkin...
A Case for the PRAM As a Standard Programmer's Model
, 1992
"... This position paper advocates that the PRAM model of parallel computation will be a standard (but not exclusive) programmer's model for computers whose hardware features various kinds of parallelism. ..."
Abstract

Cited by 13 (3 self)
 Add to MetaCart
This position paper advocates that the PRAM model of parallel computation will be a standard (but not exclusive) programmer's model for computers whose hardware features various kinds of parallelism.
Implementation of Parallel Graph Algorithms on a Massively Parallel SIMD Computer with Virtual Processing
, 1995
"... We describe our implementation of several PRAM graph algorithms on the massively parallel computer MasPar MP1 with 16,384 processors. Our implementation incorporated virtual processing and we present extensive test data. In a previous project [13], we reported the implementation of a set of paralle ..."
Abstract

Cited by 12 (3 self)
 Add to MetaCart
We describe our implementation of several PRAM graph algorithms on the massively parallel computer MasPar MP1 with 16,384 processors. Our implementation incorporated virtual processing and we present extensive test data. In a previous project [13], we reported the implementation of a set of parallel graph algorithms with the constraint that the maximum input size was restricted to be no more than the physical number of processors on the MasPar. The MasPar language MPL that we used for our code does not support virtual processing. In this paper, we describe a method of simulating virtual processors on the MasPar. We recoded and finetuned our earlier parallel graph algorithms to incorporate the usage of virtual processors. Under the current implementation scheme, there is no limit on the number of virtual processors that one can use in the program as long as there is enough main memory to store all the data required during the computation. We also give two general optimization techniq...
Thinking in parallel: Some basic dataparallel algorithms and techniques
 In use as class notes since
, 1993
"... Copyright 19922009, Uzi Vishkin. These class notes reflect the theorertical part in the Parallel ..."
Abstract

Cited by 7 (1 self)
 Add to MetaCart
Copyright 19922009, Uzi Vishkin. These class notes reflect the theorertical part in the Parallel
An Optimal Parallel Matching Algorithm for Cographs
 Journal of Parallel and Distributed Computing
, 1994
"... The class of cographs, or complementreducible graphs, arises naturally in many different areas of applied mathematics and computer science. We show that the problem of finding a maximum matching in a cograph can be solved optimally in parallel by reducing it to parenthesis matching. With an $n$ver ..."
Abstract

Cited by 4 (1 self)
 Add to MetaCart
The class of cographs, or complementreducible graphs, arises naturally in many different areas of applied mathematics and computer science. We show that the problem of finding a maximum matching in a cograph can be solved optimally in parallel by reducing it to parenthesis matching. With an $n$vertex cograph $G$ represented by its parse tree as input, our algorithm finds a maximum matching in $G$ in O($logn$) time using O($n0$) processors in the EREWPRAM model. Key Words: list ranking, tree contraction, matching, parenthesis matching, scheduling, operating systems, cographs, parallel algorithms, EREWPRAM. 1. Introduction A wellknown class of graphs arising in a wide spectrum of practical applications [1,2,7] is the class of cographs, or complementreducible graphs. The cographs are defined recursively as follows: . a singlevertex graph is a cograph; . if $G$ is a cograph, then its complement $G bar$ is also a cograph; . if $G$ and $H$ are cographs, then their union is also a cog...
Randomized RangeMaxima In NearlyConstant Parallel Time
, 1992
"... . Given an array of n input numbers, the rangemaxima problem is that of preprocessing the data so that queries of the type "what is the maximum value in subarray [i::j]" can be answered quickly using one processor. We present a randomized preprocessing algorithm that runs in O(log n) time with h ..."
Abstract

Cited by 3 (1 self)
 Add to MetaCart
. Given an array of n input numbers, the rangemaxima problem is that of preprocessing the data so that queries of the type "what is the maximum value in subarray [i::j]" can be answered quickly using one processor. We present a randomized preprocessing algorithm that runs in O(log n) time with high probability, using an optimal number of processors on a CRCW PRAM; each query can be processed in constant time by one processor. We also present a randomized algorithm for a parallel comparison model. Using an optimal number of processors, the preprocessing algorithm runs in O(ff(n)) time with high probability; each query can be processed in O(ff(n)) time by one processor. (As is standard, ff(n) is the inverse of Ackermann function.) A constant time query can be achieved by some slowdown in the performance of the preprocessing stage. Key words. parallel algorithms; randomized algorithms; PRAM; comparison model; range maximum; prefix maximum. Subject classifications. 68Q20. 1 APPEARED...
A Fast Parallel Algorithm For Finding The Convex Hull Of A Sorted Point Set
"... We present a parallel algorithm for finding the convex hull of a sorted point set. The algorithm runs in O(log log n) (doubly logarithmic) time using n= log log n processors on a Common CRCW PRAM. To break the\Omega\Gammae/2 n= log log n) time barrier required to output the convex hull in a contiguo ..."
Abstract

Cited by 3 (0 self)
 Add to MetaCart
We present a parallel algorithm for finding the convex hull of a sorted point set. The algorithm runs in O(log log n) (doubly logarithmic) time using n= log log n processors on a Common CRCW PRAM. To break the\Omega\Gammae/2 n= log log n) time barrier required to output the convex hull in a contiguous array, we introduce a novel data structure for representing the convex hull. The algorithm is optimal in two respects: (1) the timeprocessor product of the algorithm, which is linear, cannot be improved, and (2) the running time, which is doubly logarithmic, cannot be improved even by using a linear number of processors. The algorithm demonstrates the power of the "the divideand conquer doubly logarithmic paradigm" by presenting a nontrivial extension to situations that previously were known to have only slower algorithms. Keywords: Parallel algorithms, computational geometry, convex hull, PRAM. 1. Introduction We consider the following problem: given n points in the plane in a so...