Results 1  10
of
30
Programming Parallel Algorithms
, 1996
"... In the past 20 years there has been treftlendous progress in developing and analyzing parallel algorithftls. Researchers have developed efficient parallel algorithms to solve most problems for which efficient sequential solutions are known. Although some ofthese algorithms are efficient only in a th ..."
Abstract

Cited by 198 (8 self)
 Add to MetaCart
In the past 20 years there has been treftlendous progress in developing and analyzing parallel algorithftls. Researchers have developed efficient parallel algorithms to solve most problems for which efficient sequential solutions are known. Although some ofthese algorithms are efficient only in a theoretical framework, many are quite efficient in practice or have key ideas that have been used in efficient implementations. This research on parallel algorithms has not only improved our general understanding ofparallelism but in several cases has led to improvements in sequential algorithms. Unf:ortunately there has been less success in developing good languages f:or prograftlftling parallel algorithftls, particularly languages that are well suited for teaching and prototyping algorithms. There has been a large gap between languages
Parallel Construction of Quadtrees and Quality Triangulations
, 1999
"... We describe e#cient PRAM algorithms for constructing unbalanced quadtrees, balanced quadtrees, and quadtreebased finite element meshes. Our algorithms take time O(log n) for point set input and O(log n log k) time for planar straightline graphs, using O(n + k/ log n) processors, where n measure ..."
Abstract

Cited by 60 (7 self)
 Add to MetaCart
We describe e#cient PRAM algorithms for constructing unbalanced quadtrees, balanced quadtrees, and quadtreebased finite element meshes. Our algorithms take time O(log n) for point set input and O(log n log k) time for planar straightline graphs, using O(n + k/ log n) processors, where n measures input size and k output size. 1. Introduction A crucial preprocessing step for the finite element method is mesh generation, and the most general and versatile type of twodimensional mesh is an unstructured triangular mesh. Such a mesh is simply a triangulation of the input domain (e.g., a polygon), along with some extra vertices, called Steiner points. Not all triangulations, however, serve equally well; numerical and discretization error depend on the quality of the triangulation, meaning the shapes and sizes of triangles. A typical quality guarantee gives a lower bound on the minimum angle in the triangulation. Baker et al. 1 first proved the existence of quality triangulations fo...
Optimal Doubly Logarithmic Parallel Algorithms Based On Finding All Nearest Smaller Values
, 1993
"... The all nearest smaller values problem is defined as follows. Let A = (a 1 ; a 2 ; : : : ; an ) be n elements drawn from a totally ordered domain. For each a i , 1 i n, find the two nearest elements in A that are smaller than a i (if such exist): the left nearest smaller element a j (with j ! i) a ..."
Abstract

Cited by 37 (7 self)
 Add to MetaCart
The all nearest smaller values problem is defined as follows. Let A = (a 1 ; a 2 ; : : : ; an ) be n elements drawn from a totally ordered domain. For each a i , 1 i n, find the two nearest elements in A that are smaller than a i (if such exist): the left nearest smaller element a j (with j ! i) and the right nearest smaller element a k (with k ? i). We give an O(log log n) time optimal parallel algorithm for the problem on a CRCW PRAM. We apply this algorithm to achieve optimal O(log log n) time parallel algorithms for four problems: (i) Triangulating a monotone polygon, (ii) Preprocessing for answering range minimum queries in constant time, (iii) Reconstructing a binary tree from its inorder and either preorder or postorder numberings, (vi) Matching a legal sequence of parentheses. We also show that any optimal CRCW PRAM algorithm for the triangulation problem requires \Omega\Gammauir log n) time. Dept. of Computing, King's College London, The Strand, London WC2R 2LS, England. ...
The Complexity of Computation on the Parallel Random Access Machine
, 1993
"... PRAMs also approximate the situation where communication to and from shared memory is much more expensive than local operations, for example, where each processor is located on a separate chip and access to shared memory is through a combining network. Not surprisingly, abstract PRAMs can be much m ..."
Abstract

Cited by 31 (3 self)
 Add to MetaCart
PRAMs also approximate the situation where communication to and from shared memory is much more expensive than local operations, for example, where each processor is located on a separate chip and access to shared memory is through a combining network. Not surprisingly, abstract PRAMs can be much more powerful than restricted instruction set PRAMs. THEOREM 21.16 Any function of n variables can be computed by an abstract EROW PRAM in O(log n) steps using n= log 2 n processors and n=2 log 2 n shared memory cells. PROOF Each processor begins by reading log 2 n input values and combining them into one large value. The information known by processors are combined in a binarytreelike fashion. In each round, the remaining processors are grouped into pairs. In each pair, one processor communicates the information it knows about the input to the other processor and then leaves the computation. After dlog 2 ne rounds, one processor knows all n input values. Then this processor computes th...
On Parallel Hashing and Integer Sorting
, 1991
"... The problem of sorting n integers from a restricted range [1::m], where m is superpolynomial in n, is considered. An o(n log n) randomized algorithm is given. Our algorithm takes O(n log log m) expected time and O(n) space. (Thus, for m = n polylog(n) we have an O(n log log n) algorithm.) The al ..."
Abstract

Cited by 25 (8 self)
 Add to MetaCart
The problem of sorting n integers from a restricted range [1::m], where m is superpolynomial in n, is considered. An o(n log n) randomized algorithm is given. Our algorithm takes O(n log log m) expected time and O(n) space. (Thus, for m = n polylog(n) we have an O(n log log n) algorithm.) The algorithm is parallelizable. The resulting parallel algorithm achieves optimal speed up. Some features of the algorithm make us believe that it is relevant for practical applications. A result of independent interest is a parallel hashing technique. The expected construction time is logarithmic using an optimal number of processors, and searching for a value takes O(1) time in the worst case. This technique enables drastic reduction of space requirements for the price of using randomness. Applicability of the technique is demonstrated for the parallel sorting algorithm, and for some parallel string matching algorithms. The parallel sorting algorithm is designed for a strong and non standard mo...
Parallel Dynamic Programming
, 1992
"... We study the parallel computation of dynamic programming. We consider four important dynamic programming problems which have wide application, and that have been studied extensively in sequential computation: (1) the 1D problem, (2) the gap problem, (3) the parenthesis problem, and (4) the RNA probl ..."
Abstract

Cited by 18 (1 self)
 Add to MetaCart
(Show Context)
We study the parallel computation of dynamic programming. We consider four important dynamic programming problems which have wide application, and that have been studied extensively in sequential computation: (1) the 1D problem, (2) the gap problem, (3) the parenthesis problem, and (4) the RNA problem. The parenthesis problem has fast parallel algorithms; almost no work has been done for parallelizing the other three. We present a unifying framework for the parallel computation of dynamic programming. We use two wellknown methods, the closure method and the matrix product method, as general paradigms for developing parallel algorithms. Combined with various techniques, they lead to a number of new results. Our main results are optimal sublineartime algorithms for the 1D, parenthesis, and RNA problems.
A randomized parallel algorithm for singlesource shortest paths
 Journal of Algorithms
, 1997
"... Abstract We give a randomized parallel algorithm for computing singlesource shortest paths in weighted digraphs. We show that the exact shortest path problem can be efficiently reduced to solving a series of approximate shortestpath subproblems. Our algorithm for the approximate shortestpath prob ..."
Abstract

Cited by 15 (1 self)
 Add to MetaCart
(Show Context)
Abstract We give a randomized parallel algorithm for computing singlesource shortest paths in weighted digraphs. We show that the exact shortest path problem can be efficiently reduced to solving a series of approximate shortestpath subproblems. Our algorithm for the approximate shortestpath problem is based on a technique used by Ullman and Yannakakis in a parallel algorithm for breadthfirst search. 1 Introduction One of the most fundamental and ubiquitous problems in combinatorial optimization is finding singlesource shortest paths in a weighted graph. Aside from being important in its own right, the problem arises in algorithms for many other problems, especially those related to flow. In view of the importance of the singlesource shortest paths problem, it is unfortunate that all known parallel algorithms for this problem are very inefficient on sparse graphs. This inability to make efficient use of parallelism in computing shortest paths is of both theoretical and practical significance. A fast and efficient parallel algorithm for this problem remains a major goal in the design of parallel graph algorithms.
A Case for the PRAM As a Standard Programmer's Model
, 1992
"... This position paper advocates that the PRAM model of parallel computation will be a standard (but not exclusive) programmer's model for computers whose hardware features various kinds of parallelism. ..."
Abstract

Cited by 13 (3 self)
 Add to MetaCart
This position paper advocates that the PRAM model of parallel computation will be a standard (but not exclusive) programmer's model for computers whose hardware features various kinds of parallelism.