Results 1  10
of
33
On Parallel Hashing and Integer Sorting
, 1991
"... The problem of sorting n integers from a restricted range [1::m], where m is superpolynomial in n, is considered. An o(n log n) randomized algorithm is given. Our algorithm takes O(n log log m) expected time and O(n) space. (Thus, for m = n polylog(n) we have an O(n log log n) algorithm.) The al ..."
Abstract

Cited by 25 (9 self)
 Add to MetaCart
The problem of sorting n integers from a restricted range [1::m], where m is superpolynomial in n, is considered. An o(n log n) randomized algorithm is given. Our algorithm takes O(n log log m) expected time and O(n) space. (Thus, for m = n polylog(n) we have an O(n log log n) algorithm.) The algorithm is parallelizable. The resulting parallel algorithm achieves optimal speed up. Some features of the algorithm make us believe that it is relevant for practical applications. A result of independent interest is a parallel hashing technique. The expected construction time is logarithmic using an optimal number of processors, and searching for a value takes O(1) time in the worst case. This technique enables drastic reduction of space requirements for the price of using randomness. Applicability of the technique is demonstrated for the parallel sorting algorithm, and for some parallel string matching algorithms. The parallel sorting algorithm is designed for a strong and non standard mo...
Ultrafast expected time parallel algorithms
 Proc. of the 2nd SODA
, 1991
"... It has been shown previously that sorting n items into n locations with a polynomial number of processors requires Ω(log n/log log n) time. We sidestep this lower bound with the idea of Padded Sorting, or sorting n items into n + o(n) locations. Since many problems do not rely on the exact rank of s ..."
Abstract

Cited by 20 (3 self)
 Add to MetaCart
It has been shown previously that sorting n items into n locations with a polynomial number of processors requires Ω(log n/log log n) time. We sidestep this lower bound with the idea of Padded Sorting, or sorting n items into n + o(n) locations. Since many problems do not rely on the exact rank of sorted items, a Padded Sort is often just as useful as an unpadded sort. Our algorithm for Padded Sort runs on the Tolerant CRCW PRAM and takes Θ(log log n/log log log n) expected time using n log log log n/log log n processors, assuming the items are taken from a uniform distribution. Using similar techniques we solve some computational geometry problems, including Voronoi Diagram, with the same processor and time bounds, assuming points are taken from a uniform distribution in the unit square. Further, we present an Arbitrary CRCW PRAM algorithm to solve the Closest Pair problem in constant expected time with n processors regardless of the distribution of points. All of these algorithms achieve linear speedup in expected time over their optimal serial counterparts. 1 Research done while at the University of Michigan and supported by an AT&T Fellowship.
Finding Least Common Ancestors in Directed Acyclic Graphs
 PROC. 12TH ANNUAL ACMSIAM SYMPOSIUM ON DISCRETE ALGORITHMS (SODA’01
, 2001
"... ..."
Faster EntropyBounded Compressed Suffix Trees
, 2009
"... Suffix trees are among the most important data structures in stringology, with a number of applications in flourishing areas like bioinformatics. Their main problem is space usage, which has triggered much research striving for compressed representations that are still functional. A smaller suffix t ..."
Abstract

Cited by 16 (9 self)
 Add to MetaCart
Suffix trees are among the most important data structures in stringology, with a number of applications in flourishing areas like bioinformatics. Their main problem is space usage, which has triggered much research striving for compressed representations that are still functional. A smaller suffix tree representation could fit in a faster memory, outweighing by far the theoretical slowdown brought by the space reduction. We present a novel compressed suffix tree, which is the first achieving at the same time sublogarithmic complexity for the operations, and space usage that asymptotically goes to zero as the entropy of the text does. The main ideas in our development are compressing the longest common prefix information, totally getting rid of the suffix tree topology, and expressing all the suffix tree operations using range minimum queries and a novel primitive called next/previous smaller value in a sequence. Our solutions to those operations are of independent interest.
An(other) entropybounded compressed suffix tree
 In Proceedings of the 19th Annual Symposium on Combinatorial Pattern Matching, volume 5029 of LNCS
, 2008
"... Abstract. Suffix trees are among the most important data structures in stringology, with myriads of applications. Their main problem is space usage, which has triggered much research striving for compressed representations that are still functional. We present a novel compressed suffix tree. Compare ..."
Abstract

Cited by 13 (10 self)
 Add to MetaCart
Abstract. Suffix trees are among the most important data structures in stringology, with myriads of applications. Their main problem is space usage, which has triggered much research striving for compressed representations that are still functional. We present a novel compressed suffix tree. Compared to the existing ones, ours is the first achieving at the same time sublogarithmic complexity for the operations, and space usage which goes to zero as the entropy of the text does. Our development contains several novel ideas, such as compressing the longest common prefix information, and totally getting rid of the suffix tree topology, expressing all the suffix tree operations using range minimum queries and a new primitive called next/previous smaller value in a sequence. 1
Structural Parallel Algorithmics
, 1991
"... The first half of the paper is a general introduction which emphasizes the central role that the PRAM model of parallel computation plays in algorithmic studies for parallel computers. Some of the collective knowledgebase on nonnumerical parallel algorithms can be characterized in a structural way ..."
Abstract

Cited by 11 (4 self)
 Add to MetaCart
The first half of the paper is a general introduction which emphasizes the central role that the PRAM model of parallel computation plays in algorithmic studies for parallel computers. Some of the collective knowledgebase on nonnumerical parallel algorithms can be characterized in a structural way. Each structure relates a few problems and technique to one another from the basic to the more involved. The second half of the paper provides a bird'seye view of such structures for: (1) list, tree and graph parallel algorithms; (2) very fast deterministic parallel algorithms; and (3) very fast randomized parallel algorithms. 1 Introduction Parallelism is a concern that is missing from "traditional" algorithmic design. Unfortunately, it turns out that most efficient serial algorithms become rather inefficient parallel algorithms. The experience is that the design of parallel algorithms requires new paradigms and techniques, offering an exciting intellectual challenge. We note that it had...
Efficient Matrix Chain Ordering in Polylog Time
 IN PROC. OF INT'L PARALLEL PROCESSING SYMP
, 1998
"... The matrix chain ordering problem is to find the cheapest way to multiply a chain of n matrices, where the matrices are pairwise compatible but of varying dimensions. Here we give several new parallel algorithms including O(lg 3 n)time and n/lg nprocessor algorithms for solving the matrix chain o ..."
Abstract

Cited by 10 (3 self)
 Add to MetaCart
The matrix chain ordering problem is to find the cheapest way to multiply a chain of n matrices, where the matrices are pairwise compatible but of varying dimensions. Here we give several new parallel algorithms including O(lg 3 n)time and n/lg nprocessor algorithms for solving the matrix chain ordering problem and for solving an optimal triangulation problem of convex polygons on the common CRCW PRAM model. Next, by using efficient algorithms for computing row minima of totally monotone matrices, this complexity is improved to O(lg 2 n) time with n processors on the EREW PRAM and to O(lg 2 nlg lg n) time with n/lg lg n processors on a common CRCW PRAM. A new algorithm for computing the row minima of totally monotone matrices improves our parallel MCOP algorithm to O(nlg 1.5 n) work and polylog time on a CREW PRAM. Optimal logtime algorithms for computing row minima of totally monotone matrices will improve our algorithm and enable it to have the same work as the sequential algorithm of Hu and
Optimal Parallel Approximation Algorithms for Prefix Sums and Integer Sorting (Extended Abstract)
"... Parallel prefix computation is perhaps the most frequently used subroutine in parallel algorithms today. Its time complexity on the CRCWPRAM is \Theta(lg n= lg lg n) using a polynomial number of processors, even in a randomized setting. Nevertheless, there are a number of nontrivial applications t ..."
Abstract

Cited by 8 (5 self)
 Add to MetaCart
Parallel prefix computation is perhaps the most frequently used subroutine in parallel algorithms today. Its time complexity on the CRCWPRAM is \Theta(lg n= lg lg n) using a polynomial number of processors, even in a randomized setting. Nevertheless, there are a number of nontrivial applications that have been shown to be solvable using only an approximate version of the prefix sums problem. In this paper we resolve the issue of approximating parallel prefix by introducing an algorithm that runs in O(lg n) time with very high probability, using n= lg n processors, which is optimal in terms of both work and running time. Our approximate prefix sums are guaranteed to come within a factor of (1 + ffl) of the values of the true sums in a "consistent fashion", where ffl is o(1). We achieve this result through the use of a number of interesting new techniques, such as overcertification and estimatefocusing, as well ...
Thinking in parallel: Some basic dataparallel algorithms and techniques
 In use as class notes since
, 1993
"... Copyright 19922009, Uzi Vishkin. These class notes reflect the theorertical part in the Parallel ..."
Abstract

Cited by 7 (1 self)
 Add to MetaCart
Copyright 19922009, Uzi Vishkin. These class notes reflect the theorertical part in the Parallel
Efficient Parallel Dynamic Programming
, 1994
"... In 1983, Valiant, Skyum, Berkowitz and Racko# showed that many problems with simple O#n 3 # sequential dynamic programming solutions are in the class NC. They used straight line programs to show that these problems can be solved in O#lg 2 n# time with n 9 processors. In 1988, Rytter used pebbl ..."
Abstract

Cited by 6 (1 self)
 Add to MetaCart
In 1983, Valiant, Skyum, Berkowitz and Racko# showed that many problems with simple O#n 3 # sequential dynamic programming solutions are in the class NC. They used straight line programs to show that these problems can be solved in O#lg 2 n# time with n 9 processors. In 1988, Rytter used pebbling games to show that these same problems can be solved on a CREW PRAM in O#lg 2 n# time with n 6 =lg n processors. Recently, Huang, Liu and Viswanathan #23# and Galil and Park #15# give algorithms that improve this processor complexityby polylog factors. Using a graph structure that is analogous to the classical dynamic programming table, this paper improves these results. First, this graph characterization leads to a polylog time and n 6 =lg n processor algorithm that solves these problems. Second, there follows a subpolylog time and sublinear processor parallel approximation algorithm for the matrix chain ordering problem. Finally, this paper presents a n 3 =lg n processor and O...