Results 1  10
of
33
On RAM priority queues
, 1996
"... Priority queues are some of the most fundamental data structures. They are used directly for, say, task scheduling in operating systems. Moreover, they are essential to greedy algorithms. We study the complexity of priority queue operations on a RAM with arbitrary word size. We present exponential i ..."
Abstract

Cited by 70 (9 self)
 Add to MetaCart
Priority queues are some of the most fundamental data structures. They are used directly for, say, task scheduling in operating systems. Moreover, they are essential to greedy algorithms. We study the complexity of priority queue operations on a RAM with arbitrary word size. We present exponential improvements over previous bounds, and we show tight relations to sorting. Our first result is a RAM priority queue supporting insert and extractmin operations in worst case time O(log log n) where n is the current number of keys in the queue. This is an exponential improvement over the O( p log n) bound of Fredman and Willard from STOC'90. Our algorithm is simple, and it only uses AC 0 operations, meaning that there is no hidden time dependency on the word size. Plugging this priority queue into Dijkstra's algorithm gives an O(m log log m) algorithm for the single source shortest path problem on a graph with m edges, as compared with the previous O(m p log m) bound based on Fredman...
Sorting Selection and Routing on the Array with Reconfigurable Optical Buses
"... In this paper we present efficient algorithms for sorting, selection and packet routing on the AROB (Array with Reconfigurable Optical Buses) model. ..."
Abstract

Cited by 32 (5 self)
 Add to MetaCart
In this paper we present efficient algorithms for sorting, selection and packet routing on the AROB (Array with Reconfigurable Optical Buses) model.
Optimal Parallel Algorithms for Periods, Palindromes and Squares (Extended Abstract)
, 1992
"... ) Alberto Apostolico Purdue University and Universit`a di Padova Dany Breslauer yyz Columbia University Zvi Galil z Columbia University and TelAviv University Summary of results Optimal concurrentread concurrentwrite parallel algorithms for two problems are presented: ffl Finding all the pe ..."
Abstract

Cited by 32 (13 self)
 Add to MetaCart
) Alberto Apostolico Purdue University and Universit`a di Padova Dany Breslauer yyz Columbia University Zvi Galil z Columbia University and TelAviv University Summary of results Optimal concurrentread concurrentwrite parallel algorithms for two problems are presented: ffl Finding all the periods of a string. The period of a string can be computed by previous efficient parallel algorithms only if it is shorter than half of the length of the string. Our new algorithm computes all the periods in optimal O(log log n) time, even if they are longer. The algorithm can be used to compute all initial palindromes of a string within the same bounds. ffl Testing if a string is squarefree. We present an optimal O(log log n) time algorithm for testing if a string is squarefree, improving the previous bound of O(log n) given by Apostolico [1] and Crochemore and Rytter [12]. We show matching lower bounds for the optimal parallel algorithms that solve the problems above on a general alphab...
Integer Priority Queues with Decrease Key in . . .
 STOC'03
, 2003
"... We consider Fibonacci heap style integer priority queues supporting insert and decrease key operations in constant time. We present a deterministic linear space solution that with n integer keys support delete in O(log log n) time. If the integers are in the range [0,N), we can also support delete i ..."
Abstract

Cited by 27 (2 self)
 Add to MetaCart
We consider Fibonacci heap style integer priority queues supporting insert and decrease key operations in constant time. We present a deterministic linear space solution that with n integer keys support delete in O(log log n) time. If the integers are in the range [0,N), we can also support delete in O(log log N) time. Even for the special case of monotone priority queues, where the minimum has to be nondecreasing, the best previous bounds on delete were O((log n) 1/(3−ε) ) and O((log N) 1/(4−ε)). These previous bounds used both randomization and amortization. Our new bounds a deterministic, worstcase, with no restriction to monotonicity, and exponentially faster. As a classical application, for a directed graph with n nodes and m edges with nonnegative integer weights, we get single source shortest paths in O(m + n log log n) time, or O(m + n log log C) ifC is the maximal edge weight. The later solves an open problem of Ahuja, Mehlhorn, Orlin, and
Ultrafast expected time parallel algorithms
 Proc. of the 2nd SODA
, 1991
"... It has been shown previously that sorting n items into n locations with a polynomial number of processors requires Ω(log n/log log n) time. We sidestep this lower bound with the idea of Padded Sorting, or sorting n items into n + o(n) locations. Since many problems do not rely on the exact rank of s ..."
Abstract

Cited by 20 (3 self)
 Add to MetaCart
It has been shown previously that sorting n items into n locations with a polynomial number of processors requires Ω(log n/log log n) time. We sidestep this lower bound with the idea of Padded Sorting, or sorting n items into n + o(n) locations. Since many problems do not rely on the exact rank of sorted items, a Padded Sort is often just as useful as an unpadded sort. Our algorithm for Padded Sort runs on the Tolerant CRCW PRAM and takes Θ(log log n/log log log n) expected time using n log log log n/log log n processors, assuming the items are taken from a uniform distribution. Using similar techniques we solve some computational geometry problems, including Voronoi Diagram, with the same processor and time bounds, assuming points are taken from a uniform distribution in the unit square. Further, we present an Arbitrary CRCW PRAM algorithm to solve the Closest Pair problem in constant expected time with n processors regardless of the distribution of points. All of these algorithms achieve linear speedup in expected time over their optimal serial counterparts. 1 Research done while at the University of Michigan and supported by an AT&T Fellowship.
Randomized sorting in O(n log log n) time and linear space using addition, shift, and bitwise boolean operations
, 1996
"... A randomized sorting algorithm is presented, doing as described in the title. 1 Introduction In this paper we consider sorting on a very simple RAM where the only wordoperations are addition, shift, and bitwise boolean operations. Besides these wordoperations, we have direct and indirect addres ..."
Abstract

Cited by 18 (3 self)
 Add to MetaCart
A randomized sorting algorithm is presented, doing as described in the title. 1 Introduction In this paper we consider sorting on a very simple RAM where the only wordoperations are addition, shift, and bitwise boolean operations. Besides these wordoperations, we have direct and indirect addressing, jumps, and conditional statements. Such a RAM has been referred to as a Practical RAM [Mil96]. In this paper we show Theorem 1 On a Practical RAM, there is a randomized algorithm sorting n words in O(n log log n) time and linear space. The above algorithm only makes shifts by powers of two, and it only needs O(log n) random words. Our time bound matches that of the current fastest sorting algorithm by Andersson, Hagerup, Raman, and Nilsson [AHNR95]. Their algorithm has two variants: one is deterministic uses space 2 "w , where w is the word length and " is a positive constant. Thus the space is unbounded in terms of n. The other variant is randomized and uses linear space like ou...
Modeling parallel bandwidth: Local vs. global restrictions
"... Recently there has been an increasing interest in models of parallel computation that account for the bandwidth limitations in communication networks. Some models (e.g., bsp and logp) account for bandwidth limitations using a perprocessor parameter g> 1, such that eachpro cessor can send/receive at ..."
Abstract

Cited by 15 (4 self)
 Add to MetaCart
Recently there has been an increasing interest in models of parallel computation that account for the bandwidth limitations in communication networks. Some models (e.g., bsp and logp) account for bandwidth limitations using a perprocessor parameter g> 1, such that eachpro cessor can send/receive at most h messages in g h time. Other models (e.g., pram(m)) account for bandwidth limitations as an aggregate parameter m<p, such thatthe p processors can send at most m messages in total at each step. This paper provides the rst detailed study of the algorithmic implications of modeling parallel bandwidth as a perprocessor (local) limitation versus an aggregate (global) limitation. We consider a number of basic problems
Designing Checkers for Programs that Run in Parallel
 Algorithmica
, 1994
"... Program correctness for parallel programs is an even more problematic issue than for serial programs. We extend the theory of program result checking to parallel programs, and find general techniques for designing such result checkers that work for many basic problems in parallel computation. These ..."
Abstract

Cited by 14 (2 self)
 Add to MetaCart
Program correctness for parallel programs is an even more problematic issue than for serial programs. We extend the theory of program result checking to parallel programs, and find general techniques for designing such result checkers that work for many basic problems in parallel computation. These result checkers are simple to program and are more efficient than the actual computation of the result. For example, sorting, multiplication, parity, the all pairs shortest path problem and majority all have constant depth result checkers, and the result checkers for all but the last problem use a linear number of processors. We show that there are Pcomplete problems (evaluating straightline programs, linear programming) that have very fast, even constant depth, result checkers. 1 Introduction Verifying a program to see if it is correct is a problem that every programmer has encountered. Even the seemingly simplest of programs can be full of hidden bugs, and in the age of massive software...
Selection on the Reconfigurable Mesh
 Proc. Frontiers of Massively Parallel Computation
, 1992
"... Our main result is a \Theta(log n) time algorithm to select the kth smallest element in a set of n elements on a reconfigurable mesh with n processors. This improves on the previous fastest algorithm's running time by a factor of log n. We also show that some variants of this problem can be solved e ..."
Abstract

Cited by 14 (0 self)
 Add to MetaCart
Our main result is a \Theta(log n) time algorithm to select the kth smallest element in a set of n elements on a reconfigurable mesh with n processors. This improves on the previous fastest algorithm's running time by a factor of log n. We also show that some variants of this problem can be solved even faster. First we show that a good approximation to the median of n elements can be found in \Theta(log log n) time. This can be used to solve twodimensional linear programming over n equations in \Theta(log n log log n) time, an improvement of log n= log log n time over the previous fastest algorithm. Next, we show that, for any constant ffl ? 0, selecting the kth smallest element in a set of n 1\Gammaffl elements evenly spaced throughout the mesh can be done in constant time. We also show that one can select the kth smallest element from n bbit words in \Theta((b= log b) maxflog n \Gamma log b; 1g) time, which implies that if the elements come from a polynomial range, one can...