Results 1  10
of
115
A Comparison of Sorting Algorithms for the Connection Machine CM2
"... We have implemented three parallel sorting algorithms on the Connection Machine Supercomputer model CM2: Batcher's bitonic sort, a parallel radix sort, and a sample sort similar to Reif and Valiant's flashsort. We have also evaluated the implementation of many other sorting algorithms pro ..."
Abstract

Cited by 177 (5 self)
 Add to MetaCart
We have implemented three parallel sorting algorithms on the Connection Machine Supercomputer model CM2: Batcher's bitonic sort, a parallel radix sort, and a sample sort similar to Reif and Valiant's flashsort. We have also evaluated the implementation of many other sorting algorithms proposed in the literature. Our computational experiments show that the sample sort algorithm, which is a theoretically efficient "randomized" algorithm, is the fastest of the three algorithms on large data sets. On a 64Kprocessor CM2, our sample sort implementation can sort 32 10 6 64bit keys in 5.1 seconds, which is over 10 times faster than the CM2 library sort. Our implementation of radix sort, although not as fast on large data sets, is deterministic, much simpler to code, stable, faster with small keys, and faster on small data sets (few elements per processor). Our implementation of bitonic sort, which is pipelined to use all the hypercube wires simultaneously, is the least efficient of the three on large data sets, but is the most efficient on small data sets, and is considerably more space efficient. This paper analyzes the three algorithms in detail and discusses many practical issues that led us to the particular implementations.
Efficient algorithms for geometric optimization
 ACM Comput. Surv
, 1998
"... We review the recent progress in the design of efficient algorithms for various problems in geometric optimization. We present several techniques used to attack these problems, such as parametric searching, geometric alternatives to parametric searching, pruneandsearch techniques for linear progra ..."
Abstract

Cited by 100 (12 self)
 Add to MetaCart
We review the recent progress in the design of efficient algorithms for various problems in geometric optimization. We present several techniques used to attack these problems, such as parametric searching, geometric alternatives to parametric searching, pruneandsearch techniques for linear programming and related problems, and LPtype problems and their efficient solution. We then describe a variety of applications of these and other techniques to numerous problems in geometric optimization, including facility location, proximity problems, statistical estimators and metrology, placement and intersection of polygons and polyhedra, and ray shooting and other querytype problems.
Lossless condensers, unbalanced expanders, and extractors
 In Proceedings of the 33rd Annual ACM Symposium on Theory of Computing
, 2001
"... Abstract Trevisan showed that many pseudorandom generator constructions give rise to constructionsof explicit extractors. We show how to use such constructions to obtain explicit lossless condensers. A lossless condenser is a probabilistic map using only O(log n) additional random bitsthat maps n bi ..."
Abstract

Cited by 90 (19 self)
 Add to MetaCart
Abstract Trevisan showed that many pseudorandom generator constructions give rise to constructionsof explicit extractors. We show how to use such constructions to obtain explicit lossless condensers. A lossless condenser is a probabilistic map using only O(log n) additional random bitsthat maps n bits strings to poly(log K) bit strings, such that any source with support size Kis mapped almost injectively to the smaller domain. Our construction remains the best lossless condenser to date.By composing our condenser with previous extractors, we obtain new, improved extractors. For small enough minentropies our extractors can output all of the randomness with only O(log n) bits. We also obtain a new disperser that works for every entropy loss, uses an O(log n)bit seed, and has only O(log n) entropy loss. This is the best disperser construction to date,and yields other applications. Finally, our lossless condenser can be viewed as an unbalanced
Expanders that Beat the Eigenvalue Bound: Explicit Construction and Applications
 Combinatorica
, 1993
"... For every n and 0 ! ffi ! 1, we construct graphs on n nodes such that every two sets of size n ffi share an edge, having essentially optimal maximum degree n 1\Gammaffi+o(1) . Using known and new reductions from these graphs, we explicitly construct: 1. A k round sorting algorithm using n 1+1=k ..."
Abstract

Cited by 90 (26 self)
 Add to MetaCart
For every n and 0 ! ffi ! 1, we construct graphs on n nodes such that every two sets of size n ffi share an edge, having essentially optimal maximum degree n 1\Gammaffi+o(1) . Using known and new reductions from these graphs, we explicitly construct: 1. A k round sorting algorithm using n 1+1=k+o(1) comparisons. 2. A k round selection algorithm using n 1+1=(2 k \Gamma1)+o(1) comparisons. 3. A depth 2 superconcentrator of size n 1+o(1) . 4. A depth k widesense nonblocking generalized connector of size n 1+1=k+o(1) . All of these results improve on previous constructions by factors of n\Omega\Gamma37 , and are optimal to within factors of n o(1) . These results are based on an improvement to the extractor construction of Nisan & Zuckerman: our algorithm extracts an asymptotically optimal number of random bits from a defective random source using a small additional number of truly random bits. 1
Randomized routing and sorting on fixedconnection networks
 JOURNAL OF ALGORITHMS
, 1994
"... This paper presents a general paradigm for the design of packet routing algorithms for fixedconnection networks. Its basis is a randomized online algorithm for scheduling any set of N packets whose paths have congestion c on any boundeddegree leveled network with depth L in O(c + L + log N) steps ..."
Abstract

Cited by 89 (13 self)
 Add to MetaCart
This paper presents a general paradigm for the design of packet routing algorithms for fixedconnection networks. Its basis is a randomized online algorithm for scheduling any set of N packets whose paths have congestion c on any boundeddegree leveled network with depth L in O(c + L + log N) steps, using constantsize queues. In this paradigm, the design of a routing algorithm is broken into three parts: (1) showing that the underlying network can emulate a leveled network, (2) designing a path selection strategy for the leveled network, and (3) applying the scheduling algorithm. This strategy yields randomized algorithms for routing and sorting in time proportional to the diameter for meshes, butterflies, shuffleexchange graphs, multidimensional arrays, and hypercubes. It also leads to the construction of an areauniversal network: an Nnode network with area Θ(N) that can simulate any other network of area O(N) with slowdown O(log N).
Deterministic Sorting in Nearly Logarithmic Time on the Hypercube and Related Computers
 Journal of Computer and System Sciences
, 1996
"... This paper presents a deterministic sorting algorithm, called Sharesort, that sorts n records on an nprocessor hypercube, shuffleexchange, or cubeconnected cycles in O(log n (log log n) 2 ) time in the worst case. The algorithm requires only a constant amount of storage at each processor. Th ..."
Abstract

Cited by 68 (10 self)
 Add to MetaCart
This paper presents a deterministic sorting algorithm, called Sharesort, that sorts n records on an nprocessor hypercube, shuffleexchange, or cubeconnected cycles in O(log n (log log n) 2 ) time in the worst case. The algorithm requires only a constant amount of storage at each processor. The fastest previous deterministic algorithm for this problem was Batcher's bitonic sort, which runs in O(log 2 n) time. Supported by an NSERC postdoctoral fellowship, and DARPA contracts N0001487K825 and N00014 89J1988. 1 Introduction Given n records distributed uniformly over the n processors of some fixed interconnection network, the sorting problem is to route the record with the ith largest associated key to processor i, 0 i ! n. One of the earliest parallel sorting algorithms is Batcher's bitonic sort [3], which runs in O(log 2 n) time on the hypercube [10], shuffleexchange [17], and cubeconnected cycles [14]. More recently, Leighton [9] exhibited a boundeddegree,...
CommunicationEfficient Parallel Sorting
, 1996
"... We study the problem of sorting n numbers on a pprocessor bulksynchronous parallel (BSP) computer, which is a parallel multicomputer that allows for general processortoprocessor communication rounds provided each processor sends and receives at most h items in any round. We provide parallel sort ..."
Abstract

Cited by 65 (2 self)
 Add to MetaCart
We study the problem of sorting n numbers on a pprocessor bulksynchronous parallel (BSP) computer, which is a parallel multicomputer that allows for general processortoprocessor communication rounds provided each processor sends and receives at most h items in any round. We provide parallel sorting methods that use internal computation time that is O( n log n p ) and a number of communication rounds that is O( log n log(h+1) ) for h = \Theta(n=p). The internal computation bound is optimal for any comparisonbased sorting algorithm. Moreover, the number of communication rounds is bounded by a constant for the (practical) situations when p n 1\Gamma1=c for a constant c 1. In fact, we show that our bound on the number of communication rounds is asymptotically optimal for the full range of values for p, for we show that just computing the "or" of n bits distributed evenly to the first O(n=h) of an arbitrary number of processors in a BSP computer requires\Omega\Gammaqui n= log(h...
Online algorithms for path selection in a nonblocking network
 SIAM Journal on Computing
, 1996
"... This paper presents the first optimaltime algorithms for path selection in an optimalsize nonblocking network. In particular, we describe an Ninput, Noutput, nonblocking network with O(N log N) boundeddegree nodes, and an algorithm that can satisfy any request for a connection or disconnection ..."
Abstract

Cited by 63 (14 self)
 Add to MetaCart
This paper presents the first optimaltime algorithms for path selection in an optimalsize nonblocking network. In particular, we describe an Ninput, Noutput, nonblocking network with O(N log N) boundeddegree nodes, and an algorithm that can satisfy any request for a connection or disconnection between an input and an output in O(log N) bit steps, even if many requests are made at once. Viewed in a telephone switching context, the algorithm can put through any set of calls among N parties in O(log N) bit steps, even if many calls are placed simultaneously. Parties can hang up and call again whenever they like; every call is still put through O(log N) bit steps after being placed. Viewed in a distributed memory machine context, our algorithm allows any processor to access any idle block of memory within O(log N) bit steps, no matter what other connections have been made previously or are being made simultaneously.
Simulating Boolean Circuits on a DNA Computer
, 1997
"... We demonstrate that DNA computers can simulate Boolean circuits with a small overhead. Boolean circuits embody the notion of massively parallel signal processing and are jrequen,tly encountered in many parallel algorithms. Many important problems such as sorting, integer arithmetic, and matrix mult ..."
Abstract

Cited by 58 (9 self)
 Add to MetaCart
We demonstrate that DNA computers can simulate Boolean circuits with a small overhead. Boolean circuits embody the notion of massively parallel signal processing and are jrequen,tly encountered in many parallel algorithms. Many important problems such as sorting, integer arithmetic, and matrix multiplication are known to be computable by small size Boolean circuits much faster than by ordinary sequential digital computers. This paper shows that DNA chemistry allows one to simulate large semiunbounded janin Boolean circuits with a logarithmic slowdown in computation time. Also, for the class NC¹, the slowdown can be reduced to a constant. In this algorathm we have encoded the inputs, the Boolean AND gates, and the OR gates to DNA oligonucleotide sequences. We operate on the gates and the inputs by standard molecular techniques of sequencespecific annealing, ligation, separation by size, amplification, sequencespecific cleavage, and detection by size. Additional steps of amplification are not necessary for NC¹ circuits. Preliminary biochemical experiments on a small test circuit have produced encouraging results. Further confirmatory experiments are in progress.
Iterated Nearest Neighbors and Finding Minimal Polytopes
, 1994
"... Weintroduce a new method for finding several types of optimal kpoint sets, minimizing perimeter, diameter, circumradius, and related measures, by testing sets of the O(k) nearest neighbors to each point. We argue that this is better in a number of ways than previous algorithms, whichwere based o ..."
Abstract

Cited by 57 (6 self)
 Add to MetaCart
Weintroduce a new method for finding several types of optimal kpoint sets, minimizing perimeter, diameter, circumradius, and related measures, by testing sets of the O(k) nearest neighbors to each point. We argue that this is better in a number of ways than previous algorithms, whichwere based on high order Voronoi diagrams. Our technique allows us for the first time to efficiently maintain minimal sets as new points are inserted, to generalize our algorithms to higher dimensions, to find minimal convex kvertex polygons and polytopes, and to improvemany previous results. Weachievemany of our results via a new algorithm for finding rectilinear nearest neighbors in the plane in time O(n log n+kn). We also demonstrate a related technique for finding minimum area kpoint sets in the plane, based on testing sets of nearest vertical neighbors to each line segment determined by a pair of points. A generalization of this technique also allows us to find minimum volume and boundary measure sets in arbitrary dimensions.