Results 1  10
of
74
Methods and problems of communication in usual networks
, 1994
"... This paper is a survey of existing methods of communication in usual networks. We particularly study the complete network, the ring, the torus, the grid, the hypercube, the cube connected cycles, the undirected de Bruijn graph, the star graph, the shuffleexchange graph, and the butterfly graph. Two ..."
Abstract

Cited by 118 (12 self)
 Add to MetaCart
This paper is a survey of existing methods of communication in usual networks. We particularly study the complete network, the ring, the torus, the grid, the hypercube, the cube connected cycles, the undirected de Bruijn graph, the star graph, the shuffleexchange graph, and the butterfly graph. Two different models of communication time are analysed, namely the constant model and the linear model. Other constraints like fullduplex or halfduplex links, processorbound, DMAbound or linkbound possibilities are separately studied. For each case we give references, upper bound (algorithms) and lower bounds. We have also proposed improvements or new results when possible. Hopefully, optimal results are not always known and we present a list of open problems.
Deterministic Sorting in Nearly Logarithmic Time on the Hypercube and Related Computers
 Journal of Computer and System Sciences
, 1996
"... This paper presents a deterministic sorting algorithm, called Sharesort, that sorts n records on an nprocessor hypercube, shuffleexchange, or cubeconnected cycles in O(log n (log log n) 2 ) time in the worst case. The algorithm requires only a constant amount of storage at each processor. Th ..."
Abstract

Cited by 72 (10 self)
 Add to MetaCart
(Show Context)
This paper presents a deterministic sorting algorithm, called Sharesort, that sorts n records on an nprocessor hypercube, shuffleexchange, or cubeconnected cycles in O(log n (log log n) 2 ) time in the worst case. The algorithm requires only a constant amount of storage at each processor. The fastest previous deterministic algorithm for this problem was Batcher's bitonic sort, which runs in O(log 2 n) time. Supported by an NSERC postdoctoral fellowship, and DARPA contracts N0001487K825 and N00014 89J1988. 1 Introduction Given n records distributed uniformly over the n processors of some fixed interconnection network, the sorting problem is to route the record with the ith largest associated key to processor i, 0 i ! n. One of the earliest parallel sorting algorithms is Batcher's bitonic sort [3], which runs in O(log 2 n) time on the hypercube [10], shuffleexchange [17], and cubeconnected cycles [14]. More recently, Leighton [9] exhibited a boundeddegree,...
Parallel Algorithmic Techniques for Combinatorial Computation
 Ann. Rev. Comput. Sci
, 1988
"... this paper and supplied many helpful comments. This research was supported in part by NSF grants DCR8511713, CCR8605353, and CCR8814977, and by DARPA contract N0003984C0165. ..."
Abstract

Cited by 35 (3 self)
 Add to MetaCart
this paper and supplied many helpful comments. This research was supported in part by NSF grants DCR8511713, CCR8605353, and CCR8814977, and by DARPA contract N0003984C0165.
The Complexity of Computation on the Parallel Random Access Machine
, 1993
"... PRAMs also approximate the situation where communication to and from shared memory is much more expensive than local operations, for example, where each processor is located on a separate chip and access to shared memory is through a combining network. Not surprisingly, abstract PRAMs can be much m ..."
Abstract

Cited by 34 (3 self)
 Add to MetaCart
PRAMs also approximate the situation where communication to and from shared memory is much more expensive than local operations, for example, where each processor is located on a separate chip and access to shared memory is through a combining network. Not surprisingly, abstract PRAMs can be much more powerful than restricted instruction set PRAMs. THEOREM 21.16 Any function of n variables can be computed by an abstract EROW PRAM in O(log n) steps using n= log 2 n processors and n=2 log 2 n shared memory cells. PROOF Each processor begins by reading log 2 n input values and combining them into one large value. The information known by processors are combined in a binarytreelike fashion. In each round, the remaining processors are grouped into pairs. In each pair, one processor communicates the information it knows about the input to the other processor and then leaves the computation. After dlog 2 ne rounds, one processor knows all n input values. Then this processor computes th...
Basic Operations on the OTISMesh Optoelectronic Computer
 IEEE Transactions on Parallel and Distributed Systems
, 1999
"... In this paper we develop algorithms for some basic operations  broadcast, window broadcast, prefix sum, data sum, rank, shift, data accumulation, consecutive sum, adjacent sum, concentrate, distribute, generalize, sorting, random access read and write  on the OTISMesh [1] model. These operations ..."
Abstract

Cited by 32 (5 self)
 Add to MetaCart
(Show Context)
In this paper we develop algorithms for some basic operations  broadcast, window broadcast, prefix sum, data sum, rank, shift, data accumulation, consecutive sum, adjacent sum, concentrate, distribute, generalize, sorting, random access read and write  on the OTISMesh [1] model. These operations are useful in the development of efficient algorithms for numerous applications [2].
Image Processing On The OTISMesh Optoelectronic Computer
 IEEE Transactions on Parallel and Distributed Systems
, 2000
"... We develop algorithms for histogramming, histogram modification, Hough transform, and image shrinking and expanding on an OTISMesh optoelectronic computer. Our algorithm for the Hough transform is based upon a mesh algorithm for the Hough transform which is also developed in this paper. This new me ..."
Abstract

Cited by 22 (2 self)
 Add to MetaCart
We develop algorithms for histogramming, histogram modification, Hough transform, and image shrinking and expanding on an OTISMesh optoelectronic computer. Our algorithm for the Hough transform is based upon a mesh algorithm for the Hough transform which is also developed in this paper. This new mesh algorithm improves upon the mesh Hough transform algorithms of [4] and [14].
Implementing data structures on a hypercube multiprocessor, and applications in parallel computational geometry
 Journal of Parallel and Distributed Computing
, 1990
"... Abstract, In this paper, we study the problem of implementing standard data structures on a hypercube multiprocessor. We present a technique for fficiently executing multiple independant search processes on a class of graphs called ordered hlevel graphs. We show how this technique can be utilized t ..."
Abstract

Cited by 18 (6 self)
 Add to MetaCart
Abstract, In this paper, we study the problem of implementing standard data structures on a hypercube multiprocessor. We present a technique for fficiently executing multiple independant search processes on a class of graphs called ordered hlevel graphs. We show how this technique can be utilized to implement a segment tree on a hypercube, thereby obtainig O(logZn) time algorithms for sotving the next element search problem, the trapezoidal decomposition probtem, the triangulation problem, and the (multiple) planar point location problem. 1
Fundamental Algorithms for the Star and Pancake Interconnection Networks with Applications to Computational Geometry
, 1993
"... The star and pancake networks were recently proposed as attractive alternatives to the hypercube topology for interconnecting processors in a parallel computer. However, few parallel algorithms are known for these networks. In this paper, we present several data communication schemes and basic algor ..."
Abstract

Cited by 17 (3 self)
 Add to MetaCart
The star and pancake networks were recently proposed as attractive alternatives to the hypercube topology for interconnecting processors in a parallel computer. However, few parallel algorithms are known for these networks. In this paper, we present several data communication schemes and basic algorithms for these two networks. These algorithms are then used to develop parallel solutions to various computational geometric problems on both networks. Computational geometry is just one area where the algorithms proposed here can be applied. Indeed, we believe that these algorithms are interesting and important in their own right and are fundamental to the design of solutions on the star and pancake networks to a host of other problems.
Optimal Routing of Parentheses on the Hypercube
 IN PROCEEDINGS OF THE SYMPOSIUM ON PARALLEL ARCHITECTURES AND ALGORITHMS
, 1994
"... We consider a new class of routing requests or partial permutations for which we give optimal online routing algorithms on the hypercube and shuffleexchange network. For wellformed words of parentheses our algorithm establishes communication between all matching pairs in logarithmic time. It can ..."
Abstract

Cited by 15 (6 self)
 Add to MetaCart
We consider a new class of routing requests or partial permutations for which we give optimal online routing algorithms on the hypercube and shuffleexchange network. For wellformed words of parentheses our algorithm establishes communication between all matching pairs in logarithmic time. It can be applied to the membership problem for Dyck languages and a number of problems for algebraic expressions.
Scalable Data Parallel Implementations of Object Recognition using Geometric Hashing
, 1994
"... Object recognition involves identifying known objects in a given scene. It plays a key role in image understanding. Geometric hashing has been proposed as a technique for modelbased object recognition in occluded scenes. However, parallel techniques are needed to realize real time vision systems em ..."
Abstract

Cited by 11 (7 self)
 Add to MetaCart
Object recognition involves identifying known objects in a given scene. It plays a key role in image understanding. Geometric hashing has been proposed as a technique for modelbased object recognition in occluded scenes. However, parallel techniques are needed to realize real time vision systems employing geometric hashing. In this paper, we present scalable parallel algorithms for object recognition using geometric hashing. We define a realistic abstract model of CM5 in which explicit cost is associated with data routing and synchronization. We develop a loadbalancing technique that results in scalable processortime optimal algorithms for performing a probe on this model. Given a model of CM5 with P PNs and a set S of feature points in a scene, a probe of the recognition phase can be performed in O( jV (S)j P ) time, where V (S) is the set of votes cast by feature points in S. This algorithm is scalable in the range 1 P jV (S)j 1 3 . On a mesh processor array of size p P...