Results 1 - 10
of
47
Randomized routing and sorting on fixed-connection networks
- Journal of Algorithms
, 1994
"... This paper presents a general paradigm for the design of packet routing algorithms for fixed-connection networks. Its basis is a randomized on-line algorithm for scheduling any set of N packets whose paths have congestion c on any bounded-degree leveled network with depth L in O(c + L + log N) steps ..."
Abstract
-
Cited by 84 (13 self)
- Add to MetaCart
This paper presents a general paradigm for the design of packet routing algorithms for fixed-connection networks. Its basis is a randomized on-line algorithm for scheduling any set of N packets whose paths have congestion c on any bounded-degree leveled network with depth L in O(c + L + log N) steps, using constant-size queues. In this paradigm, the design of a routing algorithm is broken into three parts: (1) showing that the underlying network can emulate a leveled network, (2) designing a path selection strategy for the leveled network, and (3) applying the scheduling algorithm. This strategy yields randomized algorithms for routing and sorting in time proportional to the diameter for meshes, butterflies, shuffle-exchange graphs, multidimensional arrays, and hypercubes. It also leads to the construction of an area-universal network: an N-node network with area Θ(N) that can simulate any other network of area O(N) with slowdown O(log N).
Special Purpose Parallel Computing
- Lectures on Parallel Computation
, 1993
"... A vast amount of work has been done in recent years on the design, analysis, implementation and verification of special purpose parallel computing systems. This paper presents a survey of various aspects of this work. A long, but by no means complete, bibliography is given. 1. Introduction Turing ..."
Abstract
-
Cited by 77 (5 self)
- Add to MetaCart
A vast amount of work has been done in recent years on the design, analysis, implementation and verification of special purpose parallel computing systems. This paper presents a survey of various aspects of this work. A long, but by no means complete, bibliography is given. 1. Introduction Turing [365] demonstrated that, in principle, a single general purpose sequential machine could be designed which would be capable of efficiently performing any computation which could be performed by a special purpose sequential machine. The importance of this universality result for subsequent practical developments in computing cannot be overstated. It showed that, for a given computational problem, the additional efficiency advantages which could be gained by designing a special purpose sequential machine for that problem would not be great. Around 1944, von Neumann produced a proposal [66, 389] for a general purpose storedprogram sequential computer which captured the fundamental principles of...
Deterministic Sorting in Nearly Logarithmic Time on the Hypercube and Related Computers
- Journal of Computer and System Sciences
, 1996
"... This paper presents a deterministic sorting algorithm, called Sharesort, that sorts n records on an n-processor hypercube, shuffle-exchange, or cube-connected cycles in O(log n (log log n) 2 ) time in the worst case. The algorithm requires only a constant amount of storage at each processor. Th ..."
Abstract
-
Cited by 67 (10 self)
- Add to MetaCart
This paper presents a deterministic sorting algorithm, called Sharesort, that sorts n records on an n-processor hypercube, shuffle-exchange, or cube-connected cycles in O(log n (log log n) 2 ) time in the worst case. The algorithm requires only a constant amount of storage at each processor. The fastest previous deterministic algorithm for this problem was Batcher's bitonic sort, which runs in O(log 2 n) time. Supported by an NSERC postdoctoral fellowship, and DARPA contracts N00014--87--K--825 and N00014-- 89--J--1988. 1 Introduction Given n records distributed uniformly over the n processors of some fixed interconnection network, the sorting problem is to route the record with the ith largest associated key to processor i, 0 i ! n. One of the earliest parallel sorting algorithms is Batcher's bitonic sort [3], which runs in O(log 2 n) time on the hypercube [10], shuffle-exchange [17], and cube-connected cycles [14]. More recently, Leighton [9] exhibited a bounded-degree,...
On-line algorithms for path selection in a nonblocking network
- SIAM Journal on Computing
, 1996
"... This paper presents the first optimal-time algorithms for path selection in an optimal-size nonblocking network. In particular, we describe an N-input, N-output, nonblocking network with O(N log N) bounded-degree nodes, and an algorithm that can satisfy any request for a connection or disconnection ..."
Abstract
-
Cited by 62 (14 self)
- Add to MetaCart
This paper presents the first optimal-time algorithms for path selection in an optimal-size nonblocking network. In particular, we describe an N-input, N-output, nonblocking network with O(N log N) bounded-degree nodes, and an algorithm that can satisfy any request for a connection or disconnection between an input and an output in O(log N) bit steps, even if many requests are made at once. Viewed in a telephone switching context, the algorithm can put through any set of calls among N parties in O(log N) bit steps, even if many calls are placed simultaneously. Parties can hang up and call again whenever they like; every call is still put through O(log N) bit steps after being placed. Viewed in a distributed memory machine context, our algorithm allows any processor to access any idle block of memory within O(log N) bit steps, no matter what other connections have been made previously or are being made simultaneously.
Eigenvalues and Expansion of Regular Graphs
- Journal of the ACM
, 1995
"... The spectral method is the best currently known technique to prove lower bounds on expansion. Ramanujan graphs, which have asymptotically optimal second eigenvalue, are the best known explicit expanders. The spectral method yielded a lower bound of k=4 on the expansion of linear sized subsets of k-r ..."
Abstract
-
Cited by 46 (1 self)
- Add to MetaCart
The spectral method is the best currently known technique to prove lower bounds on expansion. Ramanujan graphs, which have asymptotically optimal second eigenvalue, are the best known explicit expanders. The spectral method yielded a lower bound of k=4 on the expansion of linear sized subsets of k-regular Ramanujan graphs. We improve the lower bound on the expansion of Ramanujan graphs to approximately k=2. Moreover, we construct a family of k-regular graphs with asymptotically optimal second eigenvalue and linear expansion equal to k=2. This shows that k=2 is the best bound one can obtain using the second eigenvalue method. We also show an upper bound of roughly 1 + p k \Gamma 1 on the average degree of linear-sized induced subgraphs of Ramanujan graphs. This compares positively with the classical bound 2 p k \Gamma 1. As a byproduct, we obtain improved results on random walks on expanders and construct selection networks (resp. extrovert graphs) of smaller size (resp. degree) th...
TIGHT ANALYSES OF TWO LOCAL LOAD BALANCING ALGORITHMS
- SIAM J. COMPUT.
, 1999
"... This paper presents an analysis of the following load balancing algorithm. At each step, each node in a network examines the number of tokens at each of its neighbors and sends a token to each neighbor with at least 2d + 1 fewer tokens, where d is the maximum degree of any node in the network. We ..."
Abstract
-
Cited by 45 (5 self)
- Add to MetaCart
This paper presents an analysis of the following load balancing algorithm. At each step, each node in a network examines the number of tokens at each of its neighbors and sends a token to each neighbor with at least 2d + 1 fewer tokens, where d is the maximum degree of any node in the network. We show that within O(∆/α) steps, the algorithm reduces the maximum difference in tokens between any two nodes to at most O((d 2 log n)/α), where ∆ is the global imbalance in tokens (i.e., the maximum difference between the number of tokens at any node initially and the average number of tokens), n is the number of nodes in the network, and α is the edge expansion of the network. The time bound is tight in the sense that for any graph with edge expansion α, and for any value ∆, there exists an initial distribution of tokens with imbalance ∆ for which the time to reduce the imbalance to even ∆/2 is at least Ω(∆/α). The bound on the final imbalance is tight in the sense that there exists a class of networks that can be locally balanced everywhere (i.e., the maximum difference in tokens between any two neighbors is at most 2d), while the global imbalance remains Ω((d 2 log n)/α). Furthermore, we show that upon reaching a state with a global imbalance of O((d 2 log n)/α), the time for this algorithm to locally balance the network can be as large as Ω(n 1/2). We extend our analysis to a variant of this algorithm for dynamic and asynchronous
Parallel Algorithms with Processor Failures and Delays
, 1995
"... We study efficient deterministic parallel algorithms on two models: restartable fail-stop CRCW PRAMs and asynchronous PRAMs. In the first model, synchronous processors are subject to arbitrary stop failures and restarts determined by an on-line adversary and involving loss of private but not shared ..."
Abstract
-
Cited by 40 (8 self)
- Add to MetaCart
We study efficient deterministic parallel algorithms on two models: restartable fail-stop CRCW PRAMs and asynchronous PRAMs. In the first model, synchronous processors are subject to arbitrary stop failures and restarts determined by an on-line adversary and involving loss of private but not shared memory; the complexity measures are completed work (where processors are charged for completed fixed-size update cycles) and overhead ratio (completed work amortized over necessary work and failures). In the second model, the result of the computation is a serializaton of the actions of the processors determined by an on-line adversary; the complexity measure is total work (number of steps taken by all processors). Despite their differences the two models share key algorithmic techniques. We present new algorithms for the Write-All problem (in which P processors write ones into an array of size N ) for the two models. These algorithms can be used to implement a simulation strategy for any N ...
Scalable Network Architectures Using The Optical Transpose Interconnection System (OTIS)
, 1996
"... The Optical Transpose Interconnection System (OTIS) proposed in [14] makes use of free-space optical interconnects to augment an electronic system with non-local interconnections. In this paper, we show how these connections can be used to implement a large-scale system with a given network topology ..."
Abstract
-
Cited by 38 (0 self)
- Add to MetaCart
The Optical Transpose Interconnection System (OTIS) proposed in [14] makes use of free-space optical interconnects to augment an electronic system with non-local interconnections. In this paper, we show how these connections can be used to implement a large-scale system with a given network topology using small copies of a similar topology. In particular, we show that, using OTIS, an N 2 node 4-D mesh can be constructed from N copies of the N-node 2-D mesh, an N 2 node hypercube can be constructed from N copies of the N-node hypercube, and an (N 2 ; ff 2 ; c=2) expander can be constructed from N copies of an (N; ff; c) expanders, all with small slowdown. We also show how this expander construction can be used to build multibutterfly networks in a scalable fashion. Finally, we demonstrate how the OTIS connections can be used to produce a bit-parallel crossbar using many copies of bit-serial crossbars with minimal overhead. 1 Introduction In principle, optical interconnect tec...
Protocols and impossibility results for gossip-based communication mechanisms
, 2002
"... In recent years, gossip-based algorithms have gained prominence as a methodology for designing robust and scalable communication schemes in large distributed systems. The premise underlying distributed gossip is very simple: in each time step, each node v in the system selects some other node w as a ..."
Abstract
-
Cited by 38 (2 self)
- Add to MetaCart
In recent years, gossip-based algorithms have gained prominence as a methodology for designing robust and scalable communication schemes in large distributed systems. The premise underlying distributed gossip is very simple: in each time step, each node v in the system selects some other node w as a communication partner — generally by a simple randomized rule — and exchanges information with w; over a period of time, information spreads through the system in an “epidemic fashion”. A fundamental issue which is not well understood is the following: how does the underlying low-level gossip mechanism — the means by which communication partners are chosen — affect one’s ability to design efficient high-level gossip-based protocols? We establish one of the first concrete results addressing this question, by showing a fundamental limitation on the power of the commonly used uniform gossip mechanism for solving nearest-resource location problems. In contrast, very efficient protocols for this problem can be designed using a non-uniform spatial gossip mechanism, as established in earlier work with Alan Demers. We go on to consider the design of protocols for more complex problems, providing an efficient distributed gossipbased protocol for a set of nodes in Euclidean space to construct an approximate minimum spanning tree. Here too, we establish a contrasting limitation on the power of uniform gossip for solving this problem. Finally, we investigate gossip-based packet routing as a primitive that underpins the communication patterns in many protocols, and as a way to understand the capabilities of different gossip mechanisms at a general level.
Short Paths in Expander Graphs
- In Proceedings of the 37th Annual Symposium on Foundations of Computer Science
, 1996
"... Graph expansion has proved to be a powerful general tool for analyzing the behavior of routing algorithms and the inter--connection networks on which they run. We develop new routing algorithms and structural results for bounded--degree expander graphs. Our results are unified by the fact that they ..."
Abstract
-
Cited by 36 (1 self)
- Add to MetaCart
Graph expansion has proved to be a powerful general tool for analyzing the behavior of routing algorithms and the inter--connection networks on which they run. We develop new routing algorithms and structural results for bounded--degree expander graphs. Our results are unified by the fact that they are all based upon, and extend, a body of work asserting that expanders are rich in short, disjoint paths. In particular, our work has consequences for the disjoint paths problem, multicommodity flow, and graph minor containment. We show: (i) A greedy algorithm for approximating the maximum disjoint paths problem achieves a polylogarithmic approximation ratio in bounded--degree expanders. Although our algorithm is both deterministic and on-line, its performance guarantee is an improvement over previous bounds in expanders. (ii) For a multicommodity flow problem with arbitrary demands on a bounded--degree expander, there is a (1+ ")--optimal solution using only flow paths of polylogarithmi...

