Results 1 - 10
of
18
On the Fault Tolerance of Some Popular Bounded-Degree Networks
- SIAM Journal on Computing
, 1992
"... In this paper, we analyze the ability of several bounded-degree networks that are commonly used for parallel computation to tolerate faults. Among other things, we show that an N-node butterfly containing N 1\Gammaffl worst-case faults (for any constant ffl ? 0) can emulate a fault-free butterfly ..."
Abstract
-
Cited by 43 (6 self)
- Add to MetaCart
In this paper, we analyze the ability of several bounded-degree networks that are commonly used for parallel computation to tolerate faults. Among other things, we show that an N-node butterfly containing N 1\Gammaffl worst-case faults (for any constant ffl ? 0) can emulate a fault-free butterfly of the same size with only constant slowdown. Similar results are proved for the shuffleexchange graph. Hence, these networks become the first connected boundeddegree networks known to be able to sustain more than a constant number of worst-case faults without suffering more than a constant-factor slowdown in performance. We also show that an N-node butterfly whose nodes fail with some constant probability p can emulate a fault-free version of itself with a slowdown of 2 O(log N) , which is a very slowly increasing function of N . The proofs of these results combine the technique of redundant computation with new algorithms for (packet) routing around faults in hypercubic networks. Tech...
Approximate Load Balancing on Dynamic and Asynchronous Networks
- In Proceedings of the 25th Annual ACM Symposium on Theory of Computing
, 1993
"... This paper presents a simple local algorithm for load balancing in a distributed network. The algorithm makes no assumption about the structure of the network. It can be executed on a synchronous network with fixed topology, a synchronous network with dynamically changing topology, or an asynchronou ..."
Abstract
-
Cited by 39 (3 self)
- Add to MetaCart
This paper presents a simple local algorithm for load balancing in a distributed network. The algorithm makes no assumption about the structure of the network. It can be executed on a synchronous network with fixed topology, a synchronous network with dynamically changing topology, or an asynchronous network. It works quickly and balances well when the network has an expansion property. In particular, we show that in an n-node networkwith maximumdegree d whose live edges, at every time step, form a ¯-expander, the algorithm will balance the load to within an additive O(d log n=¯) term in O(\Delta log(n\Delta)=¯) time, where \Delta is the initial imbalance. The algorithm improves upon previous approaches that yield O(n) time bounds in dynamic and asynchronous networks. 1 Introduction One of the most fundamental problems to solve on a parallel computer or distributed network is to balance the load or work that must be performed among the various processors. This paper analyzes a sim...
Fast Algorithms for Bit-Serial Routing on a Hypercube
, 1991
"... In this paper, we describe an O(log N)-bit-step randomized algorithm for bit-serial message routing on a hypercube. The result is asymptotically optimal, and improves upon the best previously known algorithms by a logarithmic factor. The result also solves the problem of on-line circuit switching in ..."
Abstract
-
Cited by 36 (9 self)
- Add to MetaCart
In this paper, we describe an O(log N)-bit-step randomized algorithm for bit-serial message routing on a hypercube. The result is asymptotically optimal, and improves upon the best previously known algorithms by a logarithmic factor. The result also solves the problem of on-line circuit switching in an O(1)-dilated hypercube (i.e., the problem of establishing edge-disjoint paths between the nodes of the dilated hypercube for any one-to-one mapping). Our algorithm is adaptive and we show that this is necessary to achieve the logarithmic speedup. We generalize the Borodin-Hopcroft lower bound on oblivious routing by proving that any randomized oblivious algorithm on a polylogarithmic degree network requires at least \Omega\Gammaast 2 N= log log N) bit steps with high probability for almost all permutations. 1 Introduction Substantial effort has been devoted to the study of store-and-forward packet routing algorithms for hypercubic networks. The fastest algorithms are randomized, and c...
Fault-tolerant data structures
- In Proceedings of 37th IEEE FOCS
, 1996
"... We consider the tolerance of data structures to memory faults. We observe that many pointer-based data structures (e.g. linked lists, trees, etc.) are highly nonresilient to faults. A single fault in a linked list or tree may result in the loss of the entire set of data. In this paper we present a f ..."
Abstract
-
Cited by 29 (1 self)
- Add to MetaCart
We consider the tolerance of data structures to memory faults. We observe that many pointer-based data structures (e.g. linked lists, trees, etc.) are highly nonresilient to faults. A single fault in a linked list or tree may result in the loss of the entire set of data. In this paper we present a formal framework for studying the fault tolerance properties of pointer-based data structures, and we provide fault tolerant versions of the stack, the linked list, and the dictionary tree. 1
Packet Routing In Fixed-Connection Networks: A Survey
, 1998
"... We survey routing problems on fixed-connection networks. We consider many aspects of the routing problem and provide known theoretical results for various communication models. We focus on (partial) permutation, k-relation routing, routing to random destinations, dynamic routing, isotonic routing ..."
Abstract
-
Cited by 26 (3 self)
- Add to MetaCart
We survey routing problems on fixed-connection networks. We consider many aspects of the routing problem and provide known theoretical results for various communication models. We focus on (partial) permutation, k-relation routing, routing to random destinations, dynamic routing, isotonic routing, fault tolerant routing, and related sorting results. We also provide a list of unsolved problems and numerous references.
Fast Algorithms for Routing Around Faults in Multibutterflies and Randomly-Wired Splitter Networks
- IEEE Transactions on Computers
, 1992
"... This paper describes simple deterministic O(log N)-step algorithms for routing permutations of packets in multibutterflies and randomlywired splitter networks. The algorithms are robust against faults (even in the worst case), and are efficient from a practical point of view. As a consequence, we fi ..."
Abstract
-
Cited by 25 (8 self)
- Add to MetaCart
This paper describes simple deterministic O(log N)-step algorithms for routing permutations of packets in multibutterflies and randomlywired splitter networks. The algorithms are robust against faults (even in the worst case), and are efficient from a practical point of view. As a consequence, we find that the multibutterfly is an excellent candidate for a high-bandwidth low-diameter switching network underlying a sharedmemory machine. Index Terms--Fault tolerance, interconnection network, multibutterfly, multistage network, routing algorithm. 1 Introduction Networks derived from hypercubes form the architectural basis of most parallel computers, including machines such as the BBN Butterfly, the Connection Machine, the IBM RP3 and GF11, the Intel iPSC, and the NCUBE. The butterfly, in particular, is quite popular, and has been demonstrated to perform reasonably well in practice. An example of an 8-input butterfly is illustrated in Figure 1. The nodes in this graph represent switches,...
Fault-Tolerant Meshes with Small Degree
, 1993
"... This paper presents constructions for fault-tolerant two-dimensional mesh architectures. The constructions are designed to tolerate k faults while maintaining a healthy n by n mesh as a subgraph. They utilize several novel techniques for obtaining trade-offs between the number of spare nodes and th ..."
Abstract
-
Cited by 17 (0 self)
- Add to MetaCart
This paper presents constructions for fault-tolerant two-dimensional mesh architectures. The constructions are designed to tolerate k faults while maintaining a healthy n by n mesh as a subgraph. They utilize several novel techniques for obtaining trade-offs between the number of spare nodes and the degree of the fault-tolerant network. We consider both worst-case and random fault distributions. In terms of worst-case faults, we give a construction that has constant degree and O(k 3 ) spare nodes. This is the first construction known in which the degree is constant and the number of spare nodes is independent of n. In terms of random faults, we present several new degree-6 and degree-8 constructions and show (both analytically and through simulations) that they can tolerate large numbers of randomly placed faults. A preliminary version of this paper appeared in Proceedings of the Fifth Annual ACM Symposium on Parallel Algorithms and Architectures, 1993. y California Institute of...
Optimal Routing of Parentheses on the Hypercube
- IN PROCEEDINGS OF THE SYMPOSIUM ON PARALLEL ARCHITECTURES AND ALGORITHMS
, 1994
"... We consider a new class of routing requests or partial permutations for which we give optimal on-line routing algorithms on the hypercube and shuffle-exchange network. For well-formed words of parentheses our algorithm establishes communication between all matching pairs in logarithmic time. It can ..."
Abstract
-
Cited by 14 (6 self)
- Add to MetaCart
We consider a new class of routing requests or partial permutations for which we give optimal on-line routing algorithms on the hypercube and shuffle-exchange network. For well-formed words of parentheses our algorithm establishes communication between all matching pairs in logarithmic time. It can be applied to the membership problem for Dyck languages and a number of problems for algebraic expressions.
Tolerating Faults in Hypercubes using Subcube Partitioning
- IEEE Transactions on Computers
, 1992
"... We examine the issue of running algorithms on a hypercube which has both node and edge faults, and we assume a worst case distribution of the faults. We prove that for any constant c, an n-dimensional hypercube (n-cube) with n c faulty components contains a fault-free subgraph that can implement ..."
Abstract
-
Cited by 11 (1 self)
- Add to MetaCart
We examine the issue of running algorithms on a hypercube which has both node and edge faults, and we assume a worst case distribution of the faults. We prove that for any constant c, an n-dimensional hypercube (n-cube) with n c faulty components contains a fault-free subgraph that can implement a large class of hypercube algorithms with only a constant factor slowdown. In addition, our approach yields practical implementations for small numbers of faults. For example, we show that any regular algorithm can be implemented on an n-cube that has at most n \Gamma 1 faults with slowdowns of at most 2 for computation and at most 4 for communication. To the best of our knowledge this is the first result showing that an n-cube can tolerate more than O(n) arbitrarily placed faults with a constant factor slowdown. This work was done while the author was with IBM Almaden Research Center. 1 Introduction The n-dimensional hypercube (n-cube) is one of the most popular interconnection topolog...

