Results 1  10
of
20
On the Fault Tolerance of Some Popular BoundedDegree Networks
 SIAM Journal on Computing
, 1992
"... In this paper, we analyze the ability of several boundeddegree networks that are commonly used for parallel computation to tolerate faults. Among other things, we show that an Nnode butterfly containing N 1\Gammaffl worstcase faults (for any constant ffl ? 0) can emulate a faultfree butterfly ..."
Abstract

Cited by 44 (7 self)
 Add to MetaCart
In this paper, we analyze the ability of several boundeddegree networks that are commonly used for parallel computation to tolerate faults. Among other things, we show that an Nnode butterfly containing N 1\Gammaffl worstcase faults (for any constant ffl ? 0) can emulate a faultfree butterfly of the same size with only constant slowdown. Similar results are proved for the shuffleexchange graph. Hence, these networks become the first connected boundeddegree networks known to be able to sustain more than a constant number of worstcase faults without suffering more than a constantfactor slowdown in performance. We also show that an Nnode butterfly whose nodes fail with some constant probability p can emulate a faultfree version of itself with a slowdown of 2 O(log N) , which is a very slowly increasing function of N . The proofs of these results combine the technique of redundant computation with new algorithms for (packet) routing around faults in hypercubic networks. Tech...
Approximate Load Balancing on Dynamic and Asynchronous Networks
 In Proceedings of the 25th Annual ACM Symposium on Theory of Computing
, 1993
"... This paper presents a simple local algorithm for load balancing in a distributed network. The algorithm makes no assumption about the structure of the network. It can be executed on a synchronous network with fixed topology, a synchronous network with dynamically changing topology, or an asynchronou ..."
Abstract

Cited by 41 (3 self)
 Add to MetaCart
This paper presents a simple local algorithm for load balancing in a distributed network. The algorithm makes no assumption about the structure of the network. It can be executed on a synchronous network with fixed topology, a synchronous network with dynamically changing topology, or an asynchronous network. It works quickly and balances well when the network has an expansion property. In particular, we show that in an nnode networkwith maximumdegree d whose live edges, at every time step, form a ¯expander, the algorithm will balance the load to within an additive O(d log n=¯) term in O(\Delta log(n\Delta)=¯) time, where \Delta is the initial imbalance. The algorithm improves upon previous approaches that yield O(n) time bounds in dynamic and asynchronous networks. 1 Introduction One of the most fundamental problems to solve on a parallel computer or distributed network is to balance the load or work that must be performed among the various processors. This paper analyzes a sim...
Faulttolerant data structures
 In Proceedings of 37th IEEE FOCS
, 1996
"... We consider the tolerance of data structures to memory faults. We observe that many pointerbased data structures (e.g. linked lists, trees, etc.) are highly nonresilient to faults. A single fault in a linked list or tree may result in the loss of the entire set of data. In this paper we present a f ..."
Abstract

Cited by 38 (1 self)
 Add to MetaCart
We consider the tolerance of data structures to memory faults. We observe that many pointerbased data structures (e.g. linked lists, trees, etc.) are highly nonresilient to faults. A single fault in a linked list or tree may result in the loss of the entire set of data. In this paper we present a formal framework for studying the fault tolerance properties of pointerbased data structures, and we provide fault tolerant versions of the stack, the linked list, and the dictionary tree. 1
Fast Algorithms for BitSerial Routing on a Hypercube
, 1991
"... In this paper, we describe an O(log N)bitstep randomized algorithm for bitserial message routing on a hypercube. The result is asymptotically optimal, and improves upon the best previously known algorithms by a logarithmic factor. The result also solves the problem of online circuit switching in ..."
Abstract

Cited by 36 (9 self)
 Add to MetaCart
In this paper, we describe an O(log N)bitstep randomized algorithm for bitserial message routing on a hypercube. The result is asymptotically optimal, and improves upon the best previously known algorithms by a logarithmic factor. The result also solves the problem of online circuit switching in an O(1)dilated hypercube (i.e., the problem of establishing edgedisjoint paths between the nodes of the dilated hypercube for any onetoone mapping). Our algorithm is adaptive and we show that this is necessary to achieve the logarithmic speedup. We generalize the BorodinHopcroft lower bound on oblivious routing by proving that any randomized oblivious algorithm on a polylogarithmic degree network requires at least \Omega\Gammaast 2 N= log log N) bit steps with high probability for almost all permutations. 1 Introduction Substantial effort has been devoted to the study of storeandforward packet routing algorithms for hypercubic networks. The fastest algorithms are randomized, and c...
Packet Routing In FixedConnection Networks: A Survey
, 1998
"... We survey routing problems on fixedconnection networks. We consider many aspects of the routing problem and provide known theoretical results for various communication models. We focus on (partial) permutation, krelation routing, routing to random destinations, dynamic routing, isotonic routing ..."
Abstract

Cited by 29 (3 self)
 Add to MetaCart
We survey routing problems on fixedconnection networks. We consider many aspects of the routing problem and provide known theoretical results for various communication models. We focus on (partial) permutation, krelation routing, routing to random destinations, dynamic routing, isotonic routing, fault tolerant routing, and related sorting results. We also provide a list of unsolved problems and numerous references.
Fast Algorithms for Routing Around Faults in Multibutterflies and RandomlyWired Splitter Networks
 IEEE Transactions on Computers
, 1992
"... This paper describes simple deterministic O(log N)step algorithms for routing permutations of packets in multibutterflies and randomlywired splitter networks. The algorithms are robust against faults (even in the worst case), and are efficient from a practical point of view. As a consequence, we fi ..."
Abstract

Cited by 27 (8 self)
 Add to MetaCart
This paper describes simple deterministic O(log N)step algorithms for routing permutations of packets in multibutterflies and randomlywired splitter networks. The algorithms are robust against faults (even in the worst case), and are efficient from a practical point of view. As a consequence, we find that the multibutterfly is an excellent candidate for a highbandwidth lowdiameter switching network underlying a sharedmemory machine. Index TermsFault tolerance, interconnection network, multibutterfly, multistage network, routing algorithm. 1 Introduction Networks derived from hypercubes form the architectural basis of most parallel computers, including machines such as the BBN Butterfly, the Connection Machine, the IBM RP3 and GF11, the Intel iPSC, and the NCUBE. The butterfly, in particular, is quite popular, and has been demonstrated to perform reasonably well in practice. An example of an 8input butterfly is illustrated in Figure 1. The nodes in this graph represent switches,...
FaultTolerant Meshes with Small Degree
, 1993
"... This paper presents constructions for faulttolerant twodimensional mesh architectures. The constructions are designed to tolerate k faults while maintaining a healthy n by n mesh as a subgraph. They utilize several novel techniques for obtaining tradeoffs between the number of spare nodes and th ..."
Abstract

Cited by 17 (0 self)
 Add to MetaCart
This paper presents constructions for faulttolerant twodimensional mesh architectures. The constructions are designed to tolerate k faults while maintaining a healthy n by n mesh as a subgraph. They utilize several novel techniques for obtaining tradeoffs between the number of spare nodes and the degree of the faulttolerant network. We consider both worstcase and random fault distributions. In terms of worstcase faults, we give a construction that has constant degree and O(k 3 ) spare nodes. This is the first construction known in which the degree is constant and the number of spare nodes is independent of n. In terms of random faults, we present several new degree6 and degree8 constructions and show (both analytically and through simulations) that they can tolerate large numbers of randomly placed faults. A preliminary version of this paper appeared in Proceedings of the Fifth Annual ACM Symposium on Parallel Algorithms and Architectures, 1993. y California Institute of...
Optimal Routing of Parentheses on the Hypercube
 IN PROCEEDINGS OF THE SYMPOSIUM ON PARALLEL ARCHITECTURES AND ALGORITHMS
, 1994
"... We consider a new class of routing requests or partial permutations for which we give optimal online routing algorithms on the hypercube and shuffleexchange network. For wellformed words of parentheses our algorithm establishes communication between all matching pairs in logarithmic time. It can ..."
Abstract

Cited by 14 (6 self)
 Add to MetaCart
We consider a new class of routing requests or partial permutations for which we give optimal online routing algorithms on the hypercube and shuffleexchange network. For wellformed words of parentheses our algorithm establishes communication between all matching pairs in logarithmic time. It can be applied to the membership problem for Dyck languages and a number of problems for algebraic expressions.
Tolerating Faults in Hypercubes using Subcube Partitioning
 IEEE Transactions on Computers
, 1992
"... We examine the issue of running algorithms on a hypercube which has both node and edge faults, and we assume a worst case distribution of the faults. We prove that for any constant c, an ndimensional hypercube (ncube) with n c faulty components contains a faultfree subgraph that can implement ..."
Abstract

Cited by 13 (1 self)
 Add to MetaCart
We examine the issue of running algorithms on a hypercube which has both node and edge faults, and we assume a worst case distribution of the faults. We prove that for any constant c, an ndimensional hypercube (ncube) with n c faulty components contains a faultfree subgraph that can implement a large class of hypercube algorithms with only a constant factor slowdown. In addition, our approach yields practical implementations for small numbers of faults. For example, we show that any regular algorithm can be implemented on an ncube that has at most n \Gamma 1 faults with slowdowns of at most 2 for computation and at most 4 for communication. To the best of our knowledge this is the first result showing that an ncube can tolerate more than O(n) arbitrarily placed faults with a constant factor slowdown. This work was done while the author was with IBM Almaden Research Center. 1 Introduction The ndimensional hypercube (ncube) is one of the most popular interconnection topolog...