## Efficient Parallel Divide-and-Conquer for a Class of Interconnection Topologies. (1991)

Venue: | In Proceedings of the 2nd International Symposium on Algorithms, number 557 in Lecture Notes in Computer Science |

Citations: | 7 - 1 self |

### BibTeX

@INPROCEEDINGS{Wu91efficientparallel,

author = {I-chen Wu},

title = {Efficient Parallel Divide-and-Conquer for a Class of Interconnection Topologies.},

booktitle = {In Proceedings of the 2nd International Symposium on Algorithms, number 557 in Lecture Notes in Computer Science},

year = {1991},

pages = {229--240},

publisher = {Springer-Verlag}

}

### OpenURL

### Abstract

: In this paper, we propose an efficient scheduling algorithm for expanding any divide-andconquer (D&C) computation tree on k-dimensional mesh, hypercube, and perfect shuffle networks with p processors. Assume that it takes t n time steps to expand one node of the tree and t c time steps to transmit one datum or convey one node. For any D&C computation tree with N nodes, height h, and degree d (maximal number of children of any node), our algorithm requires at most (N=p + h)t n + 'dht c time steps, where ' is O(log 2 p) on a hypercube or perfect shuffle network and is O( k p p) on a n k\Gamma1 \Theta \Delta \Delta \Delta \Theta n 0 mesh network, where n k\Gamma1 = \Delta \Delta \Delta = n 0 = k p p. This algorithm is general in the sense that it does not know the values of N , h, and d, and the shape of the computation tree as well, a priori. Most importantly, we can easily obtain a linear speedup by nearly a factor of p, especially when N AE ph(1 + 'dt c =t n ). 1. Introduction ...

### Citations

1894 |
Computational Geometry { an Introduction
- Preparata, Shamos
- 1985
(Show Context)
Citation Context ... is obtained by solving subproblems recursively. Examples of D&C computations include various sorting methods such as quick sort [8], computational geometry procedures such as convex hull calculation =-=[14]-=-, AI search heuristics such as constraint satisfaction techniques [6], adaptive data classification procedures such as generation and maintenance of quadtrees [16], and numerical methods such as multi... |

488 |
Increasing tree search efficiency for constraint satisfaction problems
- Haralick, Elliott
- 1980
(Show Context)
Citation Context ...tations include various sorting methods such as quick sort [8], computational geometry procedures such as convex hull calculation [14], AI search heuristics such as constraint satisfaction techniques =-=[6]-=-, adaptive data classification procedures such as generation and maintenance of quadtrees [16], and numerical methods such as multigrid algorithms [12] for solving partial differential equations. AD&C... |

226 |
The Cosmic Cube
- Seitz
- 1985
(Show Context)
Citation Context ...rk is called a k-D torus network. 2. Hypercube Network: Let p = 2 q . This network can be viewed as a q-D mesh network with size two in each dimension. Examples are the hypercube systems described in =-=[7, 17]-=-. 3. Perfect Shuffle Network: Let p = 2 q and i q\Gamma1 i q\Gamma2 \Delta \Delta \Delta i 0 be the binary representation of i. Every processor P i is connected to P i q\Gamma2 \Delta\Delta\Deltai 0 i... |

134 | iWarp: An integrated solution of high-speed parallel computing
- Borkar, Cohn, et al.
- 1988
(Show Context)
Citation Context ...te (i k\Gamma1 ; \Delta \Delta \Delta ; i j \Sigma 1; \Delta \Delta \Delta i 0 ), if they exist. For simplicity, assume n k\Gamma1 = \Delta \Delta \Delta = n 0 = k p p. An example is the iWarp system =-=[3]-=- with an 8 \Theta 8 mesh network. If in each dimension the last element is connected back to the first one the network is called a k-D torus network. 2. Hypercube Network: Let p = 2 q . This network c... |

108 |
DIB|a distributed implementation of backtracking
- Finkel, Manber
- 1987
(Show Context)
Citation Context ...chieve good load balancing between the processors, then parallelizing D&C becomes nontrivial. In fact, doing efficient D&C on any real parallel machines has been a major challenge to many researchers =-=[4, 5, 9, 18]-=- for many years. The difficulties are due to the fact that many D&C computations are highly dynamic in the sense that these computations are data-dependent. During computation, a problem instance can ... |

85 | The design of Nectar: a network backplane for heterogeneous multicomputers
- Arnould, Bitz, et al.
- 1989
(Show Context)
Citation Context ...here c(! 1) may be a very small constant. Hence, our algorithm is practical for implementation. We will use this algorithm as a basis to develop a programming system on the 26-processor Nectar system =-=[1] developed-=- at Carnegie Mellon University. In the rest of the paper, Wu and Kung's algorithm [19] on which our algorithm is based will be reviewed in Section 2. To avoid the "hot spot" problem, our alg... |

84 |
Architecture of a hypercube supercomputer
- Hayes, Mudge, et al.
- 1986
(Show Context)
Citation Context ...rk is called a k-D torus network. 2. Hypercube Network: Let p = 2 q . This network can be viewed as a q-D mesh network with size two in each dimension. Examples are the hypercube systems described in =-=[7, 17]-=-. 3. Perfect Shuffle Network: Let p = 2 q and i q\Gamma1 i q\Gamma2 \Delta \Delta \Delta i 0 be the binary representation of i. Every processor P i is connected to P i q\Gamma2 \Delta\Delta\Deltai 0 i... |

80 |
Applications of spatial data structures: Computer graphics, image processing, and GIS
- Samet
- 1990
(Show Context)
Citation Context ...ures such as convex hull calculation [14], AI search heuristics such as constraint satisfaction techniques [6], adaptive data classification procedures such as generation and maintenance of quadtrees =-=[16]-=-, and numerical methods such as multigrid algorithms [12] for solving partial differential equations. AD&C computation can be viewed as a process of expanding and shrinking a tree. In this paper, we o... |

73 |
Parallel Permutation and Sorting Algorithms and a New Generalized Connection Network
- Nassimi, Sahni
- 1982
(Show Context)
Citation Context ... of their discussion about communication complexity). Obviously, the processor may become a "hot spot". This paper will present an efficient D&C algorithm which uses the concentration route =-=technique [11]-=- to reduce the communication overhead on various networks. 1.3. Main Result Theorem 1 A scheduling algorithm can be devised such that the total time for a computation tree is Ts(N=p + h)t n + 'dht c w... |

54 | Solution of partial differential equations on vector and parallel computers
- Ortega, Voigt
- 1985
(Show Context)
Citation Context ...istics such as constraint satisfaction techniques [6], adaptive data classification procedures such as generation and maintenance of quadtrees [16], and numerical methods such as multigrid algorithms =-=[12]-=- for solving partial differential equations. AD&C computation can be viewed as a process of expanding and shrinking a tree. In this paper, we only consider expanding a tree for simplicity of discussio... |

37 | V.: Parallel depth first search. part i. implementation
- Rao, Kumar
- 1987
(Show Context)
Citation Context ...chieve good load balancing between the processors, then parallelizing D&C becomes nontrivial. In fact, doing efficient D&C on any real parallel machines has been a major challenge to many researchers =-=[4, 5, 9, 18]-=- for many years. The difficulties are due to the fact that many D&C computations are highly dynamic in the sense that these computations are data-dependent. During computation, a problem instance can ... |

37 |
Optimal speedup for backtrack search on a butterfly network
- Ranade
- 1991
(Show Context)
Citation Context ...des above a fixed level on one processor and then distribute nodes at this level to other processors. Load balancing would be done poorly in this approach when the tree is irregular. Another approach =-=[10, 15, 18]-=- is to distribute generated nodes to balance the load. For this scheme, the communication overhead can be very high. For example, the execution times for the algorithms in [10, 15] are all O(N (t n + ... |

32 |
Dynamic tree embeddings in butterflies and hypercubes
- Leighton, Newman, et al.
- 1992
(Show Context)
Citation Context ...des above a fixed level on one processor and then distribute nodes at this level to other processors. Load balancing would be done poorly in this approach when the tree is irregular. Another approach =-=[10, 15, 18]-=- is to distribute generated nodes to balance the load. For this scheme, the communication overhead can be very high. For example, the execution times for the algorithms in [10, 15] are all O(N (t n + ... |

30 |
Distributed tree search and its application to alpha-beta pruning
- Ferguson, Korf
- 1988
(Show Context)
Citation Context ...chieve good load balancing between the processors, then parallelizing D&C becomes nontrivial. In fact, doing efficient D&C on any real parallel machines has been a major challenge to many researchers =-=[4, 5, 9, 18]-=- for many years. The difficulties are due to the fact that many D&C computations are highly dynamic in the sense that these computations are data-dependent. During computation, a problem instance can ... |

30 | Communication complexity for parallel divide-and-conquer
- Wu, Kung
- 1991
(Show Context)
Citation Context ...the execution times for the algorithms in [10, 15] are all O(N (t n + t c )=p) with high probability. Recently, some researchers have made efforts to reduce communication overhead. A popular approach =-=[4, 5, 9, 19, 20] is based -=-on the "donate-highest-subtree" strategy, in which an idle processor will be given frontier nodes near the root. Since a subtree rooted near the top usually has many descendants and these de... |

23 |
Efficient Computation on Sparse Interconnection Networks
- Plaxton
- 1989
(Show Context)
Citation Context ...ro hop Figure 3: Embedding an IL-tree on a 3 \Theta 3 network. 3.2. Token Concentration Token concentration is an important technique used for load balancing (see Leighton's algorithm in Chapter 4 of =-=[13]-=-). Before the operation of token concentration, each processor may create one token. (Note that one processor has no more than one token at any time.) The problem of token concentration is defined as ... |

22 |
Parallel Algorithms for Combinatorial Search Problems
- Zhang
- 1989
(Show Context)
Citation Context ...the execution times for the algorithms in [10, 15] are all O(N (t n + t c )=p) with high probability. Recently, some researchers have made efforts to reduce communication overhead. A popular approach =-=[4, 5, 9, 19, 20] is based -=-on the "donate-highest-subtree" strategy, in which an idle processor will be given frontier nodes near the root. Since a subtree rooted near the top usually has many descendants and these de... |

19 |
A dynamic scheduling strategy for the chare-kernel system
- Shu, Kale
- 1989
(Show Context)
Citation Context |

17 |
The Shared Data-Object Model as a Paradigm for Programming Distributed Systems
- Bal
- 1989
(Show Context)
Citation Context ...me step, the aggregate overhead for processor idling is p \Gamma 1 within this time step. 1.2. Previous Work There have been several approaches in performing parallel D&C. A simple approach (e.g., in =-=[2]-=-) is to expand all the nodes above a fixed level on one processor and then distribute nodes at this level to other processors. Load balancing would be done poorly in this approach when the tree is irr... |