## New York Inc. Processor–Time Tradeoffs under Bounded-Speed Message Propagation: (1997)

### BibTeX

@MISC{I97newyork,

author = {Part I and G. Bilardi and F. P. Preparata},

title = {New York Inc. Processor–Time Tradeoffs under Bounded-Speed Message Propagation:},

year = {1997}

}

### OpenURL

### Abstract

Abstract. Upper bounds are derived for the processor–time tradeoffs of machines such as linear arrays and two-dimensional meshes, which are compatible with the physical limitation expressed by bounded-speed propagation of messages (due to the finiteness of the speed of light). It is shown that parallelism and locality combined may yield speedups superlinear in the number of processors. The speedups are inherent, due to the optimality of the obtained tradeoffs as established in a companion paper. Simulations of multiprocessor machines are developed by analogous machines with fewer processors. A crucial role is played by the hierarchical nature of the memory system. A divide-and-conquer technique for hierarchical memories is developed, based on the graph-theoretic notion of a topological separator. For multiprocessors, this technique also requires a careful balance of memory access and interprocessor communication costs, which leads to nonintuitive orchestrations of the simulation process.

### Citations

667 |
An Introduction to Parallel Algorithms
- Ja’Ja
- 1992
(Show Context)
Citation Context ... locality, an important property of many applications. Moreover, the processor–time tradeoff in the limiting technology can be different from the classical tradeoff embodied by Brent’s Principle [B], =-=[J]-=-, whereby a computation running for T steps on n processors can be emulated in at most ⌈n/p⌉T steps on p < n processors of the same type. A corollary of Brent’s Principle is that the best parallel alg... |

240 | The parallel evaluation of generic arithmetic expressions
- BRENT
- 1974
(Show Context)
Citation Context ... data locality, an important property of many applications. Moreover, the processor–time tradeoff in the limiting technology can be different from the classical tradeoff embodied by Brent’s Principle =-=[B]-=-, [J], whereby a computation running for T steps on n processors can be emulated in at most ⌈n/p⌉T steps on p < n processors of the same type. A corollary of Brent’s Principle is that the best paralle... |

132 |
A model for hierarchical memory
- Aggarwal, Alpern, et al.
- 1987
(Show Context)
Citation Context ... cell from the CPU. In this scenario the speedup of the n-processor mesh is �(n 3/2 ). The preceding estimate refers to a straightforward implementation of matrix multiplication; however, as noted in =-=[AACS]-=-,Processor–Time Tradeoffs under Bounded-Speed Message Propagation, I 525 careful exploitation of locality enables the access overhead in the uniprocessor to be contained to within a factor �(log n). ... |

82 |
Reckhow, Time bounded random access machines
- Cook, A
- 1973
(Show Context)
Citation Context ...consider parallel machines built as interconnections of (processingelement, memory-module) pairs. Such a pair is modeled as a Hierarchical Random Access Machine, or H-RAM, a generalization of the RAM =-=[CR]-=- introduced in [AACS] (under the name of the Hierarchical Memory Model) to capture the higher cost of remote memory access. (See also [S1] and its bibliography.) Definition 1. An f (x)-H-RAM is a rand... |

64 |
Type architectures, shared memory, and the corollary of modest potential
- Snyder
- 1986
(Show Context)
Citation Context ...llary of Brent’s Principle is that the best parallel algorithm on p processors cannot be more than p times faster than the best sequential algorithm (the Fundamental Principle of Parallel Computation =-=[S2]-=-). Informally, when communication delays are proportional to physical distances, the deployment of p processors can lead to speedups in two ways. A p-fold parallelism in the computation translates int... |

9 | Space-time tradeoffs in memory hierarchies
- Savage
- 1993
(Show Context)
Citation Context ... Random Access Machine, or H-RAM, a generalization of the RAM [CR] introduced in [AACS] (under the name of the Hierarchical Memory Model) to capture the higher cost of remote memory access. (See also =-=[S1]-=- and its bibliography.) Definition 1. An f (x)-H-RAM is a random access machine where an access to address x takes time f (x).Processor–Time Tradeoffs under Bounded-Speed Message Propagation, I 527 A... |

1 |
Bounds on Processor–Time Tradeoffs under BoundedSpeed Message Propagation
- Lower
- 1995
(Show Context)
Citation Context ... 1. The significance of Theorem 1 is in relation to applications for which the larger machine computation exactly reflects the locality. Such computations exist, as we have shown in a companion paper =-=[BP2]-=- yielding matching lower bounds (for most values of n, p, and m). Therefore, no improvement is possible beyond the results of Theorem 1, thereby showing that locality slowdown is an inherent feature o... |