Results 1  10
of
48
Scalable Computing
 Computer Science Today: Recent Trends and Developments
, 1996
"... . Scalable computing will, over the next few years, become the normal form of computing. In this paper we present a unified framework, based on the BSP model, which aims to serve as a foundation for this evolutionary development. A number of important techniques, tools and methodologies for the desi ..."
Abstract

Cited by 91 (3 self)
 Add to MetaCart
(Show Context)
. Scalable computing will, over the next few years, become the normal form of computing. In this paper we present a unified framework, based on the BSP model, which aims to serve as a foundation for this evolutionary development. A number of important techniques, tools and methodologies for the design of sequential algorithms and programs have been developed over the past few decades. In the transition from sequential to scalable computing we will find that new requirements such as universality and predictable performance will necessitate significant changes of emphasis in these areas. Programs for scalable computing, in addition to being fully portable, will have to be efficiently universal, offering high performance, in a predictable way, on any general purpose parallel architecture. The BSP model provides a discipline for the design of scalable programs of this kind. We outline the approach and discuss some of the issues involved. 1 Introduction For fifty years, sequential computin...
Special Purpose Parallel Computing
 Lectures on Parallel Computation
, 1993
"... A vast amount of work has been done in recent years on the design, analysis, implementation and verification of special purpose parallel computing systems. This paper presents a survey of various aspects of this work. A long, but by no means complete, bibliography is given. 1. Introduction Turing ..."
Abstract

Cited by 80 (6 self)
 Add to MetaCart
A vast amount of work has been done in recent years on the design, analysis, implementation and verification of special purpose parallel computing systems. This paper presents a survey of various aspects of this work. A long, but by no means complete, bibliography is given. 1. Introduction Turing [365] demonstrated that, in principle, a single general purpose sequential machine could be designed which would be capable of efficiently performing any computation which could be performed by a special purpose sequential machine. The importance of this universality result for subsequent practical developments in computing cannot be overstated. It showed that, for a given computational problem, the additional efficiency advantages which could be gained by designing a special purpose sequential machine for that problem would not be great. Around 1944, von Neumann produced a proposal [66, 389] for a general purpose storedprogram sequential computer which captured the fundamental principles of...
Adversarial contention resolution for simple channels
 In: 17th Annual Symposium on Parallelism in Algorithms and Architectures
, 2005
"... This paper analyzes the worstcase performance of randomized backoff on simple multipleaccess channels. Most previous analysis of backoff has assumed a statistical arrival model. For batched arrivals, in which all n packets arrive at time 0, we show the following tight highprobability bounds. Rand ..."
Abstract

Cited by 49 (1 self)
 Add to MetaCart
(Show Context)
This paper analyzes the worstcase performance of randomized backoff on simple multipleaccess channels. Most previous analysis of backoff has assumed a statistical arrival model. For batched arrivals, in which all n packets arrive at time 0, we show the following tight highprobability bounds. Randomized binary exponential backoff has makespan Θ(nlgn), and more generally, for any constant r, rexponential backoff has makespan Θ(nlog lgr n). Quadratic backoff has makespan Θ((n/lg n) 3/2), and more generally, for r> 1, rpolynomial backoff has makespan Θ((n/lg n) 1+1/r). Thus, for batched inputs, both exponential and polynomial backoff are highly sensitive to backoff constants. We exhibit a monotone superpolynomial subexponential backoff algorithm, called loglogiterated backoff, that achieves makespan Θ(nlg lgn/lg lglgn). We provide a matching lower bound showing that this strategy is optimal among all monotone backoff algorithms. Of independent interest is that this lower bound was proved with a delay sequence argument. In the adversarialqueuing model, we present the following stability and instability results for exponential backoff and loglogiterated backoff. Given a (λ,T)stream, in which at most n = λT packets arrive in any interval of size T, exponential backoff is stable for arrival rates of λ = O(1/lgn) and unstable for arrival rates of λ = Ω(lglgn/lg n); loglogiterated backoff is stable for arrival rates of λ = O(1/(lg lgnlgn)) and unstable for arrival rates of λ = Ω(1/lg n). Our instability results show that bursty input is close to being worstcase for exponential backoff and variants and that even small bursts can create instabilities in the channel.
Doubly Logarithmic Communication Algorithms for Optical Communication Parallel Computers
 In Proceedings of the 5th Annual ACM Symposium on Parallel Algorithms and Architectures
, 1994
"... In this paper we consider the problem of interprocessor communication on parallel computers that have optical communication networks. We consider the Completely Connected Optical Communication Parallel Computer (OCPC), which has a completely connected optical network and also the Mesh of Optical Bus ..."
Abstract

Cited by 41 (5 self)
 Add to MetaCart
In this paper we consider the problem of interprocessor communication on parallel computers that have optical communication networks. We consider the Completely Connected Optical Communication Parallel Computer (OCPC), which has a completely connected optical network and also the Mesh of Optical Buses Parallel Computer (MOBPC) , which has a mesh of optical buses as its communication network. The particular communication problem that we study is that of realizing an hrelation. In this problem, each processor has at most h messages to send and at most h messages to receive. It is clear that any 1relation can be realized in one communication step on an OCPC. However, the best previously known pprocessor OCPC algorithm for realizing an arbitrary hrelation for h ? 1 requires \Theta(h + log p) expected communication steps. (This algorithm is due to Valiant and is based on earlier work of Anderson and Miller.) Valiant's algorithm is optimal only for h = \Omega\Gamma139 p) and it is an op...
An optical simulation of shared memory
, 1994
"... We present a workoptimal randomized algorithm for simulating a shared memory machine (pram) on an optical communication parallel computer (ocpc). The ocpc model is motivated by the potential of optical communication for parallel computation. The memory of an ocpc is divided into modules, one module ..."
Abstract

Cited by 35 (3 self)
 Add to MetaCart
We present a workoptimal randomized algorithm for simulating a shared memory machine (pram) on an optical communication parallel computer (ocpc). The ocpc model is motivated by the potential of optical communication for parallel computation. The memory of an ocpc is divided into modules, one module per processor. Each memory module only services a request on a timestep if it receives exactly one memory request. Our algorithm simulates each step of an n lg lg nprocessor erew pram on an nprocessor ocpc in O(lg lg n) expected delay. (The probability that the delay is longer than this is at most n; for any constant.) The best previous simulation, due to Valiant, required (lg n) expected delay.
On Contention Resolution Protocols and Associated Probabilistic Phenomena
 IN PROCEEDINGS OF THE 26TH ANNUAL ACM SYMPOSIUM ON THEORY OF COMPUTING
, 1994
"... ..."
Contention Resolution with Constant Expected Delay
"... We study contention resolution problem in a multipleaccess channel such as the Ethernet... ..."
Abstract

Cited by 29 (3 self)
 Add to MetaCart
We study contention resolution problem in a multipleaccess channel such as the Ethernet...
Scalable Parallel Computing: A Grand Unified Theory and its Practical Development
"... The Bulk Synchronous Parallel (BSP) model provides a unified framework for the design and programming of general purpose parallel computing systems. In this paper we describe the BSP model and discuss some of the developments in architecture, algorithms and programming languages which are currently ..."
Abstract

Cited by 25 (1 self)
 Add to MetaCart
The Bulk Synchronous Parallel (BSP) model provides a unified framework for the design and programming of general purpose parallel computing systems. In this paper we describe the BSP model and discuss some of the developments in architecture, algorithms and programming languages which are currently being pursued as part of this new, unified approach to scalable parallel computing.
A Combining Mechanism for Parallel Computers
 IN PROCEEDINGS OF THE FIRST HEINZ NIXDORF SYMPOSIUM
, 1992
"... In a multiprocessor computer communication among the components may be based either on a simple router, which delivers messages pointtopoint like a mail service, or on a more elaborate combining network that, in return for a greater investment in hardware, can combine messages to the same addre ..."
Abstract

Cited by 22 (0 self)
 Add to MetaCart
In a multiprocessor computer communication among the components may be based either on a simple router, which delivers messages pointtopoint like a mail service, or on a more elaborate combining network that, in return for a greater investment in hardware, can combine messages to the same address prior to delivery. This paper describes a mechanism for recirculating messages in a simple router so that the added functionality of a combining network, for arbitrary access patterns, can be achieved by it with provable efficiency. The method brings together the messages with the same destination address in more than one stage, and at a set of components that is determined by a hash function and decreases in number at each stage.