Results 1  10
of
15
Exploiting Storage Redundancy to Speed Up Randomized Shared Memory Simulations
 IN PROCEEDINGS OF THE 12TH ANNUAL SYMPOSIUM ON THEORETICAL ASPECTS OF COMPUTER SCIENCE
, 1996
"... Assume that a set U of memory locations is distributed among n memory modules, using some number a of hash functions h1 ; : : : ; ha , randomly and independently drawn from a high performance universal class of hash functions. Thus each memory location has a copies. Consider the task of accessing b ..."
Abstract

Cited by 33 (9 self)
 Add to MetaCart
Assume that a set U of memory locations is distributed among n memory modules, using some number a of hash functions h1 ; : : : ; ha , randomly and independently drawn from a high performance universal class of hash functions. Thus each memory location has a copies. Consider the task of accessing b out of the a copies for each of given keys x1 ; : : : ; xn 2 U , b ! a. The paper presents and analyses a simple process executing the above task on distributed memory machines (DMMs) with n processors. Efficient implementations are presented, implying ffl a simulation of an nprocessor PRAM on an nprocessor optical crossbar DMM with delay O(log log n), ffl a simulation as above on an arbitraryDMM with delay O( log log n log log log n ), ffl an implementation of a static dictionary on an arbitraryDMM with parallel access time O(log n + log log n log a ), if a hash functions are used. In particular, an access time of O(log n) can be reached if (log n) 1= log n hash funct...
Shared Memory Simulations with TripleLogarithmic Delay (Extended Abstract)
, 1995
"... ) Artur Czumaj 1 , Friedhelm Meyer auf der Heide 2 , and Volker Stemann 1 1 Heinz Nixdorf Institute, University of Paderborn, D33095 Paderborn, Germany 2 Heinz Nixdorf Institute and Department of Computer Science, University of Paderborn, D33095 Paderborn, Germany Abstract. We conside ..."
Abstract

Cited by 21 (4 self)
 Add to MetaCart
) Artur Czumaj 1 , Friedhelm Meyer auf der Heide 2 , and Volker Stemann 1 1 Heinz Nixdorf Institute, University of Paderborn, D33095 Paderborn, Germany 2 Heinz Nixdorf Institute and Department of Computer Science, University of Paderborn, D33095 Paderborn, Germany Abstract. We consider the problem of simulating a PRAM on a distributed memory machine (DMM). Our main result is a randomized algorithm that simulates each step of an nprocessor CRCW PRAM on an nprocessor DMM with O(log log log n log n) delay, with high probability. This is an exponential improvement on all previously known simulations. It can be extended to a simulation of an (n log log log n log n) processor EREW PRAM on an nprocessor DMM with optimal delay O(log log log n log n), with high probability. Finally a lower bound of \Omega (log log log n=log log log log n) expected time is proved for a large class of randomized simulations that includes all known simulations. 1 Introduction Para...
PRAM Computations Resilient to Memory Faults
 2nd European Symposium on Algorithms ESA’94
"... : PRAMs with faults in their shared memory are investigated. Efficient general simulations on such machines of algorithms designed for fully reliable PRAMs are developed. The PRAM we work with is the ConcurrentRead ConcurrentWrite (CRCW) variant. Two possible settings for error occurrence are cons ..."
Abstract

Cited by 16 (6 self)
 Add to MetaCart
: PRAMs with faults in their shared memory are investigated. Efficient general simulations on such machines of algorithms designed for fully reliable PRAMs are developed. The PRAM we work with is the ConcurrentRead ConcurrentWrite (CRCW) variant. Two possible settings for error occurrence are considered: the errors may be either static (once a memory cell is checked to be operational it remains so during the computation) or dynamic (a potentially faulty cell may crash at any time, the total number of such cells being bounded). A simulation consists of two phases: memory formatting and the proper part done in a stepbystep way. For each error setting (static or dynamic), two simulations are presented: one with a O(1)time perstep cost, the other with a O(log n)time perstep cost. The other parameters of these simulations (number of processors, memory size, formatting time) are shown in table 1 in section 6. The simulations are randomized and Monte Carlo: they always operate within ...
Constructive Deterministic PRAM Simulation on a MeshConnected Computer
 In Proc. 6th ACM Symp. on Parallel Algorithms and Architectures
, 1993
"... The PRAM model of computation consists of a collection of sequential RAM machines accessing a shared memory in lockstep fashion. The PRAM is a very highlevel abstraction of a parallel computer, and its direct realization in hardware is beyond reach of the current (or even foreseeable) technology. ..."
Abstract

Cited by 12 (10 self)
 Add to MetaCart
The PRAM model of computation consists of a collection of sequential RAM machines accessing a shared memory in lockstep fashion. The PRAM is a very highlevel abstraction of a parallel computer, and its direct realization in hardware is beyond reach of the current (or even foreseeable) technology. In this paper we present a deterministic simulation scheme to emulate PRAM computation on a meshconnected computer, a feasible machine where each processor has its own memory module and is connected to at most four other processors via pointtopoint links. In order to achieve a good worstcase performance, any deterministic simulation scheme has to replicate each variable in a number of copies. Such copies are stored in the local memory modules according to a Memory Organization Scheme (MOS), which is known to all the processors. A variable is then accessed by routing packets to its copies. All deterministic schemes in the literature make use of a MOS whose existence is proved via the prob...
Fast Deterministic Simulation of Computations on Faulty Parallel Machines
 in Proc. of the 3rd Ann. European Symp. on Algorithms, 1995, Springer Verlag LNCS 979
, 1995
"... A method of deterministic simulation of fully operational parallel machines on the analogous machines prone to errors is developed. The simulation is presented for the exclusiveread exclusivewrite (EREW) PRAM and the Optical Communication Parallel Computer (OCPC), but it applies to a large class o ..."
Abstract

Cited by 9 (4 self)
 Add to MetaCart
A method of deterministic simulation of fully operational parallel machines on the analogous machines prone to errors is developed. The simulation is presented for the exclusiveread exclusivewrite (EREW) PRAM and the Optical Communication Parallel Computer (OCPC), but it applies to a large class of parallel computers. It is shown that simulations of operational multiprocessor machines on faulty ones can be performed with logarithmic slowdown in the worst case. More precisely, we prove that both a PRAM with a bounded fraction of faulty processors and memory cells and an OCPC with a bounded fraction of faulty processors can simulate deterministically their faultfree counterparts with O(log n) slowdown and preprocessing done in time O(log 2 n). The fault model is as follows. The faults are deterministic (worstcase distribution) and static (do not change in the course of a computation). If a processor attempts to communicate with some other processor (in the case of an OCPC) or re...
SharedMemory Simulations on a FaultyMemory DMM
, 1996
"... this paper are synchronous, and the time performance is our major efficiency criterion. We consider a DMM with faulty memory words, otherwise everything is assumed to be operational. In particular the communication between the processors and the MUs is reliable, and a processor may always attempt to ..."
Abstract

Cited by 9 (1 self)
 Add to MetaCart
this paper are synchronous, and the time performance is our major efficiency criterion. We consider a DMM with faulty memory words, otherwise everything is assumed to be operational. In particular the communication between the processors and the MUs is reliable, and a processor may always attempt to obtain an access to any MU, and, having been granted it, may access any memory word in it, even if all of them are faulty. The only restriction on the distribution of faults among memory words is that their total number is bounded from above by a fraction of the total number of memory words in all the MUs. In particular, some MUs may contain only operational cells, some only faulty cells, and some mixed cells. This report presents fast simulations of the PRAM on a DMM with faulty memory.
A practical constructive scheme for deterministic sharedmemory access
 In Proc. 5th ACM Symp. on Parallel Algorithms and Architectures
, 1993
"... Abstract. We present three explicit schemes for distributing M variables among N memory modules, where M = �(N 1.5), M = �(N 2), and M = �(N 3), respectively. Each variable is replicated into a constant number of copies stored in distinct modules. We show that N processors, directly accessing the me ..."
Abstract

Cited by 8 (5 self)
 Add to MetaCart
Abstract. We present three explicit schemes for distributing M variables among N memory modules, where M = �(N 1.5), M = �(N 2), and M = �(N 3), respectively. Each variable is replicated into a constant number of copies stored in distinct modules. We show that N processors, directly accessing the memories through a complete interconnection, can read/write any set of N variables in worstcase time O(N 1/3), O(N 1/2), and O(N 2/3), respectively for the three schemes. The access times for the last two schemes are optimal with respect to the particular redundancy values used by such schemes. The address computation can be carried out efficiently by each processor without recourse to a complete memory map and requiring only O(1) internal storage. 1.
Parallel AlternatingDirection Access Machine
, 1996
"... . This paper presents a theoretical study of a model of parallel computations called Parallel AlternatingDirection Access Machine (padam). padam is an abstraction of the multiprocessor computers adena /adenart and a prototype architecture usc/omp. The main feature of padam is the organization of a ..."
Abstract

Cited by 7 (3 self)
 Add to MetaCart
. This paper presents a theoretical study of a model of parallel computations called Parallel AlternatingDirection Access Machine (padam). padam is an abstraction of the multiprocessor computers adena /adenart and a prototype architecture usc/omp. The main feature of padam is the organization of access to the global memory: (1) the memory modules are arranged as a 2dimensional array, (2) each processor is assigned to a row and a column, (3) the processors switch synchronously between row and column access modes, and can access any of the assigned modules in each mode without conflicts. Since the padam processors have such a restricted access to the partially shared memory, developing tools to enhance flexibility of access to the memory is important. The paper concentrates on these issues. 1 Introduction An important goal in the study of parallel computation is to develop models which are close to real machines but abstract from technical details and provide a vehicle to design and ...
Deterministic Computations on a PRAM with Static Faults
"... We develop a deterministic simulation of fully operational Parallel Random Access Machine (PRAM) on a PRAM with some faulty processors and memory cells. The faults considered are static, i.e., once the machine starts to operate, the operational/faulty status of PRAM components does not change. The s ..."
Abstract

Cited by 6 (0 self)
 Add to MetaCart
We develop a deterministic simulation of fully operational Parallel Random Access Machine (PRAM) on a PRAM with some faulty processors and memory cells. The faults considered are static, i.e., once the machine starts to operate, the operational/faulty status of PRAM components does not change. The simulating machine can tolerate a constant fraction of faults among processors and memory cells. The simulating PRAM has n processors and m memory cells, and simulates a PRAM with n processors and m) memory cells. The simulation is in three phases: (1) preprocessing, followed by (2) retrieving the input by the processors active in the simulation, followed by (3) the proper part of the simulation performed in a stepbystep fashion. Preprocessing is performed in time O(( m n + log n) log n). The input is retrieved in time O(log 2 n). The slowdown of the proper part of the simulation is O(log m).
WorkOptimal Simulation of PRAM Models on Meshes
 Nordic Journal on Computing, 2(1):51
, 1994
"... In this paper we consider workoptimal simulations of PRAM models on coated meshes. Coated meshes consist of a mesh connected routing machinery with processors on the surface of the mesh. We prove that coated meshes with 2dimensional or 3dimensional routing machinery can workoptimally simulate ER ..."
Abstract

Cited by 5 (3 self)
 Add to MetaCart
In this paper we consider workoptimal simulations of PRAM models on coated meshes. Coated meshes consist of a mesh connected routing machinery with processors on the surface of the mesh. We prove that coated meshes with 2dimensional or 3dimensional routing machinery can workoptimally simulate EREW, CREW, and CRCW PRAM models. The general idea behind this simulation is to use Valiant's XPRAM approach, and ignore the workcomplexity of simple nodes of the routing machinery. 1 Introduction There are a wide variety of approaches to parallelism in general [40], and even to general purpose parallelism [39]  reflecting the prevailing uncertainty of the correct approach. One model aiming at general purpose parallelism is the PRAM (Parallel Random Access Machine) model, which is a natural generalization of the classical RAM model. It consists of N processors, each of which may have some local memory and registers, and a global shared memory of size m. A step of PRAM is often seen to con...