Results 1  10
of
1,348
Algorithms for Scalable Synchronization on SharedMemory Multiprocessors
 ACM Transactions on Computer Systems
, 1991
"... Busywait techniques are heavily used for mutual exclusion and barrier synchronization in sharedmemory parallel programs. Unfortunately, typical implementations of busywaiting tend to produce large amounts of memory and interconnect contention, introducing performance bottlenecks that become marke ..."
Abstract

Cited by 573 (32 self)
 Add to MetaCart
Busywait techniques are heavily used for mutual exclusion and barrier synchronization in sharedmemory parallel programs. Unfortunately, typical implementations of busywaiting tend to produce large amounts of memory and interconnect contention, introducing performance bottlenecks that become
Fast Parallel Algorithms for ShortRange Molecular Dynamics
 JOURNAL OF COMPUTATIONAL PHYSICS
, 1995
"... Three parallel algorithms for classical molecular dynamics are presented. The first assigns each processor a fixed subset of atoms; the second assigns each a fixed subset of interatomic forces to compute; the third assigns each a fixed spatial region. The algorithms are suitable for molecular dyn ..."
Abstract

Cited by 653 (7 self)
 Add to MetaCart
Three parallel algorithms for classical molecular dynamics are presented. The first assigns each processor a fixed subset of atoms; the second assigns each a fixed subset of interatomic forces to compute; the third assigns each a fixed spatial region. The algorithms are suitable for molecular
Synchronous data flow
, 1987
"... Data flow is a natural paradigm for describing DSP applications for concurrent implementation on parallel hardware. Data flow programs for signal processing are directed graphs where each node represents a function and each arc represents a signal path. Synchronous data flow (SDF) is a special case ..."
Abstract

Cited by 622 (45 self)
 Add to MetaCart
with data flow evaporates. Multiple sample rates within the same system are easily and naturally handled. Conditions for correctness of SDF graph are explained and scheduling algorithms are described for homogeneous parallel processors sharing memory. A preliminary SDF software system for automatically
Direct BulkSynchronous Parallel Algorithms
 JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING
, 1992
"... We describe a methodology for constructing parallel algorithms that are transportable among parallel computers having different numbers of processors, different bandwidths of interprocessor communication and different periodicity of global synchronisation. We do this for the bulksynchronous paralle ..."
Abstract

Cited by 174 (27 self)
 Add to MetaCart
synchronous parallel (BSP) model, which abstracts the characteristics of a parallel machine into three numerical parameters p, g, and L, corresponding to processors, bandwidth, and periodicity respectively. The model differentiates memory that is local to a processor from that which is not, but, for the sake
A performance evaluation of four parallel join algorithms in a sharednothing multiprocessor environment
 Proceedings of the SIGMOD Conference
, 1989
"... ABSTRACT In this paper we analyze and compare four parallel join algorithms. Grace and Hybrid hash represent the class of hashbased join methods, Simple hash represents a loop ing algorithm with hashing, and our last algorithm is the more traditional sortmerge. The performance of each of the alg ..."
Abstract

Cited by 183 (15 self)
 Add to MetaCart
relation are nonuniformly distributed and memory is limited. In this case, a more conservative algorithm such as the sortmerge algorithm should be used. The Gamma database machine serves as the host for the performance comparison.
Scans as primitive parallel operations
 IEEE Trans. Computers
, 1989
"... AbstmctIn most parallel random access machine (PRAM) models, memory references are assumed to take unit time. In practice, and in theory, certain scan operations, also known as prefix computations, can execute in no more time than these parallel memory references. This paper outlines an extensive ..."
Abstract

Cited by 187 (13 self)
 Add to MetaCart
AbstmctIn most parallel random access machine (PRAM) models, memory references are assumed to take unit time. In practice, and in theory, certain scan operations, also known as prefix computations, can execute in no more time than these parallel memory references. This paper outlines an extensive
Parallel Graph Generation Algorithms for Shared and Distributed Memory Machines
, 1997
"... In this paper we give an overview and a comparison of two parallel algorithms for the state space generation in stochastic modeling on common classes of multiprocessors. In this context state space generation simply means constructing a graph, which usually gets extremely large. On shared memory mac ..."
Abstract

Cited by 8 (1 self)
 Add to MetaCart
In this paper we give an overview and a comparison of two parallel algorithms for the state space generation in stochastic modeling on common classes of multiprocessors. In this context state space generation simply means constructing a graph, which usually gets extremely large. On shared memory
An Empirical Analysis of Parallel Random Permutation Algorithms on
"... We compare parallel algorithms for random permutation generation on symmetric multiprocessors (SMPs). Algorithms considered are the sortingbased algorithm, Anderson’s shuffling algorithm, the dartthrowing algorithm, and Sanders ’ algorithm. We investigate the impact of synchronization method, memor ..."
Abstract

Cited by 3 (0 self)
 Add to MetaCart
We compare parallel algorithms for random permutation generation on symmetric multiprocessors (SMPs). Algorithms considered are the sortingbased algorithm, Anderson’s shuffling algorithm, the dartthrowing algorithm, and Sanders ’ algorithm. We investigate the impact of synchronization method
Efficient LowContention Parallel Algorithms
, 1996
"... The queueread, queuewrite (qrqw) parallel random access machine (pram) model permits concurrent reading and writing to shared memory locations, but at a cost proportional to the number of readers/writers to any one memory location in a given step. The qrqw pram model re ects the contention propert ..."
Abstract

Cited by 34 (14 self)
 Add to MetaCart
The queueread, queuewrite (qrqw) parallel random access machine (pram) model permits concurrent reading and writing to shared memory locations, but at a cost proportional to the number of readers/writers to any one memory location in a given step. The qrqw pram model re ects the contention
Optimistic parallelism requires abstractions
 In PLDI
, 2007
"... Irregular applications, which manipulate large, pointerbased data structures like graphs, are difficult to parallelize manually. Automatic tools and techniques such as restructuring compilers and runtime speculative execution have failed to uncover much parallelism in these applications, in spite o ..."
Abstract

Cited by 179 (24 self)
 Add to MetaCart
) assertions about methods in class libraries, and (3) a runtime scheme for detecting and recovering from potentially unsafe accesses to shared memory made by an optimistic computation. We show that Delaunay mesh generation and agglomerative clustering can be parallelized in a straightforward way using
Results 1  10
of
1,348