Results 1 -
5 of
5
ParC - An Extension of C for Shared Memory Parallel Processing
"... this paper we describe the features and semantics of ParC. The rest of this section explains the motivation for designing a new language, the eect of the motivating forces on the design, and the structure of the software environment that surrounds it. The next section describes the parallel construc ..."
Abstract
-
Cited by 15 (11 self)
- Add to MetaCart
this paper we describe the features and semantics of ParC. The rest of this section explains the motivation for designing a new language, the eect of the motivating forces on the design, and the structure of the software environment that surrounds it. The next section describes the parallel constructs and scoping rules. The exact semantics of parallel constructs when there are more activities than processors have been widely neglected in the literature. We discuss this issue and provide guidelines for acceptable implementations. We then describe the innovative instructions for forced termination, which are based on analogies with C instructions that break out of a construct, followed by a discussion of synchronization mechanisms. A discussion of the programming methodology of ParC is then given and is followed by a discussion of our experiences with ParC . A comparison of ParC with other parallel programming languages is delayed until the end of the paper, after we have described all of its features
A Library-Based Program Development Environment for Parallel Image
- Processing, Proceedings of Scalable Parallel Libraries Conference, Mississippi State Univ
, 1993
"... Cloner is an image processing prototyping environ-ment that helps users design new parallel image pro-cessing algorithms for a target machine by building on and modifying existing library algorithms. In this pa-per we show the Cloner user interface, discuss how guided access is accomplished, and pro ..."
Abstract
-
Cited by 7 (3 self)
- Add to MetaCart
Cloner is an image processing prototyping environ-ment that helps users design new parallel image pro-cessing algorithms for a target machine by building on and modifying existing library algorithms. In this pa-per we show the Cloner user interface, discuss how guided access is accomplished, and provide an example of how Cloner supports the rapid development of high performance codes. The example demonstrates how menu options and queries are used to guide a user to select an appropriate 2-dimensional FFT algorithm based on image size and available machine resources. 1
Data Parallel Programming: A Survey and a Proposal for a New Model
, 1993
"... We give a brief description of what we consider to be data parallel programming and processing, trying to pinpoint the typical problems and pitfalls that occur. We then proceed with a short annotated history of data parallel programming, and sketch a taxonomy in which data parallel languages can be ..."
Abstract
-
Cited by 5 (0 self)
- Add to MetaCart
We give a brief description of what we consider to be data parallel programming and processing, trying to pinpoint the typical problems and pitfalls that occur. We then proceed with a short annotated history of data parallel programming, and sketch a taxonomy in which data parallel languages can be classified. Finally we present our own model of data parallel programming, which is based on the view of parallel data collections as functions. We believe that this model has a number of distinct advantages, such as being abstract, independent of implicitly assumed machine models, and general.
Achieving Speedups for APL on an SIMD Distributed Memory Machine
, 1990
"... The potential speedup for SIMD parallel implementations of APL programs is considered. Both analytical and (simulated) empirical studies are presented. The approach is to recognize that nearly 95% of the operators appearing in APL programs are either scalar primitive, reduction or indexing and so th ..."
Abstract
-
Cited by 5 (3 self)
- Add to MetaCart
The potential speedup for SIMD parallel implementations of APL programs is considered. Both analytical and (simulated) empirical studies are presented. The approach is to recognize that nearly 95% of the operators appearing in APL programs are either scalar primitive, reduction or indexing and so the performance of these operators gives a good estimate of the amount of speedup a full program might receive. Substantial speedups are demonstrated for these operators and the empirical evidence accords with the analytical estimates. Keywords: APL, data parallel, parallelism, parallel programming, SIMD computers This research has been funded by the Office of Naval Research Contract No. N00014-86-K-0264 and the National Science Foundation Grant No. DCR 8416878. List of Figures 1 4 \Theta 4 mesh. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 2 Algorithm for performing the scalar primitive operator in parallel on the mesh. . . . 10 3 Algorithm for per...
Performance Implications of Virtualisation of Massively Parallel Algorithm Implementation
, 1994
"... In this paper we investigate the accuracy of performance prediction for virtualised implementations of parallel algorithms on massively parallel SIMD architectures. Virtualisation is the process by which algorithms which assume n processors are implemented in a system with p processors, where n ? p. ..."
Abstract
- Add to MetaCart
In this paper we investigate the accuracy of performance prediction for virtualised implementations of parallel algorithms on massively parallel SIMD architectures. Virtualisation is the process by which algorithms which assume n processors are implemented in a system with p processors, where n ? p. Virtualisation is implemented in some form by any parallel environment that allows algorithms to assume more processors than are physically available on the machine. The main contributions of this paper are the adaption and practical evaluation of the best known algorithms for merging and sorting. We show that the Valiant/Kruskal merging algorithm can be implemented efficiently on the Maspar system; the actual running times shadow the theoretical bounds. Our results also show that some algorithms perform closer to their theoretically predicted performance than others. This work has implications for both algorithm designers and compiler writers since it provides insights into the effects of ...

