Results 1 -
6 of
6
A Provably Time-Efficient Parallel Implementation of Full Speculation
- In Proceedings of the 23rd ACM Symposium on Principles of Programming Languages
, 1996
"... Speculative evaluation, including leniency and futures, is often used to produce high degrees of parallelism. Existing speculative implementations, however, may serialize computation because of their implementation of queues of suspended threads. We give a provably efficient parallel implementation ..."
Abstract
-
Cited by 15 (4 self)
- Add to MetaCart
Speculative evaluation, including leniency and futures, is often used to produce high degrees of parallelism. Existing speculative implementations, however, may serialize computation because of their implementation of queues of suspended threads. We give a provably efficient parallel implementation of a speculative functional language on various machine models. The implementation includes proper parallelization of the necessary queuing operations on suspended threads. Our target machine models are a butterfly network, hypercube, and PRAM. To prove the efficiency of our implementation, we provide a cost model using a profiling semantics and relate the cost model to implementations on the parallel machine models. 1 Introduction Futures, lenient languages, and several implementations of graph reduction for lazy languages all use speculative evaluation (call-by-speculation [15]) to expose parallelism. The basic idea of speculative evaluation, in this context, is that the evaluation of a...
A Parallel Complexity Model for Functional Languages
- IN: PROC. ACM CONF. ON FUNCTIONAL PROGRAMMING LANGUAGES AND COMPUTER ARCHITECTURE
, 1994
"... A complexity model based on the -calculus with an appropriate operational semantics in presented and related to various parallel machine models, including the PRAM and hypercube models. The model is used to study parallel algorithms in the context of "sequential" functional languages, and to relate ..."
Abstract
-
Cited by 5 (2 self)
- Add to MetaCart
A complexity model based on the -calculus with an appropriate operational semantics in presented and related to various parallel machine models, including the PRAM and hypercube models. The model is used to study parallel algorithms in the context of "sequential" functional languages, and to relate these results to algorithms designed directly for parallel machine models. For example, the paper shows that equally good upper bounds can be achieved for merging two sorted sequences in the pure -calculus with some arithmetic constants as in the EREW PRAM, when they are both mapped onto a more realistic machine such as a hypercube or butterfly network. In particular for n keys and p processors, they both result in an O(n=p + log 2 p) time algorithm. These results argue that it is possible to get good parallelism in functional languages without adding explicitly parallel constructs. In fact, the lack of random access seems to be a bigger problem than the lack of parallelism. This research...
The Concurrent Execution of Non-communicating Programs on SIMD Processors
- In The Fourth Symposium on the Frontiers of Massively Parallel Computation
, 1992
"... This paper explores the use of SIMD (or SIMD-like) hardware to support the efficient interpretation of concurrent, non-communicating programs. This approach places compiled programs into the local memory space of each distinct processing element (PE). Within each PE, a local program counter is initi ..."
Abstract
-
Cited by 3 (0 self)
- Add to MetaCart
This paper explores the use of SIMD (or SIMD-like) hardware to support the efficient interpretation of concurrent, non-communicating programs. This approach places compiled programs into the local memory space of each distinct processing element (PE). Within each PE, a local program counter is initialized and the instructions are interpreted in parallel across all of the PEs by control signals emanating from the central control unit. Initial experiments have been conducted with two distinct software architectures (MINTABs and MIPS R2000) on the MasPar MP-1 and two distinct applications (program mutation analysis and Monte Carlo simulation). While these experiments have shown only marginal performance improvement, it appears that with several minor hardware modifications, SIMD-like hardware can be constructed that will cost-effectively support both SIMD and MIMD processing.
Asynchronous Problems on SIMD Parallel Computers
- IEEE Trans. Parallel and Distributed Systems
, 1995
"... Abstract { One of the essential problems in parallel computing is: can SIMD machines handle asynchronous problems? This is a di cult, unsolved problem because of the mismatch between asynchronous problems and SIMD architectures. We propose a solution to let SIMD machines handle general asynchronous ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
Abstract { One of the essential problems in parallel computing is: can SIMD machines handle asynchronous problems? This is a di cult, unsolved problem because of the mismatch between asynchronous problems and SIMD architectures. We propose a solution to let SIMD machines handle general asynchronous problems. Our approach is to implement a runtime support system which can run MIMD-like software on SIMD hardware. The runtime support system, named P kernel, is thread-based. There are two major advantages of the thread-based model. First, for application problems with irregular and/or unpredictable features, automatic scheduling can move some threads from overloaded processors to underloaded processors. Second, and more importantly, the granularity of threads can be controlled to reduce system overhead. The P kernel is also able to handle bookkeeping and message management, as well as to make these low-level tasks transparent to users. Substantial performance has been obtained on Maspar MP-1. 1
Optimizing Fortran 90D Programs for SIMD Execution
, 1993
"... SIMD architectures offer an alternative to MIMD architectures for obtaining high performance computation through parallelism. These architectures can offer impressive price/performance ratios for certain classes of problems. However, the effectiveness of such machines is greatly affected by the capa ..."
Abstract
- Add to MetaCart
SIMD architectures offer an alternative to MIMD architectures for obtaining high performance computation through parallelism. These architectures can offer impressive price/performance ratios for certain classes of problems. However, the effectiveness of such machines is greatly affected by the capabilities of the compilers which produce code for it. Current compilers have many weaknesses that introduce inefficiencies in the code that they produce. It is our thesis that advanced compiler techniques can produce more efficient SIMD code and exploit the massively parallel hardware closer to its full potential. To validate our thesis, we are designing and implementing compiler transformations that optimize computation and communication given the constraint of a single instruction stream. 1 Introduction Parallel computing has been becoming more and more popular as a method of obtaining high performance. This trend will continue as parallel computers become less expensive and more readily ...

