Results 1 - 10
of
13
Supporting Dynamic Data Structures on Distributed-Memory Machines
, 1995
"... this article, we describe an execution model for supporting programs that use pointer-based dynamic data structures. This model uses a simple mechanism for migrating a thread of control based on the layout of heap-allocated data and introduces parallelism using a technique based on futures and lazy ..."
Abstract
-
Cited by 166 (8 self)
- Add to MetaCart
this article, we describe an execution model for supporting programs that use pointer-based dynamic data structures. This model uses a simple mechanism for migrating a thread of control based on the layout of heap-allocated data and introduces parallelism using a technique based on futures and lazy task creation. We intend to exploit this execution model using compiler analyses and automatic parallelization techniques. We have implemented a prototype system, which we call Olden, that runs on the Intel iPSC/860 and the Thinking Machines CM-5. We discuss our implementation and report on experiments with five benchmarks.
Lazy Threads: Implementing a Fast Parallel Call
- JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING
, 1996
"... In this paper we describe lazy threads, a new approach for implementing multi-threaded execution models on conventional machines. We show how they can implement a parallel call at nearly the efficiency of a sequential call. The central idea is to specialize the representation of a parallel call so t ..."
Abstract
-
Cited by 58 (3 self)
- Add to MetaCart
In this paper we describe lazy threads, a new approach for implementing multi-threaded execution models on conventional machines. We show how they can implement a parallel call at nearly the efficiency of a sequential call. The central idea is to specialize the representation of a parallel call so that it can execute as a parallel-ready sequential call. This allows excess parallelism to degrade into sequential calls with the attendant efficient stack management and direct transfer of control and data, yet a call that truly needs to execute in parallel, gets its own thread of control. The efficiency of lazy threads is achieved through a careful attention to storage management and a code generation strategy that allows us to represent potential parallel work with no overhead.
Computation Migration: Enhancing Locality for Distributed-Memory Parallel Systems
"... We describe computation migration, a new technique that is based on compile-time program transformations, for accessing remote data in a distributed-memory parallel system. In contrast with RPC-style access, where the access is performed remotely, and with data migration, where the data is moved so ..."
Abstract
-
Cited by 55 (4 self)
- Add to MetaCart
We describe computation migration, a new technique that is based on compile-time program transformations, for accessing remote data in a distributed-memory parallel system. In contrast with RPC-style access, where the access is performed remotely, and with data migration, where the data is moved so that it is local, computation migration moves part of the current thread to the processor where the data resides. The access is performed at the remote processor, and the migrated thread portion continues to run on that same processor; this makes subsequent accesses in the thread portion local. We describe an implementation of computation migration that consists of two parts: an implementation that migrates single activation frames, and a high-level language annotation that allows a programmer to express when migration is desired. We performed experiments using two applications; these experiments demonstrate that computation migration is a valuable alternative to RPC and data migration.
Enabling Primitives For Compiling Parallel Languages
, 1995
"... This paper presents three novel languageimplementation primitives---lazy threads,stacklets, and synchronizers---andshows how they combine to provide a parallel call at nearly the efficiency of a sequential call. The central idea is to transform parallel calls into parallel-ready sequential calls. Ex ..."
Abstract
-
Cited by 23 (1 self)
- Add to MetaCart
(Show Context)
This paper presents three novel languageimplementation primitives---lazy threads,stacklets, and synchronizers---andshows how they combine to provide a parallel call at nearly the efficiency of a sequential call. The central idea is to transform parallel calls into parallel-ready sequential calls. Excess parallelism degrades into sequential calls with the attendant efficient stack management and direct transfer of control and data, unless a call truly needs to execute in parallel, in which case it gets its own thread of control. We show how these techniques can be applied to distribute work efficiently on multiprocessors.
Supporting a Dynamic SPMD Model in a Multi-Threaded Architecture
, 1993
"... The SPMD (Single-Program Multiple-Data) model has gained acceptance as a programming model for scientific array-intensive computations on distributed memory machines. Recently, researchers have been extending the SPMD model to handle programs which operate on recursively-defined dynamic data structu ..."
Abstract
-
Cited by 10 (5 self)
- Add to MetaCart
The SPMD (Single-Program Multiple-Data) model has gained acceptance as a programming model for scientific array-intensive computations on distributed memory machines. Recently, researchers have been extending the SPMD model to handle programs which operate on recursively-defined dynamic data structures; such models are commonly referred to as Dynamic SPMD (DSPMD) Models. In this paper, we examine existing Dynamic SPMD models and investigate how to efficiently exploit temporal and physical locality in the traversal of and operations on dynamic data structures on a multithreaded architecture. In particular, we propose an extension of a DSPMD model and present a new multi-threaded architecture to support the model. 1 Introduction The SPMD (Single-Program Multiple-Data) model has gained acceptance as a programming model for array-intensive scientific computations on distributed memory machines. In the last few years, we have seen a rapidly increasing number of research projects in this ar...
Lazy Threads: Compiler and Runtime Structures for Fine-Grained Parallel Programming
, 1997
"... Many modern parallel languages support dynamic creation of threads or require multithreading in their implementations. The threads describe the logical parallelism in the program. For ease of expression and better resource utilization, the logical parallelism in a program often exceeds the physical ..."
Abstract
-
Cited by 9 (0 self)
- Add to MetaCart
(Show Context)
Many modern parallel languages support dynamic creation of threads or require multithreading in their implementations. The threads describe the logical parallelism in the program. For ease of expression and better resource utilization, the logical parallelism in a program often exceeds the physical parallelism of the machine and leads to applications with many fine-grained threads. In practice, however, most logical threads need not be independent threads. Instead, they could be run as sequential calls, which are inherently cheaper than independent threads. The challenge is that one cannot generally predict which logical threads can be implemented as sequential calls. In lazy multithreading systems each logical thread begins execution sequentially (with the attendant effic...
A Calculus for Exploiting Data Parallelism on Recursively Defined Data
- In Proc. International Workshop on Theory and Practice on Parallel Programming, LNCS
, 1994
"... Array based data parallel programming can be generalized in two ways to make it an appropriate paradigm for parallel processing of general recursively defined data. The first is the introduction of a parallel evaluation mechanism for dynamically allocated recursively defined data. It achieves the ef ..."
Abstract
-
Cited by 5 (2 self)
- Add to MetaCart
(Show Context)
Array based data parallel programming can be generalized in two ways to make it an appropriate paradigm for parallel processing of general recursively defined data. The first is the introduction of a parallel evaluation mechanism for dynamically allocated recursively defined data. It achieves the effect of applying the same function to all the subterms of a given datum in parallel. The second is a new notion of recursion, which we call parallel recursion, for parallel evaluation of recursively defined data. In contrast with ordinary recursion, which only uses the final results of the recursive calls of its immediate subterms, the new recursion repeatedly transforms a recursive datum represented by a system of equations to another recursive datum by applying the same function to each of the equation simultaneously, until the final result is obtained. This mechanism exploits more parallelism and achieves significant speedup compared to the conventional parallel evaluation of recursive ...
Abstract Manticore: A heterogeneous parallel language
"... The Manticore project is an effort to design and implement a new functional language for parallel programming. Unlike many earlier parallel languages, Manticore is a heterogeneous language that supports parallelism at multiple levels. Specifically, we combine CML-style explicit concurrency with NESL ..."
Abstract
- Add to MetaCart
The Manticore project is an effort to design and implement a new functional language for parallel programming. Unlike many earlier parallel languages, Manticore is a heterogeneous language that supports parallelism at multiple levels. Specifically, we combine CML-style explicit concurrency with NESL/Nepal-style dataparallelism. In this paper, we describe and motivate the design of the Manticore language. We also describe a flexible runtime model that supports multiple scheduling disciplines (e.g., for both finegrain and course-grain parallelism) in a uniform framework. Work on a prototype implementation is ongoing and we give a status report. 1.