Results 1 - 10
of
322
Active Messages: a Mechanism for Integrated Communication and Computation
, 1992
"... The design challenge for large-scale multiprocessors is (1) to minimize communication overhead, (2) allow communication to overlap computation, and (3) coordinate the two without sacrificing processor cost/performance. We show that existing message passing multiprocessors have unnecessarily high com ..."
Abstract
-
Cited by 911 (72 self)
- Add to MetaCart
The design challenge for large-scale multiprocessors is (1) to minimize communication overhead, (2) allow communication to overlap computation, and (3) coordinate the two without sacrificing processor cost/performance. We show that existing message passing multiprocessors have unnecessarily high communication costs. Research prototypes of message driven machines demonstrate low communication overhead, but poor processor cost/performance. We introduce a simple communication mechanism, Active Messages, show that it is intrinsic to both architectures, allows cost effective use of the hardware, and offers tremendous flexibility. Implementations on nCUBE/2 and CM-5 are described and evaluated using a split-phase shared-memory extension to C, Split-C.We further show that active messages are sufficient to implement the dynamically scheduled languages for which message driven machines were designed. With this mechanism, latency tolerance becomes a programming/compiling concern. Hardware suppor...
Cilk: An Efficient Multithreaded Runtime System
- JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING
, 1995
"... Cilk (pronounced "silk") is a C-based runtime system for multithreaded parallel programming. In this paper, we document the efficiency of the Cilk work-stealing scheduler, both empirically and analytically. We show that on real and synthetic applications, the "work" and "critical-path length" of a C ..."
Abstract
-
Cited by 430 (34 self)
- Add to MetaCart
Cilk (pronounced "silk") is a C-based runtime system for multithreaded parallel programming. In this paper, we document the efficiency of the Cilk work-stealing scheduler, both empirically and analytically. We show that on real and synthetic applications, the "work" and "critical-path length" of a Cilk computation can be used to model performance accurately. Consequently, a Cilk programmer can focus on reducing the computation's work and critical-path length, insulated from load balancing and other runtime scheduling issues. We also prove that for the class of "fully strict" (well-structured) programs, the Cilk scheduler achieves space, time, and communication bounds all within a constant factor of optimal. The Cilk
Scheduler Activations: Effective Kernel Support for the User-Level Management of Parallelism
- ACM Transactions on Computer Systems
, 1992
"... Threads are the vehicle,for concurrency in many approaches to parallel programming. Threads separate the notion of a sequential execution stream from the other aspects of traditional UNIX-like processes, such as address spaces and I/O descriptors. The objective of this separation is to make the expr ..."
Abstract
-
Cited by 420 (18 self)
- Add to MetaCart
Threads are the vehicle,for concurrency in many approaches to parallel programming. Threads separate the notion of a sequential execution stream from the other aspects of traditional UNIX-like processes, such as address spaces and I/O descriptors. The objective of this separation is to make the expression and control of parallelism sufficiently cheap that the programmer or compiler can exploit even fine-grained parallelism with acceptable overhead. Threads can be supported either by the operating system kernel or by user-level library code in the application address space, but neither approach has been fully satisfactory. This paper addresses this dilemma. First, we argue that the performance of kernel threads is inherently worse than that of user-level threads, rather than this being an artifact of existing implementations; we thus argue that managing par- allelism at the user level is essential to high-performance parallel computing. Next, we argue that the lack of system integration exhibited by user-level threads is a consequence of the lack of kernel support for user-level threads provided by contemporary multiprocessor operating systems; we thus argue that kernel threads or processes, as currently conceived, are the wrong abstraction on which to support user- level management of parallelism. Finally, we describe the design, implementation, and performance of a new kernel interface and user-level thread package that together provide the same functionality as kernel threads without compromis- ing the performance and flexibility advantages of user-level management of parallelism.
A methodology for implementing highly concurrent data structures
- In 2nd Symp. Principles & Practice of Parallel Programming
, 1990
"... A con.curren.t object is a data structure shared by concurrent processes. Conventional techniques for implementing concurrent objects typically rely on criticaI sections: ensuring that only one process at a time can operate on the object. Nevertheless, critical sections are poorly suited for asynchr ..."
Abstract
-
Cited by 295 (12 self)
- Add to MetaCart
A con.curren.t object is a data structure shared by concurrent processes. Conventional techniques for implementing concurrent objects typically rely on criticaI sections: ensuring that only one process at a time can operate on the object. Nevertheless, critical sections are poorly suited for asynchronous systems: if one process is halted or delayed in a critical section, other, non-faulty processes will be unable to progress. By contrast, a concurrent object implementation is non-blocking if it always guarantees that some process will complete an operation in a finite number of steps, and it is wait-free if it guarantees that each process will complete an operation in a finite number of steps. This paper proposes a new methodology for constructing non-blocking aud wait-free implementations of concurrent objects. The object’s representation and operations are written as st,ylized sequential programs, with no explicit synchronization. Each sequential operation is automatically transformed into a non-blocking or wait-free operation usiug novel synchronization and memory management algorithms. These algorithms are presented for a multiple instruction/multiple data (MIM D) architecture in which n processes communicate by applying read, write, and comparekYswa,p operations to a shared memory. 1
Lazy Task Creation: A Technique for Increasing the Granularity of Parallel Programs
- IEEE Transactions on Parallel and Distributed Systems
, 1991
"... Many parallel algorithms are naturally expressed at a fine level of granularity, often finer than a MIMD parallel system can exploit efficiently. Most builders of parallel systems have looked to either the programmer or a parallelizing compiler to increase the granularity of such algorithms. In this ..."
Abstract
-
Cited by 212 (7 self)
- Add to MetaCart
Many parallel algorithms are naturally expressed at a fine level of granularity, often finer than a MIMD parallel system can exploit efficiently. Most builders of parallel systems have looked to either the programmer or a parallelizing compiler to increase the granularity of such algorithms. In this paper we explore a third approach to the granularity problem by analyzing two strategies for combining parallel tasks dynamically at run-time. We reject the simpler load-based inlining method, where tasks are combined based on dynamic load level, in favor of the safer and more robust lazy task creation method, where tasks are created only retroactively as processing resources become available. These strategies grew out of work on Mul-T [15], an efficient parallel implementation of Scheme, but could be used with other languages as well. We describe our Mul-T implementations of lazy task creation for two contrasting machines, and present performance statistics which show the method's effectiveness. Lazy task creation allows efficient execution of naturally expressed algorithms of a substantially finer grain than possible with previous parallel Lisp systems.
Programming languages for distributed computing systems
- ACM Computing Surveys
, 1989
"... When distributed systems first appeared, they were programmed in traditional sequential languages, usually with the addition of a few library procedures for sending and receiving messages. As distributed applications became more commonplace and more sophisticated, this ad hoc approach became less sa ..."
Abstract
-
Cited by 211 (15 self)
- Add to MetaCart
When distributed systems first appeared, they were programmed in traditional sequential languages, usually with the addition of a few library procedures for sending and receiving messages. As distributed applications became more commonplace and more sophisticated, this ad hoc approach became less satisfactory. Researchers all over the world began designing new programming languages specifically for implementing distributed applications. These languages and their history, their underlying principles, their design, and their use are the subject of this paper. We begin by giving our view of what a distributed system is, illustrating with examples to avoid confusion on this important and controversial point. We then describe the three main characteristics that distinguish distributed programming languages from traditional sequential languages, namely, how they deal with parallelism, communication, and partial failures. Finally, we discuss 15 representative distributed languages to give the flavor of each. These examples include languages based on message passing, rendezvous, remote procedure call, objects, and atomic transactions, as well as functional languages, logic languages, and distributed data structure languages. The paper concludes with a comprehensive bibliography listing over 200 papers on nearly 100 distributed programming languages.
Supporting Dynamic Data Structures on Distributed-Memory Machines
, 1995
"... this article, we describe an execution model for supporting programs that use pointer-based dynamic data structures. This model uses a simple mechanism for migrating a thread of control based on the layout of heap-allocated data and introduces parallelism using a technique based on futures and lazy ..."
Abstract
-
Cited by 143 (8 self)
- Add to MetaCart
this article, we describe an execution model for supporting programs that use pointer-based dynamic data structures. This model uses a simple mechanism for migrating a thread of control based on the layout of heap-allocated data and introduces parallelism using a technique based on futures and lazy task creation. We intend to exploit this execution model using compiler analyses and automatic parallelization techniques. We have implemented a prototype system, which we call Olden, that runs on the Intel iPSC/860 and the Thinking Machines CM-5. We discuss our implementation and report on experiments with five benchmarks.
Active Object -- An Object Behavioral Pattern for . . .
, 1996
"... This paper describes the Active Object pattern, which decouples method execution from method invocation in order to simplify synchronized access to an object that resides in its own thread of control. The Active Object pattern allows one or more independent threads of execution to interleave their a ..."
Abstract
-
Cited by 121 (41 self)
- Add to MetaCart
This paper describes the Active Object pattern, which decouples method execution from method invocation in order to simplify synchronized access to an object that resides in its own thread of control. The Active Object pattern allows one or more independent threads of execution to interleave their access to data modeled as a single object. A broad class of producer/consumer and reader/writer applications are wellsuited to this model of concurrency. This pattern is commonly used in distributed systems requiring multi-threaded servers. In addition, client applications, such as windowing systems and network browsers, employ active objects to simplify concurrent, asynchronous network operations.
The Design and Implementation of the SELF Compiler, an Optimizing Compiler for Object-Oriented Programming Languages
, 1992
"... Object-oriented programming languages promise to improve programmer productivity by supporting abstract data types, inheritance, and message passing directly within the language. Unfortunately, traditional implementations of object-oriented language features, particularly message passing, have been ..."
Abstract
-
Cited by 120 (15 self)
- Add to MetaCart
Object-oriented programming languages promise to improve programmer productivity by supporting abstract data types, inheritance, and message passing directly within the language. Unfortunately, traditional implementations of object-oriented language features, particularly message passing, have been much slower than traditional implementations of their non-object-oriented counterparts: the fastest existing implementation of Smalltalk-80 runs at only a tenth the speed of an optimizing C implementation. The dearth of suitable implementation technology has forced most object-oriented languages to be designed as hybrids with traditional non-object-oriented languages, complicating the languages and making programs harder to extend and reuse. This dissertation describes a collection of implementation techniques that can improve the run-time performance of object-oriented languages, in hopes of reducing the need for hybrid languages and encouraging wider spread of purely object-oriented langu...
On the Expressive Power of Programming Languages
- Science of Computer Programming
, 1990
"... The literature on programming languages contains an abundance of informal claims on the relative expressive power of programming languages, but there is no framework for formalizing such statements nor for deriving interesting consequences. As a first step in this direction, we develop a formal noti ..."
Abstract
-
Cited by 116 (4 self)
- Add to MetaCart
The literature on programming languages contains an abundance of informal claims on the relative expressive power of programming languages, but there is no framework for formalizing such statements nor for deriving interesting consequences. As a first step in this direction, we develop a formal notion of expressiveness and investigate its properties. To validate the theory, we analyze some widely held beliefs about the expressive power of several extensions of functional languages. Based on these results, we believe that our system correctly captures many of the informal ideas on expressiveness, and that it constitutes a foundation for further research in this direction. 1 Comparing Programming Languages The literature on programming languages contains an abundance of informal claims on the expressive power of programming languages. Arguments in these contexts typically assert the expressibility or non-expressibility of programming constructs relative to a language. Unfortunately, pro...

