Results 1 - 10
of
37
Exploiting Task and Data Parallelism on a Multicomputer
- In ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming
, 1993
"... For many applications, achieving good performance on a private memory parallel computer requires exploiting data parallelism as well as task parallelism. Depending on the size of the input data set and the number of nodes (i.e., processors), different tradeoffs between task and data parallelism are ..."
Abstract
-
Cited by 86 (21 self)
- Add to MetaCart
For many applications, achieving good performance on a private memory parallel computer requires exploiting data parallelism as well as task parallelism. Depending on the size of the input data set and the number of nodes (i.e., processors), different tradeoffs between task and data parallelism are appropriate for a parallel system. Most existing compilers focus on only one of data parallelism and task parallelism. Therefore, to achieve the desired results, the programmer must separately program the data and task parallelism. We have taken a unified approach to exploiting both kinds of parallelism in a single framework with an existing language. This approach eases the task of programming and exposes the tradeoffs between data and task parallelism to the compiler. We have implemented a parallelizing Fortran compiler for the iWarp system based on this approach. We discuss the design of our compiler, and present performance results to validate our approach. 1 Introduction Many applicati...
Commutativity Analysis: A New Analysis Technique for Parallelizing Compilers
- ACM TRANSACTIONS ON PROGRAMMING LANGUAGES AND SYSTEMS
, 1997
"... This article presents a new analysis technique, commutativity analysis, for automatically parallelizing computations that manipulate dynamic, pointer-based data structures. Commutativity analysis views the computation as composed of operations on objects. It then analyzes the program at this granula ..."
Abstract
-
Cited by 61 (7 self)
- Add to MetaCart
This article presents a new analysis technique, commutativity analysis, for automatically parallelizing computations that manipulate dynamic, pointer-based data structures. Commutativity analysis views the computation as composed of operations on objects. It then analyzes the program at this granularity to discover when operations commute (i.e., generate the same final result regardless of the order in which they execute). If all of the operations required to perform a given computation commute, the compiler can automatically generate parallel code. We have implemented a prototype compilation system that uses commutativity analysis as its primary analysis technique
Portable Run-Time Support for Dynamic Object-Oriented Parallel Processing
- ACM Transactions on Computer Systems
, 1993
"... Mentat is an object-oriented parallel processing system designed to simplify the task of writing portable parallel programs for parallel machines and workstation networks. The Mentat compiler and run-time system work together to automatically manage the communication and synchronization between obje ..."
Abstract
-
Cited by 57 (24 self)
- Add to MetaCart
Mentat is an object-oriented parallel processing system designed to simplify the task of writing portable parallel programs for parallel machines and workstation networks. The Mentat compiler and run-time system work together to automatically manage the communication and synchronization between objects. The run-time system marshals member function arguments, schedules objects on processors, and dynamically constructs and executes large grain data dependence graphs. In this paper we present the Mentat runtime system. We focus on three aspects --- the software architecture, including the interface to the compiler and the structure and interaction of the principle components of the runtime system; the run-time overhead on a component by component basis for two platforms, a Sun SparcStation 2 and an Intel Paragon; and an analysis of the minimum granularity required for application programs to overcome the run-time overhead. Keywords: object-oriented, parallel processing, dataflow, distribu...
The Design, Implementation, and Evaluation of Jade
- ACM Transactions on Programming Languages and Systems
, 1998
"... this article we discuss the design goals and decisions that determined the final form of Jade and present an overview of the Jade implementation. We also present our experience using Jade to implement several complete scientific and engineering applications. We use this experience to evaluate how th ..."
Abstract
-
Cited by 47 (2 self)
- Add to MetaCart
this article we discuss the design goals and decisions that determined the final form of Jade and present an overview of the Jade implementation. We also present our experience using Jade to implement several complete scientific and engineering applications. We use this experience to evaluate how the different Jade language features were used in practice and how well Jade as a whole supports the process of developing parallel applications. We find that the basic idea of preserving the serial semantics simplifies the program development process, and that the concept of using data access specifications to guide the parallelization offers significant advantages over more traditional control-based approaches. We also find that the Jade data model can interact poorly with concurrency patterns that write disjoint pieces of a single aggregate data structure, although this problem arises in only one of the applications. Categories and Subject Descriptors: D.1.3 [Programming Te
Towards a New Model of Abstraction in the Engineering of Software
- IN PROCEEDINGS INTERNATIONAL WORKSHOP ON NEW MODELS FOR SOFTWARE ARCHITECTURE (IMSA): REFLECTION AND META-LEVEL ARCHITECTURE
, 1992
"... The view of abstraction on which software engineering is based does not support the reality of practic... ..."
Abstract
-
Cited by 40 (0 self)
- Add to MetaCart
The view of abstraction on which software engineering is based does not support the reality of practic...
Towards a New Model of Abstraction
, 1992
"... We now come to the decisive step of mathematical abstraction: we forget about what the symbols stand for... [The mathematician] need not be idle; there are many operations he can carry out with these symbols, without ever having to look at the things they stand for. Hermann Weyl, “The Mathematical W ..."
Abstract
-
Cited by 33 (0 self)
- Add to MetaCart
We now come to the decisive step of mathematical abstraction: we forget about what the symbols stand for... [The mathematician] need not be idle; there are many operations he can carry out with these symbols, without ever having to look at the things they stand for. Hermann Weyl, “The Mathematical Way of Thinking” (This appears at the beginning of the Building Abstractions With Data chapter of “Structure and Interpretation
The Design, Implementation and Evaluation of Jade, a Portable, Implicitly Parallel Programming Language
- Dept. of Computer Science, Stanford Univ
, 1994
"... ii ..."
Communication and Memory Requirements as the Basis for Mapping Task and Data Parallel Programs
, 1994
"... For a wide variety of applications, both task and data parallelism must be exploited to achieve the best possible performance on a multicomputer. Recent research has underlined the importance of exploiting task and data parallelism in a single compilerframework, and such a compiler can map a single ..."
Abstract
-
Cited by 30 (7 self)
- Add to MetaCart
For a wide variety of applications, both task and data parallelism must be exploited to achieve the best possible performance on a multicomputer. Recent research has underlined the importance of exploiting task and data parallelism in a single compilerframework, and such a compiler can map a single source program in many different ways onto a parallel machine. The tradeoffs between task and data parallelism are complex and depend on the characteristics of the program to be executed, most significantly the memory and communication requirements, and the performance parameters of the target parallel machine. In this paper, we present a framework to isolate and examine the specific characteristics of programs that determine the performance for different mappings. Our focus is on applications that process a stream of input, and whose computation structure is fairly static and predictable. We describe three such applications that were developed with our compiler: fast Fourier transforms, nar...
Cosy Compiler Phase Embedding with the CoSy Compiler Model
, 1994
"... In this article we introduce a novel model for compilation and compiler construction, the CoSy(COmpiler SYstem) model. CoSy provides a framework for flexible combination and embedding of compiler phases --- called engines in the sequel --- such that the construction of parallel and (inter-proced ..."
Abstract
-
Cited by 29 (2 self)
- Add to MetaCart
In this article we introduce a novel model for compilation and compiler construction, the CoSy(COmpiler SYstem) model. CoSy provides a framework for flexible combination and embedding of compiler phases --- called engines in the sequel --- such that the construction of parallel and (inter-procedural) optimizing compilers is facilitated. In CoSy a compiler writer may program some phase in a target language and embed it transparently --- without source code changes --- into different compiler contexts, such as with alternative phase order, speculative evaluation , parallel evaluation, and generate-and-test evaluation. Compilers constructed with CoSy can be tuned for different host systems (the system the compiler runs on, not the system it produces code for) and are transparently scalable for (shared memory) multiprocessor host configurations.
Analyzing Stores and References in a Parallel Symbolic Language
- IN LFP
, 1994
"... We describe an analysis of a parallel language in which processes communicate via first-class mutable shared locations. The sequential core of the language defines a higher-order strict functional language with list data structures. The parallel extensions permit processes and shared locations to be ..."
Abstract
-
Cited by 23 (4 self)
- Add to MetaCart
We describe an analysis of a parallel language in which processes communicate via first-class mutable shared locations. The sequential core of the language defines a higher-order strict functional language with list data structures. The parallel extensions permit processes and shared locations to be dynamically created; synchronization among processes occurs exclusively via shared locations. The analysis is defined by an abstract interpretation on this language. The interpretation is efficient and useful, facilitating a number of important optimizations related to synchronization, processor/thread mapping, and storage management.

