Results 1 - 10
of
12
An Object-Oriented Concurrent Reflective Language ABCL/R3
, 2000
"... This article presents the design principles and efficient implementation techniques for ABCL/R3, an object-oriented concurrent reflective language. One of the most distinguished features of ABCL/R3 is compilation techniques using partial evaluation, which effectively remove interpretation from meta- ..."
Abstract
-
Cited by 56 (11 self)
- Add to MetaCart
This article presents the design principles and efficient implementation techniques for ABCL/R3, an object-oriented concurrent reflective language. One of the most distinguished features of ABCL/R3 is compilation techniques using partial evaluation, which effectively remove interpretation from meta-level programs. The meta-level objects are designed so that they can be partially evaluated in an effective manner. Benchmark programs show that our compilation frameworks make object execution drastically faster than interpreter-based implementations, and achieves performance close to nonreflective compilers.
StackThreads/MP: Integrating Futures into Calling Standards
- PPOPP'99
, 1999
"... An implementation scheme of fine-grain multithreading that needs no changes to current calling standards for sequential languages and modest extensions to sequential compilers is described. Like previous similar systems, it performs an asynchronous call as if it were an ordinary procedure call, and ..."
Abstract
-
Cited by 19 (5 self)
- Add to MetaCart
An implementation scheme of fine-grain multithreading that needs no changes to current calling standards for sequential languages and modest extensions to sequential compilers is described. Like previous similar systems, it performs an asynchronous call as if it were an ordinary procedure call, and detaches the callee from the caller when the callee sus-pends or either of them migrates to another processor. Un-like previous similar systems, it detaches and connects arbi-trary frames generated by off-the-shelf sequential compilers obeying calling standards. As a consequence, it requires neither a frontend preprocessor nor a native code genera-tor that has a builtin notion of parallelism. The system practically works with unmodified GNU Ccompiler (GCC). Desirable extensions to sequential compilers for guarantee-ing portability and correctness of the scheme are clarified and claimed modest. Experiments indicate that sequential performance is not sacrificed for practical applications and both sequential and parallel performance are comparable to Cilk[B], whose current implementation requires a fairly so-phisticated preprocessor to C. These results show that ef-ficient asynchronous calls (a.k.a. future calls) can be inte-grated into current calling standard with a very small impact both on sequential performance and compiler engineering.
An Efficient Compilation Framework for Languages Based on a Concurrent Process Calculus
, 1997
"... We propose a framework for compiling programming languages based on concurrent process calculi, in which computation is expressed by a combination of processes and communication channels. Our framework realizes a compile-time process scheduling and unboxed channels. The compile-time scheduling enabl ..."
Abstract
-
Cited by 10 (7 self)
- Add to MetaCart
We propose a framework for compiling programming languages based on concurrent process calculi, in which computation is expressed by a combination of processes and communication channels. Our framework realizes a compile-time process scheduling and unboxed channels. The compile-time scheduling enables us to execute multiple independent processes without ascheduling pool operation. Unboxed channels allow us to create a channel without memory allocations and to communicate values on registers. The framework is given as a set of translation rules from a concurrent calculus to an ML-like sequential program. Experimental results show that our compiler can execute sequential programs written in the process calculus only a few times slower than equivalent C programs. This indicates that pure process calculi like ours and programming languages based on them can be implemented efficiently, without losing their simplicity, purity, and elegance.
EXECUTING PARALLEL PROGRAMS WITH SYNCHRONIZATION BOTTLENECKS EFFICIENTLY
"... We propose a scheme within which parallel programs with potential synchronization bottlenecks run efficiently. In the straightforward implementations which use basic locking schemes, the execution time for the program parts with bottlenecks increases significantly when the number of processors incre ..."
Abstract
-
Cited by 9 (3 self)
- Add to MetaCart
We propose a scheme within which parallel programs with potential synchronization bottlenecks run efficiently. In the straightforward implementations which use basic locking schemes, the execution time for the program parts with bottlenecks increases significantly when the number of processors increases. Our scheme makes the parallel performance for the bottleneck parts of programs close to the sequential performance while maintaining the e ciency with which the nonbottleneck parts run. Experiments with a 64-processor SMP and a 128-processor DSM machine confirmed that parallel programs implemented with our scheme perform much better than parallel programs implemented with other widely-used locking schemes.
Type-Based Analysis Of Usage Of Values For Concurrent Programming Languages
, 1997
"... We propose a type-based technique to analyze how many times each value, including communication channels, is used during execution of concurrent programs. This work is closely related with the recent work by Kobayashi, Pierce, and Turner on a linear channel system on a process calculus. They introdu ..."
Abstract
-
Cited by 5 (3 self)
- Add to MetaCart
We propose a type-based technique to analyze how many times each value, including communication channels, is used during execution of concurrent programs. This work is closely related with the recent work by Kobayashi, Pierce, and Turner on a linear channel system on a process calculus. They introduced a type system that ensures certain channels (called linear channels) to be used just once, and showed that how linear channels help reasoning about program behaviors. However, they only deal with a pure message passing calculus, and more importantly, the type reconstruction problem is left open. This thesis develops a type reconstruction algorithm of a variant of a linear channel type system for a concurrent language with data constructors such as records, and let-polymorphism. We can detect not only linear channels but also other used-once values (closures, records, etc.) by the type reconstruction algorithm. Computational cost of our analysis (excluding cost of ordinary type reconstruc...
An Implementation and Performance Evaluation of Language with Fine-Grain Thread Creation on Shared MemoryParallel Computer
, 1998
"... We implemented two applications with irregular parallelism in (1) C and a thread libraryand (2) our concurrent language Schematic which supports e cient ne-grain dynamic thread creation and its dynamic load balance. We compared the two approaches focusing on program description cost and performance. ..."
Abstract
-
Cited by 4 (4 self)
- Add to MetaCart
We implemented two applications with irregular parallelism in (1) C and a thread libraryand (2) our concurrent language Schematic which supports e cient ne-grain dynamic thread creation and its dynamic load balance. We compared the two approaches focusing on program description cost and performance. Schematic not only achieves common programming practices seen in C such as task queue management with much smaller description cost, but incorporates some advanced optimizations for synchronization such as inter-thread communication on register. The case studyshows that Schematic can describe irregular applications more naturallyand can achieve high performance: Schematic is executed about 2.8 times slower than C on sequential environment and its speedup on 64 processor environment is comparable to C.
Online Computation of Critical Paths for Multithreaded Languages
- In Proceedings of the 5th International Workshop on High-Level Parallel Programming Models and Supportive Environments (HIPS 2000), volume 1800 of Lecture Notes in Computer Science
, 2000
"... . We have developed an instrumentation scheme that enables programs written in multithreaded languages to compute a critical path at runtime. Our scheme gives not only the length (execution time) of the critical path but also the lengths and locations of all the subpaths making up the critical p ..."
Abstract
-
Cited by 2 (2 self)
- Add to MetaCart
. We have developed an instrumentation scheme that enables programs written in multithreaded languages to compute a critical path at runtime. Our scheme gives not only the length (execution time) of the critical path but also the lengths and locations of all the subpaths making up the critical path. Although the scheme is like Cilk's algorithm in that it uses a "longest path" computation, it allows more flexible synchronization. We implemented our scheme on top of the concurrent object-oriented language Schematic and confirmed its effectiveness through experiments on a 64-processor symmetric multiprocessor. 1 Introduction The scalability expected in parallel programming is often not obtained in the first run, and then performance tuning is necessary. In the early stages of this tuning it is very useful to know what the critical path and how long it is. The length of an execution path is defined as the amount of time needed to execute it, and the critical path is the longest o...
Reasoning-conscious Meta-object Design of a Reflective Concurrent Language
, 1997
"... Computational reflection gives programming languages high flexibility, which is useful for parallel/distributed programming. On the other hand, its interpreter based execution model makes efficient implementation difficult. Especially, meta-objects in concurrent languages are described with explicit ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
Computational reflection gives programming languages high flexibility, which is useful for parallel/distributed programming. On the other hand, its interpreter based execution model makes efficient implementation difficult. Especially, meta-objects in concurrent languages are described with explicit state transition, which makes program reasoning---such as partial evaluation---difficult. In this paper, we propose a new meta-object design, which exploits reader/writer methods in our concurrent object-oriented language Schematic. The crux of the design is separation of state-related operations from others, which allows us to optimize meta-objects using an existing partial evaluator because the most methods in the meta-objects can be regarded as a sequential program. 1 Introduction 1.1 Reflection in Parallel/Distributed Programs Practical parallel and distributed programs often have complicated computation and communication structures for achieving efficiency, fault-tolerance, portabili...
A Compilation Framework for Languages with Dynamic Thread Creation
, 1996
"... The efficiency of multithreading is quite essential to the overall performance of concurrent object-oriented languages. It is very inefficient to implement such languages by using thread libraries. In this paper, we propose a framework that efficiently compiles languages which supports dynamic threa ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
The efficiency of multithreading is quite essential to the overall performance of concurrent object-oriented languages. It is very inefficient to implement such languages by using thread libraries. In this paper, we propose a framework that efficiently compiles languages which supports dynamic thread creation. In the framework, we designed and implemented a programming language Schematic, which is a concurrent object-oriented extension to Scheme. As an intermediate language of Schematic, we are using a subset of process calculus HACL, which is simple but has enough power to express higher-level constructs. We developed several optimization techniques in the intermediate language, and achieved very high performance in benchmark programs. 1 Introduction High performance concurrent programming languages are getting more and more important as parallel computers and workstation clusters spread widely. Of these languages, concurrent object-oriented languages are paid much attention because ...
Achieving High Performance for Parallel Programs that Contain Unscalable Modules
, 2000
"... This thesis is a description of a compiler and runtime technique for the efficient management of threads including their mutual exclusion. The target area for this work is parallel languages for shared-memory multiprocessors. The goal of this work is to achieve a situation in which the execution tim ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
This thesis is a description of a compiler and runtime technique for the efficient management of threads including their mutual exclusion. The target area for this work is parallel languages for shared-memory multiprocessors. The goal of this work is to achieve a situation in which the execution time either decreases or remains unchanged as the number of processors is increased. We call this performance model the satisfactory performance model. Existing parallel programming systems do not always perform according to this satisfactory model. This is the case when there are modules in the program such that concurrent invocations of the modules are serialized. We call these modules bottleneck modules. When bottleneck modules are present they prevent operation according to the satisfactory performance model since the overhead incurred because of bottleneck modules increases with the number of processors. This overhead includes communications with memory for the sharing of memory objects am...

