Results 1 - 10
of
21
ICC++ -- A C++ Dialect for High Performance Parallel Computing
- In Proceedings of the 2nd International Symposium on Object Technologies for Advanced Software
, 1996
"... ICC++ is a new C++ concurrent dialect which allows sequential/parallel program versions to be maintained with single source, the construction of concurrent data abstractions, convenient expression of irregular and fine-grained concurrency, and supports high performance implementations. ICC++ prov ..."
Abstract
-
Cited by 55 (10 self)
- Add to MetaCart
ICC++ is a new C++ concurrent dialect which allows sequential/parallel program versions to be maintained with single source, the construction of concurrent data abstractions, convenient expression of irregular and fine-grained concurrency, and supports high performance implementations. ICC++ provides annotations for potential concurrency, facilitating both sharing source with sequential programs and grain size tuning for efficient execution. ICC++ has a notion of object consistency which can be extended structurally and procedurally to implement larger data abstractions. Finally, ICC++ integrates arrays into the object system and hence the concurrency model. In short, ICC++ addresses concurrency and its relation to abstractions -- whether they are implemented by single objects, several objects, or object collections. The design of the language, its rationale, and current status are all described. Keywords concurrent object-oriented programming, concurrent languages, parallel...
Obtaining Sequential Efficiency for Concurrent Object-Oriented Languages
- In Proceedings of the ACM Symposium on the Principles of Programming Languages
, 1995
"... Concurrent object-oriented programming (COOP) languages focus the abstraction and encapsulation power of abstract data types on the problem of concurrency control. In particular, pure fine-grained concurrent object-oriented languages (as opposed to hybrid or data parallel) provides the programmer wi ..."
Abstract
-
Cited by 47 (15 self)
- Add to MetaCart
Concurrent object-oriented programming (COOP) languages focus the abstraction and encapsulation power of abstract data types on the problem of concurrency control. In particular, pure fine-grained concurrent object-oriented languages (as opposed to hybrid or data parallel) provides the programmer with a simple, uniform, and flexible model while exposing maximum concurrency. While such languages promise to greatly reduce the complexity of large-scale concurrent programming, the popularity of these languages has been hampered by efficiency which is often many orders of magnitude less than that of comparable sequential code. We present a sufficient set of techniques which enables the efficiency of fine-grained concurrent object-oriented languages to equal that of traditional sequential languages (like C) when the required data is available. These techniques are empirically validated by the application to a COOP implementation of the Livermore Loops. 1 Introduction The increasing use of ...
Iteration Abstraction in Sather
- ACM Transactions on Programming Languages and Systems
, 1996
"... ion in Sather STEPHAN MURER, STEPHEN OMOHUNDRO, DAVID STOUTAMIRE, and CLEMENS SZYPERSKI International Computer Science Institute Sather extends the notion of an iterator in a powerful new way. We argue that iteration abstractions belong in class interfaces on an equal footing with routines. Sather i ..."
Abstract
-
Cited by 21 (1 self)
- Add to MetaCart
ion in Sather STEPHAN MURER, STEPHEN OMOHUNDRO, DAVID STOUTAMIRE, and CLEMENS SZYPERSKI International Computer Science Institute Sather extends the notion of an iterator in a powerful new way. We argue that iteration abstractions belong in class interfaces on an equal footing with routines. Sather iterators were derived from CLU iterators but are much more flexible and better suited for object-oriented programming. We retain the property that iterators are structured, i.e. strictly bound to a controlling structured statement. We motivate and describe the construct along with several simple examples. We compare it with iteration based on CLU iterators, cursors, riders, streams, series, generators, coroutines, blocks, closures, and lambda expressions. Finally, we describe experiences with iterators in the Sather compiler and libraries. Categories and Subject Descriptors: D.1.5 [Programming Techniques]: Object-oriented Programming; D.3.3 [Programming Languages]: Language Constructs and Fe...
Distributed data structures and algorithms for Gröbner basis computation
- Lisp and Symbolic Computation
, 1994
"... We present the design and implementation of a parallel algorithm for computing Gröbner bases on distributed memory multiprocessors. The parallel algorithm is irregular both in space and time: the data structures are dynamic pointer-based structures and the computations on the structures have unpre ..."
Abstract
-
Cited by 11 (4 self)
- Add to MetaCart
We present the design and implementation of a parallel algorithm for computing Gröbner bases on distributed memory multiprocessors. The parallel algorithm is irregular both in space and time: the data structures are dynamic pointer-based structures and the computations on the structures have unpredictable duration. The algorithm is presented as a series of refinements on a transition rule program, in which computation proceeds by nondeterministic invocations of guarded commands. Two key data structures, a set and a priority queue, are distributed across processors in the parallel algorithm. The data structures are designed for high throughput and latency tolerance, as appropriate for distributed memory machines. The programming style represents a compromise between shared-memory and message-passing models. The distributed nature of the data structures shows through their interface in that the semantics are weaker than with shared atomic objects, but they still provide a shared abstraction that can be used for reasoning about program correctness. In the data structure design there is a classic trade-off between locality and load balance. We argue that this is best solved by designing scheduling structures in tandem with the state data structures, since the decision to replicate or partition state affects the overhead of dynamically moving tasks.
Thal: An Actor System For Efficient And Scalable Concurrent Computing
, 1997
"... Actors are a model of concurrent objects which unify synchronization and data abstraction boundaries. Because they hide details of parallel execution and present an abstract view of the computation, actors provide a promising building block for easy-to-use parallel programming systems. However, the ..."
Abstract
-
Cited by 11 (0 self)
- Add to MetaCart
Actors are a model of concurrent objects which unify synchronization and data abstraction boundaries. Because they hide details of parallel execution and present an abstract view of the computation, actors provide a promising building block for easy-to-use parallel programming systems. However, the practical success of the concurrent object model requires two conditions be satisfied. Flexible communication abstractions and their efficient implementations are the necessary conditions for the success of actors. This thesis studies how to support communication between actors efficiently. First, we discuss communication patterns commonly arising in many parallel applications in the context of an experimental actor-based language, THAL. The language provides as communication abstractions concurrent call/return communication, delegation, broadcast, and local synchronization constraints. The thesis shows how the abstractions are efficiently implemented on stock-hardware distributed memory mul...
Supporting High Level Programming with High Performance: The Illinois Concert System
- In Proceedings of the Second International Workshop on High-level Parallel Programming Models and Supportive Environments
, 1997
"... Programmers of concurrent applications are faced with a complex performance space in which data distribution and concurrency management exacerbate the difficulty of building large, complex applications. To address these challenges, the Illinois Concert system provides a global namespace, implicit co ..."
Abstract
-
Cited by 11 (4 self)
- Add to MetaCart
Programmers of concurrent applications are faced with a complex performance space in which data distribution and concurrency management exacerbate the difficulty of building large, complex applications. To address these challenges, the Illinois Concert system provides a global namespace, implicit concurrency control and granularity management, implicit storage management, and object-oriented programming features. These features are embodied in a language ICC++ (derived from C++) which has been used to build a number of kernels and applications. As high level features can potentially incur overhead, the Concert system employs a range of compiler and runtime optimization techniques to efficiently support the high level programming model. The compiler techniques include type inference, inlining and specialization; and the runtime techniques include caching, prefetching and hybrid stack/heap multithreading. The effectiveness of these techniques permits the construction of complex parallel ...
Composites: Trees for Data Parallel Programming
- In Proceedings of the 1994 International Conference on Computer Languages
, 1994
"... Data parallel programming languages offer ease of programming and debugging and scalability of parallel programs to increasing numbers of processors. Unfortunately, the usefulness of these languages for non-scientific programmers and loosely coupled parallel machines is currently limited. In this pa ..."
Abstract
-
Cited by 7 (3 self)
- Add to MetaCart
Data parallel programming languages offer ease of programming and debugging and scalability of parallel programs to increasing numbers of processors. Unfortunately, the usefulness of these languages for non-scientific programmers and loosely coupled parallel machines is currently limited. In this paper, we present the composite tree model which seeks to provide greater flexibility via parallel data types, support for more general, hierachical parallelism, parallel control flow, and efficient execution on loosely coupled, coarse grained parallel machines such as workstation networks. The composite tree model is a new model of parallel programming based on merging data parallelism with object oriented programming languages, and can be implemented as a small set of extensions to any pure, static typed, object oriented programming language. 1 Introduction Data parallel programming achieves parallelism through the simultaneous execution of the same operation across a set of data [19]. In a...
Evolving Software Tools for New Distributed Computing Environments
, 1997
"... In future, parallel and distributed computing paradigms will replace nowadays predominant sequential and centralized ones. Facing the challenge to support the construction of complex but high-quality distributed applications, appropriate software environments have to beprovided. Experience has shown ..."
Abstract
-
Cited by 4 (4 self)
- Add to MetaCart
In future, parallel and distributed computing paradigms will replace nowadays predominant sequential and centralized ones. Facing the challenge to support the construction of complex but high-quality distributed applications, appropriate software environments have to beprovided. Experience has shown, that these new paradigms require support by the resource management system, comprising compiler, linker and operating system to name only a few of the tools involved. As the implementation of new and high-quality tools demands tremendous efforts, modification but basically reusage of existing tools as far as possible is desirable. This paper
Thread Migration in Distributed Memory Multicomputers
, 1998
"... While the paradigm offered by SMP designs is relatively clean one, programming paradigms offered on distributed memory platforms rarely offer the same simplicity of comprehension, and ease of use. The most common paradigms for programming distributed memory computers offer either distributed shared ..."
Abstract
-
Cited by 4 (0 self)
- Add to MetaCart
While the paradigm offered by SMP designs is relatively clean one, programming paradigms offered on distributed memory platforms rarely offer the same simplicity of comprehension, and ease of use. The most common paradigms for programming distributed memory computers offer either distributed shared memory, or a complex and error prone message passing library. We have implemented a runtime system for distributed memory platforms, Nomad which offers not only the transparency of data location provided by distributed shared memory systems, but also transparency of processing location by a highly optimised and lightweight thread migration mechanism. We argue that thread migration is in many cases is actually a more efficient computation strategy than the data migration in conventional distributed shared memory systems. Introduction Symmetric Multi-Processor (SMP) machines typically provide a number of CPUs all of which address the same local memory. Programmers of these machines do not have c...
High Level Parallel Programming: The Illinois Concert System
, 1998
"... Programmers of concurrent applications are faced with complex performance trade-offs, since data distribution and concurrency management exacerbate the difficulty of building large, complex applications. To address these challenges, the Illinois Concert system provides a global namespace, implicit c ..."
Abstract
-
Cited by 3 (0 self)
- Add to MetaCart
Programmers of concurrent applications are faced with complex performance trade-offs, since data distribution and concurrency management exacerbate the difficulty of building large, complex applications. To address these challenges, the Illinois Concert system provides a global namespace, implicit concurrency control and granularity management, implicit storage management, and object-oriented programming features. These features are embodied in a language ICC++ (derived from C++) which has been used to build a number of kernels and applications. As high level features can potentially incur overhead, the Concert system employs a range of compiler and runtime optimization techniques to efficiently support the high level programming model. The compiler techniques include type inference, inlining and specialization; and the runtime techniques include caching, prefetching and hybrid stack/heap multithreading. The effectiveness of these techniques permits the construction of complex parallel...

