Results 1 - 10
of
70
CHARM++: A Portable Concurrent Object Oriented System Based On C++
- IN PROCEEDINGS OF THE CONFERENCE ON OBJECT ORIENTED PROGRAMMING SYSTEMS, LANGUAGES AND APPLICATIONS
, 1993
"... We describe Charm++, an object oriented portable parallel programming language based on C++. Its design philosophy, implementation, sample applications and their performance on various parallel machines are described. Charm++ is an explicitly parallel language consisting of C++ with a few extensions ..."
Abstract
-
Cited by 155 (17 self)
- Add to MetaCart
We describe Charm++, an object oriented portable parallel programming language based on C++. Its design philosophy, implementation, sample applications and their performance on various parallel machines are described. Charm++ is an explicitly parallel language consisting of C++ with a few extensions. It provides a clear separation between sequential and parallel objects. The execution model of Charm++ is message driven, thus helping one write programs that are latencytolerant. The language supports multiple inheritance, dynamic binding, overloading, strong typing, and reuse for parallel objects. Charm++ provides specific modes for sharing information between parallel objects. Extensive dynamic load balancing strategies are provided. It is based on the Charm parallel programming system, and its runtime system implementation reuses most of the runtime system for Charm.
Falcon: On-line Monitoring for Steering Parallel Programs
- In Ninth International Conference on Parallel and Distributed Computing and Systems (PDCS’97
, 1998
"... Advances in high performance computing, communications, and user interfaces enable developers to construct increasingly interactive high performance applications. The Falcon system presented in this paper supports such interactivity by providing runtime libraries, tools, and user interfaces that per ..."
Abstract
-
Cited by 51 (13 self)
- Add to MetaCart
Advances in high performance computing, communications, and user interfaces enable developers to construct increasingly interactive high performance applications. The Falcon system presented in this paper supports such interactivity by providing runtime libraries, tools, and user interfaces that permit the on-line monitoring and steering of large-scale parallel codes. The principal aspects of Falcon described in this paper are its abstractions and tools for capture and analysis of application-specific program information, performed on-line, with controlled latencies and scalable to parallel machines of substantial size. In addition, Falcon provides support for the on-line graphical display of monitoring information, and it allows programs to be steered during their execution, by human users or algorithmically. This paper presents our basic research motivation, outlines the Falcon system's functionality, and includes a detailed evaluation of its performance characteristics in light of i...
Runtime and language support for compiling adaptive irregular programs on distributed memory machines
- SOFTWARE—PRACTICE AND EXPERIENCE
, 1995
"... In many scientific applications, arrays containing data are indirectly indexed through indirection arrays. Such scientific applications are called irregular programs and are a distinct class of applications that require special techniques for parallelization. This paper presents a library called CHA ..."
Abstract
-
Cited by 42 (3 self)
- Add to MetaCart
In many scientific applications, arrays containing data are indirectly indexed through indirection arrays. Such scientific applications are called irregular programs and are a distinct class of applications that require special techniques for parallelization. This paper presents a library called CHAOS, which helps users implement irregular programs on distributed-memory message-passing machines, such as the Paragon, Delta, CM-5 and SP-1. The CHAOS library provides efficient runtime primitives for distributing data and computation over processors; it supports efficient index translation mechanisms and provides users high-level mechanisms for optimizing communication. CHAOS subsumes the previous PARTI library and supports a larger class of applications. In particular, it provides efficient support for parallelization of adaptive irregular programs where indirection arrays are modified during the course of computation. To demonstrate the efficacy of CHAOS, two challenging real-life adaptive applications were parallelized using CHAOS primitives: a molecular dynamics code, CHARMM, and a particle-in-cell code, DSMC. Besides providing runtime support to users, CHAOS can also be used by compilers to automatically parallelize irregular applications. This paper demonstrates how CHAOS can be effectively used in such a framework. By embedding CHAOS primitives in the Syracuse Fortran 90D/HPF compiler, kernels taken from the CHARMM and DSMC codes have been automatically parallelized. KEY WORDS: distributed-memory multiprocessors; runtime compilation; adaptive irregular programs; parallelizing compilers
Efficient Run-time Support for Irregular Block-Structured Applications
, 1998
"... Parallel implementations of scientific applications often rely on elaborate dynamic data structures with complicated communication patterns. We describe a set of intuitive geometric programming abstractions that simplify coordination of irregular block-structured scientific calculations without sacr ..."
Abstract
-
Cited by 38 (14 self)
- Add to MetaCart
Parallel implementations of scientific applications often rely on elaborate dynamic data structures with complicated communication patterns. We describe a set of intuitive geometric programming abstractions that simplify coordination of irregular block-structured scientific calculations without sacrificing performance. We have implemented these abstractions in KeLP, a C++ run-time library. KeLP's abstractions enable the programmer to express complicated communication patterns for dynamic applications, and to tune communication activity with a high-level, abstract interface. We show that KeLP's flexible communication model effectively manages elaborate data motion patterns arising in structured adaptive mesh refinement, and achieves performance comparable to hand-coded message-passing on several structured numerical kernels. to appear in J. Parallel and Distributed Computing 1 Introduction Many scientific numerical methods employ structured irregular representations to improve accura...
Java Programming for High-Performance Numerical Computing
, 2000
"... Class Figure 5 Simple Array construction operations //Simple 3 x 3 array of integers intArray2D A = new intArray2D(3,3); //This new array has a copy of the data in A, //and the same rank and shape. ..."
Abstract
-
Cited by 35 (8 self)
- Add to MetaCart
Class Figure 5 Simple Array construction operations //Simple 3 x 3 array of integers intArray2D A = new intArray2D(3,3); //This new array has a copy of the data in A, //and the same rank and shape.
FALCON: A MATLAB Interactive Restructuring Compiler
- IN LANGUAGES AND COMPILERS FOR PARALLEL COMPUTING
, 1995
"... The development of efficient numerical programs and library routines for high-performance parallel computers is a complex task requiring not only an understanding of the algorithms to be implemented, but also detailed knowledge of the target machine and the software environment. In this paper, w ..."
Abstract
-
Cited by 28 (10 self)
- Add to MetaCart
The development of efficient numerical programs and library routines for high-performance parallel computers is a complex task requiring not only an understanding of the algorithms to be implemented, but also detailed knowledge of the target machine and the software environment. In this paper, we describe a programming environment that can utilize such knowledge for the development of high-performance numerical programs and libraries. This environment uses an existing highlevel array language (MATLAB) as source language and performs static, dynamic, and interactive analysis to generate Fortran 90 programs with directives for parallelism. It includes capabilities for interactive and automatic transformations at both the operation-level and the functional- or algorithm-level. Preliminary experiments, comparing interpreted MATLAB programs with their compiled versions, show that compiled programs can perform up to 48 times faster on a serial machine, and up to 140 times fas...
Interoperability of Data Parallel Runtime Libraries with Meta-Chaos
- In Proceedings of the Eleventh International Parallel Processing Symposium. IEEE Computer
, 1997
"... This paper describes a framework for providing the ability to use multiple specialized data parallel libraries and/or languages within a single application. The ability to use multiple libraries is required in many application areas, such as multidisciplinary complex physical simulations and remote ..."
Abstract
-
Cited by 27 (11 self)
- Add to MetaCart
This paper describes a framework for providing the ability to use multiple specialized data parallel libraries and/or languages within a single application. The ability to use multiple libraries is required in many application areas, such as multidisciplinary complex physical simulations and remote sensing image database applications. An application can consist of one program or multiple programs that use different libraries to parallelize operations on distributed data structures. The framework is embodied in a runtime library called Meta- -Chaos that has been used to exchange data between data parallel programs written using High Performance Fortran, the Chaos and Multiblock Parti libraries developed at Maryland for handling various types of unstructured problems, and the runtime library for pC++, a data parallel version of C++ from Indiana University. Experimental results show that Meta-Chaos is able to move data between libraries efficiently, and that Meta-Chaos provides effective ...
A Parallel Software Infrastructure for Dynamic Block-Irregular Scientific Calculations
, 1995
"... ..."
Parallel Program Archetypes
- In Proceedings of the Scalable Parallel Library Conference
, 1997
"... A parallel program archetype is an abstraction that captures the common features of a class of problems with similar computational structure and combines them with a parallelization strategy to produce a pattern of dataflow and communication. Such abstractions are useful in application developme ..."
Abstract
-
Cited by 18 (0 self)
- Add to MetaCart
A parallel program archetype is an abstraction that captures the common features of a class of problems with similar computational structure and combines them with a parallelization strategy to produce a pattern of dataflow and communication. Such abstractions are useful in application development, both as a conceptual framework and as a basis for tools and techniques. This paper describes an approach to parallel application development based on archetypes and presents two example archetypes with applications. 1 Introduction This paper proposes a specific method of exploiting computational and dataflow patterns to help in developing reliable parallel programs. A great deal of work has been done on methods of exploiting design patterns in program development. This paper restricts attention to one kind of pattern that is relevant in parallel programming: the pattern of the parallel computation and communication structure. Methods of exploiting design patterns in program develop...
A Portable Run-Time System for Object-Parallel Systems
, 1995
"... This paper describes a parallel run-time system (RTS) that is used as part of the pC++ parallel programming language. The RTS has been implemented on a variety of scalable, MPP computers including the IBM SP2, Intel Paragon, Meiko CS2, SGI Power Challenge, and Cray T3D. This system differs from othe ..."
Abstract
-
Cited by 18 (4 self)
- Add to MetaCart
This paper describes a parallel run-time system (RTS) that is used as part of the pC++ parallel programming language. The RTS has been implemented on a variety of scalable, MPP computers including the IBM SP2, Intel Paragon, Meiko CS2, SGI Power Challenge, and Cray T3D. This system differs from other data-parallel RTS implementations; it is designed to support the operations from object-parallel programming that require remote member function execution and load and store operations on remote data. The implementation is designed to provide the thinnest possible layer atop the vendor-supplied machine interface. That thin veneer can then be used by other run-time layers to build machine independent class libraries, compiler back ends, and more This research is supported in part by: ARPA DABT63-94-C-0029 and Rome Labs Contract AF 3060292 -C-0135 1 sophisticated run-time support. Some preliminary measurements of the RTS are given for the IBM SP2, SGI Power Challenge, and Cray T3D. 1 Intr...

