Results 1 - 10
of
11
Advances, Applications and Performance of the Global Arrays Shared Memory Programming Toolkit
- INTERN. J. HIGH PERF. COMP. APPLICATIONS
, 2005
"... This paper describes capabilities, evolution, performance, and applications of the Global Arrays (GA) toolkit. GA was created to provide application programmers with an interface that allows them to distribute data while maintaining the type of global index space and programming syntax similar to th ..."
Abstract
-
Cited by 13 (8 self)
- Add to MetaCart
This paper describes capabilities, evolution, performance, and applications of the Global Arrays (GA) toolkit. GA was created to provide application programmers with an interface that allows them to distribute data while maintaining the type of global index space and programming syntax similar to that available when programming on a single processor. The goal of GA is to free the programmer from the low level management of communication and allow them to deal with their problems at the level at which they were originally formulated. At the same time, compatibility of GA with MPI enables the programmer to take advantage of the existing MPI software/libraries when available and appropriate. The variety of applications that have been implemented using Global Arrays attests to the
Protocols and Strategies for Optimizing Performance of Remote Memory Operations on Clusters
- In: Proc. Workshop Communication Architecture for Clusters (CAC02) of IPDPS’02, Ft
, 2002
"... this paper, we describe software architecture for supporting remote memory operations on clusters with networks such as Myrinet or cLAN. When combined with protocols and strategies for efficient management of network and host resources, this architecture can both deliver high performance and match n ..."
Abstract
-
Cited by 12 (2 self)
- Add to MetaCart
this paper, we describe software architecture for supporting remote memory operations on clusters with networks such as Myrinet or cLAN. When combined with protocols and strategies for efficient management of network and host resources, this architecture can both deliver high performance and match network protocols with requirements of remote memory operations. The protocols and strategies address issues such as buffer memory consumption, management of GM tokens, dynamic memory registration, zero-copy data transfers and adaptive data streaming. For example, the adaptive data streaming technique bridges the performance gap between remote memory operations that target registered and those that use regular memory. Our approach relies on the standard unmodified system software and drivers for Myrinet and cLAN rather than on custom/alternative drivers and interfaces (e.g., AM [1], PM [2], BIP [3], and FM [4]) interfaces that replace the standard Myrinet Control Program (MCP) on the network interface card
One-sided Communication on the Myrinet-based SMP Clusters using the GM Message-Passing Library
- In Proceedings of the Workshop on Communication Architecture for Clusters (CAC) held in conjunction with IPDPS ’01
, 2001
"... Introduction In the past five years, the Myrinet network has received much attention in the literature on high-performance communication. Multiple projects have focused on developing efficient messaging middleware (e.g., AM[1], PM[2], BIP[3], and HPVM FM [4]) by exploiting the programmable network ..."
Abstract
-
Cited by 4 (1 self)
- Add to MetaCart
Introduction In the past five years, the Myrinet network has received much attention in the literature on high-performance communication. Multiple projects have focused on developing efficient messaging middleware (e.g., AM[1], PM[2], BIP[3], and HPVM FM [4]) by exploiting the programmable network interface card (NIC) of Myrinet. The most common use of these interfaces has been support of MPI and other internal research projects. The HPVM project used its FM system to implement MPI and other programming interfaces including two one-sided: the Cray SHMEM and Global Arrays. In the last few years, because of the good scalability and rather moderate cost, Myrinet has become the primary network for building medium and large-scale clusters based on commodity processing nodes (e.g., Intel or Alpha Linux systems). To our best knowledge, with the exception of the NCSA Windows NT SuperCluster that operates in the HPVM environment, the majority of medium and large Myrinet-based clusters used in
Combining Distributed and Shared Memory Models: Approach and Evolution of the Global Arrays Toolkit
- in Proceedings of the Workshop on Performance Optimization via High-Level Languages and Libraries (POHLL-02
, 2002
"... This paper describes the characteristics of the Global Arrays programming model, capabilities of the toolkit, and discusses its evolution ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
This paper describes the characteristics of the Global Arrays programming model, capabilities of the toolkit, and discusses its evolution
A Parallel Communication Infrastructure for STAPL
"... Communication is an important but difficult aspect of parallel programming. This paper describes a parallel communication infrastructure, based on remote method invocation, to simplify parallel programming by abstracting lowlevel shared-memory or message passing details while maintaining high perfor ..."
Abstract
- Add to MetaCart
Communication is an important but difficult aspect of parallel programming. This paper describes a parallel communication infrastructure, based on remote method invocation, to simplify parallel programming by abstracting lowlevel shared-memory or message passing details while maintaining high performance and portability. STAPL, the Standard Template Adaptive Parallel Library, builds upon this infrastructure to make communication transparent to the user. The basic design is discussed, as well as the mechanisms used in the current Pthreads and MPI implementations. Performance comparisons between STAPL and explicit Pthreads or MPI are given on a variety of machines, including an HPV2200, Origin 3800 and a Linux Cluster.
Co-array Python: A Parallel Extension to the Python Language
"... Abstract. A parallel extension to the Python language is introduced that is modeled after the Co-Array Fortran extensions to Fortran 95. A new Python module, CoArray, has been developed to provide co-array syntax that allows a Python programmer to address co-array data on a remote processor. An exam ..."
Abstract
- Add to MetaCart
Abstract. A parallel extension to the Python language is introduced that is modeled after the Co-Array Fortran extensions to Fortran 95. A new Python module, CoArray, has been developed to provide co-array syntax that allows a Python programmer to address co-array data on a remote processor. An example of Jacobi iteration using the CoArray module is shown and corresponding performance results are presented. 1
Major Subject: Computer ScienceOBJECT-ORIENTED ABSTRACTIONS FOR COMMUNICATION IN PARALLEL PROGRAMS
, 2003
"... (Member) ..."
Parallel Processing Letters ❢c World Scientific Publishing Company ARMI: A High Level Communication Library for STAPL ∗
, 2004
"... ARMI is a communication library that provides a framework for expressing finegrain parallelism and mapping it to a particular machine using shared-memory and message passing library calls. The library is an advanced implementation of the RMI protocol and handles low-level details such as scheduling ..."
Abstract
- Add to MetaCart
ARMI is a communication library that provides a framework for expressing finegrain parallelism and mapping it to a particular machine using shared-memory and message passing library calls. The library is an advanced implementation of the RMI protocol and handles low-level details such as scheduling incoming communication and aggregating outgoing communication to coarsen parallelism. These details can be tuned for different platforms to allow user codes to achieve the highest performance possible without manual modification. ARMI is used by STAPL, our generic parallel library, to provide a portable, user transparent communication layer. We present the basic design as well as the mechanisms used in the current Pthreads/OpenMP, MPI implementations and/or a combination thereof. Performance comparisons between ARMI and explicit use of Pthreads or MPI are given on a variety of machines, including an HP-V2200, Origin
Classifications General Terms
"... ARMI is a communication library that provides a framework for expressing fine-grain parallelism and mapping it to a particular machine using shared-memory and message passing library calls. The library is an advanced implementation of the RMI protocol and handles low-level details such as scheduling ..."
Abstract
- Add to MetaCart
ARMI is a communication library that provides a framework for expressing fine-grain parallelism and mapping it to a particular machine using shared-memory and message passing library calls. The library is an advanced implementation of the RMI protocol and handles low-level details such as scheduling incoming communication and aggregating outgoing communication to coarsen parallelism when necessary. These details can be tuned for different platforms to allow user codes to achieve the highest performance possible without manual modification. ARMI is used by STAPL, our generic parallel library, to provide a portable, user transparent communication layer. We present the basic design as well as the mechanisms used in the current Pthreads/OpenMP, MPI implementations and/or a combination thereof. Performance comparisons between ARMI and explicit use of Pthreads or MPI are given on a variety of machines, including an HP V2200, SGI Origin 3800, IBM Regatta-HPC and IBM RS6000 SP cluster.

