Results 1 -
5 of
5
A hardware acceleration unit for mpi queue processing
- In Proceedings of IPDPS ’05
, 2005
"... With the heavy reliance of modern scientific applications upon the MPI Standard, it has become critical for the implementation of MPI to be as capable and as fast as possible. This has led some of the fastest modern networks to introduce the capability to offload aspects of MPI processing to an embe ..."
Abstract
-
Cited by 7 (3 self)
- Add to MetaCart
With the heavy reliance of modern scientific applications upon the MPI Standard, it has become critical for the implementation of MPI to be as capable and as fast as possible. This has led some of the fastest modern networks to introduce the capability to offload aspects of MPI processing to an embedded processor on the network interface. With this important capability has come significant performance implications. Most notably, the time to process long queues of posted receives or unexpected messages is substantially longer on embedded processors. This paper presents an associative list matching structure to accelerate the processing of moderate length queues in MPI. Simulations are used to compare the performance of an embedded processor augmented with this capability to a baseline implementation. The proposed enhancement significantly reduces latency for moderate length queues while adding virtually no overhead for extremely short queues. 1.
An Evaluation of Open MPI’s Matching Transport Layer on the Cray XT
"... Abstract. Open MPI was initially designed to support a wide variety of high-performance networks and network programming interfaces. Recently, Open MPI was enhanced to support networks that have full support for MPI matching semantics. Previous Open MPI efforts focused on networks that require the M ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
Abstract. Open MPI was initially designed to support a wide variety of high-performance networks and network programming interfaces. Recently, Open MPI was enhanced to support networks that have full support for MPI matching semantics. Previous Open MPI efforts focused on networks that require the MPI library to manage message matching, which is sub-optimal for some networks that inherently support matching. We describes a new matching transport layer in Open MPI, present results of micro-benchmarks and several applications on the Cray XT platform, and compare performance of the new and the existing transport layers, as well as the vendor-supplied implementation of MPI. 1
Developing Custom Firmware for the Red Storm
"... The Red Storm SeaStar network interface contains an embedded PowerPC 440 CPU that, out-of-thebox, is used solely to handle network protocol processing. Since this is a fully programmable general purpose CPU, network researchers may wish to program it to perform additional tasks such as NIC-based col ..."
Abstract
- Add to MetaCart
The Red Storm SeaStar network interface contains an embedded PowerPC 440 CPU that, out-of-thebox, is used solely to handle network protocol processing. Since this is a fully programmable general purpose CPU, network researchers may wish to program it to perform additional tasks such as NIC-based collective operations and NIC-level computation. This paper describes our experiences developing custom firmware for the SeaStar. In order to make the SeaStar more accessible to non-experts, we have developed a C version of the assembly-based firmware provided by Cray. This high-level language firmware should be much easier to understand and to quickly extend with new features. A detailed overview of the SeaStar programming environment and our C firmware will be presented along with optimization techniques that we have found beneficial.
Sandia National Laboratories
"... Abstract. Open MPI was initially designed to support a wide variety of high-performance networks and network programming interfaces. Recently, Open MPI was enhanced to support networks that have full support for MPI matching semantics. Previous Open MPI efforts focused on networks that require the M ..."
Abstract
- Add to MetaCart
Abstract. Open MPI was initially designed to support a wide variety of high-performance networks and network programming interfaces. Recently, Open MPI was enhanced to support networks that have full support for MPI matching semantics. Previous Open MPI efforts focused on networks that require the MPI library to manage message matching, which is sub-optimal for some networks that inherently support matching. We describes a new matching transport layer in Open MPI, present results of micro-benchmarks and several applications on the Cray XT platform, and compare performance of the new and the existing transport layers, as well as the vendor-supplied implementation of MPI. 1
Instrumentation and Analysis of MPI Queue Times on the SeaStar High-Performance Network
"... Abstract—Understanding the communication behavior and network resource usage of parallel applications is critical to achieving high performance and scalability on systems with tens of thousands of network endpoints. The need for better understanding is not only driven by the desire to identify poten ..."
Abstract
- Add to MetaCart
Abstract—Understanding the communication behavior and network resource usage of parallel applications is critical to achieving high performance and scalability on systems with tens of thousands of network endpoints. The need for better understanding is not only driven by the desire to identify potential performance optimization opportunities for current networks, but is also a necessity for designing next-generation networking hardware. In this paper, we describe our approach to instrumenting the SeaStar interconnect on the Cray XT series of massively parallel processing machines to gather low-level network timing data. This data provides a new perspective on performance evaluation, both in terms of evaluating the resource usage patterns of applications as well as evaluating different implementation strategies in the network protocol stack. I.

