Results 1 -
5 of
5
Virtual Machine Aware Communication Libraries for High Performance Computing
"... As the size and complexity of modern computing systems keep increasing to meet the demanding requirements of High Performance Computing (HPC) applications, manageability is becoming a critical concern to achieve both high performance and high productivity computing. Meanwhile, virtual machine (VM) t ..."
Abstract
-
Cited by 13 (2 self)
- Add to MetaCart
As the size and complexity of modern computing systems keep increasing to meet the demanding requirements of High Performance Computing (HPC) applications, manageability is becoming a critical concern to achieve both high performance and high productivity computing. Meanwhile, virtual machine (VM) technologies have become popular in both industry and academia due to various features designed to ease system management and administration. While a VM-based environment can greatly help manageability on large-scale computing systems, concerns over performance have largely blocked the HPC community from embracing VM technologies. In this paper, we follow three steps to demonstrate the ability to achieve near-native performance in a VM-based environment for HPC. First, we propose Inter-VM Communication
Automatic Path Migration over InfiniBand: Early Experiences ∗
"... High computational power of commodity PCs combined with the emergence of low latency and high bandwidth interconnects has escalated the trends of cluster computing. Clusters with InfiniBand are being deployed, as reflected in the TOP 500 Supercomputer rankings. However, increasing scale of these clu ..."
Abstract
-
Cited by 2 (1 self)
- Add to MetaCart
High computational power of commodity PCs combined with the emergence of low latency and high bandwidth interconnects has escalated the trends of cluster computing. Clusters with InfiniBand are being deployed, as reflected in the TOP 500 Supercomputer rankings. However, increasing scale of these clusters has reduced the Mean Time Between Failures (MTBF) of components. Network component is one such component of clusters, where failure of Network Interface Cards (NICs), cables and/or switches breaks existing path(s) of communication. InfiniBand provides a hardware mechanism, Automatic Path Migration (APM), which allows user transparent detection and recovery from network fault(s), without application restart. In this paper, we design a set of modules; which work together for providing network fault tolerance for user level applications leveraging the APM feature. Our performance evaluation at the MPI Layer shows that APM incurs negligible overhead in the absence of faults in the system. In the presence of network faults, APM incurs negligible overhead for reasonably long running applications.
Hybrid Design of MPI over SCTP
, 2007
"... Message Passing Interface (MPI) is a popular message passing interface for writing parallel applications. It has been designed to run over many different types of network interconnects ranging from commodity Ethernet to more specialized hardwares including: shared memory, and Remote Direct Memory Ac ..."
Abstract
- Add to MetaCart
Message Passing Interface (MPI) is a popular message passing interface for writing parallel applications. It has been designed to run over many different types of network interconnects ranging from commodity Ethernet to more specialized hardwares including: shared memory, and Remote Direct Memory Access (RDMA) devices such as InfiniBand and the recently standardized Internet Wide Area RDMA Protocol (iWARP). The API itself provides both the point-to-point and remote memory access (RMA) operations to the application. However, it is often implemented based on one kind of underlying network device, namely entirely RDMA or point-to-point. As a result, it is often not possible to provide a direct mapping from the software semantics to the underlying hardware. In this work, we propose a hybrid approach in designing MPI in which network device to use can depend on its functional requirement. This allows the MPI API to exploit the potential performance benefits of the underlying hardware more directly. Another highlight of this work is the design of the MPI middleware to be IP based in order to provide support for both cluster and wide area network environment; this can be achieved via the use of a commodity transport layer protocol, namely Stream Control Transmission Protocol (SCTP). We will demonstrate how SCTP can be used to support MPI with different kinds of network devices and to provide multirailing support from the transport layer. iii Contents Abstract.................................... ii
Dissertation Committee: Approved by
"... In the past decade, rapid advances have taken place in the field of computer and network design enabling us to connect thousands of computers together to form high performance clusters. These clusters are used to solve computationally challenging scientific problems. The Message Passing Interface (M ..."
Abstract
- Add to MetaCart
In the past decade, rapid advances have taken place in the field of computer and network design enabling us to connect thousands of computers together to form high performance clusters. These clusters are used to solve computationally challenging scientific problems. The Message Passing Interface (MPI) is a popular model to write applications for these clusters. There are a vast array of scientific applications which use MPI on clusters. As the applications operate on larger and more complex data, the size of the compute clusters is scaling higher and higher. The scalability and the performance of the MPI library if very important for the end application performance. InfiniBand is a cluster interconnect which is based on open-standards and is gaining rapid acceptance. This dissertation explores the different transports provided by Infini-Band to determine the scalabilty and performance aspects of each. Further, new MPI designs have been proposed and implemented for transports that have never been used for MPI in the past. These designs have significantly decreased the resource consumption, increased the performance and increased the reliability of ultra-scale InfiniBand clusters. A
Designing High-Performance and Resilient Message Passing on InfiniBand
"... Abstract—Clusters featuring the InfiniBand interconnect are continuing to scale. As an example, the “Ranger” system at the Texas Advanced Computing Center (TACC) includes over 60,000 cores with nearly 4,000 InfiniBand ports. The latest Top500 list shows 30 % of systems and over 50 % of the top 100 a ..."
Abstract
- Add to MetaCart
Abstract—Clusters featuring the InfiniBand interconnect are continuing to scale. As an example, the “Ranger” system at the Texas Advanced Computing Center (TACC) includes over 60,000 cores with nearly 4,000 InfiniBand ports. The latest Top500 list shows 30 % of systems and over 50 % of the top 100 are now using InfiniBand as the compute node interconnect. As these systems continue to scale, the Mean-Time-Between-Failure (MTBF) is reducing and additional resiliency must be provided to the important components of HPC systems, including the MPI library. In this paper we present a design that leverages the reliability semantics of InfiniBand, but provides a higherlevel of resiliency. We are able to avoid aborting jobs in the case of network failures as well as failures on the endpoints in the InfiniBand Host Channel Adapters (HCA). We propose reliability designs for rendezvous designs using both Remote DMA (RDMA) read and write operations. We implement a prototype of our design and show that performance is near-identical to that of a non-resilient design. This shows that we can have both the performance and the network reliability needed for large-scale systems. I.

