Results 1 -
8 of
8
Large-Scale Experiment of Co-allocation Strategies for Peer-to-Peer SuperComputing in P2P-MPI
"... High Performance computing generally involves some parallel applications to be deployed on the multiples resources used for the computation. The problem of scheduling the application across distributed resources is termed as co-allocation. In a grid context, co-allocation is difficult since the grid ..."
Abstract
-
Cited by 4 (0 self)
- Add to MetaCart
High Performance computing generally involves some parallel applications to be deployed on the multiples resources used for the computation. The problem of scheduling the application across distributed resources is termed as co-allocation. In a grid context, co-allocation is difficult since the grid middleware must face a dynamic environment. Middleware architecture on a Peer-to-Peer (P2P) basis have been proposed to tackle most limitations of centralized systems. Some of the issues addressed by P2P systems are fault tolerance, ease of maintenance, and scalability in resource discovery. However, the lack of global knowledge makes scheduling difficult in P2P systems. In this paper, we present the new developments concerning locality awareness as well as co-allocation strategies available in the latest release of P2P-MPI. i) The spread strategy tries to map processes on hosts so as to maximize the total amount of available memory while maintaining locality of processes as a secondary objective. ii) The concentrate strategy tries to maximize locality between processes by using as many cores as hosts offer. The co-allocation scheme has been devised to be simple for the user and meets the main high performance computing requirement which is locality. Extensive experiments have been conducted on Grid5000 with up to 600 processes on 6 sites throughout France. Results show that we achieved the targeted goals in these real conditions. 1
Satin: a High-Level and Efficient Grid Programming Model
"... Computational grids have an enormous potential to provide compute power. However, this power remains largely unexploited today for most applications, except trivially parallel programs. Developing parallel grid applications simply is too difficult. Grids introduce several problems not encountered be ..."
Abstract
-
Cited by 4 (3 self)
- Add to MetaCart
Computational grids have an enormous potential to provide compute power. However, this power remains largely unexploited today for most applications, except trivially parallel programs. Developing parallel grid applications simply is too difficult. Grids introduce several problems not encountered before, mainly due to the highly heterogeneous and dynamic computing and networking environment. Furthermore, failures occur frequently, and resources may be claimed by higher priority jobs at any time. In this paper, we solve these problems for an important class of applications: divide-andconquer. We introduce a system called Satin that simplifies the development of parallel grid applications by providing a rich high-level programming model that completely hides communication. All grid issues are transparently handled in the run time system, not by the programmer. Satin’s programming model is based on Java, features spawn-sync primitives and shared objects, and uses asynchronous exceptions and an abort mechanism to support speculative parallelism. To allow an efficient implementation, Satin consistently exploits the idea that grids are hierarchically structured. Dynamic load-balancing is done with a novel cluster-aware scheduling algorithm that hides the long wide-area latencies by overlapping them with useful local work.
Non-blocking Java Communications Support on Clusters
- 13th European PVM/MPI Users’ Group Meeting, EuroPVM/MPI’06, Bonn (Alemania). In Recent Advances in Parallel Virtual Machine and Message Passing Interface. Lecture Notes in Computer Science no
, 2006
"... Abstract. This paper presents communication strategies for supporting efficient non-blocking Java communication on clusters. The communication performance is critical for the overall cluster performance. It is possible to use non-blocking communications to reduce the communication overhead. Previous ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
Abstract. This paper presents communication strategies for supporting efficient non-blocking Java communication on clusters. The communication performance is critical for the overall cluster performance. It is possible to use non-blocking communications to reduce the communication overhead. Previous efforts to efficiently support non-blocking communication in Java have led to the introduction of the Java NIO API. Although the Java NIO package addresses scalability issues by providing select() like functionality, it lacks support for high speed interconnects. To solve this issue, this paper introduces a non-blocking communication library to efficiently support specialized communication hardware. This library focuses on reducing the startup communication time, avoiding unnecessary copying, and overlapping computation with communication. This project provides the basis for a Java Message-passing library to be implemented on top of it. Towards the end, this paper evaluates the proposed approach on a Scalable Coherent Interface (SCI) and Gigabit Ethernet (GbE) testbed cluster. Experimental results show that the proposed library reduces the communication overhead and increases computation and communication overlapping. 1
The Quest for Parallel Reasoning on the Semantic Web
"... Abstract. Traditional reasoning tools for the Semantic Web cannot cope with Web scale data. One major direction to improve performance is parallelization. This article surveys existing studies, basic ideas and mechanisms for parallel reasoning, and introduces three major parallel applications on the ..."
Abstract
- Add to MetaCart
Abstract. Traditional reasoning tools for the Semantic Web cannot cope with Web scale data. One major direction to improve performance is parallelization. This article surveys existing studies, basic ideas and mechanisms for parallel reasoning, and introduces three major parallel applications on the Semantic Web: LarKC, MaRVIN, and Reasoning-Hadoop. Furthermore, this paper lays the ground for parallelizing unified search and reasoning at Web scale. 1
unknown title
"... Abstract. This chapter describes the P2P-MPI project, a software framework aimed at the development of message-passing programs in large scale distributed networks of computers. Our goal is to provide a light-weight, self-contained software package that requires minimum effort to use and maintain. P ..."
Abstract
- Add to MetaCart
Abstract. This chapter describes the P2P-MPI project, a software framework aimed at the development of message-passing programs in large scale distributed networks of computers. Our goal is to provide a light-weight, self-contained software package that requires minimum effort to use and maintain. P2P-MPI relies on three features to reach this goal: i) its installation and use does not require administrator privileges, ii) available resources are discovered and selected for a computation without intervention from the user, iii) program executions can be fault-tolerant on user demand, in a completely transparent fashion (no checkpoint server to configure). P2P-MPI is typically suited for organizations having spare individual computers linked by a high speed network, possibly running heterogeneous operating systems, and having Java applications. From a technical point of view, the framework has three layers: an infrastructure management layer at the bottom, a middleware layer containing the services, and the communication layer implementing an MPJ (Message Passing for Java) communication library at the top. We explain the design and the implementation of these layers, and we discuss the allocation strategy
Int J Parallel Prog (2009) 37:433–461 DOI 10.1007/s10766-009-0115-8 Fault-Management in P2P-MPI
"... Abstract We present in this paper a study on fault management in a grid middleware. The middleware is our home-grown software called P2P-MPI. This framework is MPJ compliant, allows users to execute message passing parallel programs, and its objective is to support environments using commodity hardw ..."
Abstract
- Add to MetaCart
Abstract We present in this paper a study on fault management in a grid middleware. The middleware is our home-grown software called P2P-MPI. This framework is MPJ compliant, allows users to execute message passing parallel programs, and its objective is to support environments using commodity hardware. Hence, running programs is failure prone and a particular attention must be paid to fault management. The fault management covers two issues: fault-tolerance and fault detection. Fault-tolerance deals with the program execution: P2P-MPI provides a transparent fault tolerance facility based on replication of computations. Fault detection concerns the monitoring of the program execution by the system. The monitoring is done through a distributed set of modules called failure detectors. The contribution of this paper is twofold. The first contribution is the evaluation of the failure probability of an application depending on the replication degree. The failure probability depends on the execution length, and we propose a model to evaluate the duration of a replicated parallel program. Then, we give an expression of the replication degree required to keep the failure probability of an execution under a given threshold. The second contribution is a study of the advantages and drawbacks of several fault detection systems found in the literature. The criteria of our evaluation are the reliability of the failure detection service and
Java Fast Sockets: Enabling high-speed Java communications on high
"... journal homepage: www.elsevier.com/locate/comcom ..."
A Peer-to-Peer Framework for Message Passing Parallel Programs
, 2009
"... Abstract. This chapter describes the P2P-MPI project, a software framework aimed at the development of message-passing programs in large scale distributed networks of computers. Our goal is to provide a light-weight, self-contained software package that requires minimum effort to use and maintain. P ..."
Abstract
- Add to MetaCart
Abstract. This chapter describes the P2P-MPI project, a software framework aimed at the development of message-passing programs in large scale distributed networks of computers. Our goal is to provide a light-weight, self-contained software package that requires minimum effort to use and maintain. P2P-MPI relies on three features to reach this goal: i) its installation and use does not require administrator privileges, ii) available resources are discovered and selected for a computation without intervention from the user, iii) program executions can be fault-tolerant on user demand, in a completely transparent fashion (no checkpoint server to configure). P2P-MPI is typically suited for organizations having spare individual computers linked by a high speed network, possibly running heterogeneous operating systems, and having Java applications. From a technical point of view, the framework has three layers: an infrastructure management layer at the bottom, a middleware layer containing the services, and the communication layer implementing an MPJ (Message Passing for Java) communication library at the top. We explain the design and the implementation of these layers, and we discuss the allocation strategy

