Results 1 - 10
of
31
Ibis: A Flexible and Efficient Java-based Grid Programming Environment
- Concurrency & Computation: Practice & Experience
, 2005
"... In computational grids, performance-hungry applications need to simultaneously tap the computational power of multiple, dynamically available sites. The crux of designing grid programming environments stems exactly from the dynamic availability of compute cycles: grid programming environments (a) ne ..."
Abstract
-
Cited by 45 (15 self)
- Add to MetaCart
In computational grids, performance-hungry applications need to simultaneously tap the computational power of multiple, dynamically available sites. The crux of designing grid programming environments stems exactly from the dynamic availability of compute cycles: grid programming environments (a) need to be portable to run on as many sites as possible, (b) they need to be flexible to cope with different network protocols and dynamically changing groups of compute nodes, while (c) they need to provide efficient (local) communication that enables high-performance computing in the first place. Existing programming environments are either portable (Java), or they are flexible (Jini, Java RMI), or they are highly efficient (MPI). No system combines all three properties that are necessary for grid computing. In this paper, we present Ibis, a new programming environment that combines Java’s “run everywhere ” portability both with flexible treatment of dynamically available networks and processor pools, and with highly efficient, object-based communication. Ibis can transfer Java objects very efficiently by combining streaming object serialization with a zero-copy protocol. Using RMI as a simple test case, we show that Ibis outperforms existing RMI implementations, achieving up to 9 times higher throughputs with trees of objects. 1
Satin: Simple and efficient java-based grid programming
- In AGridM 2003 Workshop on Adaptive Grid Middleware
, 2005
"... Grid programming environments need to be both portable and efficient to exploit the computational power of dynamically available resources. In previous work, we have presented the divide-and-conquer based Satin model for parallel computing on clustered wide-area systems. In this paper, we present th ..."
Abstract
-
Cited by 29 (9 self)
- Add to MetaCart
Grid programming environments need to be both portable and efficient to exploit the computational power of dynamically available resources. In previous work, we have presented the divide-and-conquer based Satin model for parallel computing on clustered wide-area systems. In this paper, we present the Satin implementation on top of our new Ibis platform which combines Java’s write once, run everywhere with efficient communication between JVMs. We evaluate Satin/Ibis on the testbed of the EU-funded GridLab project, showing that Satin’s load-balancing algorithm automatically adapts both to heterogeneous processor speeds and varying network performance, resulting in efficient utilization of the computing resources. Our results show that when the wide-area links suffer from congestion, Satin’s load-balancing algorithm can still achieve around 80 % efficiency, while an algorithm that is not grid aware drops to 26 % or less. 1.
Ibis: an efficient Java-based Grid programming environment
- in Joint ACM Java Grande - ISCOPE 2002 Conference
, 2002
"... rob,jason,rutger,kielmann,bal ¡ ..."
Load Balancing of Autonomous Actors over Dynamic Networks
- In Proceedings of the Hawaii International Conference on System Sciences, HICSS-37 Software Technology Track
, 2004
"... The Internet is constantly growing as a ubiquitous platform for high-performance distributed computing. In this paper, we propose a new software framework for distributed computing over large scale dynamic and heterogeneous systems. Our framework wraps computation into autonomous actors, self organi ..."
Abstract
-
Cited by 24 (11 self)
- Add to MetaCart
The Internet is constantly growing as a ubiquitous platform for high-performance distributed computing. In this paper, we propose a new software framework for distributed computing over large scale dynamic and heterogeneous systems. Our framework wraps computation into autonomous actors, self organizing computing entities, which freely roam over the network to find their optimal target execution environments. We introduce the architecture of our worldwide computing framework, which consists of an actor-oriented programming language (SALSA), a distributed run time environment (WWC), and a middleware infrastructure for autonomous reconfiguration and load balancing (IO). Load balancing is completely transparent to application programmers. The middleware triggers actor migration based on profiling resources in a completely decentralized manner. Our infrastructure also allows for the dynamic addition and removal of nodes from the computation, while continuously balancing the load given the changing resources. To balance computational load, we introduce three variations of random work stealing: load-sensitive (RS), actor topology-sensitive (ARS), and network topology-sensitive (NRS) random stealing. We evaluated RS and ARS with several actor interconnection topologies in a local area network. While RS performed worse than static round-robin (RR) actor placement, ARS outperformed both RS and RR in the sparse connectivity and hypercube connectivity tests, by a full order of magnitude. 1
Programming Environments for High-Performance Grid Computing: the Albatross Project
, 2002
"... The aim of the Albatross project is to study applications and programming environments for computational Grids. We focus on high performance applications, running in parallel on multiple clusters or MPPs that are connected by wide-area networks (WANs). We briefly present three Grid programming envir ..."
Abstract
-
Cited by 12 (2 self)
- Add to MetaCart
The aim of the Albatross project is to study applications and programming environments for computational Grids. We focus on high performance applications, running in parallel on multiple clusters or MPPs that are connected by wide-area networks (WANs). We briefly present three Grid programming environments developed in the context of the Albatross project: the MagPIe library for collective communication with MPI, the Replicated Method Invocation mechanism for Java (RepMI), and the Java-based Satin system for running divide-and-conquer programs on Grid platforms.
RMIX: A multiprotocol RMI framework for Java
- In Java Parallel Distributed Computing Workshop
, 2003
"... With the increasing adoption of Java for parallel and distributed computing, there is a strong motivation for enhancing the expressive elegance of the RMI paradigm with flexible and adaptable communication substrates. Java RMI is an especially powerful and semantically comprehensive framework for di ..."
Abstract
-
Cited by 11 (4 self)
- Add to MetaCart
With the increasing adoption of Java for parallel and distributed computing, there is a strong motivation for enhancing the expressive elegance of the RMI paradigm with flexible and adaptable communication substrates. Java RMI is an especially powerful and semantically comprehensive framework for distributed Java applications – but the default Java RMI implementation is bound to a concrete wire protocol, JRMP, that is neither interoperable nor very efficient. To address the first issue, libraries have been proposed that provide RMI semantics over different wire protocols such as SOAP or IIOP, making Java interoperable with Web Services and CORBA. Similarly, alternative high performance RMI implementations have been developed. However, none of these solutions are designed to work cooperatively, and each imposes specific constraints on developers. This paper describes RMIX: an RMI framework that supports a variety of dynamically pluggable wire transports underlying a common and uniform RMI facade. RMIX facilitates dynamic protocol negotiation in loosely coupled parallel and distributed systems, and enables the development and deployment of applications that are multiprotocol by nature. Additionally, RMIX offers some enhancements to RMI semantics that are particularly useful in multiuser environments. We describe the design and preliminary implementation of RMIX, present two prototype protocol providers based on the JRMP and SOAP protocols, and outline a transition path from legacy RMI applications to RMIX. 1.
Fault-tolerance, Malleability and Migration for Divide-and-Conquer Applications on the Grid
- In Proc. of 19th International Parallel and Distributed Processing Symposium
, 2005
"... Grid applications have to cope with dynamically changing computing resources as machines may crash or be claimed by other, higher-priority applications. In this paper, we propose a mechanism that enables fault-tolerance, malleability (e.g. the ability to cope with a dynamically changing number of pr ..."
Abstract
-
Cited by 10 (5 self)
- Add to MetaCart
Grid applications have to cope with dynamically changing computing resources as machines may crash or be claimed by other, higher-priority applications. In this paper, we propose a mechanism that enables fault-tolerance, malleability (e.g. the ability to cope with a dynamically changing number of processors) and migration for divide-andconquer applications on the Grid. The novelty of our approach is restructuring the computation tree which eliminates redundant computation and salvages partial results computed by the processors leaving the computation. This enables the applications to adapt to dynamically changing numbers of processors and to migrate the computation without loss of work. Our mechanism is easy to implement and deploy in grid environment. The overhead it incurrs is close to zero. We have implemented our mechanism in the Satin system. We have evaluated the performance of our system on the DAS-2 wide-are system and on the testbed of the European GridLab project. 1.
Adaptive Allocation of Independent Tasks to Maximize Throughput
"... www.library.drexel.edu The following item is made available as a courtesy to scholars by the author(s) and Drexel University Library and may contain materials and content, including computer code and tags, artwork, text, graphics, images, and illustrations (Material) which may be protected by copyri ..."
Abstract
-
Cited by 10 (0 self)
- Add to MetaCart
www.library.drexel.edu The following item is made available as a courtesy to scholars by the author(s) and Drexel University Library and may contain materials and content, including computer code and tags, artwork, text, graphics, images, and illustrations (Material) which may be protected by copyright law. Unless otherwise noted, the Material is made available for non profit and educational purposes, such as research, teaching and private study. For these limited purposes, you may reproduce (print, download or make copies) the Material without prior permission. All copies must include any copyright notice originally included with the Material. You must seek permission from the authors or copyright owners for all uses that are not allowed by fair use and other provisions of the U.S. Copyright Law. The responsibility for making an independent legal assessment and securing any necessary permission rests with persons desiring to reproduce or use the Material.
Hierarchical masterworker skeletons
, 2007
"... Abstract. Master-worker systems are a well-known and often applicable scheme for the parallel evaluation of a pool of tasks, a work pool.The system consists of a master process managing a set of worker processes. After an initial phase with a fixed amount of tasks for each worker, further tasks are ..."
Abstract
-
Cited by 6 (4 self)
- Add to MetaCart
Abstract. Master-worker systems are a well-known and often applicable scheme for the parallel evaluation of a pool of tasks, a work pool.The system consists of a master process managing a set of worker processes. After an initial phase with a fixed amount of tasks for each worker, further tasks are distributed in reply to results sent back by the workers. As this setup quickly leads to a bottleneck in the master process, the paper investigates techniques for hierarchically nesting the basic master-worker scheme. We present implementations of hierarchical master-worker skeletons, and how to automatically calculate parameters of the nested skeleton for good performance. Nesting master-worker systems is nontrivial especially in cases where newtasksaredynamicallycreatedfrompreviousresults(typicallybreadthordepth-firsttreesearchalgorithms).Wediscusshowtohandledynamically growing pools in a hierarchy and present a declarative implementation for nested master-worker systems with dynamic task creation. The skeletons are experimentally evaluated with two typical test programs. We analyse their runtime behaviour and the effects of different hierarchies on runtimes via trace visualisations. 1
Fault-tolerant Scheduling of Fine-grained Tasks in Grid Environments
- International Journal of High Performance Applications
"... Divide-and-conquer is a well-suited programming paradigm for parallel Grid applications. Our Satin system efficiently schedules the fine-grained tasks of a divide-andconquer application across multiple clusters in a grid. To accommodate long-running applications, we present a fault-tolerance mechani ..."
Abstract
-
Cited by 5 (0 self)
- Add to MetaCart
Divide-and-conquer is a well-suited programming paradigm for parallel Grid applications. Our Satin system efficiently schedules the fine-grained tasks of a divide-andconquer application across multiple clusters in a grid. To accommodate long-running applications, we present a fault-tolerance mechanism for Satin that has negligible overhead during normal execution, while minimizing the amount of redundant work done after a crash of one or more nodes. We study the impact of our fault-tolerance mechanism on application efficiency, both on the Dutch DAS-2 system and using the European testbed of the ECfunded project GridLab.

