Results 11 - 20
of
31
Scalable Work Stealing ∗
"... Irregular and dynamic parallel applications pose significant challenges to achieving scalable performance on large-scale multicore clusters. These applications often require ongoing, dynamic load balancing in order to maintain efficiency. Scalable dynamic load balancing on large clusters is a challe ..."
Abstract
-
Cited by 5 (1 self)
- Add to MetaCart
Irregular and dynamic parallel applications pose significant challenges to achieving scalable performance on large-scale multicore clusters. These applications often require ongoing, dynamic load balancing in order to maintain efficiency. Scalable dynamic load balancing on large clusters is a challenging problem which can be addressed with distributed dynamic load balancing systems. Work stealing is a popular approach to distributed dynamic load balancing; however its performance on large-scale clusters is not well understood. Prior work on work stealing has largely focused on shared memory machines. In this work we investigate the design and scalability of work stealing on modern distributed memory systems. We demonstrate high efficiency and low overhead when scaling to 8,192 processors for three benchmark codes: a producer-consumer benchmark, the unbalanced tree search benchmark, and a multiresolution analysis kernel.
Experiences Deploying Parallel Applications on a Large-scale Grid
- In Proc. of EXPGRID - Experimental Grid Testbeds for the Assessment of Large-scale Distributed Applications and Tools. Workshop in conjunction with (HPDC-15
, 2006
"... Abstract — We describe our experiences with integrating several Grid software components into a single coherent system that is used to write and run parallel applications on the Grid. The integrated components are the Grid Application Toolkit (GAT), ProActive, Satin and Ibis. We experimented with th ..."
Abstract
-
Cited by 4 (2 self)
- Add to MetaCart
Abstract — We describe our experiences with integrating several Grid software components into a single coherent system that is used to write and run parallel applications on the Grid. The integrated components are the Grid Application Toolkit (GAT), ProActive, Satin and Ibis. We experimented with this (Javabased) system by participating in the N-Queens contest of the Grids@work event in October 2005. In addition to integrating available components, we wrote a ProActive plugin for the GAT, a parallel N-Queens solver, and an application to manage Grid deployment of N-Queens. We identified several connectivity issues and scalability problems in the components we use. We show how we modified some of the components to solve of these problems. We successfully ran experiments on 960 processors across Grid’5000, with a parallel efficiency of around 85%, winning the prize for the largest number of nodes deployed during the contest.
Satin: a High-Level and Efficient Grid Programming Model
"... Computational grids have an enormous potential to provide compute power. However, this power remains largely unexploited today for most applications, except trivially parallel programs. Developing parallel grid applications simply is too difficult. Grids introduce several problems not encountered be ..."
Abstract
-
Cited by 4 (3 self)
- Add to MetaCart
Computational grids have an enormous potential to provide compute power. However, this power remains largely unexploited today for most applications, except trivially parallel programs. Developing parallel grid applications simply is too difficult. Grids introduce several problems not encountered before, mainly due to the highly heterogeneous and dynamic computing and networking environment. Furthermore, failures occur frequently, and resources may be claimed by higher priority jobs at any time. In this paper, we solve these problems for an important class of applications: divide-andconquer. We introduce a system called Satin that simplifies the development of parallel grid applications by providing a rich high-level programming model that completely hides communication. All grid issues are transparently handled in the run time system, not by the programmer. Satin’s programming model is based on Java, features spawn-sync primitives and shared objects, and uses asynchronous exceptions and an abort mechanism to support speculative parallelism. To allow an efficient implementation, Satin consistently exploits the idea that grids are hierarchically structured. Dynamic load-balancing is done with a novel cluster-aware scheduling algorithm that hides the long wide-area latencies by overlapping them with useful local work.
On the Benefit of Processor Co-Allocation in Multicluster Grid Systems
- IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS
"... In multicluster grid systems, parallel applications may benefit from processor co-allocation, that is, the simultaneous allocation of processors in multiple clusters. Although co-allocation allows the allocation of more processors than available in a single cluster, it may severely increase the exec ..."
Abstract
-
Cited by 4 (2 self)
- Add to MetaCart
In multicluster grid systems, parallel applications may benefit from processor co-allocation, that is, the simultaneous allocation of processors in multiple clusters. Although co-allocation allows the allocation of more processors than available in a single cluster, it may severely increase the execution time of applications due to the relatively slow widearea communication. The aim of this paper is to investigate the benefit of co-allocation in multicluster grid systems, despite this drawback. To this end, we have conducted experiments in a real multicluster grid environment, as well as in a simulated environment, and we evaluate the performance of co-allocation for various applications that range from computation-intensive to communication-intensive and for various system load settings. In addition, we compare the performance of scheduling policies that are specifically designed for co-allocation. We demonstrate that considering latency in the resource selection phase improves the performance of co-allocation, especially for communicationintensive parallel applications.
Dynamically reconfigurable scientific computing on large-scale heterogeneous grids
- Proc. Parallel Processing and Applied Mathematics, Czestochowa
, 2003
"... Abstract. Many scientific applications require computational capabilities not easily supported by current computing environments. We propose a scalable computing environment based on autonomous actors. In this approach, a wide range of computational resources, ranging from clusters to desktops and l ..."
Abstract
-
Cited by 3 (2 self)
- Add to MetaCart
Abstract. Many scientific applications require computational capabilities not easily supported by current computing environments. We propose a scalable computing environment based on autonomous actors. In this approach, a wide range of computational resources, ranging from clusters to desktops and laptops, can run an application programmed using actors as program components in an actor language: SALSA. SALSA actors have the ability to execute autonomously in dynamically reconfigurable computing environments. We develop the corresponding “Internet Operating system ” (IO) to address run-time middleware issues such as permanent storage for results produced by actors, inter-actor communication and synchronization, and fault-tolerance in a manner transparent to the end-user. We are using this worldwide computing software infrastructure to solve a long outstanding problem in particle physics: the missing baryons, originally identified over thirty years ago. 1
H.: Supporting reconfigurable parallel multimedia applications
- In: Euro-Par’06. (2006) 765–776
, 2006
"... Abstract. Programming multimedia applications for System-on-Chip (SoC) architectures is difficult because streaming communication, user event handling, reconfiguration, and parallelism have to be dealt with. We present Hinch, a runtime system for multimedia applications, that efficiently exploits pa ..."
Abstract
-
Cited by 3 (2 self)
- Add to MetaCart
Abstract. Programming multimedia applications for System-on-Chip (SoC) architectures is difficult because streaming communication, user event handling, reconfiguration, and parallelism have to be dealt with. We present Hinch, a runtime system for multimedia applications, that efficiently exploits parallelism by running the application in a dataflow style. The application has to be implemented as components that communicate using streams. Reconfigurability is supported by a generic component interface. Measurements have been performed on a Space-Cake SoC architecture simulator. Hinch can easily be ported to other sharedmemory architectures. 1
Mobility and security in worldwide computing
- In Proceedings of the 9th ECOOP Workshop on Mobile Object Systems
, 2003
"... Modern distributed computing requires a secure framework capable of free code mobility. In this paper, we present a simple lambda-based actor language with extensions for mobility and security, as well as the operational semantics to reason about these topics in distributed systems. Finally, we desc ..."
Abstract
-
Cited by 3 (1 self)
- Add to MetaCart
Modern distributed computing requires a secure framework capable of free code mobility. In this paper, we present a simple lambda-based actor language with extensions for mobility and security, as well as the operational semantics to reason about these topics in distributed systems. Finally, we describe our preliminary implementation results. 1.
The Virtual Instrument: support for grid-enabled Mcell Simulations
- Int J High Perform Computing Appl
"... Ensembles of widely distributed, heterogeneous resources, or Grids, have emerged as popular platforms for largescale scientific applications. In this paper we present the Virtual Instrument project, which provides an integrated application execution environment that enables end-users to run and inte ..."
Abstract
-
Cited by 3 (1 self)
- Add to MetaCart
Ensembles of widely distributed, heterogeneous resources, or Grids, have emerged as popular platforms for largescale scientific applications. In this paper we present the Virtual Instrument project, which provides an integrated application execution environment that enables end-users to run and interact with running scientific simulations on Grids. This work is performed in the specific context of MCell, a computational biology application. While MCell provides the basis for running simulations, its capabilities are currently limited in terms of scale, ease-of-use, and interactivity. These limitations preclude usage scenarios that are critical for scientific advances. Our goal is to create a scientific “Virtual Instrument ” from MCell by allowing its users to transparently access Grid resources while being able to steer running simulations. In this paper, we motivate the Virtual Instrument project and discuss a number of relevant issues and accomplishments in the area of Grid software development and application scheduling. We then describe our software design and report on the current implementation. We verify and evaluate our design via experiments with MCell on a real-world Grid testbed. Key words: grid computing, computational neuroscience
Persistent Fault-tolerance for Divide-and-Conquer Applications on the Grid
"... Abstract. Grid applications need to be fault tolerant, malleable, and migratable. In previous work, we have presented orphan saving, an efficient mechanism addressing these issues for divide-and-conquer applications. In this paper, we present a mechanism for writing partial results to checkpoint fil ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
Abstract. Grid applications need to be fault tolerant, malleable, and migratable. In previous work, we have presented orphan saving, an efficient mechanism addressing these issues for divide-and-conquer applications. In this paper, we present a mechanism for writing partial results to checkpoint files, adding the capability to also tolerate the total loss of all processors, and to allow suspending and later resuming an application. Both mechanisms have only negligible overheads in the absence of faults. In the case of faults, the new checkpointing mechanism outperforms orphan saving by 10 % to 15 %. Also, suspending/resuming an application has only little overhead, making our approach very attractive for writing grid applications. 1
SP@CE- An SP-based Programming Model for Consumer Electronics Streaming Applications ⋆
"... Abstract. Consumer Electronics (CE) devices are becoming the favorite target platforms for multimedia streaming applications, but finding the right solutions for efficient programming, both in terms of development time and application performance is not trivial. In this context, we present ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
Abstract. Consumer Electronics (CE) devices are becoming the favorite target platforms for multimedia streaming applications, but finding the right solutions for efficient programming, both in terms of development time and application performance is not trivial. In this context, we present

