Results 1 - 10
of
13
Starfish: Fault-Tolerant Dynamic MPI Programs on Clusters of Workstations (Extended Abstract)
"... This paper reports on the architecture and design of Starfish, an environment for executing dynamic (and static) MPI-2 programs on a cluster of workstations. Starfish is unique in being efficient, faulttolerant, highly available, and dynamic as a system internally, and in supporting fault-tolerance ..."
Abstract
-
Cited by 84 (6 self)
- Add to MetaCart
This paper reports on the architecture and design of Starfish, an environment for executing dynamic (and static) MPI-2 programs on a cluster of workstations. Starfish is unique in being efficient, faulttolerant, highly available, and dynamic as a system internally, and in supporting fault-tolerance and dynamicity for its application programs as well. Starfish achieves these goals by combining group communication technology with checkpoint/restart, and uses a novel architecture that is both flexible and portable and keeps group communication outside the critical data path, for maximum performance.
A Load Balancing Framework for Adaptive and Asynchronous Applications
- IEEE Transactions on Parallel and Distributed Systems
, 2004
"... Abstract—This paper describes the design of a flexible load balancing framework and runtime software system for supporting the development of adaptive applications on distributed-memory parallel computers. The runtime system supports a global namespace, transparent object migration, automatic messag ..."
Abstract
-
Cited by 21 (11 self)
- Add to MetaCart
Abstract—This paper describes the design of a flexible load balancing framework and runtime software system for supporting the development of adaptive applications on distributed-memory parallel computers. The runtime system supports a global namespace, transparent object migration, automatic message forwarding and routing, and automatic load balancing. These features can be used at the discretion of the application developer in order to simplify program development and to eliminate complex bookkeeping associated with mobile data objects. An evaluation of this system in the context of a three-dimensional tetrahedral advancing front parallel mesh generator shows that overall runtime improvements of 15 percent compared to common stop-and-repartition load balancing methods, 30 percent compared to explicit intrusive load balancing methods, and 42 percent compared to no load balancing are possible on large processor configurations. At the same time, the overheads attributable to the runtime system are a fraction of 1 percent of the total runtime. The parallel advancing front method is a coarse-grained and highly adaptive application and therefore exercises all of the features of the runtime system. Index Terms—Dynamic load balancing, adaptive and irregular applications, runtime support software, multithreading, message passing, parallel, distributed, and grid computing, scientific computing, parallel mesh generation.
Parallel propositional satisfiability checking with distributed dynamic learning
- Parallel Computing
, 2003
"... We address the parallelization and distributed execution of an algorithm from the area of symbolic computation: propositional satisfiability (SAT) checking with dynamic learning. Our parallel programming models are strict multithreading for the core SAT checking procedure, complemented by mobile age ..."
Abstract
-
Cited by 17 (4 self)
- Add to MetaCart
We address the parallelization and distributed execution of an algorithm from the area of symbolic computation: propositional satisfiability (SAT) checking with dynamic learning. Our parallel programming models are strict multithreading for the core SAT checking procedure, complemented by mobile agents realizing a distributed dynamic learning process. Individual threads treat dynamically created subproblems, while mobile agents collect and distribute pertinent knowledge obtained during the learning process. The parallel algorithm runs on top of our parallel system platform DOTS (Distributed Object-Oriented Threads System), which provides support for our parallel programming models in highly heterogeneous distributed systems. We present performance measurements evaluating the performance gains by our approach in different application domains with practical significance. Key words: parallel symbolic computation, parallel propositional satisfiability checking, distributed multithreading 1
A.: An object-oriented platform for distributed high-performance Symbolic Computation
- Mathematics and Computers in Simulation 49
, 1999
"... We describe the Distributed Object-Oriented Threads System (DOTS), a programming environment designed to support object-oriented fork/join parallel programming in a heterogeneous distributed environment. A mixed network of Windows NT PC’s and UNIX workstations is transformed by DOTS into a homogeneo ..."
Abstract
-
Cited by 14 (9 self)
- Add to MetaCart
We describe the Distributed Object-Oriented Threads System (DOTS), a programming environment designed to support object-oriented fork/join parallel programming in a heterogeneous distributed environment. A mixed network of Windows NT PC’s and UNIX workstations is transformed by DOTS into a homogeneous pool of anonymous compute servers forming together a multicomputer. DOTS is a complete redesign of the Distributed Threads System (DTS) using the object-oriented paradigm both in its internal implementation and in the programming paradigm it supports. It has been used for the parallelization of applications in the field of computer algebra and in the field of computer graphics. We also give a brief account of applications in the domain of symbolic computation that were developed using DTS. Key words: distributed threads system, heterogeneous networks, Windows NT
Dynamic Layout of Distributed Applications
- In Proceedings of the 3rd International Software Architecture Workshop, in conjunction with ACM SIGSOFT'98
, 1998
"... We propose a novel capability for the architects of modern large-scale distributed applications | manipulating the location of their components at runtime. We argue that such \dynamic application layout" greatly elevates system scalability. Around this capability, we provide a model for dynamic layo ..."
Abstract
-
Cited by 8 (1 self)
- Add to MetaCart
We propose a novel capability for the architects of modern large-scale distributed applications | manipulating the location of their components at runtime. We argue that such \dynamic application layout" greatly elevates system scalability. Around this capability, we provide a model for dynamic layout programming that is separate from the programming of the application's logic. We show that this model improves both programming and system scalability. The FarGo system overviewed here realizes this model by providing a compiler and runtime support for layout programming, including component mobility and intercomponent references, automation and enforcement of relative co-location invariants, and monitoring facilities that allow to program the layout based on runtime information.
A multithreaded runtime environment with thread migration for a HPF data-parallel compiler
, 1998
"... This paper studies the benefits of compiling dataparallel languages onto a multithreaded runtime environment providing dynamic thread migration facilities. Each abstract process is mapped onto a thread, so that dynamic load balancing can be achieved by migrating threads among the processing nodes. W ..."
Abstract
-
Cited by 7 (2 self)
- Add to MetaCart
This paper studies the benefits of compiling dataparallel languages onto a multithreaded runtime environment providing dynamic thread migration facilities. Each abstract process is mapped onto a thread, so that dynamic load balancing can be achieved by migrating threads among the processing nodes. We describe and evaluate an implementation of this idea in the Adaptor HPF compiler. We show that no deep modification of the compiler are needed, and that the overhead of managing threads can be kept small. As an experimental validation, we report on an HPF implementation of the Gauss Partial Pivoting algorithm. We show that using an initial BLOCK data distribution with our dynamic load balancing scheme can reach the performance of the optimal CYCLIC distribution. Introduction Data-parallel languages are now recognized as major tools for high performance computing. Considerable effort has been put in designing sophisticated methods to compile them efficiently onto a variety of architectures...
Toward Real-Time Image Guided Neurosurgery Using Distributed and Grid Computing Abstract
"... Neurosurgical resection is a therapeutic intervention in the treatment of brain tumors. Precision of the resection can be improved by utilizing Magnetic Resonance Imaging (MRI) as an aid in decision making during Image Guided Neurosurgery (IGNS). Image registration adjusts pre-operative data accordi ..."
Abstract
-
Cited by 4 (3 self)
- Add to MetaCart
Neurosurgical resection is a therapeutic intervention in the treatment of brain tumors. Precision of the resection can be improved by utilizing Magnetic Resonance Imaging (MRI) as an aid in decision making during Image Guided Neurosurgery (IGNS). Image registration adjusts pre-operative data according to intra-operative tissue deformation. Some of the approaches increase the registration accuracy by tracking image landmarks through the whole brain volume. High computational cost used to render these techniques inappropriate for clinical applications. In this paper we present a parallel implementation of a state of the art registration method, and a number of needed incremental improvements. Overall, we reduced the response time for registration of an average dataset from about an hour and for some cases more than an hour to less than seven minutes, which is within the time constraints imposed by neurosurgeons. For the first time in clinical practice we demonstrated, that with the help of distributed computing non-rigid MRI registration based on volume tracking can be computed intra-operatively.
Millipede: a User-Level NT-Based Distributed Shared Memory System with Thread Migration and Dynamic Run-Time Optimization of Memory References
- Proc. of the USENIX Windows NT Workshop
, 1997
"... millipede is an all user mode, no kernel-patches, \add on " software tool for standard corporate environments, that takes advantage of idle system resources and e ciently utilizes idle processor time in available distributed environments of personal workstations. millipede presents to the user ..."
Abstract
-
Cited by 3 (1 self)
- Add to MetaCart
millipede is an all user mode, no kernel-patches, \add on " software tool for standard corporate environments, that takes advantage of idle system resources and e ciently utilizes idle processor time in available distributed environments of personal workstations. millipede presents to the user a powerful virtual parallel machine which abstracts away the underlying hardware con guration. In this way millipede supports mapping of the applications to dynamically varying levels of parallelism according to both changes in the underlying hardware and changes in the application requirements. millipede is multi-threaded, thus taking full advantage of SMPs. millipede provides a true distributed shared memory with several coherence protocols
Compiling Data-Parallel Programs to a Distributed Runtime Environment With Thread Isomigration
, 2000
"... The compilation of data-parallel languages is traditionally targeted to low-level runtime environments: abstract processors are mapped onto static system processes, which directly address the low-level communication library. Alternatively, we propose to map each HPF abstract processor onto a "lightw ..."
Abstract
-
Cited by 3 (1 self)
- Add to MetaCart
The compilation of data-parallel languages is traditionally targeted to low-level runtime environments: abstract processors are mapped onto static system processes, which directly address the low-level communication library. Alternatively, we propose to map each HPF abstract processor onto a "lightweight process" (thread) which can be dynamically migrated between nodes together with the data it manages, under the supervision of some external scheduler. We discuss the pros and cons of such an approach and the facilities which must be provided by the multithreaded runtime. We describe a prototype HPF compiling system built along these lines, based on the Adaptor HPF compiler and using the PM2 multithreaded runtime environment. Keywords: Parallel languages, load balancing, cluster of SMP, distributed multithreaded runtime, thread migration, HPF, Adaptor, MPI 1 Introduction Data-parallel languages are now recognized as major tools for high performance computing. Considerable effort has ...
Supporting Multiple Programming Paradigms for Distributed Clusters on top of a Single Virtual Parallel Machine -- The MILLIPEDE Concept
, 1997
"... In this paper we propose millipede: a small yet powerful interface for Virtual Parallel Machines (vpms) on top of distributed computing environments. millipede is a convenient environment for porting various existing parallel programming languages, for the design of new parallel programming language ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
In this paper we propose millipede: a small yet powerful interface for Virtual Parallel Machines (vpms) on top of distributed computing environments. millipede is a convenient environment for porting various existing parallel programming languages, for the design of new parallel programming languages, and for the development of parallel applications. millipede exhibit various novel features as well as some that were previously suggested in the literature. These include Distributed Shared Memory (dsm) along with a strong and flexible support for weak coherency protocols, and dynamic thread migration with a built-in run-time optimization for the locality of memory references. In particular, millipede defines a novel mechanism for inter-mobile-job communication, called mjec (millipede Job Event Control), which makes it easy to implement a variety of synchronization and communication methods. Systems that follow the mjec convention may support a diversity of programming paradigms, either ...

