Results 1 - 10
of
97
Nimrod: A Tool for Performing Parametised Simulations using Distributed Workstations
- 4th IEEE Symposium on High Performance Distributed Computing
, 1995
"... This paper discusses Nimrod, a tool for performing parametised simulations over networks of loosely coupled workstations. Using Nimrod the user interactively generates a parametised experiment. Nimrod then controls the distribution of jobs to machines and the collection of results. A simple graphica ..."
Abstract
-
Cited by 65 (26 self)
- Add to MetaCart
This paper discusses Nimrod, a tool for performing parametised simulations over networks of loosely coupled workstations. Using Nimrod the user interactively generates a parametised experiment. Nimrod then controls the distribution of jobs to machines and the collection of results. A simple graphical user interface which is built for each application allows the user to view the simulation in terms of their problem domain. The current version of Nimrod is implemented above OSF DCE and runs on DEC Alpha and IBM RS6000 workstations (including a 22 node SP2). Two different case studies are discussed as an illustration of the utility of the system. 1 INTRODUCTION A wide range of scientific and engineering experiments can be solved using numeric simulation. Examples include finite element analysis, computational fluid dynamics, electromagnetic and electronic simulation, pollution transport, granular flow and digital logic simulation. Accordingly, some very large codes have been written over ...
Experimental Assessment of Workstation Failures and Their Impact on Checkpointing Systems
- IN 28TH INTERNATIONAL SYMPOSIUM ON FAULT-TOLERANT COMPUTING
, 1997
"... In the past twenty years, there has been a wealth of theoretical research on minimizing the expected running time of a program in the presence of failures by employing checkpointing and rollback recovery. In the same time period, there has been little experimental research to corroborate these resul ..."
Abstract
-
Cited by 42 (5 self)
- Add to MetaCart
In the past twenty years, there has been a wealth of theoretical research on minimizing the expected running time of a program in the presence of failures by employing checkpointing and rollback recovery. In the same time period, there has been little experimental research to corroborate these results. In this paper, we study the results of three separate projects that monitor failure in workstation networks. Our goals are twofold. The first is to see how these results correlate with the theoretical results, and the second is to assess their impact on strategies for checkpointing long-running computations on workstations and networks of workstations. A surprising result of our work is that although the base assumptions of the theoretical research do not hold, many of the results are still applicable.
BONITA: A set of tuple space primitives for distributed coordination
, 1997
"... In the last few years the use of distributed structured shared memory paradigms for coordination between parallel processes has become common. One of the most well known implementations of this paradigm is the shared tuple space model (as used in Linda). In this paper we describe a new set of primit ..."
Abstract
-
Cited by 36 (6 self)
- Add to MetaCart
In the last few years the use of distributed structured shared memory paradigms for coordination between parallel processes has become common. One of the most well known implementations of this paradigm is the shared tuple space model (as used in Linda). In this paper we describe a new set of primitives for fully distributed coordination of processes and agents using tuple spaces, called the Bonita primitives. The Linda primitives provide synchronous access to tuple spaces, whereas the Bonita primitives provide asynchronous access to tuple spaces. The proposed primitives are able to mimic the Linda primitives, therefore providing the ease of use and expressibility of Linda together with a number of advantages for the coordination of agents or processes in distributed environments. The primitives allow user processes to perform computation concurrently with tuple space accesses, and provide new coordination constructs which lead to more efficient programs. In this paper we present the ...
Scalable Networked Information Processing Environment (SNIPE)
- in Proceedings of SuperComputing '97
, 1997
"... SNIPE is a metacomputing system that aims to provide a reliable, secure, fault-tolerant environment for long-term distributed computing applications and data stores across the global InterNet. This system combines global naming and replication of both processing and data to support large scale infor ..."
Abstract
-
Cited by 32 (10 self)
- Add to MetaCart
SNIPE is a metacomputing system that aims to provide a reliable, secure, fault-tolerant environment for long-term distributed computing applications and data stores across the global InterNet. This system combines global naming and replication of both processing and data to support large scale information processing applications leading to better availablity and reliability than currently available with typical cluster computing and/or distributed computer environments. Keywords: SNIPE, RCDS, MetaComputing, scalable, secure, reliable Acknowledgements This work was supported in part by the Office of Scientific Computing, U.S. Department of Energy, under Contract DE-AC05-96OR22464, by DARPA under Contract DAAH 04-95-1-0595, and by the National Science Foundation's Center for Research on Parallel Computation, Science and Technology Center Cooperative Agreement No. CCR-8809615. 1. Introduction The beginning of the 21st century will present new challenges for large-scale applications i...
An Efficient Distributed Tuple Space Implementation for Networks of Heterogenous Workstations
, 1996
"... The distributed tuple space concept, on which the Linda process coordination model is founded, has given rise to several implementations on parallel machines and networks of heterogenous workstations. However, the fundamental techniques used in there systems have remained largely unchanged from the ..."
Abstract
-
Cited by 30 (12 self)
- Add to MetaCart
The distributed tuple space concept, on which the Linda process coordination model is founded, has given rise to several implementations on parallel machines and networks of heterogenous workstations. However, the fundamental techniques used in there systems have remained largely unchanged from the original Linda implementations. This paper describes a novel implementation which, using extensions to the original Linda model and recently developed bulk access primitives for tuple spaces, is able to demonstrate 10 to 70 times speed improvements over the best available commercial system. This is achieved dynamically without any compile time optimisations.
Resource Management and Checkpointing for PVM
- IN PROCEEDINGS OF THE 2ND EUROPEAN PVM USERS' GROUP MEETING
, 1995
"... Checkpoints cannot only be used to increase fault tolerance, but also to migrate processes. The migration is particularly useful in workstation environments where machines become dynamically available and unavailable. We introduce the CoCheck environment which not only allows the creation of chec ..."
Abstract
-
Cited by 28 (7 self)
- Add to MetaCart
Checkpoints cannot only be used to increase fault tolerance, but also to migrate processes. The migration is particularly useful in workstation environments where machines become dynamically available and unavailable. We introduce the CoCheck environment which not only allows the creation of checkpoints, but also provides process migration. The creation of checkpoints of PVM applications is explained and we show how this service can be used in a resource manager.
The Nimrod Computational Workbench: A Case Study in Desktop Metacomputing
, 1997
"... The coordinated use of geographically distributed computers, or metacomputing, can in principle provide more accessible and cost-effective supercomputing than do conventional highperformance systems. However, we lack evidence that metacomputing systems can be made easily usable or that large numbers ..."
Abstract
-
Cited by 22 (12 self)
- Add to MetaCart
The coordinated use of geographically distributed computers, or metacomputing, can in principle provide more accessible and cost-effective supercomputing than do conventional highperformance systems. However, we lack evidence that metacomputing systems can be made easily usable or that large numbers of applications are able to exploit metacomputing resources. In this article, we present work that addresses both these concerns. The basis for this work is a system called Nimrod that provides a desktop problemsolving environment for parametric experiments. We describe how Nimrod has been extended to support the scheduling of computational resources located in a wide-area environment and report Proceedings of the 20th Australasian Computer Science Conference, Sydney, Australia, February 5--7 1997. on an experiment in which Nimrod was used to schedule a large parametric study across the Australian Internet. The experiment provided both new scientific results and insights into Nimrod capabi...
A Network Genetic Algorithm for Concept Learning
- Proceedings of the Sixth International Conference on Genetic Algorithms
, 1997
"... This paper presents a highly parallel genetic algorithm, designed for concept induction in propositional and first order logics. The system exploits niches and species for learning multimodal concepts; it deeply differs from other systems because of the distributed architecture, which totally elimin ..."
Abstract
-
Cited by 22 (3 self)
- Add to MetaCart
This paper presents a highly parallel genetic algorithm, designed for concept induction in propositional and first order logics. The system exploits niches and species for learning multimodal concepts; it deeply differs from other systems because of the distributed architecture, which totally eliminates the concept of common memory. A first implementation of the system, designed for checking the possibility of exploiting parallel processing in network computer, is evaluated on standard benchmarks. The experimental results show that the system reaches good performances both with respect to the quality of the learned knowledge and with respect to the speed up on a workstation cluster. 1 INTRODUCTION In the recent literature, Genetic Algorithms (GAs) emerged as valuable search tools in the field of concept induction (De Jong et al., 1993; Janikow, 1993; Greene & Smith, 1993; Giordana & Neri, 1996). Specifically, two of their features look particularly attractive, namely the exploration p...
Designing Parallel Programs by the Graphical Language GRAPNEL
, 1996
"... We propose a new visual programming language, called GRAPNEL (GRAphical Process's NEt Language), for designing distributed parallel programs based on the message passing programming paradigm. GRAPNEL is a high level graphical interface for creating distributed applications, and can be useful for bot ..."
Abstract
-
Cited by 22 (7 self)
- Add to MetaCart
We propose a new visual programming language, called GRAPNEL (GRAphical Process's NEt Language), for designing distributed parallel programs based on the message passing programming paradigm. GRAPNEL is a high level graphical interface for creating distributed applications, and can be useful for both non-professional and professional programmers dealt with parallel programming. GRAPNEL provides high level abstraction mechanisms in order to support the structured design at level of processes. These mechanisms include the Process Group abstraction and the automatic generation of several regular process topology based on predefined topology templates. Dynamic process creation and destruction are possible, but can be applied only in a well structured manner. GRAPNEL is a hybrid language, where the communication related parts of the program are described using graphical symbols but textual descriptions are applied where they are more appropriate. It makes possible to incorporate large ordin...
An Overview of Message Passing Environments
- Parallel Computing
, 1994
"... A majority of the MPP systems designed to date have been MIMD distributed memory systems. For almost all of these systems, message passing environments have provided the primary mechanism for programming multiprocessor applications. In this paper we provide an introduction to MPP systems in general. ..."
Abstract
-
Cited by 21 (0 self)
- Add to MetaCart
A majority of the MPP systems designed to date have been MIMD distributed memory systems. For almost all of these systems, message passing environments have provided the primary mechanism for programming multiprocessor applications. In this paper we provide an introduction to MPP systems in general. We then introduce current MPP message passing interfaces, by tracing their historical development over the last 10 years. In addition to their use within a single MPP architecture, we discuss the use of message passing systems to interconnect more loosely coupled processors in heterogeneous environments. Finally we review the development of "portability platforms" - message passing systems that have been devised solely to allow portability of message passing programs between different systems. * Research supported in part by NSF Grand Challenges Applications Group grant ASC-9217394 and by NASA HPCC Group Grant NAG5-2218. + To appear in Parallel Computing, April 1994. TABLE OF CONTENTS ...

