Results 1 - 10
of
51
Exploiting Process Lifetime Distributions for Dynamic Load Balancing
- ACM Transactions on Computer Systems
, 1996
"... We measure the distribution of lifetimes for UNIX processes and propose a functional form that fits this distribution well. We use this functional form to derive a policy for preemptive migration, and then use a trace-driven simulator to compare our proposed policy with other preemptive migration po ..."
Abstract
-
Cited by 290 (30 self)
- Add to MetaCart
We measure the distribution of lifetimes for UNIX processes and propose a functional form that fits this distribution well. We use this functional form to derive a policy for preemptive migration, and then use a trace-driven simulator to compare our proposed policy with other preemptive migration policies, and with a non-preemptive load balancing strategy. We find that, contrary to previous reports, the performance benefits of preemptive migration are significantly greater than those of non-preemptive migration, even when the memorytransfer cost is high. Using a model of migration costs representative of current systems, we find that preemptive migration reduces the mean delay (queueing and migration) by 35 -- 50%, compared to non-preemptive migration. 1 Introduction Most systems that perform load balancing use remote execution (i.e. non-preemptive migration) based on a priori knowledge of process behavior, often in the form of a list of process names eligible for migration. Althoug...
Sumatra: A Language for Resource-aware Mobile Programs
, 1997
"... . Programs that use mobility as a mechanism to adapt to resource changes have three requirements that are not shared with other mobile programs. First, they need to monitor the level and quality of resources in their operating environment. Second, they need to be able to react to changes in resource ..."
Abstract
-
Cited by 115 (2 self)
- Add to MetaCart
. Programs that use mobility as a mechanism to adapt to resource changes have three requirements that are not shared with other mobile programs. First, they need to monitor the level and quality of resources in their operating environment. Second, they need to be able to react to changes in resource availability. Third, they need to be able to control the way in which resources are used on their behalf (by libraries and other support code). In this chapter, we describe the design and implementation of Sumatra, an extension of Java that supports resourceaware mobile programs. We also describe the design and implementation of a distributed resource monitor that provides the information required by Sumatra programs. 1 Introduction Mobile programs can move an active thread of control from one site to another during execution. This flexibility has many potential advantages. For example, a program that searches distributed data repositories can improve its performance by migrating to the re...
Network-aware Mobile Programs
- In Proceedings of the 1997 USENIX Technical Conference
, 1997
"... In this paper, we investigate network-aware mobile programs, programs that can use mobility as a tool to adapt to variations in network characteristics. We present infrastructural support for mobility and network monitoring and show how adaptalk, a Java-based mobile Internet chat application can tak ..."
Abstract
-
Cited by 70 (6 self)
- Add to MetaCart
In this paper, we investigate network-aware mobile programs, programs that can use mobility as a tool to adapt to variations in network characteristics. We present infrastructural support for mobility and network monitoring and show how adaptalk, a Java-based mobile Internet chat application can take advantage of this support to dynamically place the chat server so as to minimize response time. Our conclusion was that on-line network monitoring and adaptive placement of shared data-structures can significantly improve performance of distributed applications on the Internet. 1
Process migration
- ACM Computing Surveys
, 2000
"... A process is an operating system abstraction representing an instance of a running computer program. Process migration is the act of transferring a process between two machines during its execution. Several implementations ..."
Abstract
-
Cited by 62 (1 self)
- Add to MetaCart
A process is an operating system abstraction representing an instance of a running computer program. Process migration is the act of transferring a process between two machines during its execution. Several implementations
MIST: PVM with Transparent Migration and Checkpointing
- In 3rd Annual PVM Users' Group Meeting
, 1995
"... We are currently involved in research to enable PVM to take advantage of shared networks of workstations (NOWs) more effectively. In such a computing environment, it is important to utilize workstations unobtrusively and recover from machine failures. Towards this goal, we have enhanced PVM with tra ..."
Abstract
-
Cited by 36 (0 self)
- Add to MetaCart
We are currently involved in research to enable PVM to take advantage of shared networks of workstations (NOWs) more effectively. In such a computing environment, it is important to utilize workstations unobtrusively and recover from machine failures. Towards this goal, we have enhanced PVM with transparent task migration, checkpointing, and global scheduling. These enhancements are part of the MIST project which takes an open systems approach in developing a cohesive, distributed parallel computing environment. This open systems approach promotes plug-and-play integration of independently developed modules, such as Condor, DQS, AVS, Prospero, XPVM, PIOUS, Ptools, etc. Transparent task migration, in conjunction with a global scheduler, facilitates the use of shared NOWs by allowing parallel jobs to unobtrusively utilize nodes that are currently unused. PVM tasks can be moved onto nodes that are otherwise idle, and moved off when the node is no longer free. Experiments show that migrati...
A Performance Oriented Migration Framework For The Grid
, 2003
"... At least three factors in the existing migrating systems make them less suitable in Grid systems especially when the goal is to improve the response times for individual applications - separate policies for suspension and migration of executing applications employed by these migration systems, the u ..."
Abstract
-
Cited by 33 (2 self)
- Add to MetaCart
At least three factors in the existing migrating systems make them less suitable in Grid systems especially when the goal is to improve the response times for individual applications - separate policies for suspension and migration of executing applications employed by these migration systems, the use of pre-defined conditions for suspension and migration and the lack of knowledge of the remaining execution time of the applications. In this paper we describe a migration framework for performance oriented Grid systems that implements tightly coupled policies for both suspension and migration of executing applications. The suspension and migration policies take into account both the load changes on systems as well the remaining execution times of the applications thereby taking into account both system load and application characteristics. The main goal of our migration framework is to improve the response times for individual applications. We also present some results that demonstrate the usefulness of our migrating system.
Process Hijacking
, 1999
"... Process checkpointing is a basic mechanism required for providing High Throughput Computing service on distributively owned resources. We present a new process checkpoint and migration technique, called process hijacking, that uses dynamic program re-writing techniques to add checkpointing capabilit ..."
Abstract
-
Cited by 31 (7 self)
- Add to MetaCart
Process checkpointing is a basic mechanism required for providing High Throughput Computing service on distributively owned resources. We present a new process checkpoint and migration technique, called process hijacking, that uses dynamic program re-writing techniques to add checkpointing capability to a running program. Process hijacking makes it possible to checkpoint and migrate proprietary applications that cannot be re-linked with a checkpoint library, and it makes it possible to dynamically hand off an ordinary running process to a distributed resource management system such as Condor. We discuss the problems of adding checkpointing capability to a program already in execution: (1) loading new code into the running process, and (2) replacing functions of the process with calls to dynamically loaded functions. We use the DynInst API process editing library, augmented with a new call for replacing functions, to solve these problems. We discuss problems associated with migrating a ...
Resource Management and Checkpointing for PVM
- IN PROCEEDINGS OF THE 2ND EUROPEAN PVM USERS' GROUP MEETING
, 1995
"... Checkpoints cannot only be used to increase fault tolerance, but also to migrate processes. The migration is particularly useful in workstation environments where machines become dynamically available and unavailable. We introduce the CoCheck environment which not only allows the creation of chec ..."
Abstract
-
Cited by 28 (7 self)
- Add to MetaCart
Checkpoints cannot only be used to increase fault tolerance, but also to migrate processes. The migration is particularly useful in workstation environments where machines become dynamically available and unavailable. We introduce the CoCheck environment which not only allows the creation of checkpoints, but also provides process migration. The creation of checkpoints of PVM applications is explained and we show how this service can be used in a resource manager.
Self Adaptivity in Grid Computing
- Concurrency & Computation: Practice & Experience
, 2005
"... Optimizing a given software to exploit the features of the underlying system has been an area of research for many years. Recently, a number of self adapting software have been designed and developed for various computing environments. In this paper, we discuss the design and implementation of a ..."
Abstract
-
Cited by 28 (2 self)
- Add to MetaCart
Optimizing a given software to exploit the features of the underlying system has been an area of research for many years. Recently, a number of self adapting software have been designed and developed for various computing environments. In this paper, we discuss the design and implementation of a software that dynamically adjusts the parallelism of applications executing on computational Grids in accordance to the changing load characteristics of the underlying resources. The migration framework implemented by our software is oriented towards performance oriented Grid systems and implements tightly coupled policies for both suspension and migration of executing applications. The suspension and migration policies take into account both the load changes on systems as well the remaining execution times of the applications thereby taking into account both system load and application characteristics. The main goal of our migration framework is to improve the response times for individual applications. We also present some results that demonstrate the usefulness of our migration framework.
SRS - A Framework for Developing Malleable and Migratable Parallel Applications for Distributed Systems
- In: Parallel Processing Letters. Volume
, 2002
"... The ability to produce malleable parallel applications that can be stopped and reconfigured during the execution can offer attractive benefits for both the system and the applications. The reconfiguration can be in terms of varying the parallelism for the applications, changing the data distribu ..."
Abstract
-
Cited by 27 (0 self)
- Add to MetaCart
The ability to produce malleable parallel applications that can be stopped and reconfigured during the execution can offer attractive benefits for both the system and the applications. The reconfiguration can be in terms of varying the parallelism for the applications, changing the data distributions during the executions or dynamically changing the software components involved in the application execution. In distributed and Grid computing systems, migration and reconfiguration of such malleable applications across distributed heterogeneous sites which do not share common file systems provides flexibility for scheduling and resource management in such distributed environments. The present reconfiguration systems do not support migration of parallel applications to distributed locations. In this paper, we discuss a framework for developing malleable and migratable MPI message-passing parallel applications for distributed systems. The framework includes a user-level checkpointing library called SRS and a runtime support system that manages the checkpointed data for distribution to distributed locations. Our experiment results indicate that the parallel applications, with instrumentation to SRS library, were able to achieve reconfigurability incurring about 15- 35% overhead.

