Results 1 - 10
of
74
Load Balancing and Unbalancing for Power and Performance in Cluster-Based Systems
, 2001
"... In this paper we address power conservation for clusters of workstations or PCs. Our approach is to develop systems that dynamically turn cluster nodes on -- to be able to handle the load imposed on the system efficiently -- and off -- to save power under lighter load. The key component of our syst ..."
Abstract
-
Cited by 87 (7 self)
- Add to MetaCart
In this paper we address power conservation for clusters of workstations or PCs. Our approach is to develop systems that dynamically turn cluster nodes on -- to be able to handle the load imposed on the system efficiently -- and off -- to save power under lighter load. The key component of our systems is an algorithm that makes load balancing and unbalancing decisions by considering both the total load imposed on the cluster and the power and performance implications of turning nodes off. The algorithm is implemented in two different ways: (1) at the application level for a cluster-based, localityconscious network server; and (2) at the operating system level for an operating system for clustered cycle servers. Our experimental results are very favorable, showing that our systems conserve both power and energy in comparison to traditional systems.
An opportunity cost approach for job assignment in a scalable computing cluster
- IEEE Transactions on Parallel and Distributed Systems
, 2000
"... A new method is presented for job assignment to and reassignment between machines in a computing cluster. Our method is based on a theoretical framework that has been experimentally tested and shown to be useful in practice. This “opportunity cost ” method converts the usage of several heterogeneous ..."
Abstract
-
Cited by 44 (2 self)
- Add to MetaCart
A new method is presented for job assignment to and reassignment between machines in a computing cluster. Our method is based on a theoretical framework that has been experimentally tested and shown to be useful in practice. This “opportunity cost ” method converts the usage of several heterogeneous resources in a machine to a single homogeneous “cost”. Assignment and reassignment is then performed based on that cost. This is in contrast to previous methods for job assignment and reassignment, which treat each resource as an independent entity with its own constraints. These previous methods were intrinsically ad hoc, as there was no clean way to balance one resource against another. 1.
Dynamic Cluster Reconfiguration For Power And Performance
, 2002
"... In this paper we address power conservation for clusters of workstations or PCs. Our approach is to develop systems that dynamically turn cluster nodes on -- to be able to handle the load imposed on the system efficiently -- and off -- to save power under lighter load. The key component of our syste ..."
Abstract
-
Cited by 39 (8 self)
- Add to MetaCart
In this paper we address power conservation for clusters of workstations or PCs. Our approach is to develop systems that dynamically turn cluster nodes on -- to be able to handle the load imposed on the system efficiently -- and off -- to save power under lighter load. The key component of our systems is an algorithm that makes cluster reconfiguration decisions by considering the total load imposed on the system and the power and performance implications of changing the current configuration. The algorithm is implemented in two common cluster-based systems: a network server and an operating system for clustered cycle servers. Our experimental results are very favorable, showing that our systems conserve both power and energy in comparison to traditional systems.
JESSICA2: A Distributed Java Virtual Machine with Transparent Thread Migration Support
- In IEEE Fourth International Conference on Cluster Computing
, 2002
"... A distributed Java Virtual Machine (DJVM) spanning multiple cluster nodes can provide a true parallel execution environment for multi-threaded Java applications. Most existing DJVMs suffer from the slow Java execution in interpretive mode and thus may not be efficient enough for solving computation- ..."
Abstract
-
Cited by 39 (6 self)
- Add to MetaCart
A distributed Java Virtual Machine (DJVM) spanning multiple cluster nodes can provide a true parallel execution environment for multi-threaded Java applications. Most existing DJVMs suffer from the slow Java execution in interpretive mode and thus may not be efficient enough for solving computation-intensive problems. We present JESSICA2, a new DJVM running in JIT compilation mode that can execute multi-threaded Java applications transparently on clusters. JESSICA2 provides a single system image (SSI) illusion to Java applications via an embedded global object space (GOS) layer. It implements a cluster-aware Java execution engine that supports transparent Java thread migration for achieving dynamic load balancing. We discuss the issues of supporting transparent Java thread migration in a JIT compilation environment and propose several lightweight solutions. An adaptive migrating-home protocol used in the implementation of the GOS is introduced. The system has been implemented on x86-based Linux clusters, and significant performance improvements over the previous JESSICA system have been observed.
Self-migration of Operating Systems
, 2004
"... This paper is about on-the-fly migration of entire operating systems between physically di#erent host computers. Resource allocation is often static ..."
Abstract
-
Cited by 38 (4 self)
- Add to MetaCart
This paper is about on-the-fly migration of entire operating systems between physically di#erent host computers. Resource allocation is often static
A Taxonomy of Market-Based Resource Management Systems for Utility-Driven Cluster Computing
, 2004
"... In utility-driven cluster computing, cluster systems need to know the specific needs of different users so as to allocate resources according to their needs. They are also vital in supporting service-oriented Grid computing that harness resources distributed worldwide based on users' objectives. M ..."
Abstract
-
Cited by 33 (10 self)
- Add to MetaCart
In utility-driven cluster computing, cluster systems need to know the specific needs of different users so as to allocate resources according to their needs. They are also vital in supporting service-oriented Grid computing that harness resources distributed worldwide based on users' objectives. Market-based resource management systems make use of real-world market concepts and behavior to assign resources to users. This paper outlines a taxonomy that describes how market-based resource management systems can support utility-driven cluster computing. The taxonomy is used to survey existing market-based resource management systems to better understand how they can be utilized.
The architecture and performance of security protocols in the ensemble group communication system
- ACM Transactions on Information and System Security
, 2001
"... Ensemble is a Group Communication System built at Cornell and the Hebrew universities. It allows processes to create process groups within which scalable reliable fifo-ordered multicast and point-to-point communication are supported. The system also supports other communication properties, such as c ..."
Abstract
-
Cited by 30 (1 self)
- Add to MetaCart
Ensemble is a Group Communication System built at Cornell and the Hebrew universities. It allows processes to create process groups within which scalable reliable fifo-ordered multicast and point-to-point communication are supported. The system also supports other communication properties, such as causal and total multicast ordering, flow control, etc. This paper describes the security protocols and infrastructure of Ensemble. Applications using Ensemble with the extensions described here benefit from strong security properties. Under the assumption that trusted processes will not be corrupted, all communication is secured from tampering by outsiders. Our work extends previous work performed in the Horus system (Ensemble’s predecessor) by adding support for multiple partitions, efficient rekeying, and application defined security policies. Unlike Horus, which used its own security infrastructure with non-standard key distribution and timing services, Ensemble’s security mechanism is based on off-the shelf authentication systems, such as PGP and Kerberos. We extend previous results on group rekeying, with a novel protocol that makes use of diamond-like data structures. Our Diamond protocol allows the removal of untrusted members within milliseconds.
Bypass: A Tool for Building Split Execution Systems
- In Proceedings of the Ninth IEEE Symposium on High Performance Distributed Computing
, 2000
"... Split execution is a common model for providing a friendly environment on a foreign machine. In this model, a remotely executing process sends some or all of its system calls back to a home environment for execution. Unfortunately, hand-coding split execution systems for experimentation and research ..."
Abstract
-
Cited by 25 (6 self)
- Add to MetaCart
Split execution is a common model for providing a friendly environment on a foreign machine. In this model, a remotely executing process sends some or all of its system calls back to a home environment for execution. Unfortunately, hand-coding split execution systems for experimentation and research is difficult and error-prone. We have built a tool, Bypass, for quickly producing portable and correct split execution systems for unmodified legacy applications. We demonstrate Bypass by using it to transparently connect a POSIX application to a simple data staging system based on the Globus toolkit. 1. Introduction The split execution model allows a process running on a foreign machine to behave as if it were running on its home machine. Split execution generally involves three software components: an application, an agent, and a shadow. Figure 1 shows these components. Kernel Agent Application Local System Calls Calls System Trapped Kernel Shadow Local System Calls Other...
CRAK: Linux Checkpoint/Restart As a Kernel Module
, 2001
"... Process checkpoint/restart is a very useful technology for process migration, load balancing, crash recovery, rollback transaction, job controlling and many other purposes. Although process migration has not yet been widely used and is not widely available commercial systems, the growing shift of co ..."
Abstract
-
Cited by 24 (1 self)
- Add to MetaCart
Process checkpoint/restart is a very useful technology for process migration, load balancing, crash recovery, rollback transaction, job controlling and many other purposes. Although process migration has not yet been widely used and is not widely available commercial systems, the growing shift of computing facilities from supercomputers to networked workstations and distributed systems is increasing the importance and demand for migration technologies. In this paper, we describe the design and implementation of CRAK, an innovative transparent checkpoint/restart package for Linux. CRAK provides transparent migration of Linux networked applications and computing environments without modifying, recompiling, or relinking applications or the operating system. CRAK is the first system for Unix/Linux that provides transparent checkpoint/restart with the following properties: (1) it does not require any modifications of existing operating system or application code and (2) it supports migrating network sockets. Prototype implementations are available for Linux 2.2 and Linux 2.4 kernels.
A Cost-Benefit Framework for Online Management of a Metacomputing System
- The International Journal for Decision Support Systems, Elsevier Science
"... Managing a large collection of networked machines, with a series of incoming jobs, requires that the jobs be assigned to machines wisely. A new approach to this problem is presented, inspired by economic principles: the Cost-Benefit Framework. This framework simplifies complex assignment and admissi ..."
Abstract
-
Cited by 21 (0 self)
- Add to MetaCart
Managing a large collection of networked machines, with a series of incoming jobs, requires that the jobs be assigned to machines wisely. A new approach to this problem is presented, inspired by economic principles: the Cost-Benefit Framework. This framework simplifies complex assignment and admission control decisions, and performs well in practice. We demonstrate this framework in two different environments: an Internet-wide market for computational services and the classic network of workstations. 1.1 Keywords Networks, resource allocation, metacomputing, markets. 2. INTRODUCTION Collections of networked machines are common in the modern world. Using each individual machine as a completely independent computer is obviously inefficient -- one machine could be working on a dozen jobs while the others sit idle. A metacomputing system is a set of networked machines that can pool their computational resources to avoid this problem. Each machine has several computational resources ass...

